python - Modifying values of small slice of a column -

April 15, 2014

i'm trying center (subtract average) of slice of column. in following example search supercase (the var groups observations take average, assign, in same position, old value minus average). working bigger dataframe (477 rows × 85 columns) did test df show point

import random rd  # 10 row 3 columns dataframe random floats test = pd.dataframe([[rd.random() n in range(3)] n in range(10)], columns = ["var{}".format(n+1) n in range(3)]) # supercase column group observations (rows) test["supercase"]=[1000]*2+[2000]*4+[3000]*3+[4000] # random metadata fluff n,_lett in zip(range(3),list("abc")):     test["metadata{}".format(n+1)]=[_lett*int(rd.random()*10) in range(len(test.index))]  # vars want work on _vars = test.columns[:3] # list of supercases work on supercases = test.supercase.unique()  # go through calculations var in _vars:     sc in supercases:         test[var][test.supercase == sc]=test[var][test.supercase == sc]-test[var][test.supercase == sc].mean()

(i realize group 1 observation have centered value of zero)

nevertheless, , after waiting quite bit (with original df), following warning:

c:\python27\lib\site-packages\ipython\kernel\__main__.py:5: settingwithcopywarning:  value trying set on copy of slice dataframe  see the caveats in documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

i wasn't sure meant, tried creating copy of df , attributions on new df: test_ctr = pd.dataframe(test) #to avoid 2 vars pointing same object.

for var in _vars:     sc in supercases:         test_ctr[var][test_ctr.supercase == sc]=test[var][test.supercase == sc]-test[var][test.supercase == sc].mean()

this made me notice both test_ctr (as expected) , test being modified made me more confused.

how should done then? link above describes following proper way make have save index values:

dfc.loc[0,'a'] = 11

is there i'm missing? specially in case of test df being modified?

cheers , thanks!

i'm not sure can give great explanation warning beyond what's in documentation, appears did works fine , warning doesn't apply when appears.

nevertheless, there's faster , easier way want close groupby() example in documentation here.

test[['var1','var2','var3','supercase']]         var1      var2      var3  supercase 0  0.107989  0.275314  0.688784       1000 1  0.743372  0.726421  0.457137       1000 2  0.946661  0.469229  0.145584       2000 3  0.562564  0.040528  0.150148       2000 4  0.213042  0.934673  0.713870       2000 5  0.851200  0.371629  0.239308       2000 6  0.555617  0.502027  0.862414       3000 7  0.386040  0.954245  0.392592       3000 8  0.431534  0.088997  0.016639       3000 9  0.207693  0.269625  0.189688       4000  test.groupby('supercase')[_vars].transform( lambda x: x - x.mean() )         var1      var2      var3 0 -0.317692 -0.225554  0.115823 1  0.317692  0.225554 -0.115823 2  0.303294  0.015214 -0.166643 3 -0.080803 -0.413487 -0.162079 4 -0.430325  0.480658  0.401643 5  0.207833 -0.082386 -0.072920 6  0.097887 -0.013063  0.438533 7 -0.071691  0.439156 -0.031290 8 -0.026196 -0.426092 -0.407242 9  0.000000  0.000000  0.000000

in terms of getting copy of dataframe, standard way:

test_ctr = test.copy()

i have guessed tried test_ctr = pd.dataframe(test) have worked apparently not!

Search This Blog

Ruby Code

python - Modifying values of small slice of a column -

Comments

Post a Comment

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

command line - Use qwinsta in PowerShell ISE -

java - Show Soft Keyboard when EditText Appears -