python - Pivoting a pandas dataframe with duplicate index values -
i have data frame has rows each user joining site , making purchase.
+---+-----+--------------------+---------+--------+-----+ | | uid | msg | _time | gender | age | +---+-----+--------------------+---------+--------+-----+ | 0 | 1 | confirmed_settings | 1/29/15 | m | 37 | | 1 | 1 | sale | 4/13/15 | m | 37 | | 2 | 3 | confirmed_settings | 4/19/15 | m | 35 | | 3 | 4 | confirmed_settings | 2/21/15 | m | 21 | | 4 | 5 | confirmed_settings | 3/28/15 | m | 18 | | 5 | 4 | sale | 3/15/15 | m | 21 | +---+-----+--------------------+---------+--------+-----+ i change dataframe each row unique uid , there columns called sale , confirmed_settings have timestamp of action. note not every user has sale, every user has confirmed_settings. below:
+---+-----+--------------------+---------+---------+--------+-----+ | | uid | confirmed_settings | sale | _time | gender | age | +---+-----+--------------------+---------+---------+--------+-----+ | 0 | 1 | 1/29/15 | 4/13/15 | 1/29/15 | m | 37 | | 1 | 3 | 4/19/15 | null | 4/19/15 | m | 35 | | 2 | 4 | 2/21/15 | 3/15/15 | 2/21/15 | m | 21 | | 3 | 5 | 3/28/15 | null | 3/28/15 | m | 18 | +---+-----+--------------------+---------+---------+--------+-----+ to this, trying:
df1 = df.pivot(index='uid', columns='msg', values='_time').reset_index() df1 = df1.merge(df[['uid', 'gender', 'age']].drop_duplicates(), on='uid') but error: valueerror: index contains duplicate entries, cannot reshape
how can pivot df duplicate index values transform dataframe?
edit: df1 = df.pivot_table(index='uid', columns='msg', values='_time').reset_index()
gives error dataerror: no numeric types aggregate im not sure right path go on.
x data frame have input :
uid msg _time gender age 0 1 confirmed_settings 1/29/15 m 37 1 1 sale 4/13/15 m 37 2 3 confirmed_settings 4/19/15 m 35 3 4 confirmed_settings 2/21/15 m 21 4 5 confirmed_settings 3/28/15 m 18 5 4 sale 3/15/15 m 21 y = x.pivot(index='uid', columns='msg', values='_time') x.join(y).drop('msg', axis=1) gives you:
uid _time gender age confirmed_settings sale 0 1 1/29/15 m 37 nan nan 1 1 4/13/15 m 37 1/29/15 4/13/15 2 3 4/19/15 m 35 nan nan 3 4 2/21/15 m 21 4/19/15 nan 4 5 3/28/15 m 18 2/21/15 3/15/15 5 4 3/15/15 m 21 3/28/15 nan
Comments
Post a Comment