python - when combining pandas dataframe (concat or append) can I set the default value? -
starting previous question: pandas merge 2 dataframes different columns
if concat 2 dataframes (a & b) have of same columns, have columns not present in both, in resulting dataframe entries columns not common both & b have value of nan. there way make these entries have default value?
i rather not replace nan after concat operation there may nan values in original dataframes want preserve.
here 2 example dataframes:
hello world how 1 2 3 g 5 -666 11 h 13 nan 23 7 29 j
extra how 1.1 31 b -666 37 c 1.3 41 d nan 43 -666 1.7 -666
if example default value use in disjoint columns "w4l" instead of nan, desired result be:
hello world how 1 2 3 g w4l 5 -666 11 h w4l 13 nan w4l 23 7 29 j w4l w4l w4l 31 1.1 w4l w4l 37 b -666 w4l w4l 41 c 1.3 w4l w4l 43 d nan w4l w4l -666 -666 1.7
a possible solution 'conform' indices before concatenating both dataframes, , in step possible define fill_value:
common_columns = df1.columns.union(df2.columns) df1 = df1.reindex(columns=common_columns, fill_value='w4l') df2 = df2.reindex(columns=common_columns, fill_value='w4l') pd.concat([df1, df2]) with example data:
in [32]: common_columns = df1.columns.union(df2.columns) in [34]: df1 = df1.reindex(columns=common_columns, fill_value='4wl') in [35]: df1 out[35]: hello how world 0 g 1 3 2 w4l 1 h 5 11 -666 w4l 2 13 nan w4l 3 j 23 29 7 w4l in [36]: df2 = df2.reindex(columns=common_columns, fill_value='w4l') in [37]: pd.concat([df1, df2]) out[37]: hello how world 0 g 1 3 2 w4l 1 h 5 11 -666 w4l 2 13 nan w4l 3 j 23 29 7 w4l 0 w4l 31 w4l 1.1 1 b w4l 37 w4l -666 2 c w4l 41 w4l 1.3 3 d w4l 43 w4l nan 4 -666 w4l -666 w4l 1.7 you can see original nans preserved.
Comments
Post a Comment