Python Pandas: How to groupby and compare columns -
here datafarme 'df':
match           name                   group   adamant         adamant home network   86    adamant         adamant, ltd.          86    adamant bild    tov adamant-bild       86    360works        360works               94    360works        360works.com           94   per group number want compare names 1 one , see if matched same word 'match' column.
so desired output counts:
 if match count 'tp' , if not count 'fn'.   i had idea of counting number of match words per group number not want:
df.groupby(group).count()    does body have idea how it?
if understood unclear question, should work:
import re import pandas   df = pandas.dataframe([['adamant', 'adamant home network', 86], ['adamant', 'adamant, ltd.', 86],                        ['adamant bild', "tov adamant-bild", 86], ['360works', '360works', 94],                        ['360works ', "360works.com ", 94]], columns=['match', 'name', 'group'])   def my_function(group):     i, row in group.iterrows():         if ''.join(re.findall("[a-za-z]+", row['match'])).lower() not in ''.join(                 re.findall("[a-za-z]+", row['name'])).lower():             # parsing names in each columns , looking inclusion             # if 1 of inclusion fails, return 'fn'             return 'fn'     # if inclusions succeed, return 'tp'     return 'tp'   res_series = df.groupby('group').apply(my_function) res_series.name = 'count' res_df = res_series.reset_index() print res_df   this give dataframe:
     group     count 1    86        'tp' 2    94        'tp'      
Comments
Post a Comment