Python Pandas: How to groupby and compare columns -

August 15, 2015

here datafarme 'df':

match           name                   group   adamant         adamant home network   86    adamant         adamant, ltd.          86    adamant bild    tov adamant-bild       86    360works        360works               94    360works        360works.com           94

per group number want compare names 1 one , see if matched same word 'match' column.

so desired output counts:

 if match count 'tp' , if not count 'fn'.

i had idea of counting number of match words per group number not want:

df.groupby(group).count()

does body have idea how it?

if understood unclear question, should work:

import re import pandas   df = pandas.dataframe([['adamant', 'adamant home network', 86], ['adamant', 'adamant, ltd.', 86],                        ['adamant bild', "tov adamant-bild", 86], ['360works', '360works', 94],                        ['360works ', "360works.com ", 94]], columns=['match', 'name', 'group'])   def my_function(group):     i, row in group.iterrows():         if ''.join(re.findall("[a-za-z]+", row['match'])).lower() not in ''.join(                 re.findall("[a-za-z]+", row['name'])).lower():             # parsing names in each columns , looking inclusion             # if 1 of inclusion fails, return 'fn'             return 'fn'     # if inclusions succeed, return 'tp'     return 'tp'   res_series = df.groupby('group').apply(my_function) res_series.name = 'count' res_df = res_series.reset_index() print res_df

this give dataframe:

     group     count 1    86        'tp' 2    94        'tp'

Search This Blog

Ruby Code

Python Pandas: How to groupby and compare columns -

Comments

Post a Comment

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

command line - Use qwinsta in PowerShell ISE -

java - Show Soft Keyboard when EditText Appears -