nlargest of each group
df.groupby('team').apply(lambda x:x.nlargest(3,'points')).reset_index(drop=True) team player points 0 A Alice 15 1 A Carmen 13 2 A Becky 11 3 B Greta 29 4 B Fran 28 5 B Iris 25 6 C Lucy 23 7 C Molly 18 8 C Ophelia 15
Here is what the above code is Doing:
1. groupby(‘team’)
2. apply(lambda x:x.nlargest(3,’points’))
3. reset_index(drop=True)
1. groupby(‘team’)
This is the same as before.
2. apply(lambda x:x.nlargest(3,’points’))
This is the new part.
apply() is a function that takes another function as an argument.
In this case, the function we’re passing to apply() is lambda x:x.nlargest(3,’points’).
lambda x:x.nlargest(3,’points’) is a function that takes a dataframe as an argument and returns the 3 rows with the highest values in the ‘points’ column.
So, when we call apply(lambda x:x.nlargest(3,’points’)), we’re telling pandas to group the dataframe by ‘team’, and then for each group, return the 3 rows with the highest values in the ‘points’ column.
3. reset_index(drop=True)
This is the same as before.