How to group dataframe rows into list in pandas groupby
Grouping rows into lists with pandas' groupby
can be done using the agg()
function with list
:
You will get a DataFrame where each cell is a list of the grouped values. As easy as pie.
More efficient aggregations
For larger datasets, efficiency is crucial. This is where the pd.Series.tolist
function comes in handy:
Or try grouping multiple columns with a dictionary:
Grouping: beyond groupby
For known and not numerous categories, np.unique
or array splitting and list comprehension are quick alternatives:
Advanced grouping methods
For more complex aggregations, apply multiple functions to the same column:
You can also go one step further with custom aggregation functions:
Pro tip: Enhance performance
In large groupby operations, small changes can notably affect performance. Try sorting before grouping:
Benchmark different methods. For small datasets a loop might even be quicker:
Also, don't forget that grouping data is not exclusive to pandas
Deploy the power of lambda
For more tailored data conversion, introduce lambda functions to your groupby operations:
It opens up possibilities like chaining operations:
Was this article helpful?