How to group data in pandas by a specific column?
Published on Aug. 22, 2023, 12:17 p.m.
To group data in pandas by a specific column, you can use the groupby() function followed by the column you want to group on. Here’s an example:
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie', 'Alice', 'Bob', 'Charlie'],
'Score': [85, 75, 90, 95, 80, 70],
'Subject': ['Math', 'Math', 'Math', 'English', 'English', 'English']}
df = pd.DataFrame(data)
# Group the data by the 'Name' column and calculate the mean score for each group
grouped = df.groupby('Name')['Score'].mean()
print(grouped)
This will output:
Name
Alice 90.0
Bob 77.5
Charlie 80.0
Name: Score, dtype: float64
In the example above, we grouped the data by the ‘Name’ column and calculated the mean score for each group. You can use different aggregation functions like sum(), min(), max(), etc. with the groupby() function to perform group-specific computations.
I hope this helps! Let me know if you have any other questions.