How to Group By data in Python

For doing data analysis, group by remains one of the key process to follow. Data preparation and exploration stage requires multiple level of aggregation. This article covers how to Group By data in Python. Aggregation can be a done at various level. For example:

  • Demographic variables like, Country, CIty
  • Age, Gender
  • Product type, Product category like Fashion, electronics
  • Ticket size bands like, $0-$10, $100-$200 etc.

To aggregate the data Pandas module provide an easy way to follow.

John has yearly spend data of of his customer. He wanted to understand Gender wise spend total. He also wanted to know spend distribution by Country.

We will see how John can use groupby function to achieve the results.

  • Step 1: First step is to know the important parameters. Here requirement is to group by first at Gender level. So the aggregating column is “Gender”
         df1.groupby(by="Gender")..
  • Step 2: Next step is to find the metrics which require aggregation. Currently the metric that require aggregation is “Yearly Spend”.
         df1.groupby(by="Gender")["Yearly Spend"]..
  • Step 3: Finally identifying the type of aggregation, here we need to sum up the column Yearly Spend. So the type of aggregation is “sum”. As we have all the important parameters know, final step is to hit enter.
         df1.groupby(by="Gender")["Yearly Spend"].sum()
How to group by in python 2

Aggregation at country level

  • Step 1: First step is to know the important parameters. Here requirement is to group by first at Country level. So the aggregating column is “Gender”
         df1.groupby(by="Country")..
  • Step 2: Next step is to find the metrics which require aggregation. Currently the metric that require aggregation is “Yearly Spend”.
         df1.groupby(by="Country")["Yearly Spend"]..
  • Step 3: Finally identifying the type of aggregation, here we need to sum up the column Yearly Spend. So the type of aggregation is “sum”. As we have all the important parameters know, final step is to hit enter.
         df1.groupby(by="Country")["Yearly Spend"].sum()
How to group by in python

Thus, John is able to create summary from the Pandas Dataframe as per his requirement in Python.

To get top certifications in Python and build your resume visit here. Also, you can read books listed here to build strong knowledge around Python.

Watch our video tutorial to learn more:

Looking to practice more with this example? Drop us a note, we will email you the Code file:

    📬 Stay Ahead in Data Science & AI – Subscribe to Newsletter!

    • 🎯 Interview Series: Curated questions and answers for freshers and experienced candidates.
    • 📊 Data Science for All: Simplified articles on key concepts, accessible to all levels.
    • 🤖 Generative AI for All: Easy explanations on Generative AI trends transforming industries.

    💡 Why Subscribe? Gain expert insights, stay ahead of trends, and prepare with confidence for your next interview.

    👉 Subscribe here:

    Related Posts