Groupby – Data Analysis with Python and Pandas p.3

Hello and welcome to another data analysis with Python and Pandas tutorial. In this tutorial, we’re going to change up the dataset and play with minimum wage data now.

Text-based tutorial: https://pythonprogramming.net/groupby-python3-pandas-data-analysis/

Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
G+: https://plus.google.com/+sentdex

29 comments

  1. Sebastian Mantey on

    10:49 That’s what my head does, every time I think about what the best way is to get a data frame into the desired format.

    Reply
  2. markobe08 on

    Another day another like! I showed this to mortals in my office. They do not understand powa (but few of us does). We’ll be army for better world. You teach we reach

    Reply
  3. markobe08 on

    Another day another like! I showed this to mortals in my office. They do not understand powa (but few of us do). We’ll be army for better world. You teach we reach

    Reply
  4. Milan Lora on

    Quick question. In the for loop we do an if statement asking whether act_min_wage is empty. If it is then we just use the df. However it isn’t we join act_min_wage with df. Wouldn’t this repeat rows that were already in act_min_wage.

    PS. Keep up the great work!

    Reply
  5. Marcos Daniel Torres on

    Hi everyone would you recommend the deeplearning book by Ian Goodfellow for a complete begginer? I feel lost when thinking about where I should start with these complex topics, there are so many tutorials that I feel overwhelmed

    Reply
  6. Ahmed Hany on

    A bit unrelated to the video but I just have a small question: When we are talking about a regression problem in neural networks like predicting house prices, how can I have in the same model continuous values like area of the house and also have binary or categorical values like whether the house has a garden or not which can be represented as a 0 or 1 and also categorical inputs like for example for a heart disease data set we have 3 types of diabetes we could type the 3 types as 1 , 2 , or 3 . How can I put all of those binary , categorical , continuous values in one neural network model ? By the way, this is the best pandas tutorial on the internet I love it !

    Reply
  7. John Scott on

    Best video cheat sheet to Python and now Pandas as well. Bare bones cutting straight to the “It kinda works like this” by example. Coming in so handy right now. Thanks.

    Reply
  8. mokus603 on

    I’ve been trying to introduce my colleagues to data analysis in pandas. These videos with clear instructions and easy to understand overview can make people understand the process and properties of good data analysis. Thank you.

    Reply
  9. SeamusHarper1234 on

    When you write to csv, include a index=False.. That way you dont have to deal with the “Unnamed” column..

    Reply
  10. Ori Rosenthal on

    instead of the for loop you can do: pd.pivot_table(df, values=”Low.2018″, index=”Year”, columns=’State’)

    Reply
  11. Boris Dessimond on

    I was like “Whatever I know already how to use pandas I’ll watch it 2x speed we never know.”
    But at 9:00 it blew my mind ahaha I never thought it could be so efficient. Damn ! Thanks.
    Will watch the whole serie, so much to learn from you !

    Reply
  12. Art Curious on

    Python error returns are not pythonic whatsoever, it’s a strange diversion from a code that is supposed to be more intuitive, but the error handling is atrociously non intuitive.

    Real python error handling would simply have an arrow pointing to the offending line of code with a very short explanation and maybe a key code error number for reference. Instead Python slams you with a thousand lines of pink byte rage that is the equivalent of cold water being dumped on you.

    Reply
  13. Kin Cheng on

    i’m trying to do the trick from the last video to pivot the state names into the columns but i can’t seem to figure out why my pivoted_df is only returning alabama in the DataFrame. Anyone have ideas?

    piovted_df = pd.DataFrame()
    for state in df[‘State’].unique():

    state_df = df.copy()[df[‘State’]==state]
    state_df.set_index(‘Year’, inplace=True)
    state_df.sort_index(inplace=True)
    state_df[f'{state}_CPI.Average’] = state_df[‘CPI.Average’]

    if piovted_df.empty:
    piovted_df = state_df[[f'{state}_CPI.Average’]]
    else:
    pivoted_df = piovted_df.join(state_df[f'{state}_CPI.Average’])

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Show Buttons
Hide Buttons