I'm trying to automate some data manipulation and analysis using Python / Pandas.
Context:
- The graph below shows the current 'as is' state and 'to be' theory relating to the data I receive
- Each row corresponds to an individual payment source. So, a business which has three separate payment sources will have three separate rows.
- Each 'category' can have multiple business names
- The actual sales figures [based on given monthly totals] are currently spit out into individual columns (see below). This isn't ideal from a data analysis perspective.
Goal: I'd like to get to a point where I can see the 'pct_change()' on a monthly basis across (i) Payment sources, (ii) business names and (iii) categories. [Gresumably, 'groupby' will come in handy here]
Questions:
- How can I consolidate these separate columns into a 'pivot-ready' format?
- Any general data analysis recommendations / learning resources / links anyone can recommend to help me achieve my goal?