- I have 930 observations of 1148 variables. Rather messy dataset in Excel that has gone through many hands for almost 20 years. I need to tidy it up in R for my analysis.
- Of those 1148 variables, around half of them should be in rows.
- Example: height_2003 and height_2011 should not be separate variables. It should be the same variable (height) that can be categorised by another (year).
- Of the 1148 variables, some were measured both 2003 and 2011, some only in 2003 and some only in 2011 so it is expected that there will be a lot of NA which is ok since the analysis will not use all variables at the same time.
- So this is a simplified example of what I want to achieve:
Turn this: The dataframe now
Into this: The desired dataframe
Is this transformation possible with R or do I need to spend weeks in Excel?