2

My question is similar to this one, but framed in the context of survey data.

I'd like to format data from a survey in a tidy manor where some questions are yes/no answers and others are numerical such as "time taken to complete x task".

The raw form of these data appears in a wide format; each column represents a question on the survey and each row represents a set of answers. Looking like this: enter image description here

Considering the formal definition of Tidy data (in reference to Hadley Wickham's paper), I am unsure if the tidy form of the data would require melting the data set such that question now makes up a column with each row of that column containing a specific question label (i.e question1, question2, etc.)

That data would then look like this: enter image description here

Doing so however would create mixed data types in the resulting "question response" column. To separate out the different types of data would require grouping and aggregation. To me this seems less tidy but it doesn't seem to violate the principles of Tidy data, and it is actually advice given by Tableau themselves (article found here).

Working with surveys has just gotten me a bit confused when it comes to what tidy data is. The definition of 'Variable' seems to be contextual.

Considering all of that, what would be the best way to organize data like this while both adhering to principles of tidy data and ease of computation/analysis?

0 Answers0