0

I have tried to make a boxplot out of this dataset - https://www.kaggle.com/freecodecamp/2016-new-coder-survey-/scripts, but I get an error message

"Error in boxplot.default(split(mf[[response]], mf[-response]), ...) : invalid first argument"

I've cleaned the NAs, but it doesn't work. Maybe, you know what's the problem? For a record, I tried converting Age to numeric and Employment Field to factor - still doesn't work..

boxplot(Age ~ EmploymentField, data = newCoders,
    col = brewer.pal(5, "Set1"),
    whisklty = 1,
    staplelty = 0,
    main = "Age of New Coders vs Employment Field",
    xlab = "Employment Field",
    outcol = brewer.pal(5, "Set1"),
    outpch = 16, #outlier symbol
    ylab = "Age")
joran
  • 163,977
  • 32
  • 423
  • 453
  • 2
    Could you provide a brief [example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) data set? Using `dput` could help. – bouncyball Jun 08 '16 at 18:03
  • 3
    Can you edit post with the output of `dput(newCoders[c("Age", "EmploymentField")])` please, or if that is too large then `str(newCoders[c("Age", "EmploymentField")])` – user20650 Jun 08 '16 at 18:05
  • Would have expected age to already be numeric. What does str(newCoders) return now? – IRTFM Jun 08 '16 at 18:11
  • Please include data and code in your question. Avoid linking to off-site resources. – Roman Luštrik Jun 10 '16 at 08:55

1 Answers1

0

Possible causes for this error are that your response variable is either completely missing, non-numeric, or has been otherwise called incorrectly. Try using the summarize command to check that the data you are plotting have been correctly called, are non-missing, and are indeed numeric.

Aron
  • 63
  • 1
  • 7