0

I have spent the last hour troubleshooting this and looking for answers that I can tailor to my question, but I keep running into errors. I would truly appreciate either an answer to my question or another answer that you can find that parallels my problem.

I have a large data frame with >300000 responses for 15 variables labelled "swh". I would like to create a new data frame with the following:

I want to include column 1 "c13vrgenhth" if it is greater than column two "r13vrgenhth".

I want to include column 2 "r13vrgenhth" if column 1 "c13vrgenhth" is included.

I want to include column 5 "c13smoke" if it is less than 3 AND if column if column 6 "r13smoke" is equal to 3.

I want to include column 6 "r13smoke" if column 5 "c13smoke" is included.

I tried to write it out here:

swhcount<-subset(swh[1] if(swh{[1]>swh[2])}) 

& {swh[2] if swh[2]<swh[1]}  & {swh[5] if swh[5]<3 & swh[6]==3} & {swh[6] if     swh[5]<3 & swh[6]==3})

I realize that cannot be correct because there are no "else's," but that may help to communicate what I want to do.

  • I don't understand what you are trying to by conditionally "including" columns. Some rows will have `c13vrgenhth> r13vrgenhth` and some will not but the final dataset returned by `subset()` will have the same number of columns for each row (that's how data.frames work). It would be helpful to create a small, [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input data and the desired output. Start with a small, manageable number of rules. – MrFlick May 06 '15 at 22:30
  • Thank you for letting me know. I will edit my question. – Colby Lea May 06 '15 at 22:40

0 Answers0