0

I'm creating a new data.frame by doing the opposite of "flattening" an input data.frame (in other words going from "wide" to "narrow", creating more rows).

I'll be looping over columns of the input data.frame, and appending to the output data.frame. I know it's more efficient to create the full output data.frame outright and fill it within the loop, but my question is why is it possible to create a 0 x 4 data.frame, but apparently not possible to name those 4 columns... Thanks.

 dff <- data.frame()
 dim( dff ) <- c(0,4)
 colnames(dff) <- c("first","second","third","fourth")

 Error in `colnames<-`(`*tmp*`, value = c("first", "second", "third", "fourth" : 
   'names' attribute [4] must be the same length as the vector [0]
user2105469
  • 1,353
  • 3
  • 18
  • 35

1 Answers1

1

Here are four possibilities (I'm sure there are also others):

> data.frame(first=numeric(), second=numeric(), third=numeric(), fourth=numeric())
[1] first  second third  fourth
<0 rows> (or 0-length row.names)

> data.frame(first=1,second=1,third=1,fourth=1)[0,]
[1] first  second third  fourth
<0 rows> (or 0-length row.names)

> as.data.frame(matrix(nrow=0,ncol=4,dimnames=list(c(),c("first","second","third","fourth"))))
[1] first  second third  fourth
<0 rows> (or 0-length row.names)

> setNames(as.data.frame(matrix(nrow=0,ncol=4)), c("first","second","third","fourth"))
[1] first  second third  fourth
<0 rows> (or 0-length row.names)

Note that for the first solution, you can specify whatever column classes you want (e.g., replacing numeric() with character(), etc.).

Also, you can't specify the dim attribute of a data.frame because data.frames do not have a dim attribute. Rather, they are a list structure with a row.names attribute. The str function can be helpful for understanding what these objects are.

Thomas
  • 42,067
  • 12
  • 102
  • 136
  • It's also important to note that creating an empty data.frame and adding rows after the fact it just a bad idea in general. Looping and appending is a bad strategy when it sounds like the problem is really about reshaping. Better to use the `reshape()` function or the `reshape2` package. – MrFlick Aug 10 '14 at 21:15
  • @MrFlick Yes, I leave this post without comment about the logic of doing any of these things. – Thomas Aug 10 '14 at 22:04