I have been using data frames in R for quite some time. I feel that I have a pretty good handle on what they can and can't do. However, I have recently become interested in data tables due to much more efficient lookups. However, I have run into a bit of an issue right out of the gate.
Typically with a data frame I will assign rownames and use those for indexing later. The nice thing about doing this is the rownames need not be a column in the data. So suppose I read in a csv file of the form:
Name, val1, val2, …, valN
where Name is a (unique) string and the vals are numbers. Then I will set rownames(x) = x[,1] and remove the first column. Now I have an entirely numeric data frame that I can add, subtract, etc. I don’t have to be concerned with doing math operations on string fields. Now I could do something like apply(x, 1, mean) with no problems.
However, it seems that in data table world I would do something like this:
DT = as.data.table(x); setkey(DT, Name)
But now the character column sticks around. So suppose I want to take an average of each row. Do I now have to constantly tell it to only act on columns 2:ncol?
I assume there is a way around this, but my googling has come up empty.