2

I want to create a regression model from a vector (IC50) against a number of different molecular descriptors (A,B,C,D etc).

I want to use,

model <- lm (IC50 ~ A + B + C + D)

the molecular descriptors are found in the columns of a data.frame. I would like to use a function that takes the IC50 vector and the appropriately sub-setted data.frame as inputs.

My problem is that I can't convert the columns to formula for the model.

Can anyone help.

Sample data and feeble attempt,

IC50  <- c(0.1,0.2,0.55,0.63,0.005)

descs  <- data.frame(A=c(0.002,0.2,0.654,0.851,0.654),
                     B=c(56,25,89,55,60),
                     C=c(0.005,0.006,0.004,0.009,0.007),
                     D=c(189,202,199,175,220))

model  <- function(x=IC50,y=descs) {
  a  <- lm(x ~ y)
  return(a)
}

I went down the substitute/deparse route but this didn't import the data.

DarrenRhodes
  • 1,411
  • 2
  • 14
  • 28

1 Answers1

4

You can do simply

model  <- function(x = IC50, y = descs) 
  lm(x ~ ., data = y)
Julius Vainora
  • 45,908
  • 9
  • 86
  • 100
  • That works, thanks. Any background literature on the use of the dot? – DarrenRhodes Nov 30 '15 at 16:00
  • @user1945827, you may start with the answers in http://stackoverflow.com/questions/7526467/what-does-the-dot-mean-in-r-personal-preference-naming-convention-or-more, http://stackoverflow.com/questions/13446256/meaning-of-tilde-dot-argument, http://stats.stackexchange.com/questions/10712/what-is-the-meaning-of-the-dot-in-r, http://stackoverflow.com/questions/9652943/usage-of-dot-period-in-r-functions – Julius Vainora Nov 30 '15 at 16:02