3

Say we have two variables Y and X and we want to make a simple regression model with Transform > Run R script in Power Query. Here we have sample data:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSAeJYnWglYyDLCMwyBbKMwSwLIMtEKTYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Y = _t, X = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Y", Int64.Type}, {"X", Int64.Type}})
in
    #"Changed Type"

Then I add the Run R script step to Power Query with the following code:

output <- lm(Y ~ X, data=dataset)

enter image description here

But I get an empty table:

enter image description here

How to modify R script to get the model summary?

vestland
  • 43,687
  • 29
  • 146
  • 249
Przemyslaw Remin
  • 5,514
  • 16
  • 92
  • 164

1 Answers1

4

To my knowledge, in order to retrieve any output from your R script, it has to be in the form of a data.frame. However, if you try to run something like df<-data.frame(output), you'll get the error

Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = > > stringsAsFactors) : cannot coerce class ‘"summary.lm"’ to a data.frame

But you can retrieve parts of that very same summary and organize it in a dataframe. Building on your snippet, here's how you can do it for the model coefficients. I've taken the libery to rename your output to model.

Code:

model <- lm(Y ~ X, dataset)
df<- data.frame(coef(model))
names(df)[names(df)=="coef.model."] <- "coefficients"
df['variables'] <- row.names(df)

Output 1:

enter image description here

Next, just click Table to get:

Output 2:

enter image description here

Since I don't have your data, I've just used Enter Data and this simple data sample:

enter image description here

Edit: Other model estimates

If you'd like to retrieve estimated series from the model such as residuals or fitted values, just add the following lines to the snippet above:

df_estimates <- data.frame(fitted(model), residuals(model))
colnames(df_estimates) <- c('fitted', 'residuals')

The preliminary output will now be:

enter image description here

And this is a pretty hande feature in Power BI. Just click on table next to the dataframe you'd like to keep working with. In the latter case, you'll get this:

enter image description here

Edit 2 - Include original dataset in your output:

#Y <- c(1,2,3,4,4)
#X <- c(1,2,3,4,5)

#dataset <- data.frame(X, Y)

model <- lm(Y ~ X, dataset)
df<- data.frame(coef(model))
names(df)[names(df)=="coef.model."] <- "coefficients"
df['variables'] <- row.names(df)
df_estimates <- data.frame(dataset$X, dataset$Y, fitted(model), residuals(model))
colnames(df_estimates) <- c('X', 'Y', 'fitted', 'residuals')
df_estimates
vestland
  • 43,687
  • 29
  • 146
  • 249