4

I have a gene expression dataset that I want to investigate. Particularly, I would like to understand whether there is any correlation between each gene's expression and some quantitative or qualtitative data (say, correlation between gene 'XPTO' , body mass index, and race).

One possible way to test this would be through logistic regression, but is this a good approach or are there caveats that I should know about using such a statistic?

My question is the following: which methods would you advise to measure such correlations, and why?

(this post was crossposted on Biostars)

Sos
  • 141
  • 3

1 Answers1

3

Logistic regression would generally be a bad choice for non-binary outcomes. In such cases, linear regression (or a GLM more generally) still works fine. You can already do that in the standard R packages for RNAseq (DESeq2, edgeR, and limma), where the fold-change is in whatever units you're measuring your quantitative trait in.

Devon Ryan
  • 19,602
  • 2
  • 29
  • 60