5

I'm running the pathifier approach against C2 pathway curated database for a specific microarray dataset.

As I was reading the documentation of the pathifier, in order to configure it properly for my dataset, I saw that for the min_std argument they suggest to use the technical noise.

min_std: The minimal allowed standard deviation of each gene. Genes with lower standard deviation are divided by min_std instead of their actual standard deviation. (Recommended: set min_std to be the technical noise).

But how am I going to calculate it ? At first run I just used the value 0.2254005 from the example they provide with that package, but this might not be the right for my dataset.

Update:

So from here it says: The minimal standard error (min_std) was set as the first quartile of the standard deviation in the data. How am I supposed to calculate this first quartile of the std? Any idea? Should I calculate the sd() for each gene and then find the first quartile?

What do you think about this function?

gringer
  • 14,012
  • 5
  • 23
  • 79
J. Doe
  • 575
  • 3
  • 11

1 Answers1

1

Answer from @llrs, converted from comment:

Probably is a question for the maintainer but I would guess that is a variance of the gene in the control dataset. So, if in normal samples the gene has a standard deviation of 0.5 this is the expected too in the tumoral cells. calculate the sd only on the control samples. So if the healthy/control samples have such variation the altered ones should have these or more according to the article (if I understand it correctly).

std <- apply(data[, controls], 1, sd, na.rm = TRUE); quantile(std, 0.25) should be much faster and efficient

But I would check it in support.bioconductor with the authors/maintainers.

gringer
  • 14,012
  • 5
  • 23
  • 79