6

I would like to know what do people verify when designing/using spike-in controls, to be used in sequencing experiments (mainly Illumina). So far I came up with this list:

  • Does it align only to a given genome synthetic reference?
  • Does it contain G-quadruplexes? E.g. QGRS Mapper
  • Does it contain known Illumina error motifs? (Do this matter anymore for recent platforms/chemistries?) See this article
  • Does it contain other motifs known to be polymerase inhibitors? Ref

Anything else?

llrs
  • 4,693
  • 1
  • 18
  • 42
719016
  • 2,324
  • 13
  • 19

1 Answers1

4

Disclaimer: I'm a developer for http://www.sequin.xyz.

Sequin is a new set of spiked-in controls for next-generation sequencing, and that includes Illumina. We design controls for RNA-Seq, genome sequencing, metagenomics etc.

Please study our papers if you want more details.

  • Reference Standards For Next-Generation Sequencing by Hardwick should be a good start.

Here's my list:

  1. Measured counts against input concentration. Pearson's correlation, spearman correlation and regression slope.
  2. PCA analysis
  3. Whether the synthetic aligned to the synthetic reference
  4. Spiked-in amount (e.g. dilution)
  5. Diagnostic statistics
  6. ROC Curve
  7. Detection limit

(1) is important because the controls should give you correlation = 1, and slope = 1 in a perfect flawless experiment. This is a sample output (you can read more about it in our papers):

enter image description here

(2) This is our sample PCA (our paper has the details)

enter image description here

(3) is also important but you can only do that in simulations.

(5)

enter image description here

(6)

enter image description here

(7)

enter image description here

SmallChess
  • 2,699
  • 3
  • 19
  • 35