I would like to use public omics datasets (ChIP-seq, RNA-seq, and ATAC-seq) from different studies to do an integrative analysis as follow:
- Normalise samples, within each type of omics, from different public datasets.
- Convert the normalised values into a uniform scale to make the comparison between ChIP-seq, RNA-seq and ATAC-seq possible.
- Feed the normalised uniformed values into machine learning to infer one feature (e.g. RNA expression) from other features (e.g. TF or histone marks ChIP-seq).
How could I integrate these data-sets to predict one group of variables from another one?
My goals are: a) to describe the dynamic of different histone marks during differentiation (A -> B -> C). b) to predict gene expression from these histone marks.
– Firas Nov 06 '18 at 05:46