I have a list of gene symbols:
c("cd45", "Tmem119", "CD11b", "P2Yr12", "Csf1r", "Bst2", "Cd74",
"Cx3cr1", "Trem2", "Lyz2", "GLAST", "GFAP", "ALDH1L1", "Aldoc", "Aqp4",
"Glul", "S100a", "Olig1", "Olig2", "Olig3", "Mbp", "Pdgfra", "Pecam",
"Cldn5", "Cldn10", "Epas1", "Crip1")
If I feed it to BioMart to get ensembl ids:
mart <- useDataset("mmusculus_gene_ensembl", useMart("ensembl"))
list <- getBM(filters= "mgi_symbol", attributes= c("ensembl_gene_id",
"mgi_symbol","description"),values=symbols,mart= mart)
the following genes are missed:
"cd45" "cd11b" "p2yr12" "glast" "s100a" "pecam"
All of them are pretty well-known genes, and I can manually find their ensembl ids by googling it, for instance:
http://www.informatics.jax.org/marker/MGI:97810
I tried supplying aliases, but it does not change the output. So, from my understanding BioMart is not working properly either because I am doing something wrong here or because BioMart itself is not a good tool to use. Is there a better way of getting the mapping that would map all of the gene symbols?
mgi. I tried supplyingmgiinstead ofmgi_symbol, and it did not work. Not sure whether it is an issue withbiomartor not. How to do that withAWK? Can you give any links for doing that, tutorials? To learn whole new language just to do translation ofensembl idstosymbolsseems like an overkill a bit, if it is not very easy. – Nikita Vlasenko May 11 '18 at 04:56AWKis a great tool for bioinformaticians, but if you don't want to make use of this tool, don't. I do not want to force you into something. – benn May 11 '18 at 06:49