I have 3224 Ensembl id's as rownames in a dataframe "G". To convert Ensembl ids into Genesymbols I used biomart like following.
library('biomaRt')
mart <- useDataset("hsapiens_gene_ensembl", useMart("ensembl"))
genes <- rownames(G)
G <-G[,-6]
G_list <- getBM(filters= "ensembl_gene_id", attributes= c("ensembl_gene_id","hgnc_symbol"),values=genes,mart= mart)
Now in G_list I can see only 3200 ensembl ids showing Genesymbols / No Gene_symbols. Why the other 24 ensembl ids are not seen in G_list? If there are no gene_symbol for those 24 ensembl ids it should atleast show "-"
Examples of problematic IDs are: ENSG00000257061, ENSG00000255778, ENSG00000267268. These are not at all shown in G_list (biomaRt). So, I gave them in biodbnet, which seems to handle them.
what is the problem here?
NAfor genes that are in the database, but don't have symbols. – Ian Sudbery Aug 25 '17 at 12:11NAwith-at some point. – Devon Ryan Aug 25 '17 at 12:13ENSG00000257061was last seen in ENSEMBL 84 (you can look this up by searching on the ENSEMBL website). You need to search against the exact same version as you got your gene list from. ReplaceuseMartwithuseEnsembl(version=X)where X is the version of ensembl you used to generate the gene list. – Ian Sudbery Aug 25 '17 at 12:30