Most Popular
1500 questions
3
votes
2 answers
How to link GDC ids to CCLE cell line names?
Hello bioinformaticians,
I've recently downloaded few RNAseq data files from Genomic Data Commons Data (GDC) portal. These files belong to Broad institutes CCLE project.
Now the problem is that GDC portal do not provide CCLE cell line names:
So my…
user345394
- 675
- 6
- 20
3
votes
3 answers
How BLOSUM Matrix is constructed and calculated
I would like to ask how BLOSUM matrix is constructed and calculated ? I read the wikipedia and I still do not understand it. I do not understand the mathematical calculations as I have low knowledge about logarithms.
Maybe could someone give a…
Heeh wei cheng
- 83
- 1
- 5
3
votes
1 answer
How to remove frame on RDKit figures?
RDKit can plot molecules, thought the structures are surrounded by an ugly frame with ticks. How can I turn that off and plot the molecules without that frame?
Note, the %pylab inline portion of this code is from the Jupyter environment.
%pylab…
Soerendip
- 1,295
- 11
- 22
3
votes
2 answers
How to count the number of mapped read in 100-bp window from a BAM/SAM file
Although I know how to get total number of mapped read using samtools flagstat (samtools flagstat file_sorted.bam) but I want to count total number of mapped read in a non-overlapping sliding window of fixed size (let's say 500 bp) only with respect…
Lot_to_learn
- 530
- 3
- 14
3
votes
1 answer
De novo motif discovery in protein sequences
I am trying to build a features matrix to be used for Random Forest based classification. I'd like to add, as features, short motifs which are common to all the protein sequences belonging to a specific gene family.
I tried to use MEME but I'd like…
wrong_path
- 391
- 1
- 7
3
votes
1 answer
multiple COSMIC id for the same mutation
I would like to know why there are different entries for the same mutation in COSMIC, for example COSM6954941 and COSM12833 both refer to ERBB4 c.908C>A.
In this specific case the field Gene name in the two entries is different, COSM6954941 report…
mox
- 333
- 2
- 8
3
votes
1 answer
Subset data frame based on ID
I tried to subset the dataframe in R since there are empty space after each line in the sequence which shows 'NA' when i read the files into R ,as a result of which when i subset it just takes the sequence Protein_IDS and sequence line by line not…
kcm
- 1,804
- 12
- 27
3
votes
2 answers
DESeqDataSetFromTximport invalid rownames length
I am trying to use DESeqDataSetFromTximport function from DESeq2 package to construct dds object:
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~Group)
And somehow it is giving me the following error:
Error in rownames<-(*tmp*, value =…
Nikita Vlasenko
- 2,558
- 3
- 26
- 38
3
votes
4 answers
how to filter a multi-VCF for sex-specific genotypes?
I have a single VCF file of variants for a population that includes males & females and I would like to pull out all variants that are sex-specific. For example, sites where all males are 0/1 and all females are 0/0. Is there a way to do this using…
Joanne
- 305
- 1
- 5
3
votes
1 answer
Difference between genometools installed with anaconda vs apt-get?
I installed genometools with anaconda conda install -c bioconda genometools. Something is installed though apparently no binary called gt. If I type in gt Ubuntu suggests to install genometools via apt. What is the difference between both versions?…
Soerendip
- 1,295
- 11
- 22
3
votes
1 answer
Why is the total read number still more than the paired in sequencing after removing the duplicate in samtools flagstat output?
After alignment using BWA, I have removed the dupliment using the samtools(Version: 1.9).
My procedure is as follows:
bwa mem -k 32 -M ref.fa read1 read2 > out.sam
samtools view -@ 0 -b -T ref.fa -o out.bam in.sam
samtools sort -n -o…
Lingzi Xiangli
- 33
- 4
3
votes
2 answers
Downloading multiple SRA files from several SRA accession IDs does not work
I am trying to download multiple SRA files located in several SRA accessions. Some of my accession numbers are as…
Safina A.R
- 119
- 6
3
votes
1 answer
How to paste RSIDs in CADD output
I want to paste RSIDs in CADD output as CADD does not give RSID column in its output. For this purpose I am using bedtools intersect to compare two files and have RSID column in my CADD file.
This is the command I am using.
bedtools intersect -a…
Sarah
- 105
- 5
3
votes
5 answers
How to create a .bed file from .fasta?
I have some problems in creating a .bed file for hg19, so I will be able to visualize the .bed file in IGV.
The .fasta file contains rows of this…
0x90
- 1,437
- 9
- 18
3
votes
2 answers
Using Seurat to compare mutant vs.wt
I am interested in using Seurat to compare wild type vs Mutant. I don't know how to use the package. How can I test whether mutant mice, that have deleted gene, cluster together?
hua
- 441
- 1
- 7
- 14