Most Popular
1500 questions
6
votes
7 answers
Download multiple SRA files
I want to download all SRA file from the following project. Is there a method to download all the SRA files at the same time?
user2300940
- 223
- 1
- 2
- 5
6
votes
3 answers
How can I extract gene names for a metabolic pathway from KEGG?
Note: this question has also been asked on Biostars
I need the get the list of gene names involved in glycolysis (to put an example). Not manually, I need to do this in a script. Ideally with Python. I think KEGG is the proper database to do this,…
a06e
- 161
- 3
6
votes
4 answers
Any fast options to query large VCF bed intervals?
I'm doing some analysis and I need to subset a large VCF file (~8GB gziped) given a bed interval and identify within a subset of rsid.
Unfortunately, both my normal choices to do this analysis (snpSift and bedtools) are taking way to long or…
andremrsantos
- 91
- 5
6
votes
1 answer
Bacterial genome annotation of a clinical isolate strain?
So I'm basically so new to this that I'm just trying to find out what tools, methods, and keywords I should go look up by myself.
I have a unique strain of a bacteria.
I was given RNAseq data for this unique strain.
I want to analyze the RNAseq…
myflow
- 63
- 3
6
votes
1 answer
Does GISTIC (v 2.0) estimate amplified/deleted probabilities on a single sample basis?
Does GISTIC 2.0 estimate the background model:
G = -log(Probability | Background)
by permuting within the sample or across all samples in the set?
The paper describes the probabilistic scoring method based on permutations, but I could not understand…
Emanuel
- 183
- 1
- 7
6
votes
2 answers
Clarification on Gene Enrichment
When I run a GSEA analysis on two conditions from the same RNaseq (negative control PBS injection VS positive control CpG injection) from the same dataset/same gene list, I get results that look something like this:
Notice in my example that many…
julianstanley
- 401
- 3
- 9
6
votes
1 answer
How to build a BAM header file with htslib in C++?
I'd like to use C++ to generate a new BAM file programmatically. This is an example how to use htslib to generate a new BCF file on the fly.
https://github.com/samtools/htslib/blob/develop/test/test-bcf-translate.c
The two key functions…
SmallChess
- 2,699
- 3
- 19
- 35
6
votes
1 answer
checks for spike-in sequence controls
I would like to know what do people verify when designing/using spike-in controls, to be used in sequencing experiments (mainly Illumina). So far I came up with this list:
Does it align only to a given genome synthetic reference?
Does it contain…
719016
- 2,324
- 13
- 19
6
votes
1 answer
How to convert DNAbin to FASTA in R?
I am trying to convert DNAbin files to fasta format. Rationale: I want to use the fasta file to calculate non-synonymous/synonymous mutation rate.
my_dnabin1 is a DNAbin file of 55 samples and I am using the following code to convert it into a…
Nikita
- 63
- 4
6
votes
2 answers
What is indel calling and what is its purpose?
I'm having a difficulty in grasping the general purpose and concept of indel calling.
What exactly is this process?
AlwaysTrying44
- 435
- 2
- 9
6
votes
2 answers
Is it possible to create a DESeqDataSet with a user-provided design matrix?
I'm trying to run a differential gene expression analysis using DESeq2, with counts coming from kallisto. I have imported them using tximport and I'm creating the DESeqDataSet (dds) using the DESeqDataSetFromMatrix function.
> dds <-…
mgalardini
- 977
- 7
- 18
6
votes
3 answers
Removing PCR duplicates in RNA-seq Analysis
After reading some of the forum posts in Biostar and SeqAnswers I find it very confusing whether to filter out the duplicate reads from aligned files or not. As far I understand it's very difficult to distinguish between highly expressed genes and…
arup
- 604
- 5
- 15
6
votes
3 answers
Secretome and membrane receptor profiles
I am looking the secretome profile and the membrane receptor profile for a given cell type.
In my specific case, this should be the secretome and outer membrane receptor profiles of dorsal root ganglion.
What I've done in the past is taken…
jaslibra
- 524
- 2
- 9
6
votes
2 answers
How can I edit a specific FASTQ read in place, given the read ID?
I am introducing SNVs into specific samples in order to estimate false negative rates for a variant calling pipeline. I know reads can be simulated but I would actually prefer to use the real data so as to keep everything else equal.
Given a read…
dkainer
- 128
- 3
6
votes
1 answer
How many false positive duplicates are marked using just the position of first unclipped base?
In the popular picard MarkDuplicates tool, a read is marked as a duplicate if it has the same position as another read starting from their first unclipped base in the 5'-direction. The implication is that if two reads have this property, it must be…
Ricky
- 63
- 3