Most Popular

1500 questions
6
votes
7 answers

Download multiple SRA files

I want to download all SRA file from the following project. Is there a method to download all the SRA files at the same time?
user2300940
  • 223
  • 1
  • 2
  • 5
6
votes
3 answers

How can I extract gene names for a metabolic pathway from KEGG?

Note: this question has also been asked on Biostars I need the get the list of gene names involved in glycolysis (to put an example). Not manually, I need to do this in a script. Ideally with Python. I think KEGG is the proper database to do this,…
a06e
  • 161
  • 3
6
votes
4 answers

Any fast options to query large VCF bed intervals?

I'm doing some analysis and I need to subset a large VCF file (~8GB gziped) given a bed interval and identify within a subset of rsid. Unfortunately, both my normal choices to do this analysis (snpSift and bedtools) are taking way to long or…
6
votes
1 answer

Bacterial genome annotation of a clinical isolate strain?

So I'm basically so new to this that I'm just trying to find out what tools, methods, and keywords I should go look up by myself. I have a unique strain of a bacteria. I was given RNAseq data for this unique strain. I want to analyze the RNAseq…
myflow
  • 63
  • 3
6
votes
1 answer

Does GISTIC (v 2.0) estimate amplified/deleted probabilities on a single sample basis?

Does GISTIC 2.0 estimate the background model: G = -log(Probability | Background) by permuting within the sample or across all samples in the set? The paper describes the probabilistic scoring method based on permutations, but I could not understand…
Emanuel
  • 183
  • 1
  • 7
6
votes
2 answers

Clarification on Gene Enrichment

When I run a GSEA analysis on two conditions from the same RNaseq (negative control PBS injection VS positive control CpG injection) from the same dataset/same gene list, I get results that look something like this: Notice in my example that many…
julianstanley
  • 401
  • 3
  • 9
6
votes
1 answer

How to build a BAM header file with htslib in C++?

I'd like to use C++ to generate a new BAM file programmatically. This is an example how to use htslib to generate a new BCF file on the fly. https://github.com/samtools/htslib/blob/develop/test/test-bcf-translate.c The two key functions…
SmallChess
  • 2,699
  • 3
  • 19
  • 35
6
votes
1 answer

checks for spike-in sequence controls

I would like to know what do people verify when designing/using spike-in controls, to be used in sequencing experiments (mainly Illumina). So far I came up with this list: Does it align only to a given genome synthetic reference? Does it contain…
719016
  • 2,324
  • 13
  • 19
6
votes
1 answer

How to convert DNAbin to FASTA in R?

I am trying to convert DNAbin files to fasta format. Rationale: I want to use the fasta file to calculate non-synonymous/synonymous mutation rate. my_dnabin1 is a DNAbin file of 55 samples and I am using the following code to convert it into a…
Nikita
  • 63
  • 4
6
votes
2 answers

What is indel calling and what is its purpose?

I'm having a difficulty in grasping the general purpose and concept of indel calling. What exactly is this process?
AlwaysTrying44
  • 435
  • 2
  • 9
6
votes
2 answers

Is it possible to create a DESeqDataSet with a user-provided design matrix?

I'm trying to run a differential gene expression analysis using DESeq2, with counts coming from kallisto. I have imported them using tximport and I'm creating the DESeqDataSet (dds) using the DESeqDataSetFromMatrix function. > dds <-…
mgalardini
  • 977
  • 7
  • 18
6
votes
3 answers

Removing PCR duplicates in RNA-seq Analysis

After reading some of the forum posts in Biostar and SeqAnswers I find it very confusing whether to filter out the duplicate reads from aligned files or not. As far I understand it's very difficult to distinguish between highly expressed genes and…
arup
  • 604
  • 5
  • 15
6
votes
3 answers

Secretome and membrane receptor profiles

I am looking the secretome profile and the membrane receptor profile for a given cell type. In my specific case, this should be the secretome and outer membrane receptor profiles of dorsal root ganglion. What I've done in the past is taken…
jaslibra
  • 524
  • 2
  • 9
6
votes
2 answers

How can I edit a specific FASTQ read in place, given the read ID?

I am introducing SNVs into specific samples in order to estimate false negative rates for a variant calling pipeline. I know reads can be simulated but I would actually prefer to use the real data so as to keep everything else equal. Given a read…
dkainer
  • 128
  • 3
6
votes
1 answer

How many false positive duplicates are marked using just the position of first unclipped base?

In the popular picard MarkDuplicates tool, a read is marked as a duplicate if it has the same position as another read starting from their first unclipped base in the 5'-direction. The implication is that if two reads have this property, it must be…
Ricky
  • 63
  • 3