Most Popular

1500 questions
5
votes
1 answer

Which are the use cases for the methods for DE in Seurat

In Seurat we can specify multiple methods for finding DE genes. I am wondering when should we use which one, the use cases. As I understand, whole Seurat package is used for single cell sequencing analysis. I ran different methods for my dataset…
Nikita Vlasenko
  • 2,558
  • 3
  • 26
  • 38
5
votes
2 answers

A reliable fetcher of short read using SRA/ENA accession

I am trying to build a workflow that gets data automatically from databases of sequencing reads (SRA - sequence read archive, ENA - european nucleotide archive). Till now I was pulling everything from SRA and my script have worked, but now I…
Kamil S Jaron
  • 5,542
  • 2
  • 25
  • 59
5
votes
2 answers

Manually define clusters in Seurat and determine marker genes

I want to define two clusters of cells in my dataset and find marker genes that are specific to one and the other. Is there a way to do this in Seurat? Say, if I produce two subsets by the SubsetData function, is there a way to feed them into some…
Nikita Vlasenko
  • 2,558
  • 3
  • 26
  • 38
5
votes
2 answers

Reads mapped to exonic, intronic and intergenic regions

After the alignment step I checked the rnaseq metrics of all the samples. Among 40 samples three samples show high percentage of reads mapped to intronic regions. What could be the reason? Samples Exonic Intronic Intergenic Sample1…
stack_learner
  • 1,262
  • 14
  • 26
5
votes
3 answers

Is there an efficient way to check an input BAM in R?

I'm writing a function in R for an R package which takes as input a BAM. my_func = function(my_bam){ if (!file.exists(my_bam)){ stop("Input a valid BAM file.") } ## do awesome stuff with BAM } Trying to write robust software,…
EB2127
  • 1,413
  • 2
  • 10
  • 23
5
votes
1 answer

Samtools/bcftools calling indels in noisy reads

I have very noisy nanopore reads and am trying to call SNPs/indels. I'm runinig into some trouble when truing to use the samtools mpileup | bcftools call combination. It seems to incorrectly be calling very long indels where it looks like there is…
5
votes
1 answer

Keep Format and Individual fields when annotating VCF with VEP

I'm currently updating my Variant Calling Pipeline by switching the VCF annotating software from Annovar to VEP for a variety of regions, not least how easy it is to annotate with HGVS notation and keep datasets up to date in VEP. For the most part…
5
votes
2 answers

How are Principal Component analyses and Admixture analyses from a genetic alignment different?

How are Principal Component analyses and Admixture analyses from a genetic alignment different? My understanding is that a PCA will take raw genetic differences across the entire alignment and plot them using dimensionality reduction techniques…
5
votes
3 answers

Regressing out unwanted sources of variation in single cell RNA-seq data

I have a dataset of single cell count data and I want to regress out the variation caused by the number of UMI's and the percentage of mitochondrial genes. I know that count data is discrete data, and commonly follows a negative binomial…
DCZ
  • 200
  • 8
5
votes
1 answer

Do you use SNAP for short-read mapping?

I am calling SNPs from WGS samples produced at my lab. I am currently using bwa-mem for mapping Illumina reads as it is recommended by GATK best practice. However, bwa is a bit slow. I heard from my colleague that SNAP is much faster than bwa. I…
medbe
  • 847
  • 1
  • 7
  • 9
5
votes
2 answers

Is there a Python/R package with the ability to convert an alignment and reference into a CIGAR?

I'm writing a python function from scratch to do this, but I feel like this must exist in some standard bioinformatics library already. In principle, this is a simply regex operation which many must have written previously. With the goal of having a…
EB2127
  • 1,413
  • 2
  • 10
  • 23
5
votes
1 answer

How to convert the given mathematical computation (on biological problem) to mathematical fomula, equation?

I have crossposted this question in maths StackExchange. The problem is dominantly mathematical (this question) but the application of the problem is mainly biological. Hoping that people in this forum have faced similar problems, I am posting this…
everestial
  • 200
  • 7
5
votes
1 answer

How to get results from Homo.sapiens package in bioconductor for a specific reference

I want to use the Homo.sapiens package in Bioconductor to retrieve the chromosome location start and end for each gene symbol in a specific reference (e.g. hg19 or hg38). Right now, I am using the following code: > select(Homo.sapiens,…
gc5
  • 1,783
  • 18
  • 32
5
votes
2 answers

How can I easily get the read size distribution of reads mapping on a certain set of regions?

Suppose I have a BAM file indicating where reads in a library have mapped, and a bed file describing a set of genomic regions. Is there a way to easily get the size distribution of the reads mapping on this set of regions?
bli
  • 3,130
  • 2
  • 15
  • 36
5
votes
1 answer

Which tools for differential expression analysis in scRNA-Seq?

I am starting to run analysis for differential expression in scRNA-Seq. Which tools are available for this kind of analysis? Can tools for bulk RNA-Seq like DESeq be used for scRNA-Seq?
gc5
  • 1,783
  • 18
  • 32