Most Popular
1500 questions
5
votes
2 answers
Difference between samtools mark duplicates and samtools remove duplicates?
What is difference between samtools mark duplicates and remove duplicates ? Is it necessary to mark duplicates before removing duplicates with samtools?
Ammar Sabir Cheema
- 951
- 7
- 20
5
votes
1 answer
Correcting for noise in RT-qPCR gene expression data
I have a training set of RT-qPCR gene expression data (not run in triplicate) for a batch of samples with two phenotypes $A$ and $B$ on which I've trained a "logistic regression classifier".
I also have another smaller set of samples which have been…
Set
- 241
- 1
- 8
5
votes
1 answer
No counts for added gene in cellranger (scRNA-seq)
I have a set of scRNA-seq samples enriched with FACS for cells expressing a specific gene reporter (TdTomato). In particular the gene I want to report has positive counts in the resulting matrix for 97% of cells.
I followed CellRanger documentation…
gc5
- 1,783
- 18
- 32
5
votes
2 answers
Reduce number of transcripts in a highly variable de novo transcriptome assembly
I have a de novo assembly using both multiple SRA and locally sequenced transcriptomes. I started with 270M PE reads from 9 tissues. Here are the assembly stats generated with TrinityStats.pl:
################################
## Counts of…
LinuxBlanket
- 309
- 1
- 10
5
votes
0 answers
Annotating splice junctions from tophat/STAR output
Is there a way to annotate the splice junctions output from tophat/STAR output?
What I mean by annotate is can I know if it was involved in an alternative splicing event say skipped exon, MXE or retained intron...?
I did some research looks like…
novicebioinforesearcher
- 771
- 1
- 6
- 15
5
votes
1 answer
Why do NEBNext indexing primers have sequence between the p5 oligo and index?
In a previous post I asked Why do NEB adapters have non-complementary sequence?
Since then, I realized that there is some other sequence in the p5 indexing primer, as well as in the p7 indexing primer.
Here is a diagram of the NEBNext protocol. The…
conchoecia
- 3,141
- 2
- 16
- 40
5
votes
2 answers
Spearman correlation for large dataset
I have two datasets (DataA and DataB) and I want to find the Spearman correlation between genes and also pull out the gene names (stored in first column of dataset) in R. I am using fread from read.table to read the file and cor.test to find Rho and…
user98059
- 347
- 3
- 11
5
votes
1 answer
Macs2 peak calling?
I have paired end ChIP-seq data with 101 bp and 2 biological replicates for each one. I have done peak calling with macs2 but I have some questions about it.
I also faced with an warning:
WARNING @ Thu, 07 Jun 2018 17:06:05: #2 Since the d (197)…
star
- 153
- 3
5
votes
1 answer
What causes the difference in total length of assembled contigs and scaffolds in SOAPdenovo2?
I use SOAPdenovo2 to assemble a large genome (4.8G) using ~20X paired-end reads. The total length of contig sizes is 6.3G while total length of scaffolds is 2.7G. Note that this is a haploid genome, so there is no issue of heterozygosity for…
user8095614
- 50
- 2
5
votes
1 answer
Viral Metagenomics
I am analyzing viral metagenomics data (Illumina Miseq) for the first time. I have used Ray (reference below) for de novo viral genome assembly before but I haven't done metagenomics analysis before.
I know that there are some tools like Metavelvet…
L R Joshi
- 719
- 3
- 11
5
votes
3 answers
KEGG FTP vs KEGG API
I was reading the KEGG plea and I found that it doesn't forbid using the KEGG API. Then, what is in the FTP server license for personal use/academic use that it is not covered by the API?
Or I could download all the database via the API?
PS: I…
llrs
- 4,693
- 1
- 18
- 42
5
votes
4 answers
Specific cell type identification in Single Cell Sequencing
In order to define which cell is of which type we need to identify a set of rules, for instance neurons should express one of the following: Thy1, Rbfox3, MAP2, Camk2b, Gad1,Cck, Reln, and should not express any of the following: cd45, Tmem119,…
Nikita Vlasenko
- 2,558
- 3
- 26
- 38
5
votes
2 answers
Smallest group size for differential expression in limma (bulk RNA-Seq)
I am reading Smyth et al. (ref. 1). I want to run differential expression analysis on a bulk RNA-Seq dataset in which each group is composed by 2 samples. In the paper previously cited it is written that:
Genes must be expressed in at least one…
gc5
- 1,783
- 18
- 32
5
votes
4 answers
Pathway level analysis of single-cell gene expression
I'm looking for single-cell specific methods to construct (using gene expression data) new features that express pathway "level" or "activity", and then use these for clustering cells.
One example for bulk RNA-seq is PLAGE, implemented in the GSVA R…
Peter
- 2,634
- 15
- 33
5
votes
2 answers
Error creating indices using STAR
I am trying to index wheat genome using STAR through following command
STAR --runMode genomeGenerate --genomeFastaFiles Triticum_aestivum.TGACv1.dna_sm.toplevel.fa --runThreadN 28
But getting following error,
terminate called after throwing an…
Ammar Sabir Cheema
- 951
- 7
- 20