Most Popular

1500 questions
4
votes
1 answer

10x Genomics Chromium single-cell RNA-seq data analysis options?

Provide an overview of 10x data analysis packages. 10x provides Cell Ranger which prepares a count matrix from the bcl sequencer output files and other files (see bottom of page https://support.10xgenomics.com/docs/license for the programs it…
Peter
  • 2,634
  • 15
  • 33
4
votes
1 answer

How do I find split reads?

How can I detect a split read in a BAM file? Is there any sign in the CIGAR string that describes split read?
4
votes
1 answer

Old versions of the reference genome and dbSNP

For benchmark purposes I’m trying to find an old version of the human reference genome (pre-GRCh37) and a matching version of dbSNP. Unfortunately it appears as though older versions of dbSNP aren’t archived, or at least these archives are not…
Konrad Rudolph
  • 4,845
  • 14
  • 45
4
votes
1 answer

taxon exclude list for searching local blast database using blastn

I am looking for a solution to exlude certain entries when searching a local blast nt database (with blastn), specifically the sequences from uncultured / environmental samples, ideally using their taxon ids. This should ideally replicate the…
Peter Menzel
  • 443
  • 4
  • 9
4
votes
1 answer

Viral genome assembly using broad viral ngs pipeline?

I am trying to assemble RNA virus genome using Broad Viral NGS pipeline BROAD VIRAL NGS PIPELINE. I am two questions : 1) As this pipeline requires unaligned bam format as input, how do I convert fastq to bam ? 2) Some of the tools like …
L R Joshi
  • 719
  • 3
  • 11
4
votes
2 answers

How to predict stop codons in Illumina reads?

I have Illumina MiSeq paired-end reads from 150bp amplicons mapped to my reference genome (> 1000X coverage). These reads have indels that may or may not induce frameshifts. If the indel induces a frameshift (i.e. indel is not a multiple of 3bp), I…
francoiskroll
  • 221
  • 1
  • 3
4
votes
1 answer

How to extract gene expression tables from this GEO dataset?

I've downloaded this GSE43013 dataset using GEOquery in R. My understanding is that it contains expression data from liver, kidney, and brain for multiple species. I would like to produce gene expression tables for each species and tissue type. The…
PollardMD
  • 79
  • 3
4
votes
1 answer

Attractor Landscape Analysis

I have come across a modeling toolbox, ATLANTIS, which is able to determine cell fates in silico based on the input models provided. This MATLAB-based toolbox is built on a method called "Attractor Landscape Analysis." I tried to find a paper…
4
votes
1 answer

Does UCSC definition for telomere making sense?

UCSC provides a database of telomere regions for each chromosome: ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/gap.txt.gz It reported the telomere section in chr21 as: grep chr21 gap.txt | grep telomere 585 chr21 0 10000 1 N …
SmallChess
  • 2,699
  • 3
  • 19
  • 35
4
votes
3 answers

Error while loading a big file with Cytoscape

I installed Cytoscape 3.7.1 with java 1.8.0_191 in windows server 2012. I have a 7.4 GB csv file (about 1,500,000 reords) and when I tried to load it into the Cytoscape it throws an error java.lang.OutOfMemoryErroe:Java heap space. My system has…
user4219
  • 41
  • 2
4
votes
1 answer

Understanding PCHeatmap outputs

I am currently trying to understand the purpose of these PCHeatmaps - part of the seurat package in R: All the online documentation I have searched for has only highlighted how it is used to observe heterogeneity in the data.. This much I…
h3ab74
  • 836
  • 5
  • 14
4
votes
1 answer

Are there any Genome in a Bottle-like resources for non-humans (especially for invertebrates)?

Genome in a Bottle is an excellent resource that provides many types of DNA and RNA sequencing reads for a single individual/cell line to test genome assembly and analysis tools. For example there are Illumina WGS reads, Oxford Nanopore, PacBio, and…
conchoecia
  • 3,141
  • 2
  • 16
  • 40
4
votes
1 answer

Are duplicate variants against the VCF standard?

I have been trying to understand an error that the EBI's vcf_validator gives when run on my vcf file. Consider this minimal…
terdon
  • 10,071
  • 5
  • 22
  • 48
4
votes
1 answer

Making a bed file for RSeQC

I making a bed file for RSeQC, so it can do things like compute the number of reads from exons, introns, 5"UTRs, etc. I want to use a bed file that corresponds to my GTF file, so I use gtf2bed to make a bed file, like this: awk '{ if ($0 ~…
Freek
  • 563
  • 4
  • 11
4
votes
1 answer

Help interpreting my phylogenetic tree construction of bacterial species in same genus

I constructed a phylogenetic tree using mole blast, and it created a neighbor joining tree, so I used the alignment of those sequences with mine (1-43) in mega with TN93+G+I model but I don't understand how to interpret why my species in question…