Most Popular
1500 questions
7
votes
2 answers
Canu assembly not making a single consensus?
I've downloaded reads from this BioProject. Using canu with default parameters (no correction), I've got 4 contigs, none of which really look like the reference plasmid here.
The command I used was:
canu -p ip40a -d ip40a_assembly -useGrid=false…
thestatnoob
- 71
- 2
7
votes
1 answer
dividing genome into non-overlapping windows using R
I have nearly 10 million SNPs located on 10 chromosomes. I want to divide the genome into non-overlapping windows of 15, 20 and 30 kb. Here is part of my SNP table:
head (sap_ids)
snp_id chr pos
Chr01__15043 1 15043
…
Anna1364
- 516
- 2
- 8
7
votes
1 answer
How can I colour boxes in Gviz AnnotationTrack in R?
I'm learning the Gviz bioconductor package, I generate a plot as follows:
library(Gviz)
track <- AnnotationTrack(start=c(1,5,7), end=c(2,6,10), strand=c('*','*','*'), stacking="dense", showFeatureId=TRUE, id=c('red','blue',…
Chris_Rands
- 3,948
- 12
- 31
7
votes
2 answers
How to release an R package with several bed files?
I'm currently creating an R package, and my scripts require before any analysis is done that 1-2 bed files be loaded. Normally, I would run the scripts with the following:
library(data.table)
session_bed =…
ShanZhengYang
- 1,691
- 1
- 14
- 20
7
votes
1 answer
Coverage calculation: long reads (RNA-seq)
Say your aim is to calculate the coverage of an RNA-seq experiment generated with long-read sequencing (so, uneven read length).
Up to now, I relied on the Lander/Waterman equation:
$$C = L*N / G$$
where:
$C$ = final coverage
$G$ = haploid…
aechchiki
- 2,676
- 11
- 34
7
votes
3 answers
Retrieve detailed gene descriptions
Given a list of gene IDs, how do you retrieve the gene description, summary and other detailed information in R?
Peter
- 2,634
- 15
- 33
7
votes
2 answers
Waterman-Smith-Beyer Sequence Alignment
I am having trouble understanding the affine gap penalty in the following example -
I am not sure where the 3 and 4 come from or the 4 and 5 in cells 1,3 and 2,4. I'm not understanding how the affine gap penalty works here. I understand how the…
H5159
- 173
- 4
7
votes
3 answers
How to check whether all BAM read contain defined read groups?
I'm trying to investigate whether there are errors within my BAM. After looking at the BAM header to see whether the read groups exists (using samtools, i.e.
samtools view -H file1.bam
Here, it appears that the header includes @RG tags. However, I…
ShanZhengYang
- 1,691
- 1
- 14
- 20
7
votes
1 answer
Why is this makeblastdb command not working?
I am trying to make local databases for the ncbi-blast+ package (version 2.60). I am doing so for 4 T-cell receptor genes. 3 of the 4 (TRAV, TRAJ, TRBV) have worked fine, but I am having problems with TRBJ. I say this because they should be the same…
TW93
- 449
- 3
- 11
7
votes
2 answers
variant calling on ChIP-seq style data: samtools mpileup with minimal filters
I am running samtools mpileup (v1.4) on a bam file with very choppy coverage (ChIP-seq style data). I want to get a first-pass list of positions with SNVs and their frequency as reported by the read counts, but no matter what I do, I keep getting…
719016
- 2,324
- 13
- 19
7
votes
6 answers
How to get a list of genes corresponding to the list of SNPs (rs ids)?
Is there a way for me to get a list of genes given a list of SNP rs ids? I found several questions asked with a similar goal years ago, and the answers are always about using multiple online tools, and/or R programming.
I wonder if there is a…
Haohan Wang
- 521
- 3
- 8
7
votes
2 answers
What is the difference between ENCODE Tier 1, 2 and 3 cell types?
The ENCODE Experiment Matrix at UCSC lists the different available cell types under the categories "Tier 1", "Tier 2" and "Tier 3". What is the difference between these classifications?
What, for example, makes GM12878 a Tier 1 cell type and A549 a…
juniper-
- 900
- 6
- 13
7
votes
1 answer
Server for finding kmers in set of sequences
Is there a server/website somewhere where I can submit a list of DNA/RNA sequences and find the list of kmers hits and organisms where it's found? I checked the kraken website, but they don't have a webserver for it.
I am specifically looking for…
719016
- 2,324
- 13
- 19
7
votes
2 answers
VCF merge containing CNV
How do I merge VCFs files containing CNVs?
I use vcf-merge, a VCFtools function, and after bgzip and tabix, SAMtools, to index and tab separate variants, but I don't know if it is the right way.
Andrea Spinelli
- 129
- 6
7
votes
1 answer
How to represent a deletion at position 1 in a VCF file?
I am writing a small script to write a VCF file from short multiple sequence alignments, and stumbled into an interesting corner case.
This is the way a deletion should be represented in VCF format (I assume version 4+):
NC_000016.9 2138199…
mgalardini
- 977
- 7
- 18