Most Popular
1500 questions
4
votes
2 answers
Writing a perl script to holding information for two genes
Basically I have a perl script in which I have an array (where each element of the array references a hash) and need to be able to print the array with a dumper function.
Thus I need to be able to split the $line on white spaces and save into…
code_pink
- 63
- 3
4
votes
4 answers
Two variants associated with same chromosomal position in genotype data for one individual
I'm trying to play around with my family's raw genetic data from Ancestry.com so that I can do some genetic analyses, and I noticed in the AncestryDNA raw data files that some regions of the genome have more than one variant associated with them and…
Sarah
- 486
- 1
- 4
- 18
4
votes
1 answer
How can I reproduce a manual NCBI search with Biopython Entrez module?
I'm trying to make a Biopython script that reproduces a manual search on the NCBI website.
My manual search gives me the following URL:
https://www.ncbi.nlm.nih.gov/nuccore?term=ANOS1[gene%20name]%20AND%20refseq[filter]
And the "Search details" box…
bli
- 3,130
- 2
- 15
- 36
4
votes
2 answers
Can I stop my nanopore sequencing run if there are no more reads being produced?
I am sequencing a whole genome on MinION. I have used this flow cell for 6 hours for the phage lambda control. That gave me around 300,000 reads. I started a 48h sequencing run on the same flow cell after washing it. The sequencing has been going on…
Biomagician
- 2,459
- 16
- 30
4
votes
2 answers
What does it mean to over sample?
One of my studies (A) involve sequencing the microbiome. After selecting the variable region and the primers for the targeted region of the 16S sequence. The samples were sent to a platform.
For another study (B) we used the same strategy (same…
llrs
- 4,693
- 1
- 18
- 42
4
votes
1 answer
What are the differences between BNGL (BioNetGen Language) and SBML (System Biology Markup Langague) formats?
Per request from meta comment.
I am self-learning about whole cell modeling, specifically An introduction to whole-cell modeling and Fundamentals of Systems Biology: From Synthetic Circuits to Whole-cell Models. While my background is decades of…
Guy Coder
- 315
- 1
- 9
4
votes
1 answer
Why is ribosomal RNA difficult to remove even with Poly(A) selection?
In this answer (actually in a comment), it is stated that:
As you've noticed from your own analysis, the ribosomal genes have quite variable expression across cells. They're expressed everywhere, and quite difficult to completely remove from a…
gc5
- 1,783
- 18
- 32
4
votes
2 answers
Filter bam using SNP list in bed format with minimum mapping quality and base quality
I have a bam file and a bed file that defines a list of SNPs. I would like to filter the bam file to contain only those reads with a minimum mapping quality that overlap at least one SNP with a minimum base quality.
Samtools seems to almost solve…
mattm
- 754
- 7
- 19
4
votes
2 answers
Exon-exon junctions: compare experimental transcripts to reference annotation
My aim is to parse an experimental transcript set (obtained by RNAseq) to check which splice junctions are already reported in a reference annotation and which ones are new.
I tried Regtools:
feed the experimental alignment file (BAM: aligned…
aechchiki
- 2,676
- 11
- 34
4
votes
2 answers
How does Li and Durbin's BWA paper compare alignment programs on real data?
Li and Durbin's "Fast and accurate short read alignment with burrows-whleeler transform" found here, says:
We evaluate the performance of BWA on ... real paired-end data by
checking the fraction of reads mapped in consistent pairs and by
…
5r9n
- 87
- 5
4
votes
1 answer
Counting the number of paralogues for mouse genes gives me the wrong frequency in R
I am trying to count the number of paralogues for the mouse homologues of the human protein-coding genes using BioMart. But for example in the 'PLIN4' gene its counting 35,000 paralogues instead of 4.
We think it is because some genes have one to…
Jack Dean
- 49
- 1
4
votes
2 answers
How to calculate statistical significance of sequence motifs
I would like to show that certain types of RBP motifs are enriched in RNA editing islands (i.e. clusters of RNA editing). However, I am unsure about how to think about sequence motifs with respect to their occurrence in other genomic features.
I…
tweirick
- 171
- 1
- 7
4
votes
1 answer
Make ipyrad use cuda-enabled NVidia card on Ubuntu
I want to use ipyrad on a new Ubuntu machine that has an NVidia Quadro K2000 card with 384 cores. One can configure ipyrad to run on a linux cluster. Do I have any options to get ipyrad to access these cores?
I'm assuming that I will be able to…
Peter Pearman
- 183
- 1
- 6
4
votes
2 answers
Counting reads within the intron
I have paired-end RNA-seq samples and a list of intron coordinates in bed format. I want to count the intronic reads such that:
1. The reads should overlap the intron by atleast 25 bp
2. If both the reads of the pair are within the intron, count it…
user3138373
- 420
- 1
- 5
- 13
4
votes
2 answers
Plot percentage of genome covered
Given a aligned bam file (wgs/bwa-mem) what steps do i need to perform to generate a plot as seen below (from this paper):
We are trying to see how much of genome was covered using pacbio and illumina for cow.
You mean remove this left plot?
novicebioinforesearcher
- 771
- 1
- 6
- 15