Most Popular
1500 questions
5
votes
1 answer
Can someone help me estimating the runtime of the pipeline applied by the vertebrate genome project?
The vertebrate genome project (VGP)
has a lot of interesting publications such as this one.
The rough pipeline is outlined below:
Here the pipeline in more detail:
While the paper describes all the steps of the iterative assembly pipeline, I…
ilam engl
- 280
- 1
- 10
5
votes
0 answers
Simultaneously get data from multiple applied gates in flowCore
Using the Bioconductor flowCore package, I'm applying two parallel and non-overlapping gates to a gatingSet directly under "root":
library(flowCore)
# Import data
file.name <-system.file("extdata","0877408774.B08",package="flowCore")
x…
gaspanic
- 183
- 5
5
votes
0 answers
Parse RNA variant effect annotations ("r." format)
I've got annotations for splicing variants in a format like this (this is one variant):
Variant: NM_004092.3:c.88+5G>A
Effect: Retention; r.87_88ins1_88+10:p.(Ala31Glufs*23)
I want to extract which acceptor/donor sites are lost or gained (in this…
Ron
- 51
- 1
5
votes
1 answer
Does SBOL support timing and threshold value parameters?
Like the propagation delay and threshold value parameters in electronic circuits, does SBOL format support any such parameters/glyph to represent the propagation delay (time to trigger the output protein once the input is provided) and threshold…
Hasan Baig
- 113
- 3
5
votes
3 answers
Removing duplicate FASTA sequences based on headers with Bash
I used the following command to remove duplicate FASTA sequences based on the header sequence:
paste -d $'\t' - -
aman
- 51
- 3
5
votes
1 answer
How to use SBOL (Synthetic Biology Open Language)?
One of my students is working on the development of a SynBio tool to design and represent genetic circuits in a standardized format. I know there are some libraries out there.
What I am interested in finding out is to have a hands-on tutorial how to…
Hasan Baig
- 113
- 3
5
votes
1 answer
KeyError when getting features from a genbank file with biopython with some accessions but not others
I'm very new to python but i've been using it to extract the sequence of a gene from a genbank file. The issue is is that sometimes i'll get the output i want (prints the sequence to a file) and sometimes it will return a key error. This depends on…
donna
- 53
- 2
5
votes
1 answer
installing multiple bioconductor packages at once
I was wondering if there is more elegant way of installing and loading multiple packages in Bioconductor similar to pacman with CRAN packages.
I tried:
# install and load the package manager
if (!requireNamespace("BiocManager", quietly = TRUE))
…
Sam
- 175
- 5
5
votes
2 answers
Does the kinship and inbreeding coefficients depend on population frequency of an allele?
I am reading Section 5.2, Kinship and Inbreeding Coefficients, of Kenneth Lange, Mathematical and Statistical Methods for Genetic Analysis. There the kinship coefficient $\Phi_{i,j}$ is defined for two relatives $i$ and $j$ as the probability that a…
Hans
- 189
- 6
5
votes
2 answers
Calculating average coverage for .bam files (sequence data)
(Full discolosure that this is my first time working with sequence data, and with the bash scripting.)
I need to calculate the average coverage for any .bam file.
After some searching I wrote the following script:
# Script to calculate the average…
Mirte
- 153
- 1
- 1
- 5
5
votes
0 answers
What exactly does each of InterPro, PANTHER, Pfam bring to the table individually in classifying a protein?
I would be very grateful if somebody could sketch out the methods Pfam and PANTHER use to assign a family to a given protein and how they are different. My (cursory) understanding is that InterpProp pools databases like Pfam and PANTHER and provides…
rtviii
- 364
- 1
- 7
5
votes
1 answer
are GSEA and other geneset enrichment analysis supposed to yield extremely different results between them?
I have recently ran in R four geneset enrichment analysis in the same database (TCGA:breast cancer) comparing two intrinsic subtypes. The methods I used were:
MIGSA, that imports mGSZ package and combines it with a SEA algorithm. Using RNA-seq TMM…
Darío Rocha
- 53
- 4
5
votes
2 answers
Economist article on coronavirus
I am wondering about an article in the Economist here:
https://www.economist.com/briefing/2020/02/29/covid-19-is-now-in-50-countries-and-things-will-get-worse
There is a graph there
The explanation is as follows
The course of an epidemic is shaped…
onyourmark
- 59
- 1
5
votes
1 answer
Snakemake: Migrating from deprecated cluster.json to new profiles.yaml
I am an avid user of Snakemake. Recently we have been refreshing our pipelines and I saw that a cluster.json file is no longer the recommended way to store the cluster configuration.
I used to start my pipelines like this:
snakemake --cluster-config…
Freek
- 563
- 4
- 11
5
votes
3 answers
How to translate amino acid sequences to Nucleotide sequences
I want to convert a list of fasta ( protein sequences) in a .text file into corresponding nucleotide sequences. A Google search gives me result of DNA to protein conversion but not vice versa. Also, I came across How do I find the nucleotide…
user3289492
- 51
- 1
- 2