Most Popular

1500 questions
5
votes
1 answer

Can someone help me estimating the runtime of the pipeline applied by the vertebrate genome project?

The vertebrate genome project (VGP) has a lot of interesting publications such as this one. The rough pipeline is outlined below: Here the pipeline in more detail: While the paper describes all the steps of the iterative assembly pipeline, I…
ilam engl
  • 280
  • 1
  • 10
5
votes
0 answers

Simultaneously get data from multiple applied gates in flowCore

Using the Bioconductor flowCore package, I'm applying two parallel and non-overlapping gates to a gatingSet directly under "root": library(flowCore) # Import data file.name <-system.file("extdata","0877408774.B08",package="flowCore") x…
gaspanic
  • 183
  • 5
5
votes
0 answers

Parse RNA variant effect annotations ("r." format)

I've got annotations for splicing variants in a format like this (this is one variant): Variant: NM_004092.3:c.88+5G>A Effect: Retention; r.87_88ins1_88+10:p.(Ala31Glufs*23) I want to extract which acceptor/donor sites are lost or gained (in this…
Ron
  • 51
  • 1
5
votes
1 answer

Does SBOL support timing and threshold value parameters?

Like the propagation delay and threshold value parameters in electronic circuits, does SBOL format support any such parameters/glyph to represent the propagation delay (time to trigger the output protein once the input is provided) and threshold…
Hasan Baig
  • 113
  • 3
5
votes
3 answers

Removing duplicate FASTA sequences based on headers with Bash

I used the following command to remove duplicate FASTA sequences based on the header sequence: paste -d $'\t' - -
aman
  • 51
  • 3
5
votes
1 answer

How to use SBOL (Synthetic Biology Open Language)?

One of my students is working on the development of a SynBio tool to design and represent genetic circuits in a standardized format. I know there are some libraries out there. What I am interested in finding out is to have a hands-on tutorial how to…
Hasan Baig
  • 113
  • 3
5
votes
1 answer

KeyError when getting features from a genbank file with biopython with some accessions but not others

I'm very new to python but i've been using it to extract the sequence of a gene from a genbank file. The issue is is that sometimes i'll get the output i want (prints the sequence to a file) and sometimes it will return a key error. This depends on…
donna
  • 53
  • 2
5
votes
1 answer

installing multiple bioconductor packages at once

I was wondering if there is more elegant way of installing and loading multiple packages in Bioconductor similar to pacman with CRAN packages. I tried: # install and load the package manager if (!requireNamespace("BiocManager", quietly = TRUE)) …
Sam
  • 175
  • 5
5
votes
2 answers

Does the kinship and inbreeding coefficients depend on population frequency of an allele?

I am reading Section 5.2, Kinship and Inbreeding Coefficients, of Kenneth Lange, Mathematical and Statistical Methods for Genetic Analysis. There the kinship coefficient $\Phi_{i,j}$ is defined for two relatives $i$ and $j$ as the probability that a…
Hans
  • 189
  • 6
5
votes
2 answers

Calculating average coverage for .bam files (sequence data)

(Full discolosure that this is my first time working with sequence data, and with the bash scripting.) I need to calculate the average coverage for any .bam file. After some searching I wrote the following script: # Script to calculate the average…
Mirte
  • 153
  • 1
  • 1
  • 5
5
votes
0 answers

What exactly does each of InterPro, PANTHER, Pfam bring to the table individually in classifying a protein?

I would be very grateful if somebody could sketch out the methods Pfam and PANTHER use to assign a family to a given protein and how they are different. My (cursory) understanding is that InterpProp pools databases like Pfam and PANTHER and provides…
rtviii
  • 364
  • 1
  • 7
5
votes
1 answer

are GSEA and other geneset enrichment analysis supposed to yield extremely different results between them?

I have recently ran in R four geneset enrichment analysis in the same database (TCGA:breast cancer) comparing two intrinsic subtypes. The methods I used were: MIGSA, that imports mGSZ package and combines it with a SEA algorithm. Using RNA-seq TMM…
5
votes
2 answers

Economist article on coronavirus

I am wondering about an article in the Economist here: https://www.economist.com/briefing/2020/02/29/covid-19-is-now-in-50-countries-and-things-will-get-worse There is a graph there The explanation is as follows The course of an epidemic is shaped…
onyourmark
  • 59
  • 1
5
votes
1 answer

Snakemake: Migrating from deprecated cluster.json to new profiles.yaml

I am an avid user of Snakemake. Recently we have been refreshing our pipelines and I saw that a cluster.json file is no longer the recommended way to store the cluster configuration. I used to start my pipelines like this: snakemake --cluster-config…
Freek
  • 563
  • 4
  • 11
5
votes
3 answers

How to translate amino acid sequences to Nucleotide sequences

I want to convert a list of fasta ( protein sequences) in a .text file into corresponding nucleotide sequences. A Google search gives me result of DNA to protein conversion but not vice versa. Also, I came across How do I find the nucleotide…
user3289492
  • 51
  • 1
  • 2