Most Popular
1500 questions
7
votes
3 answers
Importing GFF file with Biopython
Is there a way to import a GFF file for an organism with Biopython in the same way you can for a Genbank file?
For example,
from Bio import Entrez as ez
ez.email = '...'
handle = ez.efetch(db='gene', id='AE015451.2', rettype='genbank',…
Alex Summers
- 73
- 4
7
votes
4 answers
What is a simple command line tool for doing Needleman-Wunsch pair-wise alignment on the command line
I have two DNA strings:
GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC
and
AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG
I want a tool that allows me to do something like this on the command line:
$ aligner …
winni2k
- 2,266
- 11
- 28
7
votes
1 answer
ANY (technical) reason behind submitting sequences to GenBank versus ENA Sequence
The DNA sequence sections of the three INSDC databases (i.e., DDBJ, ENA Sequence and GenBank) are synchronized periodically and strive to keep their stored data as ubiquitously accessible as possible. Except for idiosyncrasies in their data…
Michael Gruenstaeudl
- 203
- 1
- 6
7
votes
1 answer
Gene not found in Affymetrix expression profiles
I am studying the ABA network in A. thaliana, consisting of HB7, ABI1 and AREB2. The AGI code I was given are, respectively: AT2G46680, AT4G26080 and AT1G45249.
I downloaded the following file in order to convert the array element name to the AGI…
wrong_path
- 391
- 1
- 7
7
votes
1 answer
Simulating 3' end tag-based scRNA-seq reads
Are there any tools that will simulate 3' end tag-based single-cell RNA-seq reads? For example, Drop-seq, 10X Chromium, CEL-seq?
There are tools that simulate scRNA-seq gene count data (e.g. Splatter), but I can't find anything that will simulate…
merv
- 651
- 5
- 15
7
votes
4 answers
Bash scripting FastQC for multiple fastq files in multiple directories
I am completely new to bioinformatics so I'm looking to learn how to do this.
I have multiple directories with fastq files: E.g; 10 Directories with each time series, each with Treatment and control directories, each with rep1 rep2 rep3.
For…
Ryan Carter
7
votes
1 answer
Sequence alignment using BWT
My Problem:
Skipping some specific background, what I want to do is judging whether some soft-clipping sequences are the same, which may result by the same SV event. Colored bases in Fig.1 is an example of soft-clipping sequences.
I use BWA -MEM as…
CodeUnsolved
- 71
- 3
7
votes
1 answer
How can HISAT2/StringTie report decimal coverage values
I have performed RNA-seq analysis using HISAT2 & StringTie workflow suggested in: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown.
Some of the transcripts/exons have decimal coverage values (eg., 1.1…
pogibas
- 203
- 1
- 5
7
votes
2 answers
Converting Gene Symbol to Ensembl ID in R
I'm trying to convert ~20,000 different human gene symbols to ensembl IDs. I've been trying to use biomaRt to do this, but continue getting the following error
getBM( attributes=c("ensembl_gene_id") , filters= "mgi_symbol" ,mart=ensembl)
Error in…
Mea R.
- 71
- 1
- 1
- 2
7
votes
1 answer
Source for whole genome comparisons at NCBI Genomes
The NCBI Genomes database has these dendrograms for (presumably) whole genome comparisons for certain species, e.g. Pseudonomas aeruginosa or Escherichia coli.
How were these comparisons done? Someone knows the source / paper?
Peter Menzel
- 443
- 4
- 9
7
votes
2 answers
What is the correct way to map Hi-C data with bwa mem?
Library Prep
I have a Hi-C library prepped using an enzyme that cuts at GATC, so it leaves GATCGATC as the junction sequences. This library was sequenced on a 2x150 PE Illumina run.
Data Pre-processing
The reads were adapter and quality trimming…
conchoecia
- 3,141
- 2
- 16
- 40
7
votes
2 answers
How are snakemake's --cluster and --drmaa options implemented?
I'm fairly new to snakemake and I'm trying to understand the difference between the --cluster and --drmaa flags, both of which are used to submit jobs to compute clusters/nodes.
The docs give a few hints about the advantages of using --drmaa…
Chris_Rands
- 3,948
- 12
- 31
7
votes
2 answers
Are gene names same across species?
I have a bunch of gene names of Apis mellifera (specifically 194). I used these gene names as an input on STRING database to create a network for Drosophila melanogaster. 54 of those genes were also present in D. melanogaster and I received a second…
The Last Word
- 297
- 1
- 7
7
votes
1 answer
Using pysam with cython: htslib/kstring.h not found
I'm trying to learn to use cython to speed up some code based on pysam. My issue is not strictly speaking about bioinformatics, but rather about building tools using a bioinformatics library. I hope this is still relevant in this site (I hesitated…
bli
- 3,130
- 2
- 15
- 36
7
votes
3 answers
plotting two heatmaps with the same order of genes
I expected the same pattern but here I am not able to compare the patterns as the order of genes does not seem the same. I mean how I can have two heat maps on which the order of genes are the same so I woulds be able to compare the block of yellow…
Zizogolu
- 2,148
- 11
- 44