Most Popular

1500 questions
7
votes
3 answers

Importing GFF file with Biopython

Is there a way to import a GFF file for an organism with Biopython in the same way you can for a Genbank file? For example, from Bio import Entrez as ez ez.email = '...' handle = ez.efetch(db='gene', id='AE015451.2', rettype='genbank',…
7
votes
4 answers

What is a simple command line tool for doing Needleman-Wunsch pair-wise alignment on the command line

I have two DNA strings: GGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGC and AGAAGGAGGGAGGGAAGAGGAGGGAGAAGGAGGGAGGGAAGAGGAGG I want a tool that allows me to do something like this on the command line: $ aligner …
winni2k
  • 2,266
  • 11
  • 28
7
votes
1 answer

ANY (technical) reason behind submitting sequences to GenBank versus ENA Sequence

The DNA sequence sections of the three INSDC databases (i.e., DDBJ, ENA Sequence and GenBank) are synchronized periodically and strive to keep their stored data as ubiquitously accessible as possible. Except for idiosyncrasies in their data…
7
votes
1 answer

Gene not found in Affymetrix expression profiles

I am studying the ABA network in A. thaliana, consisting of HB7, ABI1 and AREB2. The AGI code I was given are, respectively: AT2G46680, AT4G26080 and AT1G45249. I downloaded the following file in order to convert the array element name to the AGI…
wrong_path
  • 391
  • 1
  • 7
7
votes
1 answer

Simulating 3' end tag-based scRNA-seq reads

Are there any tools that will simulate 3' end tag-based single-cell RNA-seq reads? For example, Drop-seq, 10X Chromium, CEL-seq? There are tools that simulate scRNA-seq gene count data (e.g. Splatter), but I can't find anything that will simulate…
merv
  • 651
  • 5
  • 15
7
votes
4 answers

Bash scripting FastQC for multiple fastq files in multiple directories

I am completely new to bioinformatics so I'm looking to learn how to do this. I have multiple directories with fastq files: E.g; 10 Directories with each time series, each with Treatment and control directories, each with rep1 rep2 rep3. For…
Ryan Carter
7
votes
1 answer

Sequence alignment using BWT

My Problem: Skipping some specific background, what I want to do is judging whether some soft-clipping sequences are the same, which may result by the same SV event. Colored bases in Fig.1 is an example of soft-clipping sequences. I use BWA -MEM as…
7
votes
1 answer

How can HISAT2/StringTie report decimal coverage values

I have performed RNA-seq analysis using HISAT2 & StringTie workflow suggested in: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Some of the transcripts/exons have decimal coverage values (eg., 1.1…
pogibas
  • 203
  • 1
  • 5
7
votes
2 answers

Converting Gene Symbol to Ensembl ID in R

I'm trying to convert ~20,000 different human gene symbols to ensembl IDs. I've been trying to use biomaRt to do this, but continue getting the following error getBM( attributes=c("ensembl_gene_id") , filters= "mgi_symbol" ,mart=ensembl) Error in…
Mea R.
  • 71
  • 1
  • 1
  • 2
7
votes
1 answer

Source for whole genome comparisons at NCBI Genomes

The NCBI Genomes database has these dendrograms for (presumably) whole genome comparisons for certain species, e.g. Pseudonomas aeruginosa or Escherichia coli. How were these comparisons done? Someone knows the source / paper?
Peter Menzel
  • 443
  • 4
  • 9
7
votes
2 answers

What is the correct way to map Hi-C data with bwa mem?

Library Prep I have a Hi-C library prepped using an enzyme that cuts at GATC, so it leaves GATCGATC as the junction sequences. This library was sequenced on a 2x150 PE Illumina run. Data Pre-processing The reads were adapter and quality trimming…
conchoecia
  • 3,141
  • 2
  • 16
  • 40
7
votes
2 answers

How are snakemake's --cluster and --drmaa options implemented?

I'm fairly new to snakemake and I'm trying to understand the difference between the --cluster and --drmaa flags, both of which are used to submit jobs to compute clusters/nodes. The docs give a few hints about the advantages of using --drmaa…
Chris_Rands
  • 3,948
  • 12
  • 31
7
votes
2 answers

Are gene names same across species?

I have a bunch of gene names of Apis mellifera (specifically 194). I used these gene names as an input on STRING database to create a network for Drosophila melanogaster. 54 of those genes were also present in D. melanogaster and I received a second…
The Last Word
  • 297
  • 1
  • 7
7
votes
1 answer

Using pysam with cython: htslib/kstring.h not found

I'm trying to learn to use cython to speed up some code based on pysam. My issue is not strictly speaking about bioinformatics, but rather about building tools using a bioinformatics library. I hope this is still relevant in this site (I hesitated…
bli
  • 3,130
  • 2
  • 15
  • 36
7
votes
3 answers

plotting two heatmaps with the same order of genes

I expected the same pattern but here I am not able to compare the patterns as the order of genes does not seem the same. I mean how I can have two heat maps on which the order of genes are the same so I woulds be able to compare the block of yellow…
Zizogolu
  • 2,148
  • 11
  • 44