Most Popular
1500 questions
4
votes
1 answer
Can I submit a R package to Bioconductor or CRAN if I have already published it a journal?
I have written a bioinformatics package in R that I want to publish in a bioinformatics Journal. Presently, I am maintaining a local repo of that package and I want to put in the Bioconductor repository (after publication or at least after…
Wasim Aftab
- 205
- 1
- 4
4
votes
2 answers
Is the quality score of fastq used somewhere besides trimming/fastqc?
In the fastq format every 4k+4-th line contains the positionwise qualityscore (ascii encoded):
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
...
which program…
Paul
- 327
- 2
- 8
4
votes
1 answer
Interpretion of my coronavirus 2019-nCov, Wuhan, China BLAST tree?
This is the BLAST tree of the latest coronavirus out of China (from Wuhan Institute of Virology, China). It seems strange that there is so much divergence from all the other coronaviruses. Is this expected of new diseases?
Loopy
- 41
- 1
4
votes
3 answers
Existing tool for converting gff3 to genbank (gbk)
I want to convert my gff3 annotation files to genbank format for use in Mauve. I found the seqret tool here https://www.ebi.ac.uk/Tools/sfc/emboss_seqret/ which can perform this task, but my files (bacterial genomes) are too big.
Can anyone suggest…
Mark
- 143
- 1
- 8
4
votes
2 answers
What is the meaning of the "*" character in bwa fastmap's output?
I am mapping kmers back to a few bacterial genomes using bwa fastmap:
bwa fastmap -l 9 ref.fasta kmers.fasta > out.fastmap
[M::bwa_idx_load_from_disk] read 0 ALT contigs …
mgalardini
- 977
- 7
- 18
4
votes
0 answers
Low Fraction of usable antibody reads in CiteSeq
we performed a combined gene expression and CiteSeq experiment with the 10x VDJ kit and 20 conjugated antibodies and sequenced on hiseq. I used cellranger to process the sequencing output. The cellranger summary shows overall good values except for…
gypti
- 41
- 1
4
votes
1 answer
How could I match atom orders between a .mol2 and a .pdb?
When I was drawing conformation of Autodock result, I noticed that Autodock generated pdbqt from dlg has some structural problem (e.g. Benzene ring missing), and wants to correct this. However I noticed that the atom order of pdbqt doesn't match…
march_happy
- 143
- 5
4
votes
1 answer
How to quickly and robustly convert between mmCIF and PDB?
There is already a question on PDB/CIF to MMTF, however what is a robust way to programmatically go between PDB and CIF files?
For example I can use a python script from this gist that relies on Biopython.
id="4ckh"
wget…
James
- 409
- 2
- 13
4
votes
1 answer
Is there a modern alignment tool tailored for transmembrane regions?
I am looking for a project or tool that allows programmatic pairwise alignments of proteins but that takes care with transmembrane regions of proteins. TM regions are traditionally too information poor for many tools to align accurately (Wong et al,…
James
- 409
- 2
- 13
4
votes
2 answers
What is a samtools mpileup reference skip?
The samtools documentation for mpileup states:
At this column, a dot stands for a match to the reference base on the forward strand, a comma for a match on the reverse strand, a '>' or '<' for a reference skip
...
Similarly, a pattern…
mattm
- 754
- 7
- 19
4
votes
1 answer
How does picard's MarkDuplicate handle unmapped reads?
Our BAM files are created according to a "lossless" alignment procedure [1] from the Broad Institute GATK documenation and involves re-adding the unaligned/unmapped reads into an aligned BAM, using Picard's MergeBamAlignment.
The BAM files are…
init_js
- 319
- 2
- 9
4
votes
2 answers
Extracting a cytochrome B sequence from NCBI's nucleotide database
Can someone tell the way to extract the fasta sequences for the gene cytb of Acetes japonicus (shrimp important to China and South Korea)?
Can I extract them directly from NCBI nucleotide database (i.e. nuccore)?
For instance, I'm trying to fetch…
Sofia
- 351
- 2
- 7
4
votes
3 answers
awk working with large files
I have two very large vcf files 2GB and 6GB I want to look for unique combinations of CHROM and POS and output the row that matches. However, because the files a so large my machine always hang and stop processing. Is there are a way to work around…
user11766958
- 189
- 1
- 7
4
votes
3 answers
Bootstrap analysis - why it is called bootstrap?
Bootstrap analysis, bootstrapping etc are quite common jargons of bioinformatics and phylogenetics.
However, it is not very much clear to us, what exactly being meant by "a boot's straps".
Does it means a "comparison"? (Such as we hold the two…
user286
4
votes
1 answer
Software to produce a table of post-translational modifications from a peptide list
Does anyone know if there is a program/library/script in R or Python that takes as input a list of proteins/peptides and a list of post-translational modifications (PTMs; like oxidation of methionine and acetylation of cysteine), and as output…
J. Doe
- 575
- 3
- 11