Most Popular
1500 questions
4
votes
1 answer
Biopython: resseq doesn't match pdb file
I have a PDB file, and I need to extract its residue sequence numbers (resSeq's). Based on manual inspection of the first few lines of the PDB file (pasted below), I would think that resSeq's should begin with 22, 23. However, Biopython's PDB module…
GingerBadger
- 191
- 1
4
votes
2 answers
Why are Minimap2 alignments different with CIGAR generation flag?
I am using Minimap2 (v2.26-r1175) in Linux to generate a sequence alignment between the Streptomyces coelicolor A3(2) chromosome (ref.fa) and the Mycobacterium tuberculosis chromosome (query.fa). My desired output is a PAF (Pairwise mApping Format)…
Gawain
- 315
- 1
- 10
4
votes
2 answers
BioPython bootstrap is not reliable?
Here i will show you a minimal working example of code and as you can see the support values for the tree is always 100.
I am using synthetic sequences of 100bp for 6 elements. The sequences have been generated at random choosing from ATCG for each…
Mirko
- 257
- 5
4
votes
2 answers
RNAseq: Z score, Intensity, and Resources
I'm very new to bioinformatics in general, and I'm trying to understand some basic concepts.
I have RNAseq data, and bioinformatics people tell me that intensities cannot be compared across patients. So there are all of these pipelines to compare…
julianstanley
- 401
- 3
- 9
4
votes
2 answers
How to create a phylogenetic tree from diverse mitochondrial genomes
I would like to create a phylogenetic tree for the most species in my dataset.
I'm starting with around 1200 species, but since it's not good practice to align short and long sequences I tried filtering only for the species with the [15k-18k bp]…
Mirko
- 257
- 5
4
votes
1 answer
How can I incorporate wildcards in Snakemake in an R script?
I am facing the following issue:
I have a rule in Snakemake that looks something like this:
rule somerule:
input:
tables = expand("results/{tables}_table.txt", tables = ["1", "2"]),
output:
edit= expand("results/{tables}_edited.txt", tables…
Classy Q
- 43
- 2
4
votes
1 answer
Asterisk (*) calculation method
Does anyone here know how to calculate a value for the asterisk (*) code that appears in substitution matrices?
From my observation, to all pairs with a one asterisk, the lowest value from the matrix is set. For ** it's 1. Example: BLOSUM62.
But,…
maciejwww
- 227
- 1
- 14
4
votes
2 answers
Where to download a file with major and minor alleles at every position?
I want a list of all variants, i.e. sites which are known to vary between human to human. For example, it should ideally cover all sites in here, but without samples.
I don't want a giant reference panel with many samples, nor a .fa which has no…
BigMistake
- 543
- 11
4
votes
0 answers
Linking GenBank records to biosamples (and vice versa) using edirect
Assume that I wish to find all complete human mitochondrial genome records on GenBank (or rather, NCBI nuccore) that also have an entry in NCBI's Biosample database.
MYQUERY="(mitochondrion[TITLE] OR mitochondrial[TITLE]) \
AND complete…
Michael Gruenstaeudl
- 203
- 1
- 6
4
votes
1 answer
How to get cytoband and gene level copy number from genome wide SNP array copy number data?
I have (human) Illumina genome wide SNP array copy number data. For each SNP genomewide, I have Log R Ratio (LRR) and B Allele Frequency (BAF).
What tool(s) can I use to get the integer copy numbers (either -3 to +3 or 0 to inf) for each cytoband…
Sylvia Rodriguez
- 257
- 1
- 10
4
votes
2 answers
How to analyse qualitatively the penetration ability of particles in spheroids using fluorescent z-stacks?
0
I have to establish the penetration profile of particles in tumour spheroids.
For this I have spheroids composed of cells which were exposed to particles, both of which are fluorescently labeled. The spheroids were then imaged using a confocal…
Timi
- 51
- 3
4
votes
2 answers
Question about umap using different numbers of pca components as initialization
I am new to the scRNA-seq field and I have been doing some experiments of visualization of UMAP using different numbers of PCA components for initialization. The process involves projecting scRNA-seq data (count matrix) onto various numbers of PCA…
Zack
- 43
- 3
4
votes
1 answer
Can not launch bcftools using python's subprocess module, as it only accepts first command of commands list
I am trying to remove samples from a chromosome vcf file. I wrote a function that takes chromosome number and a list of samples to remove. When I try to run bcftools using subprocess module it only runs bcftools, as if I was running…
YKY
- 171
- 5
4
votes
0 answers
MergeBamAlignment error
I doing the alignment of samples following the GATK pipeline, and doing the MergeBamAlignment,like this:
MergeBamAlignment \
-ALIGNED $path/file.unsorted.bam \
-UNMAPPED $path/file.unmapped.bam \
-O $path/file.merged.bam \
-R…
Rita Soares
- 101
- 2
4
votes
1 answer
blasting a refseq protein does not show the protein in the result set
Can anyone explain me, why I don't find a specific protein with a blast that was took before from the NCBI refseq database?
Specifically, I was trying to blast the protein with the accession number "NP_420767" and its sequence, respectively, however…
Matthias F.
- 43
- 3