Highest Voted Questions - Bioinformatics Stack Exchange

4

votes

1 answer

Biopython: resseq doesn't match pdb file

I have a PDB file, and I need to extract its residue sequence numbers (resSeq's). Based on manual inspection of the first few lines of the PDB file (pasted below), I would think that resSeq's should begin with 22, 23. However, Biopython's PDB module…

asked Aug 02 '17 at 17:24

GingerBadger

191
1

4

votes

2 answers

Why are Minimap2 alignments different with CIGAR generation flag?

I am using Minimap2 (v2.26-r1175) in Linux to generate a sequence alignment between the Streptomyces coelicolor A3(2) chromosome (ref.fa) and the Mycobacterium tuberculosis chromosome (query.fa). My desired output is a PAF (Pairwise mApping Format)…

asked Dec 12 '23 at 15:21

Gawain

315
1
10

4

votes

2 answers

BioPython bootstrap is not reliable?

Here i will show you a minimal working example of code and as you can see the support values for the tree is always 100. I am using synthetic sequences of 100bp for 6 elements. The sequences have been generated at random choosing from ATCG for each…

asked Dec 05 '23 at 13:26

Mirko

257
5

4

votes

2 answers

RNAseq: Z score, Intensity, and Resources

I'm very new to bioinformatics in general, and I'm trying to understand some basic concepts. I have RNAseq data, and bioinformatics people tell me that intensities cannot be compared across patients. So there are all of these pipelines to compare…

asked Jul 29 '17 at 00:06

julianstanley

401
3
9

4

votes

2 answers

How to create a phylogenetic tree from diverse mitochondrial genomes

I would like to create a phylogenetic tree for the most species in my dataset. I'm starting with around 1200 species, but since it's not good practice to align short and long sequences I tried filtering only for the species with the [15k-18k bp]…

asked Oct 23 '23 at 10:31

Mirko

257
5

4

votes

1 answer

How can I incorporate wildcards in Snakemake in an R script?

I am facing the following issue: I have a rule in Snakemake that looks something like this: rule somerule: input: tables = expand("results/{tables}_table.txt", tables = ["1", "2"]), output: edit= expand("results/{tables}_edited.txt", tables…

asked Oct 17 '23 at 18:55

Classy Q

43
2

4

votes

1 answer

Asterisk (*) calculation method

Does anyone here know how to calculate a value for the asterisk (*) code that appears in substitution matrices? From my observation, to all pairs with a one asterisk, the lowest value from the matrix is set. For ** it's 1. Example: BLOSUM62. But,…

asked Oct 15 '23 at 01:27

maciejwww

227
1
14

4

votes

2 answers

Where to download a file with major and minor alleles at every position?

I want a list of all variants, i.e. sites which are known to vary between human to human. For example, it should ideally cover all sites in here, but without samples. I don't want a giant reference panel with many samples, nor a .fa which has no…

asked Oct 09 '23 at 23:24

BigMistake

543
11

4

votes

0 answers

Linking GenBank records to biosamples (and vice versa) using edirect

Assume that I wish to find all complete human mitochondrial genome records on GenBank (or rather, NCBI nuccore) that also have an entry in NCBI's Biosample database. MYQUERY="(mitochondrion[TITLE] OR mitochondrial[TITLE]) \ AND complete…

asked Sep 23 '23 at 23:36

Michael Gruenstaeudl

203
1
6

4

votes

1 answer

How to get cytoband and gene level copy number from genome wide SNP array copy number data?

I have (human) Illumina genome wide SNP array copy number data. For each SNP genomewide, I have Log R Ratio (LRR) and B Allele Frequency (BAF). What tool(s) can I use to get the integer copy numbers (either -3 to +3 or 0 to inf) for each cytoband…

asked Sep 13 '23 at 23:00

Sylvia Rodriguez

257
1
10

4

votes

2 answers

How to analyse qualitatively the penetration ability of particles in spheroids using fluorescent z-stacks?

0 I have to establish the penetration profile of particles in tumour spheroids. For this I have spheroids composed of cells which were exposed to particles, both of which are fluorescently labeled. The spheroids were then imaged using a confocal…

3d-structure

asked Sep 04 '23 at 14:54

Timi

51
3

4

votes

2 answers

Question about umap using different numbers of pca components as initialization

I am new to the scRNA-seq field and I have been doing some experiments of visualization of UMAP using different numbers of PCA components for initialization. The process involves projecting scRNA-seq data (count matrix) onto various numbers of PCA…

asked Aug 30 '23 at 15:47

Zack

43
3

4

votes

1 answer

Can not launch bcftools using python's subprocess module, as it only accepts first command of commands list

I am trying to remove samples from a chromosome vcf file. I wrote a function that takes chromosome number and a list of samples to remove. When I try to run bcftools using subprocess module it only runs bcftools, as if I was running…

asked Aug 27 '23 at 21:32

YKY

171
5

4

votes

0 answers

MergeBamAlignment error

I doing the alignment of samples following the GATK pipeline, and doing the MergeBamAlignment,like this: MergeBamAlignment \ -ALIGNED $path/file.unsorted.bam \ -UNMAPPED $path/file.unmapped.bam \ -O $path/file.merged.bam \ -R…

asked Aug 20 '23 at 23:23

Rita Soares

101
2

4

votes

1 answer

blasting a refseq protein does not show the protein in the result set

Can anyone explain me, why I don't find a specific protein with a blast that was took before from the NCBI refseq database? Specifically, I was trying to blast the protein with the accession number "NP_420767" and its sequence, respectively, however…

asked Jul 25 '17 at 11:11

Matthias F.

43
3

Most Popular