Most Popular
1500 questions
5
votes
2 answers
How do population genetics people define a population?
How do population genetics people define a population?
Do they define it as a layman will do? say Africans, Americans, or so? Or is there a more scientific way of doing so? For example, I think defining one's population as one's allele frequencies…
Haohan Wang
- 521
- 3
- 8
5
votes
1 answer
What is the correct way of dealing with the analysis of data from flow cytometry?
I would like to detect a change in expression of a molecule present on a cell type by flow cytometry.
Assuming I am able to detect, using an antibody, a signal that represents the amount of the molecule I'm interested into. Also assume that the…
gabt
- 348
- 2
- 13
5
votes
1 answer
python package for NNI neighbors
I am working on protein sequence data files for reconstructing phylogenetic tree and I need to generate all NNI-neighbours of the tree (two trees are NNI-neighbours if one can be transformed into another by one nearest neighbour interchange…
Sidra Younas
- 503
- 2
- 13
5
votes
3 answers
How to automate NCBI genome download
I need to download all the completely assembled cyanobacterial genome's GenBank file(.gbff) from NCBI(RefSeq or INSDC ftp data).
For this I think, the steps are:
Need to find the completely assembled genomes.
find the GenBank file URL based on the…
Arijit Panda
- 285
- 1
- 8
5
votes
2 answers
Counting letters in phylip alignment columns with Biopython
I have been using python 3.6 and biopython 1.72 to work with protein data files. I am using a protein sequence file (phylip format), for example:
14 678
Zebrafish LSSCGVVSGD LISVILPASS LEETQTSSAA AHQTHTDQQA GGSHVSSSSS
Fugu LASCGIVSGD…
Sidra Younas
- 503
- 2
- 13
5
votes
1 answer
At what stage of a transcriptome assembly is it better to perform read contaminant filter?
I'm trying to assemble a bivalve transcriptome. Since bivalves are filter feeders, their transcriptomes tend to be highly contaminated by bacteria, algae and whatnot. Since I pooled several transcriptomes, I have a high amount of reads (>2B reads).…
LinuxBlanket
- 309
- 1
- 10
5
votes
3 answers
Sampling haplotypes
I am trying to simulate different genome of peoples, I have data (VCF files) of various genes from the 1000K Gene project.
I want to simulate different whole genomes i.e generate a new population by combining real haplotypes I have. I am wondering…
Kozolovska
- 241
- 1
- 4
5
votes
1 answer
Should PCA be standardized for gene expression?
This is a theory/good practice question more than a technical one. If samples are being plotted on a PCA projection of gene expression data, I'm wondering whether it is standard (and if so, why) to center and scale the PCs.
The reason I ask is that…
Felipe Flores
- 53
- 3
5
votes
1 answer
How to interpret PCA output statistically and biologically?
How can I interpret the PCA results statistically for biological data?
I have used FactoMineR and factoextra libraries for PCA
Scripts used:
library(FactoMineR)
res.PCA = PCA(df, scale.unit=TRUE, ncp=4, graph=F )
par(mfrow=c(1,2))
plot.PCA(res.PCA,…
Dendrobium
- 187
- 3
5
votes
2 answers
How to scale the size of heat map and row names font size?
I have an expression data matrix (120X15; 15 samples and 120 genes), my heatmap looks blurred and raw names (gene names) looks very small and can not read. How can I improve my scripts?
Here is the example data
df<-structure(list(X_T0 =…
Kynda
- 95
- 1
- 1
- 6
5
votes
1 answer
Simplest way to work out structural variant type?
In VCF 4.2, a structural variant (SV) can be described with the BND keyword in SVTYPE. For example, the following example is an insertion (from https://samtools.github.io/hts-specs/VCFv4.2.pdf):
#CHROM POS ID REF ALT QUAL…
SmallChess
- 2,699
- 3
- 19
- 35
5
votes
2 answers
How to run MaxQuant in command line mode?
MaxQuant is a software package for mass spectroscopy and proteomics. There is a windows version and a linux version. To run on linux you have to use a program that is called mono. I think, it is developed by Microsoft, which I find quite nice from…
Soerendip
- 1,295
- 11
- 22
5
votes
4 answers
tRNAscan-SE error: FATAL: Unable to find /usr/local/bin/cmsearch executable
I have downloaded tRNAscan-SE from here. After decompressing and untaring the file, I installed it using:
./configure
make
make install
When I type tRNAscan-SE --help, I get the help page:
tRNAscan-SE 2.0 (December 2017)
FATAL: No sequence file(s)…
Biomagician
- 2,459
- 16
- 30
5
votes
1 answer
Find paralogs in a draft genome
We generated a (diploid, chordata, highly heterozgous) genome using PacBio and we wanted to see whether it contains lineage-specific duplications (paralogs, basically). The genome is not in Ensembl yet.
The only data we have at the moment…
aechchiki
- 2,676
- 11
- 34
5
votes
2 answers
No variant found using GATK 4.0 HaplotypeCaller
I am doing variant calling on RNA-seq datasets from wheat which is hexaploid,the binary alignment (BAM) files were created using STAR version 2.6.0c and variant calling was done using GATK 4.0 HaplotypeCaller.The whole pipeline is as…
Ammar Sabir Cheema
- 951
- 7
- 20