Most Popular
1500 questions
4
votes
1 answer
Psipred installation
I am trying to install psipred in a unix server.
Psipred is installed, but fails because it can not find the correct database files. I have downloaded uniprod90.fasta. Psipred also loads the old legacy ncbi BLAST-2.2.26 and BLAST+.
runpsipred and…
aerijman
- 645
- 5
- 14
4
votes
2 answers
Merge / Reconciliate several de novo transcriptome assemblies with different kmers
I am building a De Novo transcriptome reference assembly for an eukaryotic organism for which I have a genome.
I've created several assemblies with rnaSpades using different kmer sizes (19 to 69 with step 10). Now I would like to merge them into one…
Pitagoras Alves
- 41
- 2
4
votes
2 answers
gene-level versus transcript-level analysis
Traditionally, RNA-seq data was quantified on gene level. Newer methods quantify on transcript/isoform level. For example, Kallisto only outputs transcript-level abundances. From the DESeq2 vignette:
A newer and recommended pipeline is to use fast…
burger
- 2,179
- 10
- 21
4
votes
2 answers
What is the difference between a transcriptome and a genome?
I have a computer engineering background, not biology.
I started working on a bioinformatics project recently, which involves de-novo assembly. I came to know the terms Transcriptome and Genome, but I cannot identify the difference between these…
ThisaruG
- 239
- 1
- 4
- 7
4
votes
2 answers
Do any computational phylogenetic methods enable the specification of ancestral states?
Various phylogenetic algorithms estimate ancestral states of a phylogenetic dataset. Is there a way in either maximum parsimony, distance-based methods, or Bayesian inference to indicate what the ancestral states of characters were?
Namenlos
- 317
- 1
- 8
4
votes
1 answer
Is removing samples based on clustering for downstream analysis a right choice?
I'm using TCGA Lung cancer data. I'm interested in doing differential analysis between Lung vs Normal. Before DEA, to check the distance between each pair of samples I plotted an MDS plot:
In this, I see some Tumor samples are clustered with…
beginner
- 631
- 7
- 15
4
votes
1 answer
Is there a database of protein sequences/structure along with their melting temperature?
Such databases have been constructed for example in here or here. But I can't find them anywhere online. Can someone point me to where I can find one available online?
Otherwise I'll have to write to these people and I don't know when/whether I'll…
blehblehblecksheep
- 173
- 3
4
votes
1 answer
Is it possible to identify cells that are expressing two or more genes in Seurat?
I'm interested in looking into cells that are positive for two (or in some cases more) genes. I know I have some double positive just by looking at the FeaturePlot of those genes, but now I'm trying to figure it out how many are double positive and…
Gabriel Alencar
- 41
- 1
- 4
4
votes
3 answers
Public access to genomics databases
I'm a statistician, and am interested in applying the theory of algebraic Markov models to genomics. Here's one paper I'm interested in: Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near…
Puraṭci Vinnani
- 141
- 2
4
votes
1 answer
Problem installing last build of Sniffles
I have followed the steps described here to install Sniffles on my Mac Book Pro running macOS High Sierra.
However, I am struggling with this step:
cmake -D CMAKE_C_COMPILER=/opt/local/bin/gcc-mp-4.7 -D CMAKE_CXX_COMPILER=/opt/local/bin/g++-mp-4.7…
Biomagician
- 2,459
- 16
- 30
4
votes
1 answer
What does an output of -I for all amino acids mean in a psi-blast pssm matrix file?
I have run psi-blast using the NR database, remotely, with one iteration, for several sequences to calculate an evolutionary profile (PSSM) for each of those sequences.
However, many of the PSSM files contain lines of -I -I -I -I -I -I -I -I -I -I…
Aalawlx
- 517
- 4
- 12
4
votes
3 answers
Issues performing variant calling with GATK
I am trying to perform variant calling on a BAM file generated through STAR version STAR_2.6.0b for wheat genome using GATK haplotypecaller as follows:
gatk HaplotypeCaller -I sorted.bam -R wheat.fa -O st.vc
Using GATK jar…
Ammar Sabir Cheema
- 951
- 7
- 20
4
votes
1 answer
Test to determine if two genes/exons share the same evolutionary histories?
In classic phylogenetic inference one is usually given various orthologue sequences of a given gene across various species. Those sequences are then multiple aligned and used to construct a phylogenetic tree.
Say, I split the multiple alignment of…
Sebastian Müller
- 600
- 2
- 9
4
votes
2 answers
Use of heterozygous SNPs in cancer research: why?
When reading about allelic fraction (AF) and SNPs in cancer research, they always mention the fact that they're using heterozygous SNPs (informative SNPs). Why is this? Why can't we use homozygous SNPs?
In this case, what is the reference base (from…
wrong_path
- 391
- 1
- 7
4
votes
1 answer
Statistical approach to compare the SNP genotyping data among set of individuals
So, I have the genotyping data of about 650,000 SNPs for 96 individuals. I already know the Y DNA haplogroup of these individuals, so to some extent, I have a gross understanding of their ancestry.
What would be the best way to go about doing this?…
user2887
- 61
- 5