Most Popular

1500 questions
4
votes
1 answer

Psipred installation

I am trying to install psipred in a unix server. Psipred is installed, but fails because it can not find the correct database files. I have downloaded uniprod90.fasta. Psipred also loads the old legacy ncbi BLAST-2.2.26 and BLAST+. runpsipred and…
aerijman
  • 645
  • 5
  • 14
4
votes
2 answers

Merge / Reconciliate several de novo transcriptome assemblies with different kmers

I am building a De Novo transcriptome reference assembly for an eukaryotic organism for which I have a genome. I've created several assemblies with rnaSpades using different kmer sizes (19 to 69 with step 10). Now I would like to merge them into one…
4
votes
2 answers

gene-level versus transcript-level analysis

Traditionally, RNA-seq data was quantified on gene level. Newer methods quantify on transcript/isoform level. For example, Kallisto only outputs transcript-level abundances. From the DESeq2 vignette: A newer and recommended pipeline is to use fast…
burger
  • 2,179
  • 10
  • 21
4
votes
2 answers

What is the difference between a transcriptome and a genome?

I have a computer engineering background, not biology. I started working on a bioinformatics project recently, which involves de-novo assembly. I came to know the terms Transcriptome and Genome, but I cannot identify the difference between these…
ThisaruG
  • 239
  • 1
  • 4
  • 7
4
votes
2 answers

Do any computational phylogenetic methods enable the specification of ancestral states?

Various phylogenetic algorithms estimate ancestral states of a phylogenetic dataset. Is there a way in either maximum parsimony, distance-based methods, or Bayesian inference to indicate what the ancestral states of characters were?
Namenlos
  • 317
  • 1
  • 8
4
votes
1 answer

Is removing samples based on clustering for downstream analysis a right choice?

I'm using TCGA Lung cancer data. I'm interested in doing differential analysis between Lung vs Normal. Before DEA, to check the distance between each pair of samples I plotted an MDS plot: In this, I see some Tumor samples are clustered with…
beginner
  • 631
  • 7
  • 15
4
votes
1 answer

Is there a database of protein sequences/structure along with their melting temperature?

Such databases have been constructed for example in here or here. But I can't find them anywhere online. Can someone point me to where I can find one available online? Otherwise I'll have to write to these people and I don't know when/whether I'll…
4
votes
1 answer

Is it possible to identify cells that are expressing two or more genes in Seurat?

I'm interested in looking into cells that are positive for two (or in some cases more) genes. I know I have some double positive just by looking at the FeaturePlot of those genes, but now I'm trying to figure it out how many are double positive and…
4
votes
3 answers

Public access to genomics databases

I'm a statistician, and am interested in applying the theory of algebraic Markov models to genomics. Here's one paper I'm interested in: Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near…
4
votes
1 answer

Problem installing last build of Sniffles

I have followed the steps described here to install Sniffles on my Mac Book Pro running macOS High Sierra. However, I am struggling with this step: cmake -D CMAKE_C_COMPILER=/opt/local/bin/gcc-mp-4.7 -D CMAKE_CXX_COMPILER=/opt/local/bin/g++-mp-4.7…
Biomagician
  • 2,459
  • 16
  • 30
4
votes
1 answer

What does an output of -I for all amino acids mean in a psi-blast pssm matrix file?

I have run psi-blast using the NR database, remotely, with one iteration, for several sequences to calculate an evolutionary profile (PSSM) for each of those sequences. However, many of the PSSM files contain lines of -I -I -I -I -I -I -I -I -I -I…
Aalawlx
  • 517
  • 4
  • 12
4
votes
3 answers

Issues performing variant calling with GATK

I am trying to perform variant calling on a BAM file generated through STAR version STAR_2.6.0b for wheat genome using GATK haplotypecaller as follows: gatk HaplotypeCaller -I sorted.bam -R wheat.fa -O st.vc Using GATK jar…
4
votes
1 answer

Test to determine if two genes/exons share the same evolutionary histories?

In classic phylogenetic inference one is usually given various orthologue sequences of a given gene across various species. Those sequences are then multiple aligned and used to construct a phylogenetic tree. Say, I split the multiple alignment of…
4
votes
2 answers

Use of heterozygous SNPs in cancer research: why?

When reading about allelic fraction (AF) and SNPs in cancer research, they always mention the fact that they're using heterozygous SNPs (informative SNPs). Why is this? Why can't we use homozygous SNPs? In this case, what is the reference base (from…
wrong_path
  • 391
  • 1
  • 7
4
votes
1 answer

Statistical approach to compare the SNP genotyping data among set of individuals

So, I have the genotyping data of about 650,000 SNPs for 96 individuals. I already know the Y DNA haplogroup of these individuals, so to some extent, I have a gross understanding of their ancestry. What would be the best way to go about doing this?…
user2887
  • 61
  • 5