Most Popular

1500 questions
3
votes
1 answer

How to extract all variant alleles that do not match "./." from the GT column of a vcf file?

For the two vcf files linked below, I cannot find any variants in the GT column other than "./.". Is it possible to confirm if the GT column of vcf files have been annotated (i.e variants listed as "./1", "1/." or "1/1")? vcf files:…
CoderQ
  • 33
  • 3
3
votes
1 answer

How to prove correlation between gene expression and functions using omics data in bioinformatics?

I am new to bioinformatics, coming from a background in mathematical biology. I am currently participating in a project where the Principal Investigator (PI), who is a biologist, has tasked me with proving a correlation between the expression and…
LOVEMATH
  • 33
  • 4
3
votes
1 answer

Nextflow: Properly chaining process outputs

I have written a pipeline that is composed of three processes. I seem to be misunderstanding how Nextflow passes the outputs from one process to the next or creates a queue of files. I am using a tumor and paired normal sample from SRA (SRX21185781)…
bhumm
  • 33
  • 6
3
votes
1 answer

How to calculate distance between two regions from bed like dataframe in R?

I'm wondering how can I calculate the distance between two regions from the bed (chr, start, end) alike dataframe in R. Precisely, I want to subtract end of r from start of r+1 (r=row). Also, the calculation will repeat similarly for each…
Deb
  • 201
  • 1
  • 5
3
votes
1 answer

Finding homologous regions in multiple whole genomes

Right now I have six genomes that I want to compare and identify homologous regions in the genomes. I have run nucmer, show-coords and obtained the output files. An example is shown below with Genome 1 vs Genome 2. More files go through Genome 1 vs…
LORL
  • 43
  • 3
3
votes
1 answer

What does the absence of a variant in a VCF file mean?

I have a individual.snp.vcf.gz file of an individual genome and the referencegenome.snp.vcf.gz file of the reference genome. When I run the following code on the individual genome gunzip -c individual.snp.vcf.gz | grep -v '^##' | awk 'BEGIN…
3
votes
1 answer

What is it about gene names starting with "LOC"?

I was struggling to use AnnotationDbi to change my ensemble ID to gene name, for datasets of three different species (human, canine, mouse). Among gene names for all three species there are genes with names starting with "LOC" like…
ToTheMoon
  • 59
  • 5
3
votes
2 answers

Printing the results of for loop into a txt file

I have a for loop which prints available vcf files in the path for a list of patient IDs (using find dx data function) into my zsh terminal: for i in $(cat patient_id.txt); do dx find data --property patient_id=$i --path "vcf/" done How I can…
Zizogolu
  • 2,148
  • 11
  • 44
3
votes
0 answers

Q8 Accuracy Evaluation metric mathematical expression used in Protein Secondary Structure Prediction

I am working on Protein Secondary Structure Prediction using Recurrent Neural Network (GRU) model. I came across few open source projects which already worked on the similar problem. All of them are using Q8 Accuracy as evaluation metrics and the…
3
votes
1 answer

What is the best way to acquire protein isoform sequence alignment?

[Update] Thanks @terdon. To clarify my question: I have a bunch of protein isoforms sequences (produced by the transcripts in the figure) and I want to align protein products (e.g., proteins produced by the first 3 protein-coding transcripts…
3
votes
1 answer

probability of finding a 5 amino acids in a row within a proteome

How to calculate the probability of finding two proteins that share a 5 amino acid long motif from a proteome of around 1067 proteins that have an average length of 65 residues. The probability of a specific sequence that is completely the same is…
3
votes
1 answer

How to annotate a bacterial genome automatically?

We recently got some shotgun sequencing results from a soil bacterium. And we have obtained some contigs which have some genes of interest. Is there a way to automatically annotate the whole sequence (one single contig of around 25 kb size) and mark…
Irfan
  • 81
  • 5
3
votes
1 answer

How can I convert a BED file to GTF/GFF with gene_ids?

Given a .bed (BED12), how can I convert it to GTF/GFF formats with gene_id attributes? What is the fastest way or available tools to do it? For example, given an input like this: chr27 17266469 17281218 ENST00000541931.8 1000 + 17266469 17281218…
3
votes
1 answer

Where to access the WGCNA tutorial documents: Horvath lab site down

I am currently using the WGCNA package for some analysis and it seems the Horvath lab site is down. Does anyone know of anywhere else I can access the tutorial documents?
3
votes
1 answer

biomarRT conversion of symbols

I wanted to use the code to convert mouse to human symbol, but I have problem with the mirror, I've tried to establish another one, but continue to have the same problem. musGenes <- c("Hmmr", "Tlx3", "Cpeb4") # Basic function to convert mouse to…
1 2 3
99
100