Most Popular
1500 questions
3
votes
4 answers
How to merge transcript sequence with same name in a FASTA file.
Suppose we have a fasta file like
>Seq1
GTTGAGAGGTGTATGGACACGAAAAACGAAACTGTATCCCGTGTTTAGCAAAGAAATCAT
>Seq1
AAAAACGAAACTGTATCCCGTGTTT
>Seq2
CGTGTTTAGCAAAGAAAT
I want to…
sksahu
- 51
- 5
3
votes
3 answers
Effect of mutation in DNA sequence on transcription factor binding sites
How much does a single mutation/alteration of a nucleotide affect the presence of a transcription factor binding site (TFBS)?
I am from computer science background(Obviously).
I want to make a general assumption about the number of mutated bases…
SearchinaDNA
- 31
- 1
3
votes
2 answers
Convert Cotton Probe ID to Gene Symbol
I am new to bioinformatics, my background is in Electrical Engineering. I am trying to convert Affymetrix Cotton Probe IDs to gene symbols. I have a gene expression dataset and I need the expressions for only certain genes. So in the dataset, I have…
Adi
- 41
- 3
3
votes
1 answer
TRAL does not find "Phobos result file"
I want to use TRAL to annotate tandem repeats in the reference genome of Caenorhabditis elegans. For this, I need to install some external software, such as Phobos. I've downloaded Phobos and I am using it by typing the path to it's executable like…
Biomagician
- 2,459
- 16
- 30
3
votes
1 answer
Bioinformatics approach to dentifing potential PCR primer sequences for transcribed gene
I have an annotated transcriptome and would like to develop PCR primers for particular transcribed genes. My species is a non-model plant. Can I use BLAST or another tool to identify potential PCR primer sequences? Or more generally, is there a…
Peter Pearman
- 183
- 1
- 6
3
votes
1 answer
Where to download baseline/average gene expression level of all human coding genes?
I am looking for the most appropriate dataset for downloading baseline gene expression level across all human coding genes during development. I am aware that EMBL Expression Atlas is one of the resources that provide such information, but I am…
RJF
- 181
- 1
- 8
3
votes
1 answer
How many reads do I need to cover the entire genome?
Suppose my genome is 3 million bases and that my reads are 100 nucleotide long. I need to know how many reads I need to cover the entire genome.
I start from using the equation $C = \frac{N \cdot L}{G}$ where C is the coverage, N the number of…
wrong_path
- 391
- 1
- 7
3
votes
2 answers
Associating SNP and GENE
Assuming I have SNPs data using hg19, how can I know which SNP belongs into which Gene?
The data looks like:
chr10_103577643
chr10_124712463
and so on.
I want to add a column of Gene, which would tell to which Gene the SNP belongs.
The file is…
Kozolovska
- 241
- 1
- 4
3
votes
1 answer
finding RNA-protein physical interaction
We all know that for physical protein-protein interaction, we need to find the distance between residues from PDB file of that interaction (finding distance between carbon alpha, carbon beta or centroid of two residues in PDB data of two proteins).…
Sara
- 777
- 1
- 6
- 18
3
votes
1 answer
Increase number of threads for GATK 4.0 HaplotypeCaller
I am using GATK version 4.0, I tried to use multiple threads for calling variants using HaplotypeCaller using following command
gatk --java-options -Xmx90G -nt 28 HaplotypeCaller -I output.bam -R wheat_ref.fa -O final.vcf
and the error is
'-nt'…
Ammar Sabir Cheema
- 951
- 7
- 20
3
votes
1 answer
Is there a way to tell which chromosome a gene is on, by looking at the "Chromosome/scaffold name"
I recently got a data set, from which I need to figure out which chromosome a gene is from, but the head of the data reads like:
Gene ID Description Gene type Gene End (bp) Gene Start (bp) Strand Associated Gene Name …
Haohan Wang
- 521
- 3
- 8
3
votes
1 answer
Batch detection of CRISP proteins in fasta file
Probably a naive question. I am inexperienced.
I am interested in identifying potential CRISP (Cysteine-rich secretory proteins) in a certain tissue transcriptome (ca. 20k sequences in fasta). I have detected signalP and estimated % of cysteine in…
Scientist
- 111
- 7
3
votes
3 answers
Retrieving NCBI Taxa IDs from refseq or GenBank assembly accession
I have about 10,000 genome files all named by either refseq or genbank accession number, do you know if it's possible to convert these numbers to the corresponding NCBI taxon ID or species?
for example:
GCA_000005845.2 to 79781
In the case of…
Biomage
- 173
- 7
3
votes
1 answer
How to convert files to ADAM format?
I would like to convert BAM and VCF files to ADAM format.
How do I do that?
Jon Deaton
- 399
- 2
- 10
3
votes
1 answer
Error in seq.default in chromPlot
I am using chromPlot to visualise the genome of C. elegans.
library(chromPlot)
I have created the following data frame with the lengths of C. elegans chromosomes.
Chrom Start End Name
1 1 0 15072434 contigs
2 2 0 15279421…
Biomagician
- 2,459
- 16
- 30