How to translate amino acid sequences to Nucleotide sequences

Question

I want to convert a list of fasta ( protein sequences) in a .text file into corresponding nucleotide sequences. A Google search gives me result of DNA to protein conversion but not vice versa. Also, I came across How do I find the nucleotide sequence of a protein using Biopython?, but this is what I am not looking for. Is there any possible way to do it using python.Moreover, I would like to solve it using python programming. I am sure there must be some way to do it rather than writing a code from scratch. Thanks!

Cross-posted: https://www.biostars.org/p/421074/ – Feb 09 '20 at 11:12 — , Feb 09 '20 at 11:12

score 5 · Answer 1 · answered Feb 09 '20 at 16:57

Although there is not a unique nucleotide sequence that translates to a given protein, one can list all the possible DNA sequences that do translate to that protein.

An online tool that does just that is Backtranambig, from EMBOSS. It produces a DNA sequence representing all the nucleotide sequences matching the input protein, using IUPAC ambiguity codes.

score 2 · Answer 2 · answered Nov 23 '20 at 16:01

2

DNA Chisel (written in Python) can reverse translate a protein sequence:

import dnachisel
from dnachisel.biotools import reverse_translate
record = dnachisel.load_record("seq.fa")
reverse_translate(str(record.seq))
GGTCATATTTTAAAAATGTTTCCT

answered Nov 23 '20 at 16:01

Peter

2,634
15
33

score 1 · Answer 3 · edited Jun 16 '20 at 12:15

1

You can't do this because there is redundancy in the genetic code and the same protein sequences can be encoded by different nucleotide sequences. There are 64 codons and ~20 amino acids, e.g. GCT, GCC, GCA and GCG all encode Alanine.

edited Jun 16 '20 at 12:15

karel

135
1
8

answered Feb 09 '20 at 09:57

Chris_Rands

3,948
12
31

For the sake of the programming exercise, one could use IUPAC ambiguity codes to reflect the set of possible codons for a given amino-acid. – bli Feb 09 '20 at 10:33
@bli Agreed, one could even build a model to predict the underlying codons since there is codon usage bias. For practical purposes I'd do a TBLASTN against NT or similar – Chris_Rands Feb 09 '20 at 12:24

How to translate amino acid sequences to Nucleotide sequences

3 Answers3

GGTCATATTTTAAAAATGTTTCCT

Linked