4

Can someone tell the way to extract the fasta sequences for the gene cytb of Acetes japonicus (shrimp important to China and South Korea)?

Can I extract them directly from NCBI nucleotide database (i.e. nuccore)?

For instance, I'm trying to fetch the fasta data of Acetes japonicus of cytb gene. So I'm using biopython, like this:

 handle = Entrez.esearch(db="nucleotide", term="Acetes japonicus[Orgn] AND cytb[Gene]")
record=Entrez.read(handle)

And the output doesn't give the an ID, however when I search manually I get the result.

M__
  • 12,263
  • 5
  • 28
  • 47
Sofia
  • 351
  • 2
  • 7
  • What sequences? The gene? The CDS? The various spliced transcripts? Protein? In what species? Please [edit] and make your question more specific and we can give you specific answers. – terdon Oct 23 '19 at 09:29
  • 3
    Weird, when I search the nucleotide database, I see only COI gene. You can try handle = Entrez.esearch(db="nucleotide", term="Acetes japonicus[Orgn] AND COI[Gene]",idtype="acc") and you will see that it works. Substitute COI for cytb and it's empty. Exactly what I saw in entrez nucleotide database – StupidWolf Oct 23 '19 at 10:17
  • Thanks for the help. Is it normal for the results to appear in the form o html? – Sofia Oct 23 '19 at 11:04
  • You are referring to results of Entrez.read(handle) ? It's a something like a dictionary. You can check with type(record) – StupidWolf Oct 23 '19 at 13:06
  • 1
    @StupidWolf I concur: I also only see COI. There are no 'cytb' sequences for this species, and that is why you get 0 hits. – user3479780 Oct 28 '19 at 05:15

2 Answers2

0

I should have answered correctly ages back, because I do know it.

Cytochrome b is a massively used target for arthropods - these are shrimps. However, the nomenclature changed:

  • Traditionally nomenclature cytb
  • 'New' nomenclature COB

Wait you might ask, arn't you confusing that with COI, II, III? No because that nomenclature changed too:

  • Traditional nomenclature COI
  • 'New' nomenclature COX1

'New' was a long time ago though.

Thus the gene you are looking for is MZ571562, here

 ... AND COB [gene] 
M__
  • 12,263
  • 5
  • 28
  • 47
0

Here is how you can do it from command line version:

esearch -db nuccore -query "COB [GENE] AND "Acetes japonicus" [ORGN]"|efetch -format gb

OR

esearch -db nuccore -query "COB [GENE] AND "Acetes japonicus" [ORGN]"|efetch -format fasta

Note that gene name is actually COB.

Supertech
  • 606
  • 2
  • 10