7

Is there a way to import a GFF file for an organism with Biopython in the same way you can for a Genbank file?

For example,

from Bio import Entrez as ez
ez.email = '...'
handle = ez.efetch(db='gene', id='AE015451.2', rettype='genbank', retmode='xml')
h = handle.read()
print(h)

This will import a Gebank xml file for Pseudomonas putida. Is there a way to do this for a GFF file?

Daniel Standage
  • 5,080
  • 15
  • 50

3 Answers3

10

No, there is currently no GFF support in biopython.

However, you can read in GFF files into python using this package, gffutils. There are also a few other packages to read/write GFF files, like gff3.

conchoecia
  • 3,141
  • 2
  • 16
  • 40
2

I would use BCBio for gff handling as it is written to directly interface with BioPython’s object model.

The only downside is that I believe it is no longer actively supported. It is however the package that the BioPython docs use generally. There are plans to properly incorporate GFF/GTF parsers in to BP in the not too distant future according to that link though.

Joe Healey
  • 335
  • 2
  • 10
1

Most GFF parsers handle the work of reading annotation data into objects for convenient data access, but most do not handle the important task of resolving relationships between features. The tag Python library does both, grouping related features together for inspection, traversal, and feature-by-feature processing.

CAVEAT: The tag library only supports GFF3.

DISCLAIMER: I'm the developer of the tag library, so this is a shameless plug. :-)

Daniel Standage
  • 5,080
  • 15
  • 50