I'm trying to annotate a genome to find all genes with a specific function. I have a FASTA and the read FASTQs - I'd like to assign the functional group of the identified proteins (e.g. Kegg orthology) automatically.
For more context, I have whole genome sequences of bacteria - I'm trying to find all genes involved in membrane transport. I'd either like a tool to retrieve gene ontologies from a list of gene names (input: genA output: Membrane Transport), or a tool that outputs the function as part of the annotation pipeline (e.g. genbank output with GO function defined).
Or posed in a different way, how would you identify all genes with a specific function from prokaryotic whole genome sequence data?
Are there any scripts that can do this?
I'm looking for a way to find all genes present in an assembly that match a specific function (e.g. membrane transport). In the genbank format for example I see "/function=" which is exactly what I want to assign - but I can't find a tool that writes that data. Prokka for e.g. does not write this by default
– MichaelKirst Nov 26 '17 at 14:04