Typically you want to use a ComponentDefinition when an sub-structure is something you might want to pull out and re-use in another genetic design, and a SequenceAnnotation whenever it is not.
For example:
- Promoters, terminators, and protein coding sequences are typically best represented with their own
ComponentDefinition, since these are often pulled out and re-combined in new genetic designs. Representing such an element with a ComponentDefinition makes it easy to find the commonalities between designs.
- Assembly scars, binding sites, and cut sites are typically best represented with a
SequenceAnnotation because they are typically not very interesting except in the context of their surrounding genetic context. Representing such an element with a SequenceAnnotation helps keep them associated tightly with that context.
Not every case is cut and dried, however. For example, ribosome entry sites are often treated as separable components, but their performance is very tightly tied to the specifics of the coding sequence that they modulate. Thus, in some cases it may make more sense to represent them separately as a ComponentDefinition and in others to represent them as a SequenceAnnotation on a joint RES/CDS ComponentDefinition.
SequenceFeaturetoSequenceAnnotation. Please confirm which of those is the correct class name. – terdon Jul 29 '19 at 09:09