In our lab, we work with synthetic biology components with partially random sequences (similar to work in directed evolution). So, for example, we have a plasmid design with several components, including one protein coding sequence that will be created from a wild-type sequence via mutagenic PCR.
At the design stage, we don't know the exact DNA sequence for any of the resulting variants of that protein coding sequence. We just know the wild-type sequence and the average number of mutations we'd like to apply.
How can we represent that design in SBOL (via the Component class)?
Then, when we implement the design, we end up with a single sample that contains plasmids with 10^5-10^6 (or more) different protein coding sequences. We then measure the complete DNA sequence for every plasmid variant, but I don't want to end up with >10^5 different SBOL definitions to describe the sample.
How can we best represent that sample in SBOL (via the Implementation class)?
Finally, we also end up measuring the performance of every plasmid variant, and we identify the specific protein coding sequences for useful and/or interesting variants. We then design and build specific, clonal plasmids with those useful/interesting protein coding sequences.
Representing those clonal plasmids as SBOL Components seems pretty straight-forward, but is there a good way to capture the relationships with the original random-library plasmid design and the specific implementation of that design that we built and measured?
SequenceFeatures, one for each base in the protein coding sequence. Is that right? This seems a bit cumbersome, but but not too hard to write code for. Does the idea of aSequenceFeaturefor each base in a protein coding sequence raise any red flags for you though? – David Ross Apr 03 '21 at 12:50SequenceFeaturefor every base in the sequence does raise a red flag to me. I don't think we've worked out a best practice for this case yet, but my thought would be to mark the whole region as oneSequenceFeatureand tag it with avariantof allNNNNN....bases – jakebeal Apr 03 '21 at 17:18