Probably a naive question. I am inexperienced.
I am interested in identifying potential CRISP (Cysteine-rich secretory proteins) in a certain tissue transcriptome (ca. 20k sequences in fasta). I have detected signalP and estimated % of cysteine in sequences. However CRISP rely on a certain pattern, and I am not aware of batch-detected algorithms.
Please would anyone know the best/simplest way of detecting CRISP candidates?
[EDIT]
I have been asked for the pattern. I summarise what the literature says:
Several of them have a CAP domain;
There should be a region called 'cysteine-rich domain' (CRD) upon the carboxyl terminal half of the protein containing 10 of roughly 16 conserved cysteines.
I am particularly interested in venom-derived CRISPs, but I am not sure whether they cluster in a certain CRISP family.
References: Roberts et al. (2007) Structure and function of epididymal protein cysteine‐rich secretory protein‐1. Yamazaki & Morita (2004) Structure and function of snake venom cysteine-rich secretory proteins.