0

I have a 34-mer string like

ATGGGGTTTCCC...CTG

I want to get all possible 6-mer substrings in this string. Can you suggest a good way to do this.

Bhargav Rao
  • 45,811
  • 27
  • 120
  • 136
Ssank
  • 2,775
  • 6
  • 27
  • 31
  • Quite close http://stackoverflow.com/questions/21303224/iterate-over-all-pairs-of-consecutive-items-from-a-given-list though not an exact dupe – Bhargav Rao Jun 10 '15 at 18:12

1 Answers1

1

Assuming they have to be contiguous, you can use slicing in a list comprehension

>>> s = 'AGTAATGGCGATTGAGGGTCCACTGTCCTGGTAC'
>>> [s[i:i+6] for i in range(len(s)-5)]
['AGTAAT', 'GTAATG', 'TAATGG', 'AATGGC', 'ATGGCG', 'TGGCGA', 'GGCGAT', 'GCGATT', 'CGATTG', 'GATTGA', 'ATTGAG', 'TTGAGG', 'TGAGGG', 'GAGGGT', 'AGGGTC', 'GGGTCC', 'GGTCCA', 'GTCCAC', 'TCCACT', 'CCACTG', 'CACTGT', 'ACTGTC', 'CTGTCC', 'TGTCCT', 'GTCCTG', 'TCCTGG', 'CCTGGT', 'CTGGTA', 'TGGTAC']
Cory Kramer
  • 107,498
  • 14
  • 145
  • 201