many Natural Languages have prefixes that adds some meaning to a word.
for example: anti for antivirus, co for coordinator, counter for counterpart
detecting the stem needs these prefixes to be separated. suppose having a list of prefixes for a certain language:
prefix_list = ['c', 'ca', 'ata', 'de']
How to mach all possible overlapping occurrence in a word "catastrophic"
the result should be:
['c', 'ca']
trials:
|character doesn't support overlapping- Otto's solution doesn't mach overlaps in the beginning of the word
- I tried to backward assertion instead in the previous solution but look-behind requires fixed-width pattern
notes:
atacan't be a result as the word doesn't start withata