0

I know split() method separates a string about a separator and returns a list in Python. The separator has to be a string. But if I have a huge text and I want to split every time one of the substrings in a list are found, is there an easy way to do that in Python?

For example:

‘Dislike’ is too strong a word, but there is one feature in the Russian language that I find excessive and uncomfortable.

Russian nouns have genders: feminine, masculine and neuter. The adjective you use have to match in gender with the noun. And usually the word order is such that first you say the adjective and then the noun. Like in English — delicious apple or delicious pineapple.

But unlike English the words for apple and pineapple in Russian have different genders: apple is iábloko, neuter, and pineapple is ananás — masculine gender. So the word for 'delicious’ has to incorporate the gender of the succeeding noun. For the neuter it is vkúsnoie, for masculine — vkúsnyi. Sounds already bad for an English speaker, but for a Russian it is very natural.

But sometimes you are making a complex utterance on the go and you don't know with which words exactly you are going to finish your thought. So you start saying the adjective without knowing what noun you will use. You rely on the rich Russian vocabulary and your intuition hoping that no matter what adjective you are starting to say, you'll find a noun matching in gender. And most of the times you manage to pull this trick, but sometimes you can't find the matching noun, and then you have to say the whole phrase all over again. This is annoying.

If I want to split this text every time I find one of the substring_separators = ['.', '\n', 'and', 'And', 'but', 'But'], I cannot use list = text.split(substring_separators).

What is the work around for this?

  • maybe you want split by regex? – Lei Yang Jun 24 '21 at 03:34
  • Does this answer your question? [Split Strings into words with multiple word boundary delimiters](https://stackoverflow.com/questions/1059559/split-strings-into-words-with-multiple-word-boundary-delimiters) Especially the answer by gimel, rather than the accepted answer. – j1-lee Jun 24 '21 at 03:36
  • Yeah this solves the issue. Thanks. :') – Milind Chakraborty Jun 24 '21 at 03:45

0 Answers0