-2

I was wondering if we can add the tokens if there is specific token after the token. For example:

This is a test token and it is a test to see if it works.

In the sentence above let's say we get token as:

token ='This','is', 'a','test','token','and','it','is','a','test','to',see'....

What I want to do is if there is a token called token, I want test token to be single token.

I have looked around and tried everything but I couldn't fix it.

jonrsharpe
  • 107,083
  • 22
  • 201
  • 376
Sam
  • 1,040
  • 2
  • 10
  • 25

1 Answers1

2

Think you mean this,.

>>> import re
>>> s = "This is a test token and it is a test to see if it works."
>>> re.findall(r'\btest token\b|\S+', s)
['This', 'is', 'a', 'test token', 'and', 'it', 'is', 'a', 'test', 'to', 'see', 'if', 'it', 'works.']
Avinash Raj
  • 166,785
  • 24
  • 204
  • 249