0

I have a list in a particular format as follows:

my_list =  ['apple', 'apple', 'boy', 'cat', 'cat', 'apple', 'apple', 
             'apple', 'boy', 'cat', 'cat', 'dog', 'dog'].

And my expected output is

res = ['apple', 'boy', 'cat', 'apple',  'boy', 'cat',  'dog']

The consecutive occurrence of the same word should be replaced with the word only once irrespective of whether the word occurred as another sequence earlier.

The following code when I used gives the following output.

test_list = ['apple', 'apple', 'boy', 'cat', 'cat', 'apple', 'apple', 
         'apple', 'boy', 'cat', 'cat', 'dog', 'dog'] 
res = []
[res.append(x) for x in test_list if x not in res] 
print ("The list after removing duplicates : " + str(res))

output: ['apple', 'boy', 'cat', 'dog'] - which gave only distinct words. How will I proceed from here to get what I actually require. Thanks in advance.

Epsi95
  • 8,420
  • 1
  • 12
  • 30
BiSu
  • 3
  • 1

3 Answers3

2

Use itertools.groupby

from itertools import groupby

[key for key, _ in groupby(my_list)]
['apple', 'boy', 'cat', 'apple', 'boy', 'cat', 'dog']
Karl Knechtel
  • 56,349
  • 8
  • 83
  • 124
Epsi95
  • 8,420
  • 1
  • 12
  • 30
  • 1
    I simplified your code - the first element of the returned tuples, which you initially ignored, is already exactly what you want (so there's no need to parse the second element). – Karl Knechtel Feb 05 '21 at 10:26
0

Use set(), which ignores duplicate values.

test_list = ['apple', 'apple', 'boy', 'cat', 'cat', 'apple', 'apple', 
         'apple', 'boy', 'cat', 'cat', 'dog', 'dog'] 
         
t = set(test_list)

Ouput :

{'apple', 'boy', 'cat', 'dog'}

If needed, you can convert the set back into a list by

list(t)

Output :

['dog', 'boy', 'apple', 'cat']
pfabri
  • 753
  • 1
  • 7
  • 24
blaze
  • 59
  • 5
0

Try this:

my_list =  ['apple', 'apple', 'boy', 'cat', 'cat', 'apple', 'apple', 
             'apple', 'boy', 'cat', 'cat', 'dog', 'dog'] + [""]
res = [my_list[i] for i in range(len(my_list) -1) if my_list[i+1] != my_list[i]] 
print(res)
dimay
  • 2,618
  • 1
  • 12
  • 21