8

I have list1 and list2. list2 is a group of words that have to be removed from list1, for example:

list1=['paste', 'text', 'text', 'here', 'here', 'here', 'my', 'i', 'i', 'me', 'me']

list2=["i","me"]

Desired output:

list3=['paste', 'text', 'text', 'here', 'here', 'here', 'my']

I have tried different versions using 'for' but no results so far.

Any ideas would be appreciated!

Jon Clements
  • 132,101
  • 31
  • 237
  • 267
rodrigocf
  • 1,705
  • 11
  • 35
  • 53

2 Answers2

19

Use list comprehension:

>>> list1 = ['paste', 'text', 'text', 'here', 'here', 'here', 'my', 'i', 'i', 'me', 'me']
>>> list2 = ["i","me"]
>>> list3 = [item for item in list1 if item not in list2]
>>> list3
['paste', 'text', 'text', 'here', 'here', 'here', 'my']

NOTE: Lookups in lists are O(n), consider making a set from list2 instead - lookups in sets are O(1).

alecxe
  • 441,113
  • 110
  • 1,021
  • 1,148
5

What about leveraging set arithmetics?

diff = set(list1) - set(list2)
result = [o for o in list1 if o in diff]

Or even better (more efficient):

set2 = set(list2)
result = [o for o in list1 if o not in set2]
Marsellus Wallace
  • 16,981
  • 22
  • 84
  • 147
  • 1
    It's much less expensive to just check that an element of list1 (without making it a set) is not in a set of list2... – Jon Clements Jul 29 '13 at 21:55
  • Would the removal of duplicates in list1 pay for the set overhead? In the example data there are repeated items - where's the breakeven likely to show up? – theodox Jul 30 '13 at 05:08