25

I'd like to flatten lists that may contain other lists without breaking strings apart. For example:

In [39]: list( itertools.chain(*["cat", ["dog","bird"]]) )
Out[39]: ['c', 'a', 't', 'dog', 'bird']

and I would like

['cat', 'dog', 'bird']
Cev
  • 1,350
  • 1
  • 13
  • 17

7 Answers7

29

Solution:

def flatten(foo):
    for x in foo:
        if hasattr(x, '__iter__') and not isinstance(x, str):
            for y in flatten(x):
                yield y
        else:
            yield x

Old version for Python 2.x:

def flatten(foo):
    for x in foo:
        if hasattr(x, '__iter__'):
            for y in flatten(x):
                yield y
        else:
            yield x

(In Python 2.x, strings conveniently didn't actually have an __iter__ attribute, unlike pretty much every other iterable object in Python. Note however that they do in Python 3, so the above code will only work in Python 2.x.)

smci
  • 29,564
  • 18
  • 109
  • 144
Amber
  • 477,764
  • 81
  • 611
  • 541
  • 1
    Only the Python 3 version uses `isinstance()` and there's no `basestring` class in Python 3 since all strings are Unicode. – kindall Mar 13 '11 at 02:56
  • @Hugh Bothwell: `hasattr(u'foo', '__iter__') == False` in Python 2.x, and Python 3 doesn't have a `basestring`, all it has is `str` (which is unicode) and `bytes`. – Amber Mar 13 '11 at 04:56
  • It looks like the Python 3 version would work in Python 2 also, even if it's a little bit redundant. – Mark Ransom Jul 25 '13 at 17:46
8

A slight modification of orip's answer that avoids creating an intermediate list:

import itertools
items = ['cat',['dog','bird']]
itertools.chain.from_iterable(itertools.repeat(x,1) if isinstance(x,str) else x for x in items)
Geoff Reedy
  • 33,491
  • 3
  • 53
  • 76
2

a brute force way would be to wrap the string in its own list, then use itertools.chain

>>> l = ["cat", ["dog","bird"]]
>>> l2 = [([x] if isinstance(x,str) else x) for x in l]
>>> list(itertools.chain(*l2))
['cat', 'dog', 'bird']
orip
  • 69,626
  • 21
  • 116
  • 145
1
def squash(L):
    if L==[]:
        return []
    elif type(L[0]) == type(""):
        M = squash(L[1:])
        M.insert(0, L[0])
        return M
    elif type(L[0]) == type([]):
        M = squash(L[0])
        M.append(squash(L[1:]))
        return M

def flatten(L):
    return [i for i in squash(L) if i!= []]

>> flatten(["cat", ["dog","bird"]])
['cat', 'dog', 'bird']

Hope this helps

inspectorG4dget
  • 104,525
  • 25
  • 135
  • 234
1

Here's a one-liner approach:

[item for sublist in [[item] if type(item) is not list else item for item in list1] for item in sublist]
Adam Zeldin
  • 838
  • 4
  • 6
1

With the reduce function from the functools library you can do it like this:

import functools
items = ['cat',['dog','bird']]
print(functools.reduce(lambda a, b: [a] + b, items))
Paul Jansen
  • 1,116
  • 1
  • 11
  • 33
  • Does not work if the first element of `items` is a list: `functools.reduce(lambda a, b: [a] + b, [["dog", "bird"], "cat"])` gives `TypeError: can only concatenate list (not "str") to list` – Adam Parkin Mar 15 '21 at 18:01
1

a lambda function approach that works for more than two levels of hierarchy

>>> items = ['cat',['dog','bird',['fish']]] 
>>> flatten = lambda y: [k for j in ([i] if not isinstance(i,list) else flatten(i) for i in y) for k in j]
>>> flatten(items)
['cat', 'dog', 'bird', 'fish']