21

I have some use cases in which I need to run generator functions without caring about the yielded items.
I cannot make them non-generaor functions because in other use cases I certainly need the yielded values.

I am currently using a trivial self-made function to exhaust the generators.

def exhaust(generator):
     for _ in generator:
         pass

I wondered, whether there is a simpler way to do that, which I'm missing?

Edit Following a use case:

def create_tables(fail_silently=True):
    """Create the respective tables."""

    for model in MODELS:
        try:
            model.create_table(fail_silently=fail_silently)
        except Exception:
            yield (False, model)
        else:
            yield (True, model)

In some context, I care about the error and success values…

for success, table in create_tables():
    if success:
        print('Creation of table {} succeeded.'.format(table))
    else:
        print('Creation of table {} failed.'.format(table), file=stderr)

… and in some I just want to run the function "blindly":

exhaust(create_tables())
Richard Neumann
  • 2,630
  • 1
  • 21
  • 43

4 Answers4

25

Setting up a for loop for this could be relatively expensive, keeping in mind that a for loop in Python is fundamentally successive execution of simple assignment statements; you'll be executing n (number of items in generator) assignments, only to discard the assignment targets afterwards.

You can instead feed the generator to a zero length deque; consumes at C-speed and does not use up memory as with list and other callables that materialise iterators/generators:

from collections import deque

def exhaust(generator):
    deque(generator, maxlen=0)

Taken from the consume itertools recipe.

Moses Koledoye
  • 74,909
  • 8
  • 119
  • 129
  • Doesn't a `for` loop also run at C-speed and use the same memory? Maybe show some timings? – Chris_Rands Nov 23 '17 at 13:34
  • 1
    @Chris_Rands Yes it runs at C-speed, but not entirely since callbacks are made to Python, to repeat the loop until completion. Besides, the repeated assignment is just extra overhead. – Moses Koledoye Nov 23 '17 at 13:37
  • 2
    Using `deque` seems to be just a tiny bit faster than the plain loop proposed by the OP; with a `consume.generator` function which yields the first 1000 numbers, running `for _ in consume.generator(): pass` takes 71.8 usec per loop for me, `deque(consume.generator(), maxlen=0)` takes 67.4 (according to `timeit`). – Frerich Raabe Nov 23 '17 at 13:40
  • @FrerichRaabe Thanks for metrics. Try scaling up the numbers a little? – Moses Koledoye Nov 23 '17 at 13:41
  • 1
    @MosesKoledoye For consuming the first million numbers, a for loop needs `77.9` ms here, and `deque` needs `70.8`. I.e. `deque` appears to be a bit less than 10% faster here (it doesn't seem to scale better in my tests). – Frerich Raabe Nov 23 '17 at 13:43
  • 8
    Also, funny that whenever someone asks "how to do this simple thing in a pythonic way", everything turns into "how to do it in least microseconds per loop" – Kos Nov 23 '17 at 13:45
  • 4
    @Kos Indeed, especially since `for _ in generator: pass` is not only less magical than the `deque` solution, it's also one character shorter than `deque(generator, maxlen=0)`. :-) – Frerich Raabe Nov 23 '17 at 13:49
  • @FrerichRaabe let alone the char's (and time!) wasted for "from collections import deque"... (and for such 1-liners one might not even need a function... oops, I failed to make one of these "why...?, don't...!" comments I hate so much when I ask "does it exist? / it is possible...?" – Max Jan 17 '20 at 02:44
5

Based on your use case it's hard to imagine that there would be sufficiently many tables to create that you would need to consider performance.

Additionally, table creation is going to be much more expensive than iteration.

So the for loop that you already have would seem the simplest and most Pythonic solution - in this case.

mhawke
  • 80,261
  • 9
  • 108
  • 134
5

One very simple and possibly efficient solution could be

def exhaust(generator): all(generator)

if we can assume that generator will always return True (as in your case where a tuple of 2 elements (success,table) is true even if success and table both are False), or: any(generator) if it will always return False, and in the "worst case", all(x or True for x in generator).

Being that short & simple, you might not even need a function for it!

Regarding the "why?" comment (I dislike these...): There are many cases where one may want to exhaust a generator. To cite just one, it's a way of doing a for loop as an expression, e.g., any(print(i,x) for i,x in enumerate(S)) - of course there are less trivial examples.

Max
  • 323
  • 3
  • 12
1

You could just have two functions that each do one thing and call the appropriate one at the appropriate time?

def create_table(model, fail_silently=True):
    """Create the table."""
    try:
        model.create_table(fail_silently=fail_silently)
    except Exception:
        return (False, model)
    else:
        return (True, model)

def create_tables(MODELS)
    for model in MODELS:
        create_table(model)
        
def iter_create_tables(MODELS)
   for model in MODELS:
       yield create_table(model)

When you care about the returned values do:

for success, table in iter_create_tables(MODELS):
    if success:
        print('Creation of table {} succeeded.'.format(table))
    else:
        print('Creation of table {} failed.'.format(table), file=stderr)

when you don't just do

create_tables(MODELS)
abrac
  • 318
  • 2
  • 10
JeffUK
  • 3,937
  • 2
  • 18
  • 33