2

I have a simple For Loop in a python script:

for filename in filenames:
    outline= getinfo(filename)
    outfile.write(outline)

This For Loop is part of a larger script that extracts data from HTML pages. I have nearly 6GB of html pages and want to do some test runs before I try it on all of them.

I have searched but cant find a way to make my For Loop break after n iterations (lets say 100.)

arshajii
  • 123,543
  • 24
  • 232
  • 276
user2475523
  • 43
  • 2
  • 4

4 Answers4

11
for filename in filenames[:100]:
    outline= getinfo(filename)
    outfile.write(outline)

The list slice filenames[:100] will truncate the list of file names to just the first 100 elements.

kqr
  • 14,309
  • 3
  • 37
  • 70
8

Keep a counter for your for loop. When your counter reaches, 100, break

counter = 0
for filename in filenames:
    if counter == 100:
        break
    outline= getinfo(filename)
    outfile.write(outline)
    counter += 1
waldol1
  • 1,747
  • 2
  • 17
  • 21
  • 7
    The preferred way to keep a counter is to do `for (counter, filename) in enumerate(filenames)`. – kqr Jun 11 '13 at 17:19
2

I like @kqr's answer, but just another approach to consider, instead of taking the first 100, you could take a random n many instead:

from random import sample
for filename in sample(filenames, 10):
    # pass
Jon Clements
  • 132,101
  • 31
  • 237
  • 267
  • I would argue this is the better solution as long as it doesn't have any terrible performance issues. – kqr Jun 11 '13 at 17:28
  • @kqr the main issue I would be worried about is reproduce-ability.... So maybe the compromise is to take 1 in n instead, which could be done nicely with slicing as shown in your answer... And still be more useful for testing... – Jon Clements Jun 11 '13 at 17:30
  • Yes, that's what I thought too, but discarded since it would probably have similar performance to taking a random sample. I didn't think about testability, but you are indeed correct. – kqr Jun 11 '13 at 17:34
1

Use the built-in function enumerate(), available in both Python 2 and 3.

for idx,filename in enumerate(filenames):
    if idx == 100:
        break
    outline= getinfo(filename)
    outfile.write(outline)

Also look at this.