22

Specification of the problem:

I'm searching through really great amount of lines of a log file and I'm distributing those lines to groups in order to regular expressions(RegExses) I have stored using the re.match() function. Unfortunately some of my RegExses are too complicated and Python sometimes gets himself to backtracking hell. Due to this I need to protect it with some kind of timeout.

Problems:

  • re.match, I'm using, is Python's function and as I found out somewhere here on StackOverflow (I'm really sorry, I can not find the link now :-( ). It is very difficult to interrupt thread with running Python's library. For this reason threads are out of the game.
  • Because evaluating of re.match function takes relatively short time and I want to analyse with this function great amount of lines, I need some timeout function that wont't take too long to execute (this makes threads even less suitable, it takes really long time to initialise new thread) and can be set to less than one second.
    For those reasons, answers here - Timeout on a function call and here - Timeout function if it takes too long to finish with decorator (alarm - 1sec and more) are off the table.

I've spent this morning searching for solution to this question but I did not find any satisfactory answer.

Community
  • 1
  • 1
Jendas
  • 3,231
  • 2
  • 25
  • 52

1 Answers1

41

Solution:

I've just modified a script posted here: Timeout function if it takes too long to finish.

And here is the code:

from functools import wraps
import errno
import os
import signal

class TimeoutError(Exception):
    pass

def timeout(seconds=10, error_message=os.strerror(errno.ETIME)):
    def decorator(func):
        def _handle_timeout(signum, frame):
            raise TimeoutError(error_message)

        def wrapper(*args, **kwargs):
            signal.signal(signal.SIGALRM, _handle_timeout)
            signal.setitimer(signal.ITIMER_REAL,seconds) #used timer instead of alarm
            try:
                result = func(*args, **kwargs)
            finally:
                signal.alarm(0)
            return result
        return wraps(func)(wrapper)
    return decorator

And then you can use it like this:

from timeout import timeout 
from time import time

@timeout(0.01)
def loop():
    while True:
       pass
try:
    begin = time.time()
    loop()
except TimeoutError, e:
    print "Time elapsed: {:.3f}s".format(time.time() - begin)

Which prints

Time elapsed: 0.010s
Jendas
  • 3,231
  • 2
  • 25
  • 52
  • 2
    This is basically a whole-sale copy of the other answer, with the only difference is that you show the seconds parameter can be a float.. – Martijn Pieters Aug 10 '12 at 12:23
  • 5
    Yes, but using setitimer instead of alarm solved the problem - I can set now time to float - and I thought it will be more clear when I post it with whole syntax and I referenced to that answer. I didn't mean to steal someone’s credit. :-) – Jendas Aug 10 '12 at 12:30
  • Ah, that's the difference, and that's indeed more significant. – Martijn Pieters Aug 10 '12 at 12:30
  • @Jendas: I hope you're going to fix that bare except before you accept the answer. They are rarely what you want. – MRAB Aug 11 '12 at 01:17
  • According to the `signal` documentation, this won't work: "Although Python signal handlers are called asynchronously as far as the Python user is concerned, they can only occur between the “atomic” instructions of the Python interpreter. This means that signals arriving during long calculations implemented purely in C (such as regular expression matches on large bodies of text) may be delayed for an arbitrary amount of time." – Phillip Jan 19 '15 at 13:55
  • I'm noticing this sometimes doesn't work and just hangs, it seems to me it comes at random and I can't quite pin it down. Especially in some infinite loops where this doesn't terminate. Can't really pin it down though – Evan Pu Feb 23 '16 at 04:24
  • Well that is interesting, might it be the problem Phillip mentioned? Although I have never experienced anything like that... – Jendas Feb 23 '16 at 07:49
  • @Jendas Python now has TimeoutError as a built-in exception. Perhaps you could update :) Thanks for the nice solution – Muhammad Ali Jan 17 '19 at 12:51
  • 1
    Since you are using `signal` this code is only applicable at UNIX... what about Windows? – Dilshat Dec 24 '19 at 22:28