23

I'm creating a threading.Timer(2,work) run threads. Inside each work function, upon some condition the global counter must increment without conflict for access of counter variable among the spawned work threads.

I've tried Queue.Queue assigned counter as well as threading.Lock(). Which is a best way to implement thread-safe global increment variable.

Previously someone asked question here: Python threading. How do I lock a thread?

Community
  • 1
  • 1
L.fole
  • 637
  • 2
  • 11
  • 18

3 Answers3

45

Not sure if you have tried this specific syntax already, but for me this has always worked well:

Define a global lock:

import threading
threadLock = threading.Lock()

and then you have to acquire and release the lock every time you increase your counter in your individual threads:

with threadLock:
    global_counter += 1
o11c
  • 14,462
  • 4
  • 48
  • 70
mommermi
  • 932
  • 10
  • 18
7

One solution is to protect the counter with a multiprocessing.Lock. You could keep it in a class, like so:

from multiprocessing import Process, RawValue, Lock
import time

class Counter(object):
    def __init__(self, value=0):
        # RawValue because we don't need it to create a Lock:
        self.val = RawValue('i', value)
        self.lock = Lock()

    def increment(self):
        with self.lock:
            self.val.value += 1

    def value(self):
        with self.lock:
            return self.val.value

def inc(counter):
    for i in range(1000):
        counter.increment()

if __name__ == '__main__':
    thread_safe_counter = Counter(0)
    procs = [Process(target=inc, args=(thread_safe_counter,)) for i in range(100)]

    for p in procs: p.start()
    for p in procs: p.join()

    print (thread_safe_counter.value())

The above snippet was first taken from Eli Bendersky's blog, here.

Michael Foukarakis
  • 38,030
  • 5
  • 79
  • 118
  • Do you need to do that. Provided you are prepared to assume CPython, then the Global Interpreter Lock means that each python operation is atomic, so `self.value += 1` should be atomic. Or do I misunderstand something? – Martin Bonner supports Monica Jan 29 '16 at 15:50
  • Using threads in CPython is mostly nonsensical, and the question isn't CPython specific. Even in CPython, some things happen outside the GIL, so I'd rather not encourage bad habits. – Michael Foukarakis Feb 03 '16 at 08:24
  • 10
    Also for anyone who stumbles across this later... `self.value += 1` is not a single python operation in the relevant sense, because it compiles down into multiple bytecodes. Effectively it's `tmp1 = self.value; tmp2 = tmp1.__iadd__(1); self.value = tmp2`, where each of those 3 statements is atomic, but the sequence as whole is definitely not. – Nathaniel J. Smith Jan 20 '17 at 08:16
  • 7
    @MichaelFoukarakis Threads a definitively not a waste of time. You can download 1000 websites much faster with threads then without. – AlexLordThorsen May 21 '17 at 18:02
  • 5
    @MichaelFoukarakis Threads have their uses. For example when making a non-blocking single core program such as GUIs. Threads are not about multiprocessing necessarily. A new thread does not mandate it being run on another core. That is a misconception about threads, which many people seem to have. Clearly they are not mostly nonsensical. That is an overly broad statement to make about something as essential as threads. – Zelphir Kaltstahl Aug 03 '17 at 12:06
  • Could this instead be done with multiprocessing.sharedctypes.Value with lock=True? `counter = Value(typecode_or_type, *args, lock=True)` From the documentation: "The same as RawValue() except that depending on the value of lock a process-safe synchronization wrapper may be returned instead of a raw ctypes object." – hansonap Dec 14 '20 at 14:28
  • This is a threading question - use `threading.Lock`. `multiprocessing.Lock` is a much more expensive hammer. – Tom Swirly May 02 '22 at 18:20
0

If you're using CPython1, you can do this without explicit locks:

import itertools

class Counter:
    def __init__(self):
        self._incs = itertools.count()
        self._accesses = itertools.count()

    def increment(self):
        next(self._incs)

    def value(self):
        return next(self._incs) - next(self._accesses)

my_global_counter = Counter()

We need two counters: one to count increments and one to count accesses of value(). This is because itertools.count does not provide a way to access the current value, only the next value. So we need to "undo" the increments we incur just by asking for the value.

This is threadsafe because itertools.count.__next__() is atomic in CPython (thanks, GIL!) and we don't persist the difference.

Note that if value() is accessed in parallel, the exact number may not be perfectly stable or strictly monotonically increasing. It could be plus or minus a margin proportional to the number of threads accessing. In theory, self._incs could be updated first in one thread while self._accesses is updated first in another thread. But overall the system will never lose any data due to unguarded writes; it will always settle to the correct value.

1 Not all Python is CPython, but a lot (most?) is.

2 Credit to https://julien.danjou.info/atomic-lock-free-counters-in-python/ for the initial idea to use itertools.count to increment and a second access counter to correct. They stopped just short of removing all locks.

PhilMarsh
  • 26
  • 2