10

I would like to use Pool within a class, but there seems to be a problem. My code is long, I created a small-demo variant to illustrated the problem. It would be great if you can give me a variant of the code below that works.

from multiprocessing import Pool

class SeriesInstance(object):
    def __init__(self):
        self.numbers = [1,2,3]
    def F(self, x):
        return x * x
    def run(self):
        p = Pool()
        print p.map(self.F, self.numbers)


ins = SeriesInstance()
ins.run()

Outputs:

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 551, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 504, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 319, in _handle_tasks
    put(task)
PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

And then hangs.

Dan D.
  • 70,581
  • 13
  • 96
  • 116
user58925
  • 1,371
  • 3
  • 17
  • 27

3 Answers3

13

It looks like because of the way the function gets passed to the worker threads (pickling) you can't use instance methods unfortunately. My first thought was to use lambdas, but it turns out the built in pickler can't serialize those either. The solution, sadly, is just to use a function in the global namespace. As suggested in other answers, you can use static methods and pass self to make it look more like an instance method.

from multiprocessing import Pool
from itertools import repeat

class SeriesInstance(object):
    def __init__(self):
        self.numbers = [1,2,3]

    def run(self):
        p = Pool()
        squares = p.map(self.F, self.numbers)
        multiples = p.starmap(self.G, zip(repeat(self), [2, 5, 10]))
        return (squares, multiples)

    @staticmethod
    def F(x):
        return x * x

    @staticmethod
    def G(self, m):
        return [m *n for n in self.numbers]

if __name__ == '__main__':
    print(SeriesInstance().run())
Alex Sherman
  • 454
  • 3
  • 9
  • Thanks, but there seems to be a problem. When I use this principle in my larger code it crashes after a few iterations with the following error "OSError: [Errno 35] Resource temporarily unavailable" – user58925 Sep 01 '15 at 20:45
  • 1
    I think the error is due to OS errors when creating too many processes. It seems like you need to properly close your pools when you're done with them, this is just a guess though. Depending on your actual code, which you should provide if you can, you could use a single pool that you pass to run as a parameter or just for every SeriesInstance you must close their pools when you're done with them. – Alex Sherman Sep 01 '15 at 21:22
  • Typically you should limit the size of your pool by "p = mp.Pool(mp.cpu_count())". This works perfectly. – Steve Lihn Jul 01 '21 at 20:28
  • this answer misses the question's idea. what if the function `F` includes class members? – Livne Rosenblum Feb 17 '22 at 12:26
3

You can also use multiprocessing with static functions in the class.

stardust
  • 464
  • 6
  • 9
1

You have an error, because pickle can't serialize instancemethod. So you should use this tiny workaround:

from itertools import repeat
from multiprocessing import Pool


class SeriesInstance:
    def __init__(self):
        self.numbers = [1, 2, 3]

    def F(self, x):
        return x * x

    def run(self):
        p = Pool()
        print(list(p.starmap(SeriesInstance.F, zip(repeat(self), self.numbers))))


if __name__ == '__main__':
    SeriesInstance().run()