Python Multiprocessing with class methods and class variables

Question

I am setting up a Programm which exports PDF Files and now I want to speed it up by using Pythons multiprocessing module and the map function (I am using Python 3.7.9).

Now I am running into trouble applying the multiprocessing to my code. I tried to break down the code into its essentials which then looks like this:

from multiprocessing import Pool


def callFunctionInClass(i):
    #calls the function inside the class which is supposed to be parallelized
    SomeClass().func1(i)
    

class SomeClass:
    
    def setupUi(self):
        # sets up the Graphical user interface
        return
        
    def func1(self, i):
        currentListItem = self.someList[i]
        #from here on out I create a PDF which then is exported
        
    def parallelizationFunction(self):
        #set up the pool
        pool = Pool()
        # range which the map function is supposed to iterate over
        someRange = list(range(0, len(self.someList)))
        pool.map(callFunctionInClass, someList)
        
        

if __name__ == "__main__":
    #calls the class and sets up the GUI
    ui = SomeClass()
    ui.setupUi()

Inside the class the pool is initialized and the top level method is called using map, which then is supposed to call the class-method that I want to use multiproccessing on. Now similar questions have already been asked here and I tried to adopt according to the answers but I still run into errors (mostly pickling errors).

So far I tried several things, like passing self to the top-level-methodor using pathos ProcessingPool but I could also not get this to work. My question is how can I pass class values to methods outside its scope without running into pickling errors (maybe multiprocessing.Value does the job)? I hope my question is not too vague, otherwise I'll try to provide more sufficient information. Many thanks in advance!

This is not a problem with multiprocessing, it is with the OOPs concept. You cant access a method outside a class. You can only access a `static` function. And inside a static function you can only access class variables and not instance variables — shoaib30, Aug 27 '21 at 11:04

shoaib30 · Answer 1 · 2021-08-27T11:41:43.913

You can not access a method from outside a class without an object of the class. You can either create a list outside the class and use that, or create a static method with class variables

Using a List outside the class

from multiprocessing import Pool

dataList = [{'data': 'a'}, {'data': 'b'}, {'data': 'c'}]


def callFunctionInClass(i):
    # calls the function inside the class which is supposed to be parallelized
    SomeClass().func1(i)


class SomeClass:

    def setupUi(self):
        print('UI')
        # sets up the Graphical user interface
        return

    def func1(self, i):
        currentListItem = dataList[i]
        print(currentListItem)
        # from here on out I create a PDF which then is exported

    def parallelizationFunction(self):
        # set up the pool
        pool = Pool()
        # range which the map function is supposed to iterate over
        someRange = list(range(0, len(dataList)))
        pool.map(callFunctionInClass, someRange)


if __name__ == "__main__":
    # calls the class and sets up the GUI
    ui = SomeClass()
    ui.setupUi()
    ui.parallelizationFunction()

This simply extends your code to refer to a list outside the class and it works as expected

Using a static method (Static methods in Python)

from multiprocessing import Pool



def callFunctionInClass(i):
    # calls the function inside the class which is supposed to be parallelized
    SomeClass().func1(i)


class SomeClass:
    dataList = [{'data': 'a'}, {'data': 'b'}, {'data': 'c'}]

    def setupUi(self):
        print('UI')
        # sets up the Graphical user interface
        return

    @staticmethod
    def func1(i):
        currentListItem = SomeClass.dataList[i]
        print(currentListItem)
        # from here on out I create a PDF which then is exported

    def parallelizationFunction(self):
        # set up the pool
        pool = Pool()
        # range which the map function is supposed to iterate over
        someRange = list(range(0, len(SomeClass.dataList)))
        pool.map(callFunctionInClass, someRange)


if __name__ == "__main__":
    # calls the class and sets up the GUI
    ui = SomeClass()
    ui.setupUi()
    ui.parallelizationFunction()

Python Multiprocessing with class methods and class variables

1 Answers1