I need to run a parameter sweep over a grid search for a model written in Python where most of the work is written in Cython. I have thousands of input parameters sets to run through. I dispose of 36 cores in a local Desktop computer, with each cpu core multithread for a total of 72 parallel threads. I have 128 GB of RAM. Each run can take up to 2 GB of RAM.
What is the most efficient way to run concurrent jobs to do the parameter sweep? I have done it in two different ways:
- Using Python multiprocessing:
from multiprocessing import Pool
pool = Pool(processes=70)
pool.map(my_model, sweep_params)
pool.close()
pool.join()
where my intent is to use up to 70 concurrent processes at a time to run my_model with a different set of input parameter. And where sweep_params is a list of thousands of dictionnaries which contain the set of inputs required to run my model, including a unique identifier used to uniquely map the output to the which concurrent run generated it.
- Using SLURM on a cluster, and the environmental
SLURM_ARRAY_TASK_IDto run the concurrent Python jobs.SLURM_ARRAY_TASK_IDsimply indexes the list of input dictionnaries used in method 1). However, for any python job, I am limited to 50 parallel threads on that cluster. I am hesitant installing SLURM on my 72-core local machine used in 1) that has 72 threads, as I am not clear on whether using <=70 threads with SLUM locally will be more efficient than when using method 1).
Would SLURM run things more efficiently than method 1 if installed on my local machine?
Is there another more efficient way to perform the parameter sweep on my 72-thread machine?
I have seen this multiprocessing-based solution at How to use multiprocessing for grid search (parameter optimization) in Python is from 7 years ago, would it be equivalent to what I am already doing in method 1) ? Or more efficient?