5

I am trying to run this constraint problem but the memory runs out, $$S_{i}$$ are 1975 Students that need to be assigned to one of 188 teacher assistants classes, each teacher assistant has to chose a time slot $$TA_{j}$$ from out of 8. Each Teacher assistant and Student have their expressed time slots in dfTA and dfS data frames.

The idea is to assign a teacher assistant to each student and a time slot to each teacher assistant. Of course all students in a class must be able to take that class as well as the teacher assistant to give it.

import constraint
problem = constraint.Problem()
for i in range(0,1974):
    problem.addVariable(f'S_{i}', range(0,187))
for i in range(0,187):
    problem.addVariable(f'TA_{i}', range(0,8))
for S in range(0,1974):
    for TA in range(0,187):
        exec(f"""def timezone{S}_{TA}(s,t):
                if s!=TA:
                    return True
                if s==TA and (dfS.iloc[S,1+t]>0)*(dfTA.iloc[TA,t]>0):
                    return True
                else:
                    return False
problem.addConstraint(timezone{S}_{TA}, ['S_{S}','TA_{TA}'])""")
problem.getSolutions()

If anyone knows how to solve this or optimize this it would be of great use.

Link to a Colab notebook: https://colab.research.google.com/drive/1pb9qM13S2GmpjHAWIAEUCRwYwvyF68IT?usp=sharing

And data: https://drive.google.com/drive/folders/1J6yAfXIKcn0NZrT6xtluxt71nyhuR0ak?usp=sharing

jeroaranda
  • 95
  • 4
  • Can you add sufficient context data and code for someone else to be able to reproduce the issue? – Richard Jun 21 '20 at 15:32
  • that is exactly the code I have, dfTA.iloc[i,j] is 1 if the teacher i is able to give class at time slot j and dfS.iloc[i,j] is 1 if student i is able to take class at time slot j. They are only allowed to choose 2 contigous time slots. What else do you think would be necessary? – jeroaranda Jun 21 '20 at 19:36
  • A place to acquire the dataset, and clearly there is more code since the code you posted doesn't include reading or generating the data frame. A Minimum Working Example is most likely to get you quality help. – Richard Jun 21 '20 at 19:40
  • yeah the other things are just imports and the data frame reads, here is a link to a Colab notebook https://colab.research.google.com/drive/1pb9qM13S2GmpjHAWIAEUCRwYwvyF68IT?usp=sharing – jeroaranda Jun 21 '20 at 19:43
  • Unfortunately, other people cannot read the contents of your Google Drive. – Richard Jun 21 '20 at 20:14
  • Thanks!, data is here https://drive.google.com/drive/folders/1J6yAfXIKcn0NZrT6xtluxt71nyhuR0ak?usp=sharing – jeroaranda Jun 22 '20 at 13:16
  • Have you tried to use another constraint programming solver? Google OR-Tools for example would be my first choice at the moment if you do not want to use commercial tools. – JakobS Jun 30 '20 at 08:15

3 Answers3

8

Be sure that your code reaches the getSolutions line. As of now, you are not sure that it does.

Your Python code is creating more than 3000000 functions! There is a good chance that is what is causing your memory issues. Don't use exec, and create a single timezone function instead with S and TA as additional parameters. You can use a lambda to pass it to addConstraint.

Then you want to make sure your model works. Use a smaller problem size - the current one is way too big, and you won't be able to debug the issues you run into. Try to state each constraint in words and explain what it does and why - it helps.

Ggouvine
  • 1,877
  • 7
  • 13
  • Thanks for the insight, it does reach the getSolutions line, it actually takes a few seconds to reach it. I tried adding an if clause, so that only the student and teachers which do share time slots get a function compiled.

    I did the exec because I was not sure on how to pass the student and teacher indexes as parameters, as you need those to actually check the time constraints. Can you elaborate in the way to use a lambda function to do this. Thank you!

    – jeroaranda Jun 20 '20 at 15:54
  • As I remember, there is an example right on the python-constraint page. – Ggouvine Jun 20 '20 at 17:22
2

You could try using Z3, like so:

#!/usr/bin/env python3

import itertools

import pandas as pd import z3

#Read data df = pd.read_csv('constraints.csv') dfTA = pd.read_csv('ta_timezone.csv') dfS = pd.read_csv('student_timezone.csv')

problem = z3.Solver()

number_of_students = 1974 number_of_tas = 187 number_of_timeslots = 9

student_to_ta = [] for i in range(number_of_students): temp = z3.Int(f"S_{i}") student_to_ta.append(temp) problem.add(0<=temp) problem.add(temp<number_of_tas)

#Assign TAs to timeslots ta_to_timeslot = [] for i in range(number_of_tas): temp = z3.Int(f"TA_{i}") ta_to_timeslot.append(temp) problem.add(0<=temp) problem.add(temp<number_of_timeslots) for t in range(number_of_timeslots): #TA must like this time slot problem.add(z3.Implies(temp==t, bool(dfTA.iloc[i,t]>0)))

#Assign students to TAs for s, ta, t in itertools.product(range(0,number_of_students), range(0,number_of_tas), range(number_of_timeslots)): #If (student is assigned to TA and TA is assigned to this timeslot) then (student must like this time slot) problem.add(z3.Implies(z3.And(student_to_ta[s]==ta, ta_to_timeslot[ta]==t), bool(dfS.iloc[s,1+t]>0)))

if problem.check()!=z3.sat: print("Problem could not be solved!") else: solution = problem.model() print("Students to TAs: ", [solution[x] for x in student_to_ta]) print("TAs to Timeslots: ", [solution[x] for x in ta_to_timeslot])

However, the problem is big and Python is slow, so this formulation may also be problematic. Moving to C++ might be a more robust option for a problem of this size. I'd suggest Julia, but its Z3 package is 5 years old and doesn't work any more.

Richard
  • 543
  • 2
  • 13
2

The problem here is that you are generating way more variables & functions than Pyomo can normally handle. However, I am guessing that you don't strictly need all of them, i.e., it's possible that there is inherent sparsity and/or symmetry in the problem that you are not exploiting. My suggestion would be do add if statements in your definitions to skip the generations of variables & functions that are redundant.

If that is not possible in your case, even though Pyomo is great, it does have its limitations so you can try using commercial modelling software such as AMPL, which would most likely blaze through that problem.

Nikos Kazazakis
  • 12,121
  • 17
  • 59