14

Is there a single crisp and accessible reference which covers how to generate Mixed Integer Programming formulations to linearize products, handle logical constraints and disjunctive constraints, do Big M, et. in optimization problems?

Are there any pitfalls to beware of when using Big M to accomplish any of these?

Mark L. Stone
  • 13,392
  • 1
  • 31
  • 67
  • I believe separate questions fit better to the Q&A format. – ayhan May 31 '19 at 13:24
  • 1
    There is a discussion on meta about whether FAQs and wikis are appropriate at this point. All current answers suggest to wait. If you disagree, please consider participating in the discussion – Michael Feldmeier May 31 '19 at 13:25
  • 7
    I disagree with this approach for two reasons: (1) I think these questions should be separate, not grouped into a large FAQ. It is easier to search for and find a question and answer if the Q&A are narrow and specific. Moreover, separate Q&As give new users opportunities to vote, as well as to earn reputation points, which beta sites need. (2) I don't like the approach of having a bunch of links in the answer. If it's going to be an FAQ, then the answers should be self-contained. – LarrySnyder610 May 31 '19 at 13:26
  • I will strip off the FAQ: – Mark L. Stone May 31 '19 at 13:26
  • On the other hand, I think at some point in the future, it could be very useful to collate the individual Q&As into an FAQ for someone who wants to see it all in one place. That FAQ should have links to the individual Q&As. – LarrySnyder610 May 31 '19 at 13:27
  • I also think such an approach will be closer to Wikipedia rather than SE. The nice thing of SE is to discuss many details of, e.g., Big-M.

    Such an attempt would be equivalent to going to Stack Overflow and writing: "to handle zillion of questions about Pandas/Python, we are sharing all the details of how to handle the issues in the package".

    – independentvariable May 31 '19 at 13:28
  • I know this is an FAQ because of the old OR Exchange site. This one "answer" covers a large fraction of the questions there. Then questions can focus on nuances, more advanced cases, . But having a single good answer is not good if you like lots of repetitive questions. – Mark L. Stone May 31 '19 at 13:29
  • @LarrySnyder610 Should that FAQ be on main or meta? – ə̷̶̸͇̘̜́̍͗̂̄︣͟ May 31 '19 at 13:31
  • I thin it makes sense to have some basic reading material as a prerequisite. Then ask questions. – Mark L. Stone May 31 '19 at 13:31
  • @TheSimpliFire I'm not sure. Are FAQs on other sites usually on main or meta? – LarrySnyder610 May 31 '19 at 13:35
  • 1
    The overwhelming majority of noobies are NOT going to be reading META before posting questions on main. Anyhow, I think FAQs which are technical on topic of the site, belong on main, not META. – Mark L. Stone May 31 '19 at 13:38
  • 1
    Given that I produced an answer which covers it all, it is not too broad. I am not of the analysis by the pound school, nor do I think it a bad thing if one question and answer can knock out a broad swath of questions people have. I think that;s much better than cluttering up with questions on every little thing which can be handled at once. I think one stop shopping is good if it delivers. Anyhow, i am in the minority of active people here, so have at it.. – Mark L. Stone Jun 01 '19 at 02:54
  • OR Exchange is no back online in read only mode. Look what I found https://www.or-exchange.org/questions/14268/half-the-questions-on-this-board-could-be-eliminated-if-there-were-an-faq-directing-people-to-automated-software-for-linearizing-constraints-conditional-constraints-big-m-etc It's no less true now than when I wrote it over 2 1.2 years ago. – Mark L. Stone Jun 04 '19 at 20:14
  • @MarkL.Stone "Half the questions on this board could be eliminated..." I am not sure this is inherently a good thing. True, if there are fewer questions then there is less noise to sift through to find the question you are interested in -- but then there is more noise within that question/answer to find the actual answer you are interested in. – LarrySnyder610 Jun 05 '19 at 13:28
  • @LarrySnyder61 As soon as I got in the biz after school, I saw that there were many practitioners of "analysis by the pound" - page after page of meaningless or low quality drivel.. I don't like it in analysis, I don't like it in forums. I'd rather have a smaller quantity of high quality material, not spread out all over the place. I doubt this forum will be shut down for lack of volume. And if a large amount of clutter is what it takes to prevent shut down, I would view that as a Pyrrhic victory. – Mark L. Stone Jun 05 '19 at 13:38
  • @MarkL.Stone You are assuming that the content in multiple, more narrowly focused questions is "low quality drivel" while the content in fewer, broader questions is "high quality material". I don't think that's the case. If we were to build a single reference for all of the topics in the title of this post, I assume the quality would be the same for each of them than it would be if they were contained in separate posts. Having a single post lets one person (or many, if it's a CW) "curate" the content, but the voting system is the community's way of curating multiple narrow questions. – LarrySnyder610 Jun 05 '19 at 13:47
  • I did not intend to make that assumption. My analogy was not exactly applicable or as artfully presented as it should have been. But clutter still applies. Unified presentations make things easier to follow. Clutter obscures good content. In any event, we have different opinions. – Mark L. Stone Jun 05 '19 at 13:58
  • OK, fair enough. – LarrySnyder610 Jun 08 '19 at 11:43
  • @LarrySnyder61 Think of trying to get up to speed in a new area. You have to read 10 papers, all of which use different notation and convention. Now think of a book or review paper (e.g. in SIAM Review) which presents all that information in a unified manner, with common notation and conventions - much easier to wade through and understand. – Mark L. Stone Jun 08 '19 at 12:05
  • 1
    Oh I agree there is a value to tutorial-style resources for people trying to start a new topic. I just don’t think that’s the best way to answer narrow, specific questions; e.g., if someone wants to know how to linearize a constraint, this Q&A will be more useful than a much longer, more general post with many different topics. – LarrySnyder610 Jun 08 '19 at 12:17
  • There might be value to also having tutorial-style posts, but that depends on (a) whether the community wants them, and (b) whether SE is a good format/has good tools for such posts. In any case I don’t think we should default to that type of post. – LarrySnyder610 Jun 08 '19 at 12:17
  • 1
    I think having QA pairs intended to explain basic material about questions frequently asked can be very useful and can fit in the SE format. There are many questions where the most helpful answer is "Try to understand the basic material, and if you still have problems after that, ask a more specific question". If you have a QA pair covering the relevant basic material, you can then close those questions as a duplicate of the question on the basics. – Discrete lizard Jun 08 '19 at 14:10
  • On [cs.se], we do this a lot. Questions such as https://cs.stackexchange.com/questions/23593/ or https://cs.stackexchange.com/questions/23593/ are frequently used to direct questions at and the system seems to work well. – Discrete lizard Jun 08 '19 at 14:10
  • 1
    I do think that it is best to try to place as much content inside the answers of the question, rather than in links to other resources. But that is not a reason for not asking the question and anyone is of course free to provide such an answer. – Discrete lizard Jun 08 '19 at 14:11
  • I wonder if the 2nd part, "Are there any pitfalls to beware of when using Big M to accomplish any of these?", might be best asked as a separate question. – SecretAgentMan Jun 13 '19 at 16:16
  • @ SecretAgentMan It has subsequently in https://or.stackexchange.com/questions/236/why-is-it-important-to-choose-big-m-carefully-and-what-are-the-consequences-of-d/237#237 and to an extent https://or.stackexchange.com/questions/231/when-to-use-indicator-constraints-versus-big-m-approaches-in-solving-mixed-int/348#348 – Mark L. Stone Jun 13 '19 at 17:48
  • How id this too broad when it;s been answered? – Mark L. Stone Jun 13 '19 at 17:56
  • @MarkL.Stone, thanks for linking those. For the record, I've upvoted this question & your answer. – SecretAgentMan Jun 20 '19 at 23:57

4 Answers4

17

Here is a nice, succinct, and easy to understand reference for how to do all this and more. Answers to many future questions can be handled by referencing the appropriate section number in this document and then addressing any particular difficulties or concerns the questioner may have.

FICO MIP formulations and linearizations Quick reference at https://www.msi-jp.com/xpress/learning/square/10-mipformref.pdf


Introductory references which are more tutorial but less comprehensive, which may help people better understand the previously listed reference.

  1. https://ocw.mit.edu/courses/sloan-school-of-management/15-053-optimization-methods-in-management-science-spring-2013/lecture-notes/MIT15_053S13_lec11.pdf

  2. https://ocw.mit.edu/courses/sloan-school-of-management/15-053-optimization-methods-in-management-science-spring-2013/tutorials/MIT15_053S13_tut09.pdf

  3. Logics and integer-programming representations - YALMIP background on how common mixed-integer representable functions and logic can be modeled, which is helpful even if YALMIP is not used. Includes all the usual stuff, plus general integer, sort, number of non-zeros, and much more.


Many of these formulations can be handled under the hood by modern optimization modeling systems.

For instance, indicator constraints in CPLEX.

See:

  1. https://www.ibm.com/support/knowledgecenter/SSSA5P_12.7.0/ilog.odms.cplex.help/CPLEX/UsrMan/topics/discr_optim/indicator_constr/01_indicators_title_synopsis.html

  2. https://www.ibm.com/support/knowledgecenter/SSSA5P_12.7.0/ilog.odms.cplex.help/CPLEX/UsrMan/topics/discr_optim/indicator_constr/07_best_prac.html ,

Note that per https://www.ibm.com/support/knowledgecenter/SSSA5P_12.7.0/ilog.odms.cplex.help/CPLEX/UsrMan/topics/discr_optim/indicator_constr/06_restrictions.html in CPLEX, "The constraint must be linear; a quadratic constraint is not allowed to have an indicator constraint."

Note that per my updated answer at When to use indicator constraints versus big-M approaches in solving (mixed-)integer programs, Indicator Constraints in CPLEX are immune from the big M trickle flow issue mentioned later in this answer under "Be careful with choice of Big M, and its relation to solver integrality tolerance"

Gurobi General Constraints: https://www.gurobi.com/documentation/8.1/refman/constraints.html

YALMIP "implies" automates modeling of some logical constraints: https://yalmip.github.io/command/implies/ Explicit finite lower and upper bounds on all involved variables are required so that a Big M formulation can be derived under the hood by YALMIP to handle the implies.


Be careful with the choice of Big M, and its relation to solver integrality tolerance.

Keep in mind, as the YALMIP developer Johan Lofberg has stated, that Big M is misnamed. It should be called just big enough M. Excessively large M can result in weak formulations which degrade (slow down) the solution process. In extreme cases (but not uncommon for "amateurs"), as a result of solver constraint satisfaction tolerance, an excessively large value of M can cause the solver to produce a solution that violates the logic constraint as intended. That is because integer (including binary) constraints are only satisfied to within an integrality tolerance, which typically is 1e-5 or 1e-6.

Further details are provided by @prubin at https://orinanobworld.blogspot.com/2018/04/big-m-and-integrality-tolerance.html. IBM CPLEX documentation refers to this as "trickle flow" here: https://www-01.ibm.com/support/docview.wss?uid=swg21399984

If M needs to be so big that it would cause the logic constraint to not be satisfied as intended, the integrality tolerance should be made smaller, and verified to be small enough. Proper scaling of the problem can in many cases ameliorate such difficulties and allow for a not very large M.

Mark L. Stone
  • 13,392
  • 1
  • 31
  • 67
11

I recommend Formulating Integer Linear Programs: A Rogues' Gallery:

The article[1] is very accessible, clear, and has multiple examples of using binary variables to achieve logical constraints. Full citation below.

[1] Gerald G. Brown, Robert F. Dell, (2007) Formulating Integer Linear Programs: A Rogues' Gallery. INFORMS Transactions on Education. 7(2):153–159. http://dx.doi.org/10.1287/ited.7.2.153. Also available from https://faculty.nps.edu/gbrown/docs/Rogues_Gallery.pdf.

SecretAgentMan
  • 1,895
  • 2
  • 13
  • 39
9

What about Mixed Integer Linear Programming Formulation Techniques, J.P. Vielma, SIAM Rev., 57(1), 3–57, 2015?

rocarvaj
  • 191
  • 5
7

Another resource is "Strategies for “Not Linear” Optimization" from the AMPL group.

A.Omidi
  • 8,832
  • 2
  • 13
  • 49