top | item 40867746

A practical introduction to constraint programming using CP-SAT and Python

254 points| lfittl | 1 year ago |pganalyze.com

39 comments

order

0cf8612b2e1e|1 year ago

I have used constraint solvers in the past, and they are truly magical in what they can do. The problem is that there are not many available resources for the novice. Most of the material you can find is how to solve sudoku (the hello world of the space) or highly technical primary research literate meant exclusively for domain experts. Which is a shame, because I think huge swaths of problems could be solved by these tools if they were more accessible. “Accessible” still meaning it requires a programmer, because shaping a problem into the constraints DSL is not going to be in the wheelhouse of most.

cchianel|1 year ago

I think the reason why these tools are not accessible as they could be is because the vast majority of solvers are MIPs (Mixed Integer Programming) based, meaning the domain need to be written down using mathematical equations. This in turn means a user would need to be familiar with both the domain and mathematics in order to correctly write constraints.

That being said, MIPs are not the only kind of solvers. There are also "local search" based constraint solvers, which does not have the restriction that each constraint must be modelled as a relation or equation of integer variables. In local search solvers, the constraints are mostly treated as a black box that tells how good a particular solution is. As a consequent, local search solvers are typically unable to find the optimal solution (since it would require testing all possible solutions because the constraint is treated as a black box), but rather finds a "near-optimal" solution in reasonable time.

One local search based solver is Timefold Solver. In it, users annotate their domain so the solver knows what are the variables and possible values. This means instead of your constraints dealing with `int`, it would deal with `Shift` and `Employee`, and can access any of their methods.

Disclosure: I work on Timefold Solver

jmjrawlings|1 year ago

You totally nailed it. The actual syntax / API of constraint solvers are so simple they can be learned in no time at all. What actually takes time and expertise is modelling problems in this fashion and there are almost 0 real world (in size and complexity) examples out there for others to reference.

I have about 5 years of experience in MiniZinc solving scheduling problems but sadly all that code is locked behind closed doors never to be open sourced. I would love put together some fully worked constraint programming examples complete with containerisation / visualisation/ modeling etc but the barrier to doing so is finding problems that are actually worth solving and have open source data to work on.

wjholden|1 year ago

I agree, the theory isn't nearly as difficult as the reductions. Dennis Yurichev's "SAT/SMT by Example" (https://smt.st/) is a great resource on this topic, although pretty intimidating.

dd82|1 year ago

> Most of the material you can find is how to solve sudoku (the hello world of the space) or highly technical primary research literate meant exclusively for domain experts

Exactly. I was looking at using a sat solver for a rules engine and couldn't make heads or tails how to use it. After alot of deduction, got a basic POC working, but couldn't extend it to what was actually needed. But the gulf between toy implementations and anything more substantial was very large.

richardw|1 year ago

I’m a long time coder but a bit rusty now. Last year I built a football team optimiser using Google’s OR tools (various optional constraints like being with friends and trying to balance skill levels across teams). LLM’s can go quite far in terms of getting you into the approximately correct direction fairly quickly. They fail right now at really getting it right but I was far enough that I could then take it the rest of the way.

taeric|1 year ago

Core to a lot of this, is learning how to model things in such a way that you can send them to a solver. After that, how to take a solution and present it in a way that can be understood.

It is a shame, as most programs work against the ideas here by trying to have a singular representation of their data. This is just not reasonable for most things and leads to a lot of contortions to get the algorithms to work on a new representation.

This article touches on it with the brief touch of declarative at the top. I always regret that more of my code is not translating between representations more often. You can wind up with very concise representations when you do this, and then you can get a double bonus by having things run faster by virtue of being concise.

(And, yes, I realize I'm basically describing many data pipelines. Where you spend most of your time translating and fanning out data to places for more compute to be done on it.)

bartkappenburg|1 year ago

I used a lot of solvers in the early 2000s in my Operations Research master after my econometrics study. While now working on software (web) that uses python I’m thrilled to see these deep dives on this subject!

I love the subject and reading this brought back a lot of memories. Also the realization that translating constraints to a model (variables, structure etc) is 90% of the work and the most difficult part.

Murky3515|1 year ago

>the realization that translating constraints to a model (variables, structure etc) is 90% of the work and the most difficult part.

LLMs can help a lot there. I've been wanting to write an LLM => Constraint model adapter that does it for you. It's such low hanging fruit, I wonder if anyone else would benefit from it though.

akutlay|1 year ago

I would say the most difficult part is to run it in production with minimal issues. Scaling them and making them robust to changes in data takes a long time.

d_burfoot|1 year ago

I have a client that runs a sports camp for kids. The kids get to request what sports they want to play, and what friends they want to be in class with. This creates a scheduling problem that's hard for a human, and previously they spent several man-weeks per year dealing with it. I built them a simple system that connects their data to an optimizer based on OR-Tools, now their scheduling is done with a few clicks.

jgalt212|1 year ago

yep, once you have the data, constraints, and utility functions properly* in the system you can brute force your way to many good enough solutions very quickly.

I coach a basketball league that has 8 periods. No player can play 2 more periods that any other player. The number of possible line-ups per game while still hitting the playing time contraint is astronomical. Very easy to find a series line-ups that fits the constraint, but very hard to find an optimal or near-optimal series of line-ups. It gets even more fun when you have to adjust for late arrivals or unannounced no-shows.

* not always completely doable

turndown|1 year ago

I can guarantee you a blog post detailing how to do this would go triple platinum

Elucalidavah|1 year ago

Is there a parametric CAD that works primarily as a constraint solver?

It so often bothers me that I have to guesstimate some values for parameters I don't initially care about, instead of constraining the parameters I care about and then optimizing the rest.

richard___|1 year ago

How does this compare with mixed integer programming? For problems in physics

sevensor|1 year ago

A whole bunch of problems can be set up either way. MILP always has an objective, and the constraints are always linear combinations of the decisions. Gurobi is so incredibly fast that it might be worth contorting your problem into a MILP just so you can get solutions at all.

sirwhinesalot|1 year ago

CP-SAT is integer only, so I'm guessing for physics it's not great (you can scale your reals but that's not as good as working with floating point directly).

The advantage of CP-SAT is that it handles boolean and integer variables and constraints much more efficiently than a MIP solver, specially higher-level constraints like all_different.

taeric|1 year ago

I would assume largely similarly? https://www.amazon.com/gp/product/1107658799/ is the book I last went through on this and it covers a lot of the same ideas. In particular, I'm assuming the section of this post that aims to minimize some value are directly using the same stuff.