(no title)
lxe | 3 months ago
In practice, I find it much more productive to start with a computational solution - write the algorithm, make it work, understand the procedure. Then, if there's elegant mathematical structure hiding in there, it reveals itself naturally. You optimize where it matters.
The problem is math purists will look at this approach and dismiss it as "inelegant" or "brute force" thinking. But that's backwards. A closed-form solution you've memorized but don't deeply understand is worse than an iterative algorithm you've built from scratch and can reason about clearly.
Most real problems have perfectly good computational solutions. The computational perspective often forces you to think through edge cases, termination conditions, and the actual mechanics of what's happening - which builds genuine intuition. The "elegant" closed-form solution often obscures that structure.
I'm not against finding mathematical elegance. I'm against the cultural bias that treats computation as second-class thinking. Start with what works. Optimize when the structure becomes obvious. That's how you actually solve problems.
godelski|3 months ago
Elegance is not first. First is rough. Solving by math sounds much like what you describe. I find my structures, put them together, and find the interactions. Elegance comes after cleaning things up. It's towards the end of the process, not the beginning. We don't divine math just as you don't divine code. I'm just not sure how you get elegance from the get go.
So I find it weird that you criticize a math first approach because your description of a math approach doesn't feel all that accurate to me.
Edit: I do also want to mention that there's a correspondence between math and code. They aren't completely isomorphic because math can do a lot more and can be much more arbitrarily constructed, but the correspondence is key to understanding how these techniques are not so different.
saulpw|3 months ago
The top-down (mathematical) approach can also fail, in cases where there's not an existing math solution, or when a perfectly spherical cow isn't an adequate representation of reality. See Minix vs Linux, or OSI vs TCP/IP.
lxe|3 months ago
But I think the Sudoku example is less about top-down vs bottom-up and more about dogmatic adherence to abstractions (OOP in that case). Jeffries wasn't just using a 'hacker' approach - he was forcing everything through an OOP lens that fundamentally didn't fit the problem structure.
But yes, same issue can happen with the 'mathematical' approach - forcing "elegant" closed-form thinking onto problems that are inherently messy or iterative.
kenjackson|3 months ago
liquid_bluing|3 months ago
IMO, the mathematical approach is essentially always better for software; nearly every problem that the industry didn’t inflict upon itself was solved by some egghead last century. But there is a kind of joy in creating pointless difficulties at enormous cost in order to experience the satisfaction of overcoming them without recourse to others, I suppose.
orforforof|3 months ago
whilenot-dev|3 months ago
Every single YouTube video from tom7[0] or 3blue1brown[1] do way more on transmitting the fascinations of mathematics.
[0]: https://www.youtube.com/@tom7
[1]: https://www.youtube.com/@3blue1brown
krikou|3 months ago
I could relate how he described the mathematical experience with what I feel is happening in my head/brain when I do programming.
thenobsta|3 months ago
vatsachak|3 months ago
Just get a proof of the open problem no matter how sketchy. Then iterate and refine.
But people love to reinvent the wheel without caring about abstractions, resulting in languages like Python being the defacto standard for machine learning
wiz21c|3 months ago
Sidenote: I code fluid dynamics stuff (I'm trained in computer science, not at all in physics). It's funny to see how the math and physics deeply affect the way I code (and not the other way around). Math and physics laws feels unescapable and my code usually have to be extremely accurate to handle these laws correctly. When debugging that code, usually, thinking math/physics first is the way to go as they allow you to narrow the (code) bug more quickly. And if all fails, then usually, it's back to the math/physics drawing board :-)
ViscountPenguin|3 months ago
https://www.youtube.com/watch?v=ltLUadnCyi0
Personally, I find a mix of all three approaches (programming, pen and paper, and "pure" mathematical structural thought) to be best.
zdkaster|3 months ago
MITSardine|3 months ago
That said, I mostly agree with you, and I thought I'd share an anecdote where a math result came from a premature implementation.
I was working on maximizing the minimum value of a set of functions f_i that depend on variables X. I.e., solve max_X min_i f_i(X).
The f_i were each cubic, so F(X) = min_i f_i(X) was piecewise cubic. X was dimension 3xN, N arbitrarily large. This is intractable to solve as, F being non-smooth (derivatives are discontinuous), you can't well throw it at Newton's method or a gradient descent. Non-differentiable optimization was out of the question due to cost.
To solve this, I'd implemented an optimizer that moved one variable at a time x, such that F(x) was now a 1d piecewise cubic function that I could globally maximize with analytical methods.
This was a simple algorithm where I intersected graphs of the f_i to figure out where they're minimal, then maximize the whole thing analytically section by section.
In debugging this, something jumped out: coefficients corresponding to second and third derivative were always zero. What the hell was wrong with my implementation?? Did I compute the coefficients wrong?
After a lot of head scratching and code back and forth, I went back to the scratchpad, looked at these functions more closely, and realized they're cubic of all variables, but linear of any given variable. This should have been obvious, as it was a determinant of a matrix whose columns or rows depended linearly on the variables. Noticing this would have been 1st year math curriculum.
This changed things radically as I could now recast my maxmin problem as a Linear Program, which has very efficient numerical solvers (e.g. Dantzig's simplex algorithm). These give you the global optimum to machine precision, and are very fast on small problems. As a bonus, I could actually move three variables at once --- not just one ---, as those were separate rows of the matrix. Or I could even move N at once, as those were separate columns. This could beat all the differentiable optimization based approaches that people had been doing on all counts (quality of the extrema and speed), using regularizations of F.
The end result is what I'd consider one of the few things not busy work in my PhD thesis, an actual novel result that brings something useful to the table. To say this has been adopted at all is a different matter, but I'm satisfied with my result which, in the end, is mathematical in nature. It still baffles me that no-one had stumbled on this simple property despite the compute cycles wasted on solving this problem, which coincidentally is often stated as one of the main reasons the overarching field is still not as popular as it could be.
From this episode, I deduced two things. Firstly, the right a priori mathematical insight can save a lot of time in designing misfit algorithms, and then implementing and debugging them. I don't recall exactly, but this took me about two months or so, as I tried different approaches. Secondly, the right mathematical insight can be easy to miss. I had been blinded by the fact no-one had solved this problem before, so I assumed it must have had a hard solution. Something as trivial as this was not even imaginable to me.
Now I try to be a little more careful and not jump into code right away when meeting a novel problem, and at least consider if there isn't a way it can be recast to a simpler problem. Recasting things to simpler or known problems is basically the essence of mathematics, isn't it?