top | item 45919496

(no title)

lxe | 3 months ago

I think the author makes a good point about understanding structure over symbol manipulation, but there's a slippery slope here that bothers me.

In practice, I find it much more productive to start with a computational solution - write the algorithm, make it work, understand the procedure. Then, if there's elegant mathematical structure hiding in there, it reveals itself naturally. You optimize where it matters.

The problem is math purists will look at this approach and dismiss it as "inelegant" or "brute force" thinking. But that's backwards. A closed-form solution you've memorized but don't deeply understand is worse than an iterative algorithm you've built from scratch and can reason about clearly.

Most real problems have perfectly good computational solutions. The computational perspective often forces you to think through edge cases, termination conditions, and the actual mechanics of what's happening - which builds genuine intuition. The "elegant" closed-form solution often obscures that structure.

I'm not against finding mathematical elegance. I'm against the cultural bias that treats computation as second-class thinking. Start with what works. Optimize when the structure becomes obvious. That's how you actually solve problems.

discuss

order

godelski|3 months ago

  Mathematics is not the study of numbers, but the relationships between them
  - Henry Poincaré
 
I want to stress this because I think you have too rigid of a definition of math. Your talk about optimization sounds odd to me as someone who starts with math first. Optimization is done with a profiler. Sure, I'll also use math to find that solution but I don't start with optimization nor do I optimize by big O.

Elegance is not first. First is rough. Solving by math sounds much like what you describe. I find my structures, put them together, and find the interactions. Elegance comes after cleaning things up. It's towards the end of the process, not the beginning. We don't divine math just as you don't divine code. I'm just not sure how you get elegance from the get go.

So I find it weird that you criticize a math first approach because your description of a math approach doesn't feel all that accurate to me.

Edit: I do also want to mention that there's a correspondence between math and code. They aren't completely isomorphic because math can do a lot more and can be much more arbitrarily constructed, but the correspondence is key to understanding how these techniques are not so different.

saulpw|3 months ago

Some people like Peter Norvig prefer top-down, hackers like me and you prefer bottom-up. Many problems can be solved either way. But for some problems, if you use the wrong approach, you're gonna have a bad time. See Ron Jeffries' attempt to solve sudoku.

The top-down (mathematical) approach can also fail, in cases where there's not an existing math solution, or when a perfectly spherical cow isn't an adequate representation of reality. See Minix vs Linux, or OSI vs TCP/IP.

lxe|3 months ago

Fair point about problem-fit - some problems do naturally lend themselves to one approach over the other.

But I think the Sudoku example is less about top-down vs bottom-up and more about dogmatic adherence to abstractions (OOP in that case). Jeffries wasn't just using a 'hacker' approach - he was forcing everything through an OOP lens that fundamentally didn't fit the problem structure.

But yes, same issue can happen with the 'mathematical' approach - forcing "elegant" closed-form thinking onto problems that are inherently messy or iterative.

kenjackson|3 months ago

I'd argue that everyone solves problems bottoms up. It's just that some people have done the problem before (or a variant of it) so they have already constructed a top-down schema for it.

liquid_bluing|3 months ago

The hacker’s mentality is like that of the painter who spends months on a portrait in order to produce a beautiful but imperfect likeness, marked with his own personal style, which few can replicate and people pay a lot for. The mathematical approach is to take a photo because someone figured out how to perfectly reproduce images on paper over a hundred years ago and I just want a picture, dammit, but the camera’s manual is in Lojban.

IMO, the mathematical approach is essentially always better for software; nearly every problem that the industry didn’t inflict upon itself was solved by some egghead last century. But there is a kind of joy in creating pointless difficulties at enormous cost in order to experience the satisfaction of overcoming them without recourse to others, I suppose.

orforforof|3 months ago

I really enjoyed the book Mathematica by David Bessis, who writes about his creative process as a mathematician. He makes a case that formal math is usually the last step to refine/optimize an idea, not the starting point as is often assumed. His point is to push against the cultural idea that math == symbols. Sounds similar to some of what you're describing.

whilenot-dev|3 months ago

I really didn't like that book. Its basic premise was that we should separate the idea of mathematics from the formalities of mathematics, we should aim to imagine mathematical problems visually. The later chapters then consist of an elephant drawing that isn't true to scale and tell me why David Bessis thought it would be best to create an AI startup, that just put the final nail in the coffin for me. There's some historical note here and there, but that's it - it really could've been a blog post.

Every single YouTube video from tom7[0] or 3blue1brown[1] do way more on transmitting the fascinations of mathematics.

[0]: https://www.youtube.com/@tom7

[1]: https://www.youtube.com/@3blue1brown

krikou|3 months ago

Indeed, this is a fantastic book.

I could relate how he described the mathematical experience with what I feel is happening in my head/brain when I do programming.

thenobsta|3 months ago

Amazing book. I love how he brings math into something tacit and internal.

vatsachak|3 months ago

I have math papers in top journals and that's exactly how I did math;

Just get a proof of the open problem no matter how sketchy. Then iterate and refine.

But people love to reinvent the wheel without caring about abstractions, resulting in languages like Python being the defacto standard for machine learning

wiz21c|3 months ago

Now there's engineering and math. Engineering use maths to solve problems and when writing programs, you usually tinker with your data until the math tools pops in your mind (e.g. first look at your data then conclude that a normal distribution is the way to think about them). BAsically, one uses existing math tools. In math it's more about proving something new, building new tools I guess.

Sidenote: I code fluid dynamics stuff (I'm trained in computer science, not at all in physics). It's funny to see how the math and physics deeply affect the way I code (and not the other way around). Math and physics laws feels unescapable and my code usually have to be extremely accurate to handle these laws correctly. When debugging that code, usually, thinking math/physics first is the way to go as they allow you to narrow the (code) bug more quickly. And if all fails, then usually, it's back to the math/physics drawing board :-)

ViscountPenguin|3 months ago

3Blue1Brown has a great video which frames this as a cultural problem that also exists in mathematics pedagogy:

https://www.youtube.com/watch?v=ltLUadnCyi0

Personally, I find a mix of all three approaches (programming, pen and paper, and "pure" mathematical structural thought) to be best.

zdkaster|3 months ago

I completely agree. Start with what works, rough, understand it a bit deeper develop better solutions. Any trial-error, brute force or inelegant makes more natural for practioner. I think this aligns with George Pólya https://en.wikipedia.org/wiki/How_to_Solve_It book. The brute force is more productive and will build better intuition when you will realize the pattern and so elegant will come.

MITSardine|3 months ago

Math isn't about memorizing closed-form solutions, but analyzing the behavior of mathematical objects.

That said, I mostly agree with you, and I thought I'd share an anecdote where a math result came from a premature implementation.

I was working on maximizing the minimum value of a set of functions f_i that depend on variables X. I.e., solve max_X min_i f_i(X).

The f_i were each cubic, so F(X) = min_i f_i(X) was piecewise cubic. X was dimension 3xN, N arbitrarily large. This is intractable to solve as, F being non-smooth (derivatives are discontinuous), you can't well throw it at Newton's method or a gradient descent. Non-differentiable optimization was out of the question due to cost.

To solve this, I'd implemented an optimizer that moved one variable at a time x, such that F(x) was now a 1d piecewise cubic function that I could globally maximize with analytical methods.

This was a simple algorithm where I intersected graphs of the f_i to figure out where they're minimal, then maximize the whole thing analytically section by section.

In debugging this, something jumped out: coefficients corresponding to second and third derivative were always zero. What the hell was wrong with my implementation?? Did I compute the coefficients wrong?

After a lot of head scratching and code back and forth, I went back to the scratchpad, looked at these functions more closely, and realized they're cubic of all variables, but linear of any given variable. This should have been obvious, as it was a determinant of a matrix whose columns or rows depended linearly on the variables. Noticing this would have been 1st year math curriculum.

This changed things radically as I could now recast my maxmin problem as a Linear Program, which has very efficient numerical solvers (e.g. Dantzig's simplex algorithm). These give you the global optimum to machine precision, and are very fast on small problems. As a bonus, I could actually move three variables at once --- not just one ---, as those were separate rows of the matrix. Or I could even move N at once, as those were separate columns. This could beat all the differentiable optimization based approaches that people had been doing on all counts (quality of the extrema and speed), using regularizations of F.

The end result is what I'd consider one of the few things not busy work in my PhD thesis, an actual novel result that brings something useful to the table. To say this has been adopted at all is a different matter, but I'm satisfied with my result which, in the end, is mathematical in nature. It still baffles me that no-one had stumbled on this simple property despite the compute cycles wasted on solving this problem, which coincidentally is often stated as one of the main reasons the overarching field is still not as popular as it could be.

From this episode, I deduced two things. Firstly, the right a priori mathematical insight can save a lot of time in designing misfit algorithms, and then implementing and debugging them. I don't recall exactly, but this took me about two months or so, as I tried different approaches. Secondly, the right mathematical insight can be easy to miss. I had been blinded by the fact no-one had solved this problem before, so I assumed it must have had a hard solution. Something as trivial as this was not even imaginable to me.

Now I try to be a little more careful and not jump into code right away when meeting a novel problem, and at least consider if there isn't a way it can be recast to a simpler problem. Recasting things to simpler or known problems is basically the essence of mathematics, isn't it?