top | item 43266633

(no title)

I agree with your observation about the exact noise-free nature of the problem. It allows them to formulate the problem as "minimize complexity such that you memorize the X-y relationship exactly". This would need to be generalized to the noisy case: instead of demanding exact memorization, you'd need to prescribe an error budget. But then this error budget seems like an effective complexity metaparameter, doesn't it, and we're back to square zero of cross-validation.

discuss

ealexhudson|1 year ago

If we think of the 'budget' as being similar to a bandwidth limit on video playback, there's a kind of line below which the picture starts being pretty unintelligible, but for the most part that's a slider: the less the budget, the slightly less accurate playback you get.

But because this is clean data, I wonder if there's basically a big gap here: the codec that encodes the "correct rule" can achieve a step-change lower bandwidth requirement than similar-looking solutions. The most elegant ruleset - at least in this set of puzzles - always compresses markedly better. And so you can kind of brute-force the correct rule by trying lots of encoding strategies, and just identify which one gets you that step-change compression benefit.