It's not a likely solution given how loss functions work, but in theory a single model could learn to perform exactly the function you describe. When you say "just do X" where X is any function (in this case, a piecewise function), a large enough model could do it.
sandpaper26|2 years ago
liquidpele|2 years ago
treprinum|2 years ago
sandpaper26|2 years ago
And part of the reason why single-hidden-layer networks aren't enough even in continuous memoryless Euclidean cases is, again, because of how loss functions work; you're unlikely to converge on a good approximation with very few hidden layers.