top | item 24295605

(no title)

This is awesome! This reminds me of MinBLEP audio synthesis of discontinuous functions (https://www.cs.cmu.edu/~eli/papers/icmc01-hardsync.pdf). Instead of doing things at high sampling rate and explicitly filtering, generate the band-limited version directly.

In the article, talking about smoothstep approximation of sinc: "I'd argue the smoothstep version looks better" Why would this be? I would have thought the theoretically correct sinc version would look nicer.

discuss

GuB-42|5 years ago

Short answer: ringing artefacts

sinc is perfect if you are looking only at frequency response. But in images, you also want to preserve locality, that is, processing of one part of the image should not affect the rest of the image. For example, sharpening an edge should only affect the edge in question, not its surroundings. It comes in contradiction with the idea of preserving frequency response. Frequencies are about sine waves, and sine waves are wide, infinitely wide in fact.

BTW, that's also the reason why in quantum mechanics you can't know both position and momentum (a frequency) precisely.

So we need to compromise, and like in scaling algorithms, you have 3 qualities: sharpness (the result of a good frequency response), locality and aliasing. You can't have all 3, so you need to pick the most pleasant combination.

The extreme cases are:

- Point sampling: excellent locality and sharpness, terrible aliasing

- Linear filtering: excellent locality and no aliasing, very blurry

- sinc filtering: excellent sharpness and no aliasing, terrible locality (ringing artefacts)

Using smoothstep is a good compromise, it has a bit of aliasing because it is a step function and it has a bit of smearing because it is smooth but none of the effects as so bad as to be unpleasant.

Side note: for audio, frequency response is more important than locality, that's why sinc window functions are so popular.

shoo|5 years ago

> "I'd argue the smoothstep version looks better" Why would this be? I would have thought the theoretically correct sinc version would look nicer.

For a fixed well-defined mathematical problem, you might be able to solve it optimally or approximately. One perspective is to treat the problem as given and immutable and then try to compute an exact or optimal solution.

But often the original problem statement is fairly arbitrary, based on a bunch of guesses or simplifications, and you might be able to get a better result by changing the problem definition (perhaps unfortunately making it much messier to solve exactly) and then solving the new problem statement approximately.

What's the actual problem we're trying to solve here? Generate something that looks visually pleasing. Why is an expression involving cosine the natural way to define that problem statement mathematically? There's likely a lot of freedom to here to vary our problem definition.

It might be interesting to start with the smoothstep multiplied result and take the derivative and look at how that differs from a normal cosine, and ponder why that might produce a more pleasing result than a cosine.

skybrian|5 years ago

It seems like a theoretically correct box filter might not actually be the best filter to use? By approximating it you get a different filter, and whether it's a better filter is something you need to judge by looking at the result.

It looks like the sinc version is still adding a little bit of some higher frequencies (the dampened sine wave), and the approximation doesn't. Maybe those higher frequencies don't actually make things look better?

gnramires|5 years ago

> In the article, talking about smoothstep approximation of sinc: "I'd argue the smoothstep version looks better" Why would this be? I would have thought the theoretically correct sinc version would look nicer.

In this case we are sort of mimicking the eye. The eye doesn't do sinc-bandlimiting (it does a sort of angular integration -- it sums the photons received in a region).

I say "sort of", because we're really doing two steps: first we are projecting a scene into a screen, and then the eye is viewing the screen. We want (in most cases) that what the eye sees in the screen corresponds to what it would see directly (if seeing the scene in reality).

The naive rendering approach simply samples an exact point for each pixel. When there's high pixel variation (higher spatial frequency than the pixel frequency), as you move the camera the samples will alternate rapidly which wouldn't correspond to the desired eye reconstruction. The eye would see approximately an averaged (integrated) color over a small smooth angular window.

Note we really never get the perfect eye reconstruction unless the resolution of your display is much larger than the resolution your eye can perceive[1]. But through anti-aliasing at least this sampling artifact disappears.

This window-integration is not an ideal sinc filtering! Actually it's not bandlimiting at all! since it is a finite-support convolution -- bandlimiting is just a convenient theoretical (approximate/incorrect) description.

In the frequency domain this convolution is not a box (ideal sinc filtering), it's smooth with ripples. In the spatial domain (that's really used here), it probably does look something like a smoothstep (a smooth window)[2]. The details don't matter if the resolution is large[3].

[1] Plus we would actually need to model other optical effects of the eye (like focus and aberration) that I won't go into detail :) But you can ask if interested.

[2] It looks something like this: https://foundationsofvision.stanford.edu/wp-content/uploads/... found here: https://foundationsofvision.stanford.edu/chapter-2-image-for... This describes only the optical behavior of the eye, there's also the sampling behavior of the retina.

[3] Because our own eye integrates the pixels anyway. Again this does ignore other optical effects of the eye (such as "focus" and aberration) that vary with distance to the focal plane, and more.

TL;DR: The correct function looks something like this https://foundationsofvision.stanford.edu/wp-content/uploads/... , which seems close to a smoothstep.

gnramires|5 years ago

Just fixing a mistake: what I described is the window function, not the integrated (in this case, cosine) function that was used in the article. In this case there would still be ripples when applying the shown window function (in cosine integration). I do think ripple-free are probably better functions (or some faster decaying ripples) because of limited floating point precision generating artifacts (which can be seen in the center in the second demo).

Experimentally playing around a little I've found

fcos = cos(x) * sin(0.5 * w)/(0.5 * w) * smoothstep(6.28,0.0,0.38*w)

To be a good compromise between eliminating high frequency ripple and maintaining good definition.

CyberDildonics|5 years ago

The smoothstep function is close to a gaussian function, which is very difficult to beat as a pixel filter.