(no title)
Fiveplus | 1 month ago
I wonder how strictly they interpret behavior here given the architectural divergence?
As an example, focus-stealing prevention. In xfwm4 (and x11 generally), this requires complex heuristics and timestamp checks because x11 clients are powerful and can aggressively grab focus. In wayland, the compositor is the sole arbiter of focus, hence clients can't steal it, they can only request it via xdg-activation. Porting the legacy x11 logic involves the challenge of actually designing a new policy that feels like the old heuristic but operates on wayland's strict authority model.
This leads to my main curiosity regarding the raw responsiveness of xfce. On potato hardware, xfwm4 often feels snappy because it can run as a distinct stacking window manager with the compositor disabled. Wayland, by definition forces compositing. While I am not concerned about rust vs C latency (since smithay compiles to machine code without a GC), I am curious about the mandatory compositing overhead. Can the compositor replicate the input-to-pixel latency of uncomposited x11 on low-end devices or is that a class of performance we just have to sacrifice for the frame-perfect rendering of wayland?
kelnos|1 month ago
> I wonder how strictly they interpret behavior here given the architectural divergence?
It's right there in the rest of the sentence (that you didn't quote all of): "... or as much as possible considering the differences between X11 and Wayland."
I'll do my best. It won't be exactly the same, of course, but it will be as close as I can get it.
> As an example, focus-stealing prevention.
Focus stealing prevention is a place where I think xfwl4 could be at an advantage over xfwm4. Xfwm4 does a great job at focus-stealing prevention, but it has to work on a bunch of heuristics, and sometimes it just does the wrong thing, and there's not much we can do about it. Wayland's model plus xdg-activation should at least make the focus-or-don't-focus decision much more consistent.
> I am curious about the mandatory compositing overhead. Can the compositor replicate the input-to-pixel latency of uncomposited x11 on low-end devices or is that a class of performance we just have to sacrifice for the frame-perfect rendering of wayland?
I'm not sure yet, but I suspect your fears are well-founded here. On modern (and even not-so-modern) hardware, even low-end GPUs should be fine with all this (on my four-year-old laptop with Intel graphics, I can't tell the difference performance-wise with xfwm4's compositor on or off). But I know people run Xfce/X11 on very-not-modern hardware, and those people may unfortunately be left behind. But we'll see.
argulane|1 month ago
pjmlp|1 month ago
Naturally these kinds of having a language island create some attrition regarding build tooling, integration with existing ecosystem and who is able to contribute to what.
So lets see how it evolves, even with my C bashing, I was a much happier XFCE user than with GNOME and GJS all over the place.
amazari|1 month ago
It is not the performance bottleneck people seem to believe.
simoncion|1 month ago
I think I know what "frame perfect" means, and I'm pretty sure that you've been able to get that for ages on X11... at least with AMD/ATi hardware. Enable (or have your distro enable) the TearFree option, and there you go.
I read somewhere that TearFree is triple buffering, so -if true- it's my (perhaps mistaken) understanding that this adds a frame of latency.
wtallis|1 month ago
True triple buffering doesn't add one frame of latency, but since it enforces only whole frames be sent to the display instead of tearing, it can cause partial frames of latency. (It's hard to come up with a well-defined measure of frame latency when tearing is allowed.)
But there have been many systems that abused the term "triple buffering" to refer to a three-frame queue, which always does add unnecessary latency, making it almost always the wrong choice for interactive systems.
gryn|1 month ago
badsectoracula|1 month ago
mikkupikku|1 month ago
josefx|1 month ago
aktau|1 month ago
i80and|1 month ago
The compositing tax is just waiting for vsync; unless your machine is, like, a Pentium Classic, compositing itself isn't a problem.
jchw|1 month ago
I think this is ultimately correct. The compositor will have to render a frame at some point after the VBlank signal, and it will need to render with it the buffers on-screen as of that point, which will be from whatever was last rendered to them.
This can be somewhat alleviated, though. Both KDE and GNOME have been getting progressively more aggressive about "unredirecting" surfaces into hardware accelerated DRM planes in more circumstances. In this situation, the unredirected planes will not suffer compositing latency, as their buffers will be scanned out by the GPU at scanout time with the rest of the composited result. In modern Wayland, this is accomplished via both underlays and overlays.
There is also a slight penalty to the latency of mouse cursor movement that is imparted by using atomic DRM commits. Since using atomic DRM is very common in modern Wayland, it is normal for the cursor to have at least a fraction of a frame of added latency (depending on many factors.)
I'm of two minds about this. One, obviously it's sad. The old hardware worked perfectly and never had latency issues like this. Could it be possible to implement Wayland without full compositing? Maybe, actually. But I don't expect anyone to try, because let's face it, people have simply accepted that we now live with slightly more latency on the desktop. But then again, "old" hardware is now hardware that can more often than not, handle high refresh rates pretty well on desktop. An on-average increase of half a frame of latency is pretty bad with 60 Hz: it's, what, 8.3ms? But half a frame at 144 Hz is much less at somewhere around 3.5ms of added latency, which I think is more acceptable. Combined with aggressive underlay/overlay usage and dynamic triple buffering, I think this makes the compositing experience an acceptable tradeoff.
What about computers that really can't handle something like 144 Hz or higher output? Well, tough call. I mean, I have some fairly old computers that can definitely handle at least 100 Hz very well on desktop. I'm talking Pentium 4 machines with old GeForce cards. Linux is certainly happy to go older (though the baseline has been inching up there; I think you need at least Pentium now?) but I do think there is a point where you cross a line where asking for things to work well is just too much. At that point, it's not a matter of asking developers to not waste resources for no reason, but asking them to optimize not just for reasonably recent machines but also to optimize for machines from 30 years ago. At a certain point it does feel like we have to let it go, not because the computers are necessarily completely obsolete, but because the range of machines to support is too wide.
Obviously, though, simply going for higher refresh rates can't fix everything. Plenty of laptops have screens that can't go above 60 Hz, and they are forever stuck with a few extra milliseconds of latency when using a compositor. It is unideal, but what are you going to do? Compositors offer many advantages, it seems straightforward to design for a future where they are always on.
drob518|1 month ago
I’m always a little bewildered by frame rate discussions. Yes, I understand that more is better, but for non-gaming apps (e.g. “productivity” apps), do we really need much more than 60 Hz? Yes, you can get smoother fast scrolling with higher frame rate at 120 Hz or more, but how many people were complaining about that over the last decade?
michaelmrose|1 month ago
account42|1 month ago
Not that that's necessarily the best way to do it but nothing stops xfwl4 from simply granting every focus request and then applying their existing heuristics on the result of that.
PunchyHamster|1 month ago
well, the answer is just no, wayland has been consistently slower than X11 and nothing running on top can't really go around that
ca1f|1 month ago
happymellon|1 month ago
Wayland is a specification, it has an inability to be "faster" than other options. That's like saying JSON is 5% slower than Word.
And as for the implementations being slower than X, that also doesn't reflect reality.
https://www.phoronix.com/review/ubuntu-2504-x11-gaming
michaelmrose|1 month ago
imcritic|1 month ago
stonogo|1 month ago
https://gitlab.xfce.org/xfce/xfwm4/-/blob/master/settings-di...