top | item 46780901

(no title)

Fiveplus | 1 month ago

>The goal is, that xfwl4 will offer the same functionality and behavior as xfwm4 does...

I wonder how strictly they interpret behavior here given the architectural divergence?

As an example, focus-stealing prevention. In xfwm4 (and x11 generally), this requires complex heuristics and timestamp checks because x11 clients are powerful and can aggressively grab focus. In wayland, the compositor is the sole arbiter of focus, hence clients can't steal it, they can only request it via xdg-activation. Porting the legacy x11 logic involves the challenge of actually designing a new policy that feels like the old heuristic but operates on wayland's strict authority model.

This leads to my main curiosity regarding the raw responsiveness of xfce. On potato hardware, xfwm4 often feels snappy because it can run as a distinct stacking window manager with the compositor disabled. Wayland, by definition forces compositing. While I am not concerned about rust vs C latency (since smithay compiles to machine code without a GC), I am curious about the mandatory compositing overhead. Can the compositor replicate the input-to-pixel latency of uncomposited x11 on low-end devices or is that a class of performance we just have to sacrifice for the frame-perfect rendering of wayland?

discuss

order

kelnos|1 month ago

(xfwl4 author here.)

> I wonder how strictly they interpret behavior here given the architectural divergence?

It's right there in the rest of the sentence (that you didn't quote all of): "... or as much as possible considering the differences between X11 and Wayland."

I'll do my best. It won't be exactly the same, of course, but it will be as close as I can get it.

> As an example, focus-stealing prevention.

Focus stealing prevention is a place where I think xfwl4 could be at an advantage over xfwm4. Xfwm4 does a great job at focus-stealing prevention, but it has to work on a bunch of heuristics, and sometimes it just does the wrong thing, and there's not much we can do about it. Wayland's model plus xdg-activation should at least make the focus-or-don't-focus decision much more consistent.

> I am curious about the mandatory compositing overhead. Can the compositor replicate the input-to-pixel latency of uncomposited x11 on low-end devices or is that a class of performance we just have to sacrifice for the frame-perfect rendering of wayland?

I'm not sure yet, but I suspect your fears are well-founded here. On modern (and even not-so-modern) hardware, even low-end GPUs should be fine with all this (on my four-year-old laptop with Intel graphics, I can't tell the difference performance-wise with xfwm4's compositor on or off). But I know people run Xfce/X11 on very-not-modern hardware, and those people may unfortunately be left behind. But we'll see.

argulane|1 month ago

If xfwl4 plans to implement something like sway output max_render_time, then input to pixel output latency should be same or even lower than x11

pjmlp|1 month ago

At least they are honest regarding the reasons, not a wall of text to justify what bails down to "because I like it".

Naturally these kinds of having a language island create some attrition regarding build tooling, integration with existing ecosystem and who is able to contribute to what.

So lets see how it evolves, even with my C bashing, I was a much happier XFCE user than with GNOME and GJS all over the place.

amazari|1 month ago

You know that all the Wayland primitives, event handling and drawing in gnome-shell are handled in C/native code through Mutter, right ? The JavaScript in gnome-shell is the cherry on top for scripting, similar to C#/Lua (or any GCed language) in game engines, elisp in Emacs, event JS in QtQuick/QML.

It is not the performance bottleneck people seem to believe.

simoncion|1 month ago

> ...or is that a class of performance we just have to sacrifice for the frame-perfect rendering of wayland?

I think I know what "frame perfect" means, and I'm pretty sure that you've been able to get that for ages on X11... at least with AMD/ATi hardware. Enable (or have your distro enable) the TearFree option, and there you go.

I read somewhere that TearFree is triple buffering, so -if true- it's my (perhaps mistaken) understanding that this adds a frame of latency.

wtallis|1 month ago

> I read somewhere that TearFree is triple buffering, so -if true- it's my (perhaps mistaken) understanding that this adds a frame of latency.

True triple buffering doesn't add one frame of latency, but since it enforces only whole frames be sent to the display instead of tearing, it can cause partial frames of latency. (It's hard to come up with a well-defined measure of frame latency when tearing is allowed.)

But there have been many systems that abused the term "triple buffering" to refer to a three-frame queue, which always does add unnecessary latency, making it almost always the wrong choice for interactive systems.

gryn|1 month ago

only on the primary display. once you had more than one display there were only workarounds.

badsectoracula|1 month ago

One thing to keep in mind is that composition does not mean you have to do it with vsync, you can just refresh the screen the moment a client tells you the window has new contents.

mikkupikku|1 month ago

Compositor overhead even with cheapo Intel laptop graphics is basically a non-issue these days. The people still rocking their 20 year old thinkpads might want to choose something else, but besides that kind of user I don't think it's worth worrying too much about.

josefx|1 month ago

It isn't always pure overhead, but also jitter, additional delays and other issues caused by the indirection. Most systems have a way to mostly override the compositor for fullscreen windows and for games and other applications where visible jitter and delays are an issue you want that even on modern hardware.

aktau|1 month ago

That matches what I recall too, back when I ran a very cheap integrated intel (at least that's what I recall) card on my underpowered laptop. I posted a few days ago with screenshots of my 2009 setup with awesome+xcompmgr, and I remember it being very snappy (much more so than my tuned Windows XP install at the time). https://news.ycombinator.com/item?id=46717701

i80and|1 month ago

I ran xfwm's compositor back when it was first introduced on a 400 MHz Pentium II with a GeForce 2. It was fully fine.

The compositing tax is just waiting for vsync; unless your machine is, like, a Pentium Classic, compositing itself isn't a problem.

jchw|1 month ago

> Can the compositor replicate the input-to-pixel latency of uncomposited x11 on low-end devices or is that a class of performance we just have to sacrifice for the frame-perfect rendering of wayland?

I think this is ultimately correct. The compositor will have to render a frame at some point after the VBlank signal, and it will need to render with it the buffers on-screen as of that point, which will be from whatever was last rendered to them.

This can be somewhat alleviated, though. Both KDE and GNOME have been getting progressively more aggressive about "unredirecting" surfaces into hardware accelerated DRM planes in more circumstances. In this situation, the unredirected planes will not suffer compositing latency, as their buffers will be scanned out by the GPU at scanout time with the rest of the composited result. In modern Wayland, this is accomplished via both underlays and overlays.

There is also a slight penalty to the latency of mouse cursor movement that is imparted by using atomic DRM commits. Since using atomic DRM is very common in modern Wayland, it is normal for the cursor to have at least a fraction of a frame of added latency (depending on many factors.)

I'm of two minds about this. One, obviously it's sad. The old hardware worked perfectly and never had latency issues like this. Could it be possible to implement Wayland without full compositing? Maybe, actually. But I don't expect anyone to try, because let's face it, people have simply accepted that we now live with slightly more latency on the desktop. But then again, "old" hardware is now hardware that can more often than not, handle high refresh rates pretty well on desktop. An on-average increase of half a frame of latency is pretty bad with 60 Hz: it's, what, 8.3ms? But half a frame at 144 Hz is much less at somewhere around 3.5ms of added latency, which I think is more acceptable. Combined with aggressive underlay/overlay usage and dynamic triple buffering, I think this makes the compositing experience an acceptable tradeoff.

What about computers that really can't handle something like 144 Hz or higher output? Well, tough call. I mean, I have some fairly old computers that can definitely handle at least 100 Hz very well on desktop. I'm talking Pentium 4 machines with old GeForce cards. Linux is certainly happy to go older (though the baseline has been inching up there; I think you need at least Pentium now?) but I do think there is a point where you cross a line where asking for things to work well is just too much. At that point, it's not a matter of asking developers to not waste resources for no reason, but asking them to optimize not just for reasonably recent machines but also to optimize for machines from 30 years ago. At a certain point it does feel like we have to let it go, not because the computers are necessarily completely obsolete, but because the range of machines to support is too wide.

Obviously, though, simply going for higher refresh rates can't fix everything. Plenty of laptops have screens that can't go above 60 Hz, and they are forever stuck with a few extra milliseconds of latency when using a compositor. It is unideal, but what are you going to do? Compositors offer many advantages, it seems straightforward to design for a future where they are always on.

drob518|1 month ago

Love your post. So, don’t take this as disagreement.

I’m always a little bewildered by frame rate discussions. Yes, I understand that more is better, but for non-gaming apps (e.g. “productivity” apps), do we really need much more than 60 Hz? Yes, you can get smoother fast scrolling with higher frame rate at 120 Hz or more, but how many people were complaining about that over the last decade?

michaelmrose|1 month ago

I couldn't find ready stats on what percentage of displays are 60 hz but outside of gaming and high end machines I suspect 60 hz is still the majority of of machines used by actual users meaning we should evaluate the latency as it is observed by most users.

account42|1 month ago

> As an example, focus-stealing prevention. In xfwm4 (and x11 generally), this requires complex heuristics and timestamp checks because x11 clients are powerful and can aggressively grab focus. In wayland, the compositor is the sole arbiter of focus, hence clients can't steal it, they can only request it via xdg-activation. Porting the legacy x11 logic involves the challenge of actually designing a new policy that feels like the old heuristic but operates on wayland's strict authority model.

Not that that's necessarily the best way to do it but nothing stops xfwl4 from simply granting every focus request and then applying their existing heuristics on the result of that.

PunchyHamster|1 month ago

> Can the compositor replicate the input-to-pixel latency of uncomposited x11 on low-end devices or is that a class of performance we just have to sacrifice for the frame-perfect rendering of wayland?

well, the answer is just no, wayland has been consistently slower than X11 and nothing running on top can't really go around that

ca1f|1 month ago

Can you cite any sources for that claim? I found this blog post that says wayland is pretty much on par with X11 except for XWayland, which should be considered a band-aid only anyways: https://davidjusto.com/articles/m2p-latency/

happymellon|1 month ago

> wayland has been consistently slower than X11

Wayland is a specification, it has an inability to be "faster" than other options. That's like saying JSON is 5% slower than Word.

And as for the implementations being slower than X, that also doesn't reflect reality.

https://www.phoronix.com/review/ubuntu-2504-x11-gaming

michaelmrose|1 month ago

There is no Wayland to run on top of as its a standard to implement rather than a server to talk to.