Love such articles where I learn something new. cdb is completely new to me. It's apparently the Microsoft Console Debugger. For others like me who were wondering how `eb win32u!NtUserSetLayeredWindowAttributes c3` neutered the window animation:
"By executing this command, you are effectively replacing the first byte of the `NtUserSetLayeredWindowAttributes` function with a `ret` instruction. This means that any call to `NtUserSetLayeredWindowAttributes` will immediately return without executing any of its original code. This can be used to bypass or disable the functionality of this function"
- eb[0] "enters bytes" into memory at the specified location;
- The RETN[1] instruction is encoded as C3 in x86 opcodes; and
- Debuggers will typically load ELF symbols so you can refer to memory locations with their names, i.e. function names refer to their jump target.
Putting those three together, we almost get the author's command. I'm not sure about the "win32u!NtUser" name prefix, though. Is it name-munging performed on the compiler side? Maybe some debugger syntax thrown in to select the dll source of the name?
I thought so, too. I'm not interested enough to benchmark it, but for all practical purposes it's instantaneous on my machine. As fast to open a new terminal as it is to switch to the existing one.
interesting side note, our brain is compensating for delay, it can do it to around 250ms
so if anything lags up to that amount our brain will compensate and make it feel imstantenious
there was interesting experiment that I reproduced at university, create app that slowly build up delay to clicks to allow brain to adapt, and then remove it completely.
result is that you have feeling that it reacts just before you actually click until brain adapts again to new timing
> I’ve been using this configuration for a few days, so far it’s working great. I haven’t noticed any issues running it this way.
The journey was very useful, even the destination may be pretty specific to your needs. The process of how to go about debugging minor annoyances like this is really hard to learn about.
Just for fun I did film some video footage from my 60Hz monitor to see how quickly my terminal starts up. Seems like 2-3 frames to show up the terminal window, and 1-2 frames to show shell prompt. So 50 ms - 83 ms. This is with foot terminal on Sway.
My very unscientific methodology was to run
$ echo hello && foot
in a terminal and measure the time between the hello text appearing and the new window appearing. Looking at my video, the time from physical key press to "hello" text appearing might be 20ish ms but that is less clear, so about 100 ms total from key press to shell prompt.
This is pretty much completely untuned setup, I haven't done any tweaks to improve the figures. Enabling foot server might shave some milliseconds, but tbh I don't feel that's necessary.
It'd be fun to do this with better camera, and also with better monitors. Idk how difficult it would be to mod in some LED to keyboard to capture the exact moment the key is activated, just trying to eyeball the key movement is not very precise.
I'm at the tail end of my career, so working on efficiency gains like this doesn't usually add up for me.
However I was interested in knowing whether it does for the author.
Assuming he/she does suffer this 1300 ms delay "hundreds" of times a day (let's say 200) and for the sake of argument they use their computer 300 days a year and have 20 years of such work ahead of them with this config, then this inefficiency will total 1300 x 200 x 300 x 20 / 1000 / 60 / 60 hours wasted during author's lifetime - some 430 hours.
I had a printout of [1] at my office.
Of course at is base it is only a simple multiplication table, but nevertheless is reminded me several time that a issue is worth fixing.
I'm so distracted by latency that I run my macOS with vsync disabled 24/7 (through Quartz Debug).
When I used to use Windows 10+ years ago, I had decent luck using xming + cygwin + Cygwin/X + bblean to run xterm in a minimal latency/lag environment.
I also launch Chrome/Spotify/Slack desktop using:
$ open -a open -a Google\ Chrome --args --disable-gpu-vsync --disable-smooth-scrolling
One way to have the cake and eat it too is to upgrade to a high-refresh rate display. No tearing + less latency + smoother display. Although it's diminishing returns even 60Hz -> 144Hz+ will make a lot of difference. On a 240Hz display, vsync penalty is just 4ms.
Also if you are using a miniLED M-class MBP, its pixel response is abysmal.
Very nice article, I love such debugging. I sometimes do it myself too.
Anyway, this also made me think about general bloat we have in new OSes and programs. Im still on old OS running spinning rust and bash here starts instantly when cache is hot. I think GUI designers lost an engineer touch...
We need a community of those obsessed with responsive applications. UI latency irks me on every device. Not only computers and smart phones, but now TVs, refrigerators, cars all have atrocious UI latency.
If we're handing out tips, then as noted in a few examples from the article hyperfine is even more useful when called with multiple commands directly. It presents a concise epilogue with the information you're probably trying to gleam from a run such as yours:
$ hyperfine -L arg '1,2,3' 'sleep {arg}'
…
Summary
sleep 1 ran
2.00 ± 0.00 times faster than sleep 2
3.00 ± 0.00 times faster than sleep 3
If your commands don't share enough in common for that approach then you can declare them individually, as in "hyperfine 'blib 1' 'blob x y' 'blub --arg'", and still get the summary.
Besides learning about `hyperfine`, the combination of `xargs` to keep N warm processes ready, `LD_PRELOAD` to trick them into waiting to map their windows, and `pkill --oldest ...` to get one of those to go is quite neat.
But I have a very different solution to this problem: have just one terminal window and use and abuse `tmux`. I only use new windows (or tabs, if the terminal app has those) to run `ssh` to targets where I use `tmux`. I even nest `tmux` sessions, so essentially I've two levels of `tmux` sessions, and I title each window in the top-level session to match the name of the session running in that window -- this helps me find things very quickly. I also title windows running `vi` after the `basename` of the file being edited. Add in a simple PID-to-tmux window resolver script, scripts for utilities like `cscope` to open new windows, and this gets very comfortable, and it's fast. I even have a script that launches this whole setup should I need to reboot. Opening a new `tmux` window is very snappy!
Even 80ms seems unnecessarily slow to me. 300ms would drive me nuts ...
I'm using a tiling window manager (dwm) and interestingly the spawning time varies depending on the position that the terminal window has to be rendered to.
The fastest startup time I get on the fullscreen tiling mode.
hyperfine 'st -e true'
Benchmark 1: st -e true
Time (mean ± σ): 35.7 ms ± 10.0 ms [User: 15.4 ms, System: 4.8 ms]
Range (min … max): 17.2 ms … 78.7 ms 123 runs
The non-fullscreen one ends up at about 60ms which still seems reasonable.
vijucat|1 year ago
"By executing this command, you are effectively replacing the first byte of the `NtUserSetLayeredWindowAttributes` function with a `ret` instruction. This means that any call to `NtUserSetLayeredWindowAttributes` will immediately return without executing any of its original code. This can be used to bypass or disable the functionality of this function"
(Thanks to GitHub Copilot for that)
Also see https://learn.microsoft.com/en-us/windows-hardware/drivers/d...
xelxebar|1 year ago
- eb[0] "enters bytes" into memory at the specified location;
- The RETN[1] instruction is encoded as C3 in x86 opcodes; and
- Debuggers will typically load ELF symbols so you can refer to memory locations with their names, i.e. function names refer to their jump target.
Putting those three together, we almost get the author's command. I'm not sure about the "win32u!NtUser" name prefix, though. Is it name-munging performed on the compiler side? Maybe some debugger syntax thrown in to select the dll source of the name?
[0]:https://learn.microsoft.com/en-us/windows-hardware/drivers/d...
[1]:http://ref.x86asm.net/geek64.html#xC3
txdv|1 year ago
imp0cat|1 year ago
In fact, keeping something preloaded and ready to go is quite common, these two examples are off the top of my head:
- The Emacs server way - https://ungleich.ch/u/blog/emacs-server-the-smart-way/
- SSH connection reuse.
aardshark|1 year ago
jlarocco|1 year ago
Szpadel|1 year ago
so if anything lags up to that amount our brain will compensate and make it feel imstantenious
there was interesting experiment that I reproduced at university, create app that slowly build up delay to clicks to allow brain to adapt, and then remove it completely. result is that you have feeling that it reacts just before you actually click until brain adapts again to new timing
pcthrowaway|1 year ago
The journey was very useful, even the destination may be pretty specific to your needs. The process of how to go about debugging minor annoyances like this is really hard to learn about.
zokier|1 year ago
My very unscientific methodology was to run
in a terminal and measure the time between the hello text appearing and the new window appearing. Looking at my video, the time from physical key press to "hello" text appearing might be 20ish ms but that is less clear, so about 100 ms total from key press to shell prompt.This is pretty much completely untuned setup, I haven't done any tweaks to improve the figures. Enabling foot server might shave some milliseconds, but tbh I don't feel that's necessary.
It'd be fun to do this with better camera, and also with better monitors. Idk how difficult it would be to mod in some LED to keyboard to capture the exact moment the key is activated, just trying to eyeball the key movement is not very precise.
SushiHippie|1 year ago
mberning|1 year ago
beachy|1 year ago
However I was interested in knowing whether it does for the author.
Assuming he/she does suffer this 1300 ms delay "hundreds" of times a day (let's say 200) and for the sake of argument they use their computer 300 days a year and have 20 years of such work ahead of them with this config, then this inefficiency will total 1300 x 200 x 300 x 20 / 1000 / 60 / 60 hours wasted during author's lifetime - some 430 hours.
So well worth the effort to fix!
anonymoushn|1 year ago
sllabres|1 year ago
[1] https://xkcd.com/1205/
jd3|1 year ago
When I used to use Windows 10+ years ago, I had decent luck using xming + cygwin + Cygwin/X + bblean to run xterm in a minimal latency/lag environment.
I also launch Chrome/Spotify/Slack desktop using:
$ open -a open -a Google\ Chrome --args --disable-gpu-vsync --disable-smooth-scrolling
abhinavk|1 year ago
Also if you are using a miniLED M-class MBP, its pixel response is abysmal.
apazzolini|1 year ago
Borg3|1 year ago
Anyway, this also made me think about general bloat we have in new OSes and programs. Im still on old OS running spinning rust and bash here starts instantly when cache is hot. I think GUI designers lost an engineer touch...
aardshark|1 year ago
tonymet|1 year ago
Great debugging work to come up with a solution!
pikseladam|1 year ago
aftbit|1 year ago
JNRowe|1 year ago
cryptonector|1 year ago
But I have a very different solution to this problem: have just one terminal window and use and abuse `tmux`. I only use new windows (or tabs, if the terminal app has those) to run `ssh` to targets where I use `tmux`. I even nest `tmux` sessions, so essentially I've two levels of `tmux` sessions, and I title each window in the top-level session to match the name of the session running in that window -- this helps me find things very quickly. I also title windows running `vi` after the `basename` of the file being edited. Add in a simple PID-to-tmux window resolver script, scripts for utilities like `cscope` to open new windows, and this gets very comfortable, and it's fast. I even have a script that launches this whole setup should I need to reboot. Opening a new `tmux` window is very snappy!
pimlottc|1 year ago
https://github.com/sharkdp/hyperfine
Shugyousha|1 year ago
Even 80ms seems unnecessarily slow to me. 300ms would drive me nuts ...
I'm using a tiling window manager (dwm) and interestingly the spawning time varies depending on the position that the terminal window has to be rendered to.
The fastest startup time I get on the fullscreen tiling mode.
The non-fullscreen one ends up at about 60ms which still seems reasonable.oguz-ismail|1 year ago
unknown|1 year ago
[deleted]