Back in Leopard days, I was playing with the x86-64 ABI on Mac OS X (no real documentation existed whatsoever, or at least that I could find, except for the source code). Very soon I accidentally ran into a kernel panic that could be reduced to a three-instruction program:
mov rax, 1
mov rdi, 1
syscall
and it would bring down the entire OS (kernel panic) when run by any user mode application under a standard non-root user. Took a few releases for them to fix that. I stopped trusting them from a low-level reliability perspective since then.
Fuzzing their system call interface may not have been a bad idea.
P.S. XNU is an interesting as it lets userspace use Mach syscalls as well as a bunch of BSD system calls directly. Probably esoteric interactions between them are not very well-thought-out. (I hope I am not offending Avie Tevanian here :))
I immediately thought of the htop bug and then she references that in the article. How is this not fixed yet? Like, this is a security bug, right? You can DoS from a simple usermode app with this bug.
I don’t think the attack vector (presumably asking for admin credentials to install a startup item) would be any different than an app that wanted to fork bomb, allocate too much memory, or spin wait on all cores. In that way I don’t think it’s any more security critical than any other bug that hangs the system.
It’s definitely something that should be fixed of course.
> This will be a kernel data structure protected by a mutex or semaphore. task_for_pid waits at pri>=0 for a wakeup that won't happen because race. ps queues behind it at pri<0 (disabling ^C). At least two bugs there.
Seems like a reasonable explanation as to the underlying cause of the behaviour.
Wanted to point out the same. It’s time for apple to fix something here. Not sure if I like the idea to work with timeouts in code just to prevent the bug from happening.
The htop report (in TFA and an other comment) seems high sierra specific (as in, users report the issue starting right when they upgraded to high sierra), running the C snippet from TFA on my system (a 2010 MBP running El Cap'), I can run through 15000 iterations without any freeze, according to TFA "sometimes it needs to try 10 times before it’ll freeze."
Htop also had segfaults after a while in OpenBSD 6.2 i386, when I used OpenBSD exclusively for a couple of weeks. It could also be present in other BSD-like kernels.
At first I remembered this and wondered if it was related:
"The rules for using Objective-C between fork() and exec() have changed in macOS 10.13. Incorrect code that happened to work most of the time in the past may now fail. Some workarounds are available."
Question from someone who knows very little about the Mach/XNU APIs: Does this code leak Mach ports? If you call task_for_pid, you get back a Mach task port. Do you have to close the port with mach_port_deallocate? Could a resource leak be contributing to the system freeze?
You mean Sierra / High Sierra. This is because of compatibility breaking changes to some low-level kernel system calls. Since valgrind is essentially a CPU emulator, it is tightly integrated with the OS kernel, and has to be updated accordingly. The macOS contributors to valgrind seem to be relatively few, probably because most macOS developers primarily use the various sanitisers in clang (they also have UI integration in Xcode).
Have you tried the clang or gcc asan/tsan/usan sanitisers as a replacment? There are pros and cons of valgrind vs compile time instrumentation. The sanitisers increase the memory footprint, but run with less overhead. valgrind can detect some errors that the sanitizers cannot etc.
I found the style of the article to be quite refreshing somehow. The OP is not trying to look like a smartass about the discovery (a trait very common in the IT industry), and she acknowledges that she doesn't really understand what is the underlying cause. She is just happy that she discovered something and is keen on sharing it with the world.
As long as we are talking about anecdotical evidence: my MBP with High Sierra has been running fine for the last few months. I haven't encountered any issues in my day-to-day work as iOS app developer and neither in my home use. I think it's a pretty decent release, though it didn't add any new features that I feel I really need.
The mach kernels in general are buggy, regardless of the release. e.g. I'm pretty sure they end up delivering SIGPIPE to the wrong thread in some circumstances on Sierra. There is also a problem with recvmsg not returning control messages some time.
Qualitatively, it feels like apple’s software quality has been on a slide for several years.
What attracted me to move to Mac OS in the first place some 15 years ago was the sheer quality. It was thrilling to use a computer that Just Worked, with no BSODs or the endless dependency hell that was Linux at the time.
It doesn’t feel like that any more, across either OSX or iOS - it feels fragile. Things crash, behaviours are inconsistent, and it feels like more emphasis has been placed on immediate commerciality than long term retention through quality.
For what it’s worth, I’m in the process of moving to Linux on my MBP. The pros of OSX just aren’t as strong any more.
I've edited 'he' to 'she' in the two otherwise fine comments that made this mistake (https://news.ycombinator.com/item?id=16251566 and https://news.ycombinator.com/item?id=16251562) and grouped several empty replies and one lame off-topic subthread under this one. It's rare that we do something like this (and I've emailed the author), but it seems fairer than to penalize their original posts, which were otherwise informative and on topic.
[+] [-] mehrdada|8 years ago|reply
mov rax, 1
mov rdi, 1
syscall
and it would bring down the entire OS (kernel panic) when run by any user mode application under a standard non-root user. Took a few releases for them to fix that. I stopped trusting them from a low-level reliability perspective since then.
Fuzzing their system call interface may not have been a bad idea.
P.S. XNU is an interesting as it lets userspace use Mach syscalls as well as a bunch of BSD system calls directly. Probably esoteric interactions between them are not very well-thought-out. (I hope I am not offending Avie Tevanian here :))
[+] [-] djsumdog|8 years ago|reply
Nothing official from Apple on this yet?
[+] [-] Moto7451|8 years ago|reply
It’s definitely something that should be fixed of course.
[+] [-] Cogito|8 years ago|reply
> This will be a kernel data structure protected by a mutex or semaphore. task_for_pid waits at pri>=0 for a wakeup that won't happen because race. ps queues behind it at pri<0 (disabling ^C). At least two bugs there.
Seems like a reasonable explanation as to the underlying cause of the behaviour.
[0] https://twitter.com/cliffordheath/status/957505667568353280
[+] [-] Doctor_Fegg|8 years ago|reply
[+] [-] tomsmeding|8 years ago|reply
[+] [-] hit8run|8 years ago|reply
[+] [-] phreack|8 years ago|reply
[+] [-] masklinn|8 years ago|reply
[+] [-] jacksmith21006|8 years ago|reply
[+] [-] terminalcommand|8 years ago|reply
[+] [-] jrochkind1|8 years ago|reply
"The rules for using Objective-C between fork() and exec() have changed in macOS 10.13. Incorrect code that happened to work most of the time in the past may now fail. Some workarounds are available."
http://sealiesoftware.com/blog/archive/2017/6/5/Objective-C_...
But seems like no, actually, at least not obviously.
[+] [-] skissane|8 years ago|reply
[+] [-] stochastic_monk|8 years ago|reply
Which is a huge pain, because it means that if I ever need to use it, I have to debug on a server.
[+] [-] jchb|8 years ago|reply
Have you tried the clang or gcc asan/tsan/usan sanitisers as a replacment? There are pros and cons of valgrind vs compile time instrumentation. The sanitisers increase the memory footprint, but run with less overhead. valgrind can detect some errors that the sanitizers cannot etc.
[+] [-] saagarjha|8 years ago|reply
[+] [-] pcwalton|8 years ago|reply
[+] [-] rokhinip|8 years ago|reply
[+] [-] thibaut_barrere|8 years ago|reply
[+] [-] spooneybarger|8 years ago|reply
[+] [-] crypt1d|8 years ago|reply
[+] [-] icebraining|8 years ago|reply
[+] [-] _asummers|8 years ago|reply
[+] [-] tenaciousDaniel|8 years ago|reply
[+] [-] stevekemp|8 years ago|reply
On the other hand I find that if every other sentence ends with an exclamation mark, or a question-mark it gets quite exhausting.
[+] [-] tambourine_man|8 years ago|reply
I’m sure there’s a cognitive bias partially to blame (since it’s the most recent) but it looks like we are way past that.
[+] [-] wsc981|8 years ago|reply
[+] [-] danieltillett|8 years ago|reply
[+] [-] lmb|8 years ago|reply
Another reason they killed OS X Server maybe.
[+] [-] madaxe_again|8 years ago|reply
What attracted me to move to Mac OS in the first place some 15 years ago was the sheer quality. It was thrilling to use a computer that Just Worked, with no BSODs or the endless dependency hell that was Linux at the time.
It doesn’t feel like that any more, across either OSX or iOS - it feels fragile. Things crash, behaviours are inconsistent, and it feels like more emphasis has been placed on immediate commerciality than long term retention through quality.
For what it’s worth, I’m in the process of moving to Linux on my MBP. The pros of OSX just aren’t as strong any more.
[+] [-] dang|8 years ago|reply
[+] [-] Hello71|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] lima|8 years ago|reply
[+] [-] nathell|8 years ago|reply
[+] [-] chirau|8 years ago|reply
[deleted]
[+] [-] freehunter|8 years ago|reply
[+] [-] aw3c2|8 years ago|reply
[+] [-] GirlsCanCode|8 years ago|reply
[deleted]
[+] [-] GoofballJones|8 years ago|reply