(no title)
anqurvanillapy | 1 year ago
> Headers.
C++20 modules are left unstable and unused in major compilers there, but it’s a standard. And C is ironically perfect for FFI, as I said, almost every programming language speaks C: Rust WebAssembly API is extern C, JNI in Java, every scripting language, even Go itself talks to OS solely using syscall ABI, foreign-function calls are only possible with Cgo. C was not just an application/systems language for some sad decades.
> Big elephants.
Since I was in the zoo watching tigers:
Mostly three groups of people are served under a language: Application writers, library writers, compiler writers (language itself).
I narrowed down and started “small” to see if people writing programs crossing kernel and user space would have more thoughts about C since it’s the only choice. That’s also my job, I made distributed block device (AWS EBS replacement) using SPDK, distributed filesystem (Ceph FS replacement) using FUSE, packet introspection module in router using DPDK. I know how it feels.
Then for the elephants you mentioned, I see them more fitted into a more general library and application development, so here we go:
> Threading.
Async Rust is painful, Send + Sync + Pin, long signatures of trait bounds, no async runtimes are available in standard libraries, endless battles in 3rd party runtimes.
I would prefer Go on such problems. Not saying goroutines and channels are perfect (stackful is officially the only choice, when goroutine stacks somehow become memory intensive, going stackless is only possible with 3rd party event loops), but builtin deadlock and race detection win much here. So it just crashes on violation, loops on unknown deadlocks, I would probably go to this direction.
> Optimization, hardware.
Quite don’t understand why these concerns are “concerns” here.
It’s the mindset of having more known safer parts in C, like a disallow list, rather than under a strong set of rules, like in Rust, an allowlist (mark `unsafe` to be nasty). Not making everything reasonable, safe and generally smart, which is surreal.
C is still, ironically again, the best language to win against assembly upon an optimizing performance, if you know these stories:
- They increased 30% speed on CPython interpreter recently on v3.14.
- The technique was known 3 years ago to be applied in LuaJIT-Remake, they remade a Lua interpreter to win against the original handwritten assembly version, without inline caching.
- Sub-techniques of it exist more than a decade even it’s in Haskell LLVM target, and they theoretically exist before C was born.
It is essentially just an approach to matching how the real abstract machine looks like underneath.
> libc.
Like I said, C is more than a language. Ones need to switch a new allocator algorithm upon malloc/free, Rust quits using jemalloc by default and uses just malloc instead. Libc is somewhat a weird de facto interface.
SleepyMyroslav|1 year ago
>Modules
Nobody plans to provide other interfaces to oses/middlewares/large established libraries. Economy is just not there.
>Threading
I was not talking about I/O at all. All of that you mention will be miles better in any high level language because waiting can be done in any language. Using threads for computation intensive things is a niche for low level languages. I would go further say that copying stuff around and mutexes also will be fine in high level languages.
>Optimization/Hardware
Is very important to me. I don't know how it was not relevant to your plan of fixing low level language. Here goes couple of examples to try to shake things up.
The strlen implementation in glibc is not written in C. UB just do not allow to implement the same algorithm. Because reading up until memory page end is outside of abstract machine. Also note how sanitizers are implemented to avoid checking strlen implementation.
Pointer provenance that is both present in each major compiler and impossible to define atm. You need to decide if your language goes with abstract machine or gcc or clang or linux. None of them agree on it. A good attempt to add into C standard a logical model of pointer provenance did not produced any results. If you want to read up on that there was HN thread about it recently.
>libc
I am pretty sure I can't move you on that. Just consider platforms that need to use new APIs for everything and have horrendous 'never to be used' shims to be posix 'compatible'. Like you can compile legacy things but running it does not make sense. Games tend to run there just fine because games used to write relevant low level code per platform anyway.
anqurvanillapy|1 year ago
You don’t. Read the features I listed. One ends up with a C alternative frontend (Cfront, if you love bad jokes) including type system like Zig without any standard library. No hash tables, no vectors. You tended to write large games with this.
Like I said the main 3 groups of users, if you’re concerned about application writing, ask it. Rest of the comments talked about possible directions of langdev.
> Modules.
You write C++ and don’t know what a standard is. Motivating examples, real world problems (full and incremental compilation, better compilation cache instead of precompiled headers), decades spent on discussions. Economy would come for projects with modern C++ features.
> Threading.
If you know Rust and Go, talk about them more. Go creates tasks and uses futexes, with bare-bone syscall ABI. Higher level primitives are easy to use. Tools and runtime are friendly to debugging.
I wrote Go components with channels running faster than atomics with waits, in a distributed filesystem metadata server.
On CPU intensiveness, I would talk about things like automatic vectorization, smarter boxing/unboxing, smarter memory layout (aka levity, e.g. AoS vs SoA). Not threading niche.
> Strlen implementation and plan of low level programming.
Because I keep talking about designing a general purpose language. One can also use LLVM IR to implement such algorithms.
The design space here is to write these if necessary. Go source code is full of assembly.
> Pointer provenance.
Search for Andras Kovacs implementation of 2ltt in ICFP 2024 (actually he finished it in 2022), and his dtt-rtcg, you would realize how trivial these features could be implemented “for a new language”. I design new languages.
> libc.
Like I said, your happy new APIs invoke malloc.
jcranmer|1 year ago
> Quite don’t understand why these concerns are “concerns” here.
One of the most frustrating things about C is that it is generally taught together with assembly, so that there is a general conflation between C and assembly, as if C is both "just" some sort of portable assembler and the unique language with that property. The main consequence of this is that the C abstract machine [1] tends to be assumed to be the model of how processors work, and this ends up creating a lot of friction where the C abstract machines just doesn't match hardware newer than about 40 years old. It can be a little hard to understand just how bad the friction is if you haven't personally run across it, but here's a few examples:
* Registers. C doesn't have a concept of registers [2], and there's not much of an easy way to really distinguish between "things that look like a load/store because the abstract machine assumes everything has a memory location" and "no, this is meant to actually issue a hardware machine load/store or this is meant to actually permanently live in a register." There's also minor stuff like the fact that the language makes it easier to express "A[i]" (load A + i * sizeof(A)) over "&A[i]" (A + i sizeof(A)) that makes it somewhat annoying if you want to express assembly concepts better.
SIMD vectors. This is pretty common (at least across a desktop, server, or mobile CPU or GPU). But C has no way of expressing these types or how to use them, outside of compiler extensions (and there's like three incompatible versions of it).
* There's a lack of concept of optimization, and concomitant issues like optimization barriers. Some things have slowly moved in (e.g., there's now an attribute to indicate a function call is speculatable), but in general, it's still difficult to tell the compiler to stop doing some optimization that might break your code.
* No hardware speculation barrier concept, and similar other barriers for more exotic concepts like operations depending on the path condition of the function call (cryptographic code, which wants to be constant-time, or SIMT code tends to care about that a lot more).
[1] Or at least what people assume the semantics of the abstract machine are. Let's be frank, the C userbase isn't very good at actually knowing what the C standard does and doesn't guarantee.
[2] Yes, I know about the register keyword. No, it doesn't give C a meaningful concept of registers.
needlesslygrim|1 year ago
On the other hand, I've found normal threading in Rust quite simple (generally using a thread pool).
anqurvanillapy|1 year ago
Sorry that I didn't much clarify the "pain" though:
It's quite like the experience of using parser combinator in Rust, where you could happily define the grammar and the parsing action using its existing utitlies. But once you have to do some easy wrapping, e.g. to make a combinator called `parenthesized` to surround an expression with parentheses, the "pain" kicks in, you have to leave as many trait bounds as possible since wiring the typing annotations become tedious. That came up while I was using framework like `winnow`.
Async Rust kinda shares some similar characteristics, utility functionalities might bring in many "typing wirings" that could terrify some people (well but I love it though).