top | item 34218208

(no title)

Raptor22 | 3 years ago

Can someone explain the differences between Google and the C++ standards committee? I understand it has to do with breaking ABI, but what exactly does this imply?

As someone much more experienced in other languages like Python and Swift, my biggest surprise when using C++ is that the language standard and the compiler are two different things. In Python, the default CPython interpreter is the standard - if it's not in CPython, it's not standard Python. But in C++, there's like 3 different compilers, all of which implement different subsets of the latest C++ standards. IIRC, the latest standard that all compilers implement fully is C++ 14. I understand C++ is a broad and complicated language, and I actually have liked using it in the few times I've had the chance, but the compiler/standard situation seems like complete lunacy from where I stand.

discuss

order

deng|3 years ago

> As someone much more experienced in other languages like Python and Swift, my biggest surprise when using C++ is that the language standard and the compiler are two different things.

Yes, that's because C++ is a formally standardized language, and Python and Swift are not. C++ is an ISO standard, and it is published in text form. You are free to create your own compiler, and if it works according to the rules written in the official standard, you may call it a standard-conforming compiler. There are of course other languages which are also officially standardized like this (does not have to be ISO), for instance Common Lisp, Scheme and of course Javascript (ECMA). If a language is defined by a reference implementation, then this is an informal way of standardizing your language, and if you want to create your own compiler/interpreter for such a language, you have to carefully evaluate what the reference implementation does.

> I understand it has to do with breaking ABI, but what exactly does this imply?

Very roughly, it means that you cannot link object code with the current ABI with object code that was generated by older compilers with the old ABI (well, it's usually worse: you can link just fine, but your program might or might not crash). This is not unprecedented in C++, we had this when switching from gcc4 to gcc5, it was definitely pretty painful and I'd rather not have this again.

yeputons|3 years ago

Google is a company which uses C++ in a specific way. For example, they build most of their code from source with the same compiler and compilation settings, including external dependencies. They do not use exceptions, some parts of the standard library, some parts of the language. They also can upgrade their codebase in more-or-less atomic company-wide refactorings which also run most (if not all) tests for the affected code.

The C++ standards committee is a body of people from different backgrounds, different companies and different needs from C++. Moreover, they kind of have to cater to everybody. There may be people programming microcontrollers, there may be game developers, there may be people supporting/porting decades-old software which uses old versions of Qt, there may be people wanting all the modern bells and whistles in C++, and there may be people using pre-compiled third-party libraries from a long gone vendor or vendor not willing to upgrade their compiler.

These needs may contradict each other. ABI (Application Binary Interface) in C++ can be thought of as a `.pyc` file in Python. You don't expect _any_ compatibility of `.pyc` files between Python versions, so all libraries are distributed in source code in `.py`, and it's up to Python to process them into `.pyc` files as it wants. In C++ world, lots of libraries (including all OS libraries, actually) are distributed in a pre-compiled binary form only. The way your program interacts with the library is the ABI. If you upgrade your compiler, but the library does not, you can no longer use the library.

One example is the memory layout of standard library types. E.g. a release version of `std::vector` (the standard dynamic array container) may only need three fields: a pointer to the allocated memory, maximal capacity of the vector, and its current size. A debug version may also include stuff like "where this vector was created". If one part of your program (or an external library) expects a vector to be 24 bytes and have such and such fields, and another part expects something else, Everyone Dies(tm). To make things worse, everything breaks silently: there are little to no checks for ABI compatibility, and reading a byte almost always works.

As to why the standard is affected by this, even though "ABI" is not mentioned anywhere in the text: you cannot add/remove fields or virtual methods within the standard library in the next standard. If you do, the newer standard library _must_ become incompatible with all the pre-compiled code expecting the older version. (Almost) no way around it. A similar thing in Python would be if one has tried to use both Python 2 and Python in the same project simultaneously, in the same process.

So if everyone starts building all their code and dependencies from the source code, there will be no ABI concerns anymore. I don't see it happening any time.

> IIRC, the latest standard that all compilers implement fully is C++ 14.

Not even than, garbage collection from C++11 was never implemented by any compiler: https://en.cppreference.com/w/cpp/compiler_support/11

Not a big problem though, as it's never used by anyone. I've heard of people using non-standard garbage collection extensions instead, long before C++11. That's another thing with the standard and compilers: not everyone needs every feature, so some are given a priority depending on the compiler's users. And there are lots of existing solutions which probably won't be migrated to the standard.

pjmlp|3 years ago

> I've heard of people using non-standard garbage collection extensions instead, long before C++11.

Anyone using Unreal C++, Managed C++ (.NET 1.0) or C++/CLI (.NET 2.0 onwards).