top | item 39995666

(no title)

That's an impressive read, thank you and congrats on the release! I think that nowadays the development and adoption of performant IPC mechanisms is unfairly low, it's good to have such tech opensourced.

My question is, how does Flow-IPC compare to projects like Mojo IPC (from Chromium) and Eclipse iceoryx? At first glance they all pursue similar goals and pay much less attention to complex allocation management, yet managing to perform well enough.

discuss

ygoldfeld|1 year ago

Appreciate your time! And, naturally, this was the question I expected to pop up once I was able to work through everything required internally here at Akamai to actually put this guy out in public. Wouldn't it be sad :-( if the same thing already existed, and we just hadn't noticed it?

In tactical terms, back when this all started, of course we looked around for something to use; after all why write a whole thing, if we could use something? We didn't write a serializer, for example, since a kick-butt one (capnp - and FlatBuffers also seems fine) already existed. Back then, though, nothing really jumped out. So looking back, it may have simply been a race; a few people/groups out there saw this niche and started developing things. I see iceoryx in particular has one identical plank, which is workable/general end-to-end zero-copy via SHM; and it was released a couple years before, hence has a super nice presentation I hugely appreciate: many well-documented examples in particular. Whereas for us, providing that will take some more effort. (That said, we did not skimp on documentation: everything is documented meticulously, and there is a hopefully-reader-friendly Manual as well.)

When it came down to the core abilities we needed, it was like this: 1. We wanted to be able to share arbitrary combinations of C++ native structures, and not just PoDs (plain-old-datatypes). Meaning, things with pointers needed to work; and things with STL-compliant containers needed to work. Boost.interprocess initially looked like it got that job done... but not enough for our use-case at least. When it came down to it, with Boost.ipc:

- Allocation from a SHM-segment had to be done using a built-in Boost-written heap-allocation algorithm (they provided two of them, and you can plug in your own... as long as all the control structures lived inside SHM).

- The shared data structure had to live entirely within one SHM-segment (mmap()ed area).

But, we needed some heavy-duty allocation - the Boost ones are not that. Plugging in a commercial-grade one - like jemalloc - was an option, but that was itself quite a project, especially since the control structures have to live in SHM for it to work. jemalloc is the most advanced thing available, but it kept control structures as globals, so plopping those into SHM meant changing jemalloc (a lot... Eddy actually did pursue this during the design phase). Plus, having both sides of the conversation reading and writing in one shared SHM-segment was not great due to safety concerns.

And, whatever allocation would be used - with Boost.interprocess's straightforward assumptions - had to be constrained to one mmap()ed area (SHM-segment). jemalloc (for example; substitute tcmalloc or any other heap-provider as desired) would want to mmap() new segments at will. Boost.ipc doesn't work in that advanced way.

2. We wanted to to send capnp-encoded messages (and, more generally, just "stuff" - linear buffers) with end-to-end zero-copy, meaning capnp-segments would need to be allocated in SHM. I spoke with Kenton Varda (Cap'n Proto overlord) very recently; he too felt this straightforward desire of not piping-over copies of capnp-encoded items. Various Akamai teams implemented and reimplemented this by hand, for specific use cases, but as I said earlier, it wasn't reusable in a general way (not for our specific use-case for that original big C++ service that I was tasked with splitting-up).

Other niceties were desirable too - not worrying about names IPC-resource names/conflicts/..., ensuring SHM cleanup straightforwardly on exit or crash - but they were more tangential (albeit extremely useful) things that came about once we decided to handle the core (1) and (2) in reusable fashion.

At that point, nothing seemed to be around that would just give us those fairly intuitive things. I am not saying these are necessary for every IPC use-case... but they never hurt at the very least, and having those readily available give one a feeling of power and freedom.

Now, as to the actual question: How does it compare to those? I am not going to lie (because lying is bad): It'll take me a few days to understand the ins and outs of Mojo IPC and iceoryx, so any impression I give here is going to be preliminary and surface-level. To that point, I expect the correct/true answer to your question will be a matter of diving into each API and simply seeing which one seems best to the particular potential user. For Flow-IPC, this Manual page here should be a pretty decent overview of what's available with code snippets: https://flow-ipc.github.io/doc/flow-ipc/versions/main/genera...

That said, my preliminary initial impression is:

(cont.)

ygoldfeld|1 year ago

Versus iceoryx (the C++ version, not the Rust-oriented iceoryx2):

TL;DR: So far, it looks super-sweet (as well as mature, already supporting macOS for example). However more of an investment to use than is Flow-IPC, with a central daemon and a special event-loop model. It also doesn't want to do #1 above described (no pointers, no using existing STL-compliant container types).

This guy seems really cool, and it directly addresses at least the major part of need #2 above. You can transmit buffers with near-zero latency, and it'll do the SHM stuff for you. (For capnp specifically one would then implement the required SHM-allocating capnp::MessageBuilder, and off we go. Flow-IPC does give you this part out-of-the-box, granted.) Looking over the examples and overview, it seems like integrating it into an event loop might involve some pretty serious learning of iceoryx's event-loop model + subscribe/publish. There is also a central daemon that needs to run.

Flow-IPC, to me, seems to have a lower-learning/lower-maintenance curve approach to this. There's no central daemon or any equivalent of it. For each asynchronous thing (a transport::Channel, for example, which has receive-oriented methods), you can use one of 2 supplied APIs. The sync_io-style API will let you plug into anything select()/poll()/epoll()-oriented (and has a syntactic-sugar hook for boost.asio loops). If you've got an event loop, it'll be easy to plug Flow-IPC ops right into it - no background threads added thereby. Or, use the async-I/O-style API; then it'll create background threads as needed and call your callback (e.g., on message receipt) from there, leaving it to you to handle it there or by posting the "true" handling onto one of your own threads.

Point being, my impression so far is, using Flow-IPC in this sense is a lower-effort enterprise. It's pretty much just there to plug-in. (I really hope that isn't slander. That's my take so far - as I said, it'll take me a few days to understand these products in-depth.)

Now, in terms of need #1. (I acknowledge, this need is not for every C++ IPC use-case ever. 2 processes collaborating on one native C++ data structure full of SHM-compliant containers and/or pointers =/= done every day. Still, though, if 2 threads in one process can do it easily, why shouldn't they as-easily be able to do it across a process boundary? Right?) If I understand iceoryx's example on this topic (https://iceoryx.io/latest/examples/complexdata/)... I quote: "To implement zero-copy data transfer we use a shared memory approach. This requires that every data structure needs to be entirely contained in the shared memory and must not internally use pointers or references. ... Therefore, most of the STL types cannot be used, but we reimplemented some constructs. This example shows how to send/receive a iox::cxx::vector and how to send/receive a complex data structure containing some of our STL container surrogates."

With Flow-IPC, this does not apply. You can share existing STL-compliant containers, and (if you want) can have raw pointers too. We have tests nesting boost::container string/vector/map guys and our own flow::util::Basic_blob STL-compliant guy and sharing them, no problem. We've provided the necessary allocator and fancy-pointer types. Moreover, with a single line you can do this in jemalloc-allocated SHM; or instead choose a Boost.ipc-backed single-segment SHM. (Depends on what you desire for safety versus simplicity, internally. I am being a bit vague on that here, but it's in the docs, I promise.) I believe this is a pretty good illustration of Flow-IPC's "thing":

- Meat-and-potatoes, do what you want to do in your daily C++, without a major learning curve... - ...but without sacrificing essential power... - ...and extensibly, meaning you can modify its behavior in core ways without requiring a massive amount of learning of how Flow-IPC is built.

Versus Mojo IPC:

I really need to understand it better, before I can really comment. So far, it seems like its equivalent of Flow-IPC's sessions = super cool, building up a network of processes that can all talk to each other once in the network. Flow-IPC's sessions are basic: you want process A and B to speak, you establish a session (during this step, one is designated as the session-server and can therefore accept more sessions from that app or other apps)... then from there, you can make channels (and access SHM arenas, if you are using SHM directly as opposed to letting the zero-copy channels do it invisibly). It also has various-language bindings; Flow-IPC is C++... straight up.

That established, I need to understand it better. It looks like it provides super-fast low-level IPC transports (similar to Flow-IPC's unstructured-layer channels) in platform-agnostic fashion - but does not seem to specifically facilitate end-to-end zero-copy transmission of data structures via SHM. I could be completely wrong here, but it actually looks like one could feasibly plug-in Mojo IPC pipes as Flow-IPC Blob_sender/receiver (and/or Native_handle_sender/receiver) concept impl, into Flow-IPC, and get the end-to-end zero-copy goodness.

At least superficially, so far, Flow-IPC again looks like perhaps a more down-to-earth/readily-pluggable effort. (But, still documented out-the-wazoo!)

unknown|1 year ago

[deleted]