Show HN: Restate – Low-latency durable workflows for JavaScript/Java, in Rust
185 points| sewen | 1 year ago |restate.dev
https://github.com/restatedev/ https://restate.dev/
It is free and open, SDKs are MIT-licensed, runtime permissive BSL (basically just the minimal Amazon defense). We worked on that for a bit over a year. A few points I think are worth mentioning:
- Restate's runtime is a single binary, self-contained, no dependencies aside from a durable disk. It contains basically a lightweight integrated version of a durable log, workflow state machine, state storage, etc. That makes it very compact and easy to run both on a laptop and a server.
- Restate implements durable execution not only for workflows, but the core building block is durable RPC handlers (or event handler). It adds a few concepts on top of durable execution, like virtual objects (turn RPC handlers into virtual actors), durable communication, and durable promises. Here are more details: https://restate.dev/programming-model
- Core design goal for APIs was to keep a familiar style. An app developer should look at Restate examples and say "hey, that looks quite familiar". You can let us know if that worked out.
- Basically every operation (handler invocation, step, ...) goes through a consensus layer, for a high degree of resilience and consistency.
- The lightweight log-centric architecture gives Restate still good latencies: For example around 50ms roundtrip (invoke to result) for a 3-step durable workflow handler (Restate on EBS with fsync for every step).
We'd love to hear what you think of it!
[+] [-] BenoitP|1 year ago|reply
Question for OP: I'd bet Flink's Statefuns comes in Restate's story. Could you please comment on this? Maybe Statefuns we're sort of a plugin, and you guys wanted to rebase to the core of a distributed function?
[+] [-] sewen|1 year ago|reply
Yes, Flink Stateful Functions were a first experiment to build a system for the use cases we have here. Specifically in Virtual Objects you can see that legacy.
With Stateful Functions, we quickly realized that we needed something built for transactions, while Flink is built for analytics. That manifests in many ways, maybe most obviously in the latency: Transactional durability takes seconds in Flink (checkpoint interval) and milliseconds in Restate.
Also, we could give Restate a very different dev ex, more compatible with modern app development. Flink comes from a data engineering side, very different set of integrations, tools, etc.
[+] [-] pavel_pt|1 year ago|reply
> Stateful Functions (in Apache Flink): Our thoughts started a while back, and our early experiments created StateFun. These thoughts and ideas then grew to be much much more now, resulting in Restate. Of course, you can still recognize some of the StateFun roots in Restate.
The full post is at: https://restate.dev/blog/why-we-built-restate/
[+] [-] sewen|1 year ago|reply
- Blog post with an overview of Restate 1.0: https://restate.dev/blog/announcing-restate-1.0-restate-clou...
- Restate docs: https://docs.restate.dev/
- Discord, for anyone who wants to chat interactively: https://discord.com/invite/skW3AZ6uGd
[+] [-] yaj54|1 year ago|reply
[+] [-] p10jkle|1 year ago|reply
https://restate.dev/blog/solving-durable-executions-immutabi...
https://restate.dev/blog/code-that-sleeps-for-a-month/
The key takeaways:
1. Immutable code platforms (like Lambda) make things much more tractable - old code being executable for 'as long as your handlers run' is the property you need. This can also be achieved in Kubernetes with some clever controllers
2. The ability to make delayed RPCs and span time that way allows you to make your handlers very short running, but take action over very long periods. This is much superior to just sleeping over and over in a loop - instead, you do delayed tail calls.
[+] [-] rockostrich|1 year ago|reply
If we deploy a new version of the workflow, we just keep around the existing deployed version until all of its in-flight runs are completed. Usually this can be done within a few minutes but sometimes we need to wait days.
We don't actually tie service releases 1:1 with the workflow versions just in case we need a hotfix for a given workflow version, but the general pattern has worked very well for our use cases.
[+] [-] delusional|1 year ago|reply
[+] [-] senorrib|1 year ago|reply
[+] [-] stsffap|1 year ago|reply
[+] [-] hintymad|1 year ago|reply
[+] [-] tempaccount420|1 year ago|reply
[+] [-] threeseed|1 year ago|reply
It's a terrible language for concurrency and transitive dependencies can cause panics which you often can't recover from.
Which means the entire ecosystem is like sitting on old dynamite waiting to explode.
JVM really has proven itself to be by far the best choice for high-concurrency, back-end applications.
[+] [-] swyx|1 year ago|reply
[+] [-] bilalq|1 year ago|reply
1. Max execution duration of a workflow
2. Max input/output payload size in bytes for a service invocation
3. Max timeout for a service invocation
4. Max number of allowed state transitions in a workflow
5. Max Journal history retention time
[+] [-] stsffap|1 year ago|reply
2. Restate currently does not impose a strict size limit for input/output messages by default (it has the option to limit it though to protect the system). Nevertheless, it is recommended to not go overboard with the input/output sizes because Restate needs to send the input messages to the service endpoint in order to invoke it. Thus, the larger the input/output sizes, the longer it takes to invoke a service handler and sending the result back to the user (increasing latency). Right now we do issue a soft warning whenever a message becomes larger than 10 MB.
3. If the user does not specify a timeout for its call to Restate, then the system won't time it out. Of course, for long-running invocations it can happen that the external client fails or its connection gets interrupted. In this case, Restate allows to re-attach to an ongoing invocation or to retrieve its result if it completed in the meantime.
4. There is no limit on the max number of state transitions of a workflow in Restate.
5. Restate keeps the journal history around for as long as the invocation/workflow is ongoing. Once the workflow completes, we will drop the journal but keep the completed result for 24 hours.
[+] [-] sewen|1 year ago|reply
You can store a lot of data in Restate (workflow events, steps). Logged events move quickly to an embedded RocksDB, which is very scalable per node. The architecture is partitioned, and while we have not finished all the multi-node features yet, everything internally is build in a partitioned scalable manner.
So it is less a question of what the system can do, maybe more what you want:
- if you keep tens of thousands of journal entries, replays might take a bit of time. (Side note, you also don't need that, Restate's support for explicit state gives you an intuitive alternative to the "forever running infinite journal" workflow pattern some other systems promote.)
- Execution duration for a workflow is not limited by default. More of a question of how long do you want to keep instances older versions of the business logic around?
- History retention (we do this only for tasks of the "workflow" type right now) as much as you are willing to invest into for storage. RocksDB is decent at letting old data flow down the LSM tree and not get in the way.
Coming up with the best possible defaults would be something we'd appreciate some feedback on, so would love to chat more on Discord: https://discord.gg/skW3AZ6uGd
The only one where I think we need (and have) a hard limit is the message size, because this can adversely affect system stability, if you have many handlers with very large messages active. This would eventually need a feature like out-of-band transport for large messages (e.g., through S3).
[+] [-] bilalq|1 year ago|reply
One big hangup for me is that there's only a single node orchestrator as a CDK construct. Having a HA setup would be a must for business critical flows.
I stumbled on Restate a few months ago and left the following message on their discord.
> I was considering writing a framework that would let you author AWS Step Functions workflows as code in a typesafe way when I stumbled on Restate. This looks really interesting and the blog posts show that the team really understands the problem space.
> My own background in this domain was as an early user of AWS SWF internally at AWS many, many years ago. We were incredibly frustrated by the AWS Flow framework built on top of SWF, so I ended up creating a meta Java framework that let you express workflows as code with true type-safety, arrow function based step delegations, and leveraging Either/Maybe/Promise and other monads for expressiveness. The DX was leaps and bounds better than anything else out at the time. This was back around 2015, I think.
> Fast-forward to today, I'm now running a startup that uses AWS Step Functions. It has some benefits, the most notable being that it's fully serverless. However, the lack of type-safety is incredibly frustrating. An innocent looking change can easily result in States.Runtime errors that cannot be caught and ignore all your catch-error logic. Then, of course, is how ridiculous it feels to write logic in JSON or a JSON-builder using CDK. As if that wasn't bad enough, the pricing is also quite steep. $25 for every million state transitions feels like a lot when you need to create so many extra state transitions for common patterns like sagas, choice branches, etc.
> I'm looking forward to seeing how Restate matures!
[+] [-] p10jkle|1 year ago|reply
[+] [-] aleksiy123|1 year ago|reply
Also something about this area always makes me excited. I guess it must be the thought of having all these tasks just working in the background without having to explicitly manage them.
One question I have is does anyone have experience for building data pipelines in this type of architecture?
Does it make sense to fan out on lots of small tasks? Or is it better to batch things into bigger tasks to reduce the overhead.
[+] [-] stsffap|1 year ago|reply
Regarding whether to parallelize or to batch, I think this strongly depends on what the actual operation involves. If it involves some CPU-intensive work like model inference, for example, then running more parallel tasks will probably speed things up.
[+] [-] gvdongen|1 year ago|reply
[+] [-] netvarun|1 year ago|reply
[+] [-] sewen|1 year ago|reply
(1) Restate has latencies that to the best of my knowledge are not achievable with Temporal. Restate's latencies are low because of (a) its event-log architecture and (b) the fact that Restate doesn't need to spawn tasks for activities, but calls RPC handlers.
(2) Restate works really well with FaaS. FaaS needs essentially a "push event" model, which is exactly what Restate does (push event, call handler). IIRC, Temporal has a worker model that pulls tasks, and a pull model is not great for FaaS. Restate + AWS Lambda is actually an amazing task queue that you can submit to super fast and that scales out its workers virtually infinitely automatically (Lambda).
(3) Restate is a self-contained single binary that you download and start and you are done. I think that is a vastly different experience from most systems out there, not just Temporal. Why do app developers love Redis so much, despite its debatable durability? I think it is the insanely lightweight manner they love, and this is what we want to replicate (with proper durability, though).
(4) Maybe most importantly, Restate does much more than workflows. You can use it for just workflows, but you can also implement services that communicate durably (exactly-one RPC), maintain state in an actor-style manner (via virtual objects), or ingest events from Kafka.
This is maybe not the first thing you build, but it shows you how far you can take this if you want: It is a full app with many services, workflows, digital twins, some connect to Kafka. https://github.com/restatedev/examples/tree/main/end-to-end-...
All execution and communication is async, durable, reliable. I think that kind of app would be very hard to build with Temporal, and if you build it, you'd probably be using some really weird quirks around signals, for example when building the state maintenance of the digital twin that don't make this something any other app developer would find really intuitive.
[+] [-] unknown|1 year ago|reply
[deleted]
[+] [-] mikelnrd|1 year ago|reply
[+] [-] p10jkle|1 year ago|reply
Once http2 stuff is removed, there's nothing particularly odd that our library does that shouldn't work in all platforms, but I'm sure there will be some papercuts until we are actively testing against these targets
[+] [-] tonyhb|1 year ago|reply
The restate API is extremely similar to ours, and because of the similarities both Restate and Inngest should work on Bun, Deno, or any runtime/cloud. We most definitely do, and have users in production on all TS runtimes in every cloud (GCP, Azure, AWS, Vercel, Netlify, Fly, Render, Railway, Cloudflare, etc).
[+] [-] hamandcheese|1 year ago|reply
[+] [-] pavel_pt|1 year ago|reply
You can absolutely do something similar with a RDBMS.
I tend to think of building services in state machines: every important step is tracked somewhere safe, and causes a state transition through the state machine. If doing this by hand, you would reach out to a DBMS and explicitly checkpoint your state whenever something important happens.
To achieve idempotency, you'd end up peppering your code with prepare-commit type steps where you first read the stored state and decide, at each logical step, whether you're resuming a prior partial execution or starting fresh. This gets old very quickly and so most code ends up relying on maybe a single idempotency check at the start, and caller retries. You would also need an external task queue or a sweeper of some sort to pick up and redrive partially-completed executions.
The beauty of a complete purpose-built system like Restate is that it gives you a durable journal service that's designed for the task of tracking executions, and also provides you with an SDK that makes it very easy to achieve the "chain of idempotent blocks" effect without hand-rolling a giant state machine yourself.
You don't have to use Restate to persist data, though you can - and you get the benefit of having the state changes automatically commit with the same isolation properties as part of the journaling process. But you could easily orchestrate writes into external stores such as RDBMS, K-V, queues with the same guaranteed-progress semantics as the rest of your Restate service. Its execution semantics make this easier and more pleasant as you get retries out of the box.
Finally, it's worth mentioning that we expose a PostgreSQL protocol-compatible SQL query endpoint. This allows you to query any state you do choose to store in Restate alongside service metadata, i.e. reflect on active invocations.
[+] [-] sewen|1 year ago|reply
(1) it is really helpful in getting good latencies.
(2) it makes it self-contained, so easy to start and run anywhere
(3) There is a simplicity in the deeply integrated architecture, where consensus of the log, fencing of the state machine leaders, etc. goes hand in hand. It removes the need to coordinate between different components with different paradigms (pub-sub-logs, SQL databases, etc) that each have their own consistency/transactions. And coordination avoidance is probably the best one can do in distributed systems. This ultimately leads also to an easier to understand behavior when running/operating the system.
(4) The storage is actually pluggable, because the internal architecture uses virtual consensus. So if the biggest ask from users would be "let me use Kafka or SQS FIFO" then that's doable.
We'd love to go about this the following way: We aim to provide an experience than is users would end up preferring to maintaining multiple clusters of storage systems (like Cassandra + ElasticSearch + X server and Y queues) though this integrated design. If that turns out to not be what anyone wants, we can still relatively easily work with other systems.
[+] [-] AhmedSoliman|1 year ago|reply
[+] [-] azmy|1 year ago|reply
[+] [-] magnio|1 year ago|reply
[+] [-] sewen|1 year ago|reply
One difference is that Airflow seems geared towards heavier operations, like in data pipelines. In contrast, would be that Restate is not by default spawning any tasks, but it acts more of a proxy/broker for RPC- or event handlers and adds durable retries, journaling, ability to make durable RPCs, etc.
That makes it quite lightweight: If the handlers is fast in a running container, the whole thing results in super fast turnaround times (milliseconds).
You can also deploy the handlers on FaaS and basically get the equivalent of spawning a (serverless task) per step.
The other difference would be the way that the logic is defined, can maintain state, can make exactly-once calls to other handlers.
[+] [-] akbirkhan|1 year ago|reply
Question tho, when will you guys have python support? I’m a ml researcher here and can you tell that most of my work is now pipelines between different services, e.g. Chaining multiple LLM services. Big bottleneck is if one service returns an error and crashes the full chain.
Big fan of this work nevertheless. Just think you have alpha on the table
[+] [-] pavel_pt|1 year ago|reply
[+] [-] p10jkle|1 year ago|reply
[+] [-] p10jkle|1 year ago|reply
[+] [-] jamifsud|1 year ago|reply
[+] [-] stsffap|1 year ago|reply
[+] [-] rubyfan|1 year ago|reply
[+] [-] p10jkle|1 year ago|reply
The way we do that is by writing down what your code is doing, while its doing it, to a store. Then, on any failure, we re-execute your code, fill in any previously stored results, so that it can 'zoom' back to the point where it failed, and continue. It's like a much more efficient and intelligent retry, where the code doesn't have to be idempotent.
[+] [-] unknown|1 year ago|reply
[deleted]
[+] [-] mnahkies|1 year ago|reply
I'm particularly interested in the scaling characteristics, and how your approach to durable storage (seems no external database is required?) differs
[+] [-] stsffap|1 year ago|reply
And yes, Restate does not have any external dependencies. It comes as a single self-contained binary that you can easily deploy and operate wherever you are used to run your code.
[+] [-] sharkdoodoo|1 year ago|reply
[+] [-] whoiskatrin|1 year ago|reply