top | item 40659160

Show HN: Restate – Low-latency durable workflows for JavaScript/Java, in Rust

185 points| sewen | 1 year ago |restate.dev

We'd love to share our work with you: Restate, a system for workflows-as-code (durable execution). With SDKs in JS/Java/Kotlin and a lightweight runtime built in Rust/Tokio.

https://github.com/restatedev/ https://restate.dev/

It is free and open, SDKs are MIT-licensed, runtime permissive BSL (basically just the minimal Amazon defense). We worked on that for a bit over a year. A few points I think are worth mentioning:

- Restate's runtime is a single binary, self-contained, no dependencies aside from a durable disk. It contains basically a lightweight integrated version of a durable log, workflow state machine, state storage, etc. That makes it very compact and easy to run both on a laptop and a server.

- Restate implements durable execution not only for workflows, but the core building block is durable RPC handlers (or event handler). It adds a few concepts on top of durable execution, like virtual objects (turn RPC handlers into virtual actors), durable communication, and durable promises. Here are more details: https://restate.dev/programming-model

- Core design goal for APIs was to keep a familiar style. An app developer should look at Restate examples and say "hey, that looks quite familiar". You can let us know if that worked out.

- Basically every operation (handler invocation, step, ...) goes through a consensus layer, for a high degree of resilience and consistency.

- The lightweight log-centric architecture gives Restate still good latencies: For example around 50ms roundtrip (invoke to result) for a 3-step durable workflow handler (Restate on EBS with fsync for every step).

We'd love to hear what you think of it!

109 comments

[+] BenoitP|1 year ago|reply

For context (because he's too good to brag) OP is among the original creators of Apache Flink.

Question for OP: I'd bet Flink's Statefuns comes in Restate's story. Could you please comment on this? Maybe Statefuns we're sort of a plugin, and you guys wanted to rebase to the core of a distributed function?

[+] sewen|1 year ago|reply

Thank you!

Yes, Flink Stateful Functions were a first experiment to build a system for the use cases we have here. Specifically in Virtual Objects you can see that legacy.

With Stateful Functions, we quickly realized that we needed something built for transactions, while Flink is built for analytics. That manifests in many ways, maybe most obviously in the latency: Transactional durability takes seconds in Flink (checkpoint interval) and milliseconds in Restate.

Also, we could give Restate a very different dev ex, more compatible with modern app development. Flink comes from a data engineering side, very different set of integrations, tools, etc.

[+] pavel_pt|1 year ago|reply

I hope @sewen will expand on this but from the blog post he wrote to announce Restate to the world back in August '23:

> Stateful Functions (in Apache Flink): Our thoughts started a while back, and our early experiments created StateFun. These thoughts and ideas then grew to be much much more now, resulting in Restate. Of course, you can still recognize some of the StateFun roots in Restate.

The full post is at: https://restate.dev/blog/why-we-built-restate/

[+] sewen|1 year ago|reply

A few links worth sharing here:

- Blog post with an overview of Restate 1.0: https://restate.dev/blog/announcing-restate-1.0-restate-clou...

- Restate docs: https://docs.restate.dev/

- Discord, for anyone who wants to chat interactively: https://discord.com/invite/skW3AZ6uGd

[+] yaj54|1 year ago|reply

how do tools like this handle evolving workflows? e.g., if I have a "durable worklflow" that sleeps for a month and then performs its next actions, what do I do if I need to change the workflow during that month? I really like the concept but this seems like an issue for anything except fairly short workflows. If I keep my data and algorithms separate I can modify my event handling code while workflows are "active."

[+] p10jkle|1 year ago|reply

I wrote two blog posts on this! It's a really hard problem

https://restate.dev/blog/solving-durable-executions-immutabi...

https://restate.dev/blog/code-that-sleeps-for-a-month/

The key takeaways:

1. Immutable code platforms (like Lambda) make things much more tractable - old code being executable for 'as long as your handlers run' is the property you need. This can also be achieved in Kubernetes with some clever controllers

2. The ability to make delayed RPCs and span time that way allows you to make your handlers very short running, but take action over very long periods. This is much superior to just sleeping over and over in a loop - instead, you do delayed tail calls.

[+] rockostrich|1 year ago|reply

My org solved this problem for our use case (handling travel booking) by versioning workflow runs. Most of our runs are very shortlived but there are cases where we have a run that lasts for days because of some long running polling process e.g. waiting on a human to perform some kind of action.

If we deploy a new version of the workflow, we just keep around the existing deployed version until all of its in-flight runs are completed. Usually this can be done within a few minutes but sometimes we need to wait days.

We don't actually tie service releases 1:1 with the workflow versions just in case we need a hotfix for a given workflow version, but the general pattern has worked very well for our use cases.

[+] delusional|1 year ago|reply

Conceptually I think the only thing these tools add on to the mental model of separation of data and logic is that they also store the name of next routine to call. The name is late bond, so migration would amount to switching out the implementation of that procedure.

[+] senorrib|1 year ago|reply

Looks very interesting, but calling it Open Source is misleading. BSL is not "minimal Amazon defense". It effectively prevents any meaningful dynamic functionality to be built on top of it without a commercial subscription.

[+] stsffap|1 year ago|reply

We tried to design the additional usage grant (https://github.com/restatedev/restate/blob/39f34753be0e27af8...) as permissive as possible. Our intention is to only prevent the big cloud service providers from offering Restate as a managed service as it has happened in the past with other open source projects. If you find the additional usage grant still too restrictive, then let us talk how to adjust it to enable you while still maintaining our initial intention.

[+] hintymad|1 year ago|reply

I'm not sure "In Rust" serve any marketing value. A product's success rarely has to do with the use of a programming language, if not at all. I understand the arguments made by Paul Graham on the effectiveness of programming languages, but specifically for a workflow manager, a user like me cares literally zero about which programming language the workflow system uses even if I have to hack into the internal of the system, and latency really matters a lot less than throughput.

[+] tempaccount420|1 year ago|reply

You are free to ignore it. Personally I like to see new projects be made in Rust, because it means they're easier to contribute to than projects in other unmanaged non-GC languages.

[+] threeseed|1 year ago|reply

Having spent a lot of time recently writing Rust it's a major negative for me.

It's a terrible language for concurrency and transitive dependencies can cause panics which you often can't recover from.

Which means the entire ecosystem is like sitting on old dynamite waiting to explode.

JVM really has proven itself to be by far the best choice for high-concurrency, back-end applications.

[+] swyx|1 year ago|reply

it does if it makes Hners click upvote...

[+] bilalq|1 year ago|reply

Could you share details on limits to be mindful of when designing workflows? Some things I'd love to be able to reference at a glance:

1. Max execution duration of a workflow

2. Max input/output payload size in bytes for a service invocation

3. Max timeout for a service invocation

4. Max number of allowed state transitions in a workflow

5. Max Journal history retention time

[+] stsffap|1 year ago|reply

1. There is no maximum execution duration for a Restate workflow. Workflows can run only for a few seconds or span months with Restate. One thing to keep in mind for long-running workflows is that you might have to evolve the code over its lifetime. That's why we recommend writing them as a sequence of delayed tail calls (https://news.ycombinator.com/item?id=40659687)

2. Restate currently does not impose a strict size limit for input/output messages by default (it has the option to limit it though to protect the system). Nevertheless, it is recommended to not go overboard with the input/output sizes because Restate needs to send the input messages to the service endpoint in order to invoke it. Thus, the larger the input/output sizes, the longer it takes to invoke a service handler and sending the result back to the user (increasing latency). Right now we do issue a soft warning whenever a message becomes larger than 10 MB.

3. If the user does not specify a timeout for its call to Restate, then the system won't time it out. Of course, for long-running invocations it can happen that the external client fails or its connection gets interrupted. In this case, Restate allows to re-attach to an ongoing invocation or to retrieve its result if it completed in the meantime.

4. There is no limit on the max number of state transitions of a workflow in Restate.

5. Restate keeps the journal history around for as long as the invocation/workflow is ongoing. Once the workflow completes, we will drop the journal but keep the completed result for 24 hours.

[+] sewen|1 year ago|reply

For a many of those values, the answer would be "as much as you like", but with awareness for tradeoffs.

You can store a lot of data in Restate (workflow events, steps). Logged events move quickly to an embedded RocksDB, which is very scalable per node. The architecture is partitioned, and while we have not finished all the multi-node features yet, everything internally is build in a partitioned scalable manner.

So it is less a question of what the system can do, maybe more what you want:

- if you keep tens of thousands of journal entries, replays might take a bit of time. (Side note, you also don't need that, Restate's support for explicit state gives you an intuitive alternative to the "forever running infinite journal" workflow pattern some other systems promote.)

- Execution duration for a workflow is not limited by default. More of a question of how long do you want to keep instances older versions of the business logic around?

- History retention (we do this only for tasks of the "workflow" type right now) as much as you are willing to invest into for storage. RocksDB is decent at letting old data flow down the LSM tree and not get in the way.

Coming up with the best possible defaults would be something we'd appreciate some feedback on, so would love to chat more on Discord: https://discord.gg/skW3AZ6uGd

The only one where I think we need (and have) a hard limit is the message size, because this can adversely affect system stability, if you have many handlers with very large messages active. This would eventually need a feature like out-of-band transport for large messages (e.g., through S3).

[+] bilalq|1 year ago|reply

I still haven't gotten around to adopting Restate yet, but it's on the radar. One thing that Step Functions probably has over Restate is the diagram visualization of your state machine definition and execution history. It's been really neat to be able to zero in on a root cause at the conceptual level instead of the implementation level.

One big hangup for me is that there's only a single node orchestrator as a CDK construct. Having a HA setup would be a must for business critical flows.

I stumbled on Restate a few months ago and left the following message on their discord.

> I was considering writing a framework that would let you author AWS Step Functions workflows as code in a typesafe way when I stumbled on Restate. This looks really interesting and the blog posts show that the team really understands the problem space.

> My own background in this domain was as an early user of AWS SWF internally at AWS many, many years ago. We were incredibly frustrated by the AWS Flow framework built on top of SWF, so I ended up creating a meta Java framework that let you express workflows as code with true type-safety, arrow function based step delegations, and leveraging Either/Maybe/Promise and other monads for expressiveness. The DX was leaps and bounds better than anything else out at the time. This was back around 2015, I think.

> Fast-forward to today, I'm now running a startup that uses AWS Step Functions. It has some benefits, the most notable being that it's fully serverless. However, the lack of type-safety is incredibly frustrating. An innocent looking change can easily result in States.Runtime errors that cannot be caught and ignore all your catch-error logic. Then, of course, is how ridiculous it feels to write logic in JSON or a JSON-builder using CDK. As if that wasn't bad enough, the pricing is also quite steep. $25 for every million state transitions feels like a lot when you need to create so many extra state transitions for common patterns like sagas, choice branches, etc.

> I'm looking forward to seeing how Restate matures!

[+] p10jkle|1 year ago|reply

A visualisation/dashboard is a top priority! Distributed architecture (to support multiple nodes for HA and horizontal scaling) is being actively worked on and will land in the coming months

[+] aleksiy123|1 year ago|reply

Looks really awesome. Always been looking for some easy to use async workflows + cronjobs service to use with serverless like Vercel.

Also something about this area always makes me excited. I guess it must be the thought of having all these tasks just working in the background without having to explicitly manage them.

One question I have is does anyone have experience for building data pipelines in this type of architecture?

Does it make sense to fan out on lots of small tasks? Or is it better to batch things into bigger tasks to reduce the overhead.

[+] stsffap|1 year ago|reply

While Restate is not optimized for analytical workloads it should be fast enough to also use it for simpler analytical workloads. Admittedly, it currently lacks a fluent API to express a dataflow graph but this is something that can be added on top of the existing APIs. As @gvdongen mentioned a scatter-gather like pattern can be easily expressed with Restate.

Regarding whether to parallelize or to batch, I think this strongly depends on what the actual operation involves. If it involves some CPU-intensive work like model inference, for example, then running more parallel tasks will probably speed things up.

[+] gvdongen|1 year ago|reply

Here is a fan-out example for async tasks: https://docs.restate.dev/use-cases/async-tasks#parallelizing... First, a number of tasks are scheduled, and then their results are collected (fan-in). This probably comes closest to what you are looking for. Each of those tasks gets executed durably, and their execution tracked by Restate.

[+] netvarun|1 year ago|reply

Feedback: everybody’s question is going to be on why this over temporal? I’ve noticed you answered a little bit of that below. My advice would be to write a detailed blog post maybe on how both the systems compare from installation to use cases and administration, etc - I’ve been following your blog and while I think y’all are doing interesting stuff I still haven’t wrapped my head around how exactly is restate different from temporal which is a lot more funded, has almost every unicorn using them and are fully permissively licensed.

[+] sewen|1 year ago|reply

That blog post should exist, agree. Here is an attempt at a short answer (with the caveat that I am not an expert in Temporal).

(1) Restate has latencies that to the best of my knowledge are not achievable with Temporal. Restate's latencies are low because of (a) its event-log architecture and (b) the fact that Restate doesn't need to spawn tasks for activities, but calls RPC handlers.

(2) Restate works really well with FaaS. FaaS needs essentially a "push event" model, which is exactly what Restate does (push event, call handler). IIRC, Temporal has a worker model that pulls tasks, and a pull model is not great for FaaS. Restate + AWS Lambda is actually an amazing task queue that you can submit to super fast and that scales out its workers virtually infinitely automatically (Lambda).

(3) Restate is a self-contained single binary that you download and start and you are done. I think that is a vastly different experience from most systems out there, not just Temporal. Why do app developers love Redis so much, despite its debatable durability? I think it is the insanely lightweight manner they love, and this is what we want to replicate (with proper durability, though).

(4) Maybe most importantly, Restate does much more than workflows. You can use it for just workflows, but you can also implement services that communicate durably (exactly-one RPC), maintain state in an actor-style manner (via virtual objects), or ingest events from Kafka.

This is maybe not the first thing you build, but it shows you how far you can take this if you want: It is a full app with many services, workflows, digital twins, some connect to Kafka. https://github.com/restatedev/examples/tree/main/end-to-end-...

All execution and communication is async, durable, reliable. I think that kind of app would be very hard to build with Temporal, and if you build it, you'd probably be using some really weird quirks around signals, for example when building the state maintenance of the digital twin that don't make this something any other app developer would find really intuitive.

[+] unknown|1 year ago|reply

[deleted]

[+] mikelnrd|1 year ago|reply

Hi. I'm excited to try this out. Does the typescript library for writing restate services run in Deno? And how about in a Cloudflare worker? These aren't quite nodejs environments but they do both offer comparability layers that make most nodejs libraries work. Just wondering if you know if the SDK will run in those runtimes? Thanks

[+] p10jkle|1 year ago|reply

Hey! I managed to get a POC running on Cloudflare workers, I had to make some small changes to the SDK eg to remove the http2 import, convert the Cloudflare request type into the Lambda request type, and add some methods to the Buffer type. I suspect similar things would be needed on Deno platforms. We have it on our todo list (scheduled within weeks not months) to make it possible to import a version of the library that just works out of the box on these platforms. I think if we had someone with a use case asking for it, we would happily build that even sooner - maybe come chat in our discord? https://discord.gg/skW3AZ6uGd

Once http2 stuff is removed, there's nothing particularly odd that our library does that shouldn't work in all platforms, but I'm sure there will be some papercuts until we are actively testing against these targets

[+] tonyhb|1 year ago|reply

Disclaimer: I work for Inngest (https://www.inngest.com), which works in the same area and released 2 years ago.

The restate API is extremely similar to ours, and because of the similarities both Restate and Inngest should work on Bun, Deno, or any runtime/cloud. We most definitely do, and have users in production on all TS runtimes in every cloud (GCP, Azure, AWS, Vercel, Netlify, Fly, Render, Railway, Cloudflare, etc).

[+] hamandcheese|1 year ago|reply

Is this a competitor to Temporal? I admit that I have never used either, but it strikes me as odd that these things bring their own data layer. Is the workload not possible using a general purpose [R]DBMS?

[+] pavel_pt|1 year ago|reply

Disclaimer: I work on Restate together with @p10jkle.

You can absolutely do something similar with a RDBMS.

I tend to think of building services in state machines: every important step is tracked somewhere safe, and causes a state transition through the state machine. If doing this by hand, you would reach out to a DBMS and explicitly checkpoint your state whenever something important happens.

To achieve idempotency, you'd end up peppering your code with prepare-commit type steps where you first read the stored state and decide, at each logical step, whether you're resuming a prior partial execution or starting fresh. This gets old very quickly and so most code ends up relying on maybe a single idempotency check at the start, and caller retries. You would also need an external task queue or a sweeper of some sort to pick up and redrive partially-completed executions.

The beauty of a complete purpose-built system like Restate is that it gives you a durable journal service that's designed for the task of tracking executions, and also provides you with an SDK that makes it very easy to achieve the "chain of idempotent blocks" effect without hand-rolling a giant state machine yourself.

You don't have to use Restate to persist data, though you can - and you get the benefit of having the state changes automatically commit with the same isolation properties as part of the journaling process. But you could easily orchestrate writes into external stores such as RDBMS, K-V, queues with the same guaranteed-progress semantics as the rest of your Restate service. Its execution semantics make this easier and more pleasant as you get retries out of the box.

Finally, it's worth mentioning that we expose a PostgreSQL protocol-compatible SQL query endpoint. This allows you to query any state you do choose to store in Restate alongside service metadata, i.e. reflect on active invocations.

[+] sewen|1 year ago|reply

That's definitely a good question. A few thoughts here (I am one of the authors). The "bring your own data layer" has several goals:

(1) it is really helpful in getting good latencies.

(2) it makes it self-contained, so easy to start and run anywhere

(3) There is a simplicity in the deeply integrated architecture, where consensus of the log, fencing of the state machine leaders, etc. goes hand in hand. It removes the need to coordinate between different components with different paradigms (pub-sub-logs, SQL databases, etc) that each have their own consistency/transactions. And coordination avoidance is probably the best one can do in distributed systems. This ultimately leads also to an easier to understand behavior when running/operating the system.

(4) The storage is actually pluggable, because the internal architecture uses virtual consensus. So if the biggest ask from users would be "let me use Kafka or SQS FIFO" then that's doable.

We'd love to go about this the following way: We aim to provide an experience than is users would end up preferring to maintaining multiple clusters of storage systems (like Cassandra + ElasticSearch + X server and Y queues) though this integrated design. If that turns out to not be what anyone wants, we can still relatively easily work with other systems.

[+] AhmedSoliman|1 year ago|reply

Nothing prevents you from using your own data layer, but part of the power of Restate is the tight control over the short-term state and the durable execution flow. This means that you don't need to think a lot about concurrency control, dirty reads, etc.

[+] azmy|1 year ago|reply

I have been following on this project on a while and i tried it on older version and was already amazing. I am so excited to try this version out! specially with the cloud offering

[+] magnio|1 year ago|reply

How does Restate compare with Apache Airflow or Prefect?

[+] sewen|1 year ago|reply

Disclaimer, I am not an Airflow expert and even less of a Prefect expert.

One difference is that Airflow seems geared towards heavier operations, like in data pipelines. In contrast, would be that Restate is not by default spawning any tasks, but it acts more of a proxy/broker for RPC- or event handlers and adds durable retries, journaling, ability to make durable RPCs, etc.

That makes it quite lightweight: If the handlers is fast in a running container, the whole thing results in super fast turnaround times (milliseconds).

You can also deploy the handlers on FaaS and basically get the equivalent of spawning a (serverless task) per step.

The other difference would be the way that the logic is defined, can maintain state, can make exactly-once calls to other handlers.

[+] akbirkhan|1 year ago|reply

Nice! Excited tools that makes using microservices easier.

Question tho, when will you guys have python support? I’m a ml researcher here and can you tell that most of my work is now pipelines between different services, e.g. Chaining multiple LLM services. Big bottleneck is if one service returns an error and crashes the full chain.

Big fan of this work nevertheless. Just think you have alpha on the table

[+] pavel_pt|1 year ago|reply

We don't have specific plans for our next SDK to build, but Python definitely comes up often - thank you for the input!

[+] p10jkle|1 year ago|reply

Probably one of our two most requested languages. We absolutely are going to do it, probably in the next 6-12 months :)

[+] p10jkle|1 year ago|reply

Hey all, I work with @sewen, and I focus on the cloud platform which also launched today (https://restate.dev/blog/announcing-restate-cloud-early-acce...) Happy to answer any questions :)

[+] jamifsud|1 year ago|reply

Any plans for a Python SDK? We’re actively looking for a platform like this but our stack is TS / Python!

[+] stsffap|1 year ago|reply

We are actively looking for feedback on what SDK to develop next. Quite a few people have voiced interest in Python so far. This will make it more likely that we might tackle this soonish. We'll keep you posted.

[+] rubyfan|1 year ago|reply

There’s a lot of jargon in this, is there a lay person explanation of what problem this solves?

[+] p10jkle|1 year ago|reply

Our goal is to make it easier to write code that handles failures - failed outbound api calls, infrastructure issues like a host dying, problems talking between services. The primitive we offer is that we guarantee that your handlers always run to completion (whether to a result or a terminal error)

The way we do that is by writing down what your code is doing, while its doing it, to a store. Then, on any failure, we re-execute your code, fill in any previously stored results, so that it can 'zoom' back to the point where it failed, and continue. It's like a much more efficient and intelligent retry, where the code doesn't have to be idempotent.

[+] unknown|1 year ago|reply

[deleted]

[+] mnahkies|1 year ago|reply

Do you have anything comparing and contrasting with temporal?

I'm particularly interested in the scaling characteristics, and how your approach to durable storage (seems no external database is required?) differs

[+] stsffap|1 year ago|reply

We will create a more detailed comparison to Temporal shortly. Until then @sewen gave a nice summarizing comparison here: https://news.ycombinator.com/item?id=40660568.

And yes, Restate does not have any external dependencies. It comes as a single self-contained binary that you can easily deploy and operate wherever you are used to run your code.

[+] sharkdoodoo|1 year ago|reply

I understand the need for writing this as an SDK over existing languages for adoption reasons, but in your opinion would a programming language purposely built for such a paradigm make more sense?

[+] whoiskatrin|1 year ago|reply

The cloud setup was super fast! I used it for an existing app + restate TS sdk, really took a few steps to get things up and running! Looking forward to more support for nextjs/node