top | item 37969888

(no title)

abrazensunset | 2 years ago

This has loose overlap with:

- Materialize - Flink SQL - Arroyo - Readyset - RisingWave - Timeplus - Pathway - Dozer - ReadySet - Snowflake dynamic tables - Native materialized views in OLTP databases - Just having a stack of views in your db - Poor man's MVs with triggers

All subtly different on every spectrum from consistency, UDF support, operator support, latency, scaling/state limits, source/sink integrations, and compatibility with existing protocols.

What seems unique is the focus on "writebacks to the source without Kafka/Connect in between", instead of having either a built-in cache, serving as a stream processor, or both. It looks like the built-in cache is still available through the FDW deployment pattern.

They note that relative to the source tables they are eventually consistent (of course, unless you want to delay transaction writes) but it's not clear what other consistency aspects they respect (such as preserving transactions end to end).

Overall this looks like it's designed to overcome materialized view limitations (which in popular OLTP dbs are pretty severe w.r.t. either what operations are supported, latency, or both) compared to other solutions that basically move the action downstream...curious if it will see much use, or if they'll inevitably introduce sinks and direct access to see if they can compete in the "live ODS" segment with Materialize and RisingWave.

edit: to make my comment more clear: this is a new entrant in a crowded space with several sophisticated, established players and the main differentiation is the deployment pattern. I'd be curious to know if anything else sets them apart

discuss

order

necubi|2 years ago

I'm with Arroyo [0] — thanks for the mention! I'd be interested to see someone from Epsio chime in with where exactly they're positioning, but you're right that this is (recently) a very crowded space.

I think you can somewhat arbitrarily draw a line between systems like Materialize/RisingWave that are focused on materialized view maintenance (often reading change feeds from your OLTP database) and stream processors like Flink/Arroyo that are focused on supporting more operational use cases and work with a lager variety of sources and sinks.

Epsio seems like it's working primarily in the former mold, with fast incremental computation of materialized views. Unlike Materialize/RisingWave it seems to be designed to run in front of your database, with all queries going through it.

ReadySet is a really cool project in the query caching/materialized view space that I think doesn't get enough attention. Rather than making you define your materialized views ahead of time, it acts as an automated caching layer on top of postgres/mysql that performs incremental computation for components of query graphs.

As someone who's been in the streaming space for years, it's really exciting to see so much energy in the space in the past couple of years after a long period of stagnation, with everyone trying to figure out the programming and deployment models that make the most sense.

Most folks right now are gravitating towards materialized views as the model, largely because it's easy and familar for users. But ultimately I think this approach will end up too limited for most use cases and will remain valuable but somewhat niche.

[0] https://github.com/ArroyoSystems/arroyo