top | item 43730883

(no title)

gz09 | 10 months ago

I wonder what the authors mean with

> DBSP does make some tradeoffs when compared to differential dataflow. It simplifies the programming model by constraining how time and state management occur. This simplification limits some of the concurrency gains we see in timely and differential dataflow.

FWIW there are no fundamental big differences between dbsp and dd in terms of concurrency. Both models can concurrently process data on many threads/machines and both do it in similar ways (sharding things).

discuss

order

riccomini|10 months ago

DD supports lattices that allow it to compute at multiple points in time simultaneously. As I understand it, DBSP limits time to one diff at a time. Lalith can correct me if I’m off base on this. :)

ryzhyk|10 months ago

I'd say the difference is in the type of transaction isolation guarantees each system provides. DBSP can process multiple diffs in parallel, and when it's done it outputs a single diff that captures the effects of all the input diffs. DD can additionally attribute each output diff to a specific input diff by assigning each input diff and matching output diff a logical timestamp. This has a cost in terms of complexity and runtime overhead, but it allows strong isolation of concurrent transactions.

lsuresh|10 months ago

(lalith here) -- Whatever ryzyk and gz09 said above. :)