top | item 12088626

(no title)

jagsr123 | 9 years ago

Structured streaming really makes spark much more competitive for streaming use cases by allowing streams to be seen as DataFrames and hence make it super simple to run continuous queries using SQL on streams, but, often you also need to work with historical data in your analytic query, continuously mutate data (which may require transactional semantics), make sure the data itself is HA (not just fault tolerant; important for low latency apps), etc. What we do is fuse in modern in-memory DB with spark (i.e. DB cluster nodes run spark executors) so there is no need to couple spark with some other data management cluster. That said, some of the APIs we introduced for SQL stream processing will be replaced by the new Structured streaming APIs.

discuss

order

No comments yet.