Capillaries is a distributed data processing platform that:
- works with structured row-based data
- splits data into batches that can be processed as separate jobs on multiple machines in parallel
- allows scenarios that involve human operator supervision and data validation
- has ETL/ELT capabilities
- has SQL-like join, grouping, and aggregation capabilities
allows custom data processing plugins
kleineshertz|2 years ago