top | item 36773712

(no title)

ersatz_username | 2 years ago

A lot of these tools are very, very different from each other so it's hard to address each individually. Just by way of example, databend is a full on datawarehouse while greatexpectations is a testing framework evaluating data assertions (i.e. "I see there are nulls, but you wrote a test which says there shouldn't be").

Here are some things we think are really important though

1. Data quality testing ideally happens during CI not after merge.

2. Developers come first. Virtually every aspect of the tool can be customized, modified, and extended down to the basic data model without changing any upstream core code. Want to build your own custom application on top of your data lineage? Great! Have at it!

3. Users should be able to own not just their own data but their own metadata. We go to great lengths to maintain feature parity between the cloud and self-hosted application.

discuss

order