top | item 39227725

(no title)

sztanko | 2 years ago

In data engineering, there are frameworks like DBT that do exactly that. In fact, these are industry standards and the recommended way to do transformations and cleanups nowadays. This is essentially a mix of sql and jinja (and yaml files, for variables), you can create your own macros, it comes with it's own testing framework and also strict sql code formatters. Fits git flow quite well. The rationale is that it enables data analysts (data analytics engineers) to do quite sophisticated stuff still using sql. Also, if you are operating on datasets that are larger that a single machine can process, doing it in sql and passing to MPP engines like BigQuery and Snowflake are probably the only way to do it with relative ease.

In any case, this is for data engineering only. I wouldn't imagine doing this for live production stuff.

discuss

No comments yet.