top | item 40724646

(no title)

sqlcook | 1 year ago

You’re missing the point of the benefits of solutions like these, and the original set of tools like the Informatica of the kind. Those tools come with limitations and constrains, like a box of legos you can build a very powerful pipeline without having to wire up a lot of redundant code as you pass data frames between validation stages. Tools like Airflow/Spark etc are great for what they are, but they don’t come with guidelines or best practices when it comes to reusable code at scale, your team has to establish that early on.

You can open a pretty complicated large DAG in and right away you’ll understand the data flow and processing steps. If you were to do similar in code, it becomes a lot harder unless you comply to good modular design practices.

This is also why common game engine and 3d rendering tools come with a UI for flow driven scripting. It’s intuitive and much easier to organize.

discuss

order

No comments yet.