top | item 30795514

(no title)

TTPrograms | 3 years ago

FYI some of the Airflow issues are out of date / can be resolved with config changes.

AirFlow 2 is designed to support larger XCOM messages, so the guidance to only use it for small data no longer applies.

Your DAG construction overhead issue is likely due to dagbag refreshing. Airflow checks for DAG changes on a fixed interval, causing a reimport. The default period for that is fairly small, so for large deployments you will want to use a larger period (e.g. at least 5 minutes). I do not know why the default is so short (or was last I checked, anyway). Python files shouldn't do much of note on import regardless IMO.

I am not otherwise familiar with the improvements in Airflow 2, so I cannot say for sure if your other complaints still remain.

discuss

order

tchiotludo|3 years ago

I know that that some issues are fixed in Airflow 2, they have made a large improvement with that release. But not all issues is resolved with this one.

The performance issue is still here, just launch Airflow and submit thousand dagruns with simple python sleep(1) and you will hit the cpu bound very quickly with a total time that will have a large duration. Airflow is not designed for a lot of short duration tasks. When using event driving data flow, it's really complicated to managed.

Imagine a flow that will be triggered for each store for example (thousand of store, with 10+ tasks for each one), Airflow will not be able to manage this kind of workflow quickly (and it's not its goals). Airflow was clearly defined to handle small (hundreds tasks) for a long time.

For the XCOM part, Airflow store this in database, so you can't store data into this, you will need to store a small data (database is not here to store big files). In Kestra, we have a provide a storage that allow storing large data (Go, To, ...) between tasks natively with the pain on multiple node clusters.

TTPrograms|3 years ago

AirFlow 2 was released in 2020. You're saying you knew that these issues were fixed, and then an article is published on your webpage in 2022 knowingly comparing against the technical properties of a major version release 2 years behind? That is not a good look.