(no title)
Longwelwind | 2 years ago
* Add a GPU resource requirement on one of your step
* Add an auto-scaler that adds GPU nodes to your cluster based on the GPU resource demand.
After having written the above, I realize that it might sound like that famous HN comment about how you can /easily/ re-create Dropbox yourself, which might actually prove your point that there is a need for ML-specific tools for the training part.
thundergolfer|2 years ago
Airflow is also absolutely not built for that purpose. It's ~10yr old Hadoop-era technology.
__MatrixMan__|2 years ago
As for configuring the kubernetes pod operator to ask for pods with GPU's, it exposes the k8s python API in the dag definition. I haven't done it myself, but I think that it's not really airflow that's going to be a pain there. Getting the pod spec right is gonna have to happen whatever does the orchestration.
(Full disclosure: my employer offers airflow as a service)
Starlord2048|2 years ago
mountainriver|2 years ago