Great question, I am working on a follow up blog that will explain the differences in more detail. Flyte does take some inspiration from airflow, but it has a lot of important differences
- Flyte natively understands data flow between tasks. This is achieved using its own type system created in protobuf
- Flyte tasks are first class citizens and hence can be shared, reused and are always associated with an interface declaration
- Flyte is container and kibernetes native. It is also multi tenant.
- Flyte corn scheduler, control plane api and the actual execution engine are decoupled. Each workflow can be independently executed on a different execution engine
- Flyte workflows are purely specification - defined in protobuf and Flyte tasks also
- Flyte provides an event stream of the execution
- since Flyte is aware of the data, it comes with built in memorization and auto cataloging
- like airflow Flyte can have plugins in python, but it supports a richer plugin interface
- Flyte is written in Golang and on top of kuberenetes
It is definitely less mature in the open source, so please help us make it better. But it has been battle tested at Lyft for more than 3 years in production.
mmq|6 years ago
Polyaxon[0] took a similar approach to FLyte, i.e. for authoring specifications: strongly typed system in protobuf + intuitive yaml specification + sdks in Python/golang/java/... It also treats operations (tasks in Flyte) as first class citizens and allows to run them in a serverless way. Users can choose to register repetitive operations as components and share them with a description and a typed inputs/outputs.
[0] https://github.com/polyaxon/polyaxon
zerovar|6 years ago
kumare3|6 years ago
But, as it exists, we have a FlyteAirflowOperator, so that users can easily connect their Airflow pipelines with Flyte and write the new ones on Flyte alone.
Stay tuned for developments on this front :)