top | item 25730630

(no title)

somurzakov | 5 years ago

I believe what you described is a job of Platform Engineer/Systems Engineer/Data lake Architect, especially JVM aspect of it. The interesting job is in the beginning when you build the cluster initially, or do major extension, after that the ops/maintenance is usually outsourced to cheap labor offshore - so this kinda job is personally not for me.

spark has dataframe API which is similar to pandas api and can be learned in one day, especially if you know python.

same for Airflow and other frameworks, it just a fancy scheduler that anyone can pick up in a couple days.

discuss

order

No comments yet.