top | item 40726660

(no title)

kaon_ | 1 year ago

I would love to have your advice. What tool would you recommend to do straightforward ETL's as a single developer? Think of tasks like ETL-ing data from Production to Test or Local. Or quickly combining data from 2 databases to answer some business question.

Six years ago I used Pentaho to do it. And it worked really well. It was easy and quick. Though maintenance was hard sometimes and it felt very dated: The javascript version was ancient, I could find a lot of questions answered online, but they were usually 5-10years old. I am wondering whether I should use something like Amphi for my next simple-ETLs.

discuss

order

NortySpock|1 year ago

I've gotten some quick wins with Benthos (now RedPanda Connect) but I agree it's an unsolved problem as there are typically gotchas.

If you can get a true CDC stream from the database to analytics, that would be ideal, but when that isn't available you spend 100x more time trying to bodge together an equivalent batch/retry system.

rubslopes|1 year ago

I also want to know that. The BI team where I work still uses Pentaho. It's buggy and ugly, but it gets the job done most of the time. A few of them know a little of python, so a tool like Amphi could be the next stage.

hipadev23|1 year ago

clickhouse can enable all the things you mentioned