top | item 26302227

(no title)

shapiro92 | 5 years ago

before I try yet another ETL tool. How does this work with datasets that do not come from 3rd party providers like salesforce etc? I have had to build ETL pipelines for highly customized datasets either row level based or xml with I would say tricky code as the nesting or flows were not so simple and a lot of data missing.

How would Meltano or the other mention tools handle this?

Example is EDIFACT or FHIR or BDT

discuss

order

tayloramurphy|5 years ago

We're working on an SDK[0] for building taps that should make it much easier to build to the Singer spec with all of the features out of the box. In theory if you can write some python against whatever you're pulling data out of, then it can work within the Singer ecosystem and Meltano. It's nearly ready to go but we'd love feedback if you decide to test it out!

[0] https://gitlab.com/meltano/singer-sdk

shapiro92|5 years ago

okay fair enough. but I guess then it doesn't offer anything extra for us as we have already the ETL Platform self made and we add the process with just an extra file. thanks!

estsauver|5 years ago

I can speak for singer taps which this is based on for the EL bits. Our postgres etl works quite well. Logical replication broke down quite quickly, but primary keys + updated_at keys are working very well for us. I can't speak about XML though.

smilliken|5 years ago

Why did logical replication break down for you?