top | item 38657997

(no title)

ThomasMoll | 2 years ago

We (when I worked at LinkedIn) did it with ETL clusters, we already had built them out for moving data between datacenters nightly. They would mirror an HDFS cluster, then ran batch jobs to transfer either directly to the outbound cluster or to another ETL cluster in another DC.

We used one of our ETL clusters to ship data to MSFT for various LinkedIn integrations, like seeing LinkedIn profile information in Outlook or Office products.

discuss

order

junto|2 years ago

Which tools were you using for ETL? Or were they completely custom?