top | item 42920238

(no title)

joshwget | 1 year ago

Hightouch |Remote (North America)|Full-time | Backend, Fullstack, and Frontend Engineers

Some background on Hightouch - our mission is to help companies leverage their customer data to grow. We started with the problem of “Reverse ETL” or helping companies sync data from their data warehouse (e.g. Snowflake, Databricks, etc.) to 200+ SaaS tools (Salesforce, Marketo, Facebook Ads, etc.) without coding. Since then, we’ve evolved into a suite of tools around the warehouse (identity resolution, data enrichment, event streaming, etc.). We’ve raised a Series B and scaled to $40m+ ARR in 3 years with 800+ customers including Fortune 500 co’s like Spotify, the NBA, PetSmart, etc. We are hiring for:

Full Stack Engineer, AI Decisioning: https://boards.greenhouse.io/hightouch/jobs/5404573004

Product Manager, Insights Products: https://boards.greenhouse.io/hightouch/jobs/5401455004

Our Talent Team is committed to responding to everyone who applies!

discuss

order

n_u|1 year ago

Looks like you are also hiring a distributed systems engineer https://news.ycombinator.com/item?id=42920292

From that listing "Sync Speed: Customers want to sync a lot of data to important destinations like Facebook and Snapchat, which requires us to analyze every part of our syncing process and find where we can optimize to sync data more quickly"

I'm curious about this. What workflow requires syncing high volumes of data from a CDW to Facebook & Snapchat at low latency? It's my understanding that businesses mostly use those platforms for advertising. I'm struggling to think of a use case where you want to adjust your advertising with low latency and lots of data? I could understand feeding lots of data from your CDW into a ML model that updates your ads through the FB Ads API but I can't see why

1. it needs to go straight from CDW to FB ?

2. it needs to be a lot of data?

3. it needs to be fast?

Perhaps there is some other use-case besides adjusting ads.

4. Also why do you use the word "syncing" rather than "send"? I tend to think of syncing involving multiple programs that can edit data (e.g. Google Docs, distributed consensus etc.). Are Facebook and Snapchat actually updating the data you send and you have to sync the other direction? Or is just one-way?

kernel_concern|1 year ago

I work on the syncing team at Hightouch. These are great questions and also good feedback on how we could be clearer when describing the problems we need to solve.

1. We also support the case you describe, in which an ML model processes data and then updates properties in a destination. However, customers still get a lot of value out keeping downstream systems synced with their warehouse tables. For instance, you can define which people you want to receive different campaigns and make sure that's consistent across all your ad platforms. You can also use it for simple projects like easily keeping Airtable in sync with a Postgres database.

2. Some people have warehouse tables with many billions of rows.

3. If you have a billion rows, you need to hit a very high rows per second number in order to run a sync in a feasible amount of time. Also, we have an event collection product, which allows customers to feed events into Hightouch in realtime, and a personalization API product, which allows customers to hit an API and get a low-latency response for how a given user's experience should be personalized. Making sure that the new data flowing into the events API is processed, and data is ready in the personalization API for fast fetching, needs to be fast.

4. It's true that syncing often implies some bi-directionality. In this case we think about "syncing" the destination system state to that present in the source system. It's nice because you can use the source system as the source of truth and trust that any edits you make will be reflected elsewhere. Possibly across many destinations.

Is this helpful?