thank you! We will use the data for customer segmentation purposes. For that, we will need to transform all the user's sources into a single entity for a given user, for example. And link that entity/user to another table, with their purchases so we can do the customer segmentation. I'm not sure about latency but honestly that's not really important at the moment, I just need to have a strategy to make an end-to-end solution for gathering the data, transform it and delivery a concise and coherent "user table" for the machine learning dude.
stadium|4 years ago
Usually I'd recommend to bring the raw data into your database first before transforming it. It's hard/impossible to predict future needs and this buys you flexibility. "ELT" describes this approach (vs ETL)
Good luck!