(no title)
feqgmmr2 | 4 years ago
Consider a scenario where data is coming in periodically, say daily, from some source, server logs, sensor data, whatever. And the user wants to train models daily on the data and they also want to do some SQL. Maybe they ingest the data directly into SF and copy it out for training, or they do it the other way round, land it in object store and the ingest into SF. This is unlikely to be a humongous amount of data, it's probably not a PB. However, this adds up, maybe for some use cases it becomes a PB in a month, maybe in a quarter, maybe it only adds up to a PB in a year.
Thing is, without a Lakehouse architecture, the user will pay to store and copy that data multiple times (at least twice) no. matter. what. They may not pay for a PB in one shot, but you can bet that eventually they'll pay multiple times to store and copy that PB.
No comments yet.