top | item 44884324

(no title)

salmonellaeater | 6 months ago

It's the wrong architecture from a dependency management perspective. Directly importing a table into Iceberg allows analytics consumers to take dependencies on it. This means the Postgres database schema can't be changed without breaking those consumers. This is effectively a multi-tenant database with extra steps.

This is not to say that this architecture isn't salvageable - if the only consumer of the Iceberg table copy is a e.g. view that downstream consumers must use, then it's easier to change the Postgres schema, as only the view must be adjusted. My experience with copying tables directly to a data warehouse using CDC, though, suggests it's hard to prevent erosion of the architecture as high-urgency projects start taking direct dependencies to save time.

discuss

order

code_biologist|6 months ago

Eh, as long as it isn't life or death, I think allowing direct consumption and explicitly agreeing that breakage is a consumer problem is better for most business use cases (less code, easier to maintain and evolve). If you make a breaking schema change and nobody complains, is it really breaking?

I have spent way too much life maintaining consumer shield views and answering hairy schema translation questions for use cases so unimportant the downstream business user forgot they even had the view.

Important downstream data consumers almost always have monitoring/alerting set up (if it's not important enough to have those, it's not important) and usually the business user cares about integrity enough to help data teams set up CI. Even in these cases, where the business user cares a lot, I've still found shield views to be of limited utility versus just letting the schema change hit the downstream system and letting them handle it as they see fit, as long as they're prepared for it.

> it's hard to prevent erosion of the architecture as high-urgency projects start taking direct dependencies to save time.

IME, it feels wrong, but it mostly does end up saving time with few consequences. Worse is better.