Strange the article proposes itself for "Enterprise" yet has no mention of Google's Zanzibar and how it compares to the other approaches. AFAIK it doesn't use pre-computed values but just queries really fast (using Spanner so there's that)
Google's Zanzibar actually does both: for the vast majority of queries, it uses significant levels of caching and a permitted amount of staleness [1], allowing Spanner to return a (somewhat stale) copy of the relationship data from local nodes, rather than having to wait or coordinate with the other nodes.
However, some deeply recursive or wide relations can still be slow, so Zanzibar also has a pre-computation cache called Leopard that is used for a very specific subset of these relations [2]. For SpiceDB, we called our version of this cache Materialize and it is designed expressly for handling "Enterprise" levels of scale in a similar fashion, as sometimes it is simply too slow to walk these deep graphs in real-time.
Interesting article, but it mixes up two concerns, I would say. One is retrieving trees from the DB and storing them - which can be annoying but has nothing to do with permissions. Another one is "hiding" unpermitted nodes/branches from the viewer (if that is what applying permissions is about - it can also handle read-only things, for instance). If these two concepts get separated and it is not a big deal to "overfetch" for the current user before doing the filtering - things become way easier. When the tree is reconstructed, you can do breadth-first traversal and compute permissions for every item in there - or retrieve the permissions for items at that level, if you are doing ACL stuff. From there - if there is no permission for the current viewer on that node - you exclude it from further scans and you do not add its' children to further traversals as you go down. Max. number of scans = tree depth. With some PG prowess you could even fold this into sophisticated SQL stuff.
>We added a point of failure, as the permissions table can get out of sync with the actual data.
>The main risk with pre-computed permissions is data getting out of sync.
It would make sense to have permissions be a first class concept for databases and to ensure such a desync could never happen. Data being only read or written from specific users is a very common thing for data so it would be worth having first class support for it.
I'm struggling to understand what the issue that the author is getting at. The point of a database is that it's ACID compliant, wrap insets/updates/deletes in a transaction and no such drift would occur. What am I missing?
Why is it a useful property that everything is always "in sync"? I propose this is not possible anyway. These systems are always asynchronous, and the time of check is always before the time of use, and it is always possible that a revocation occurs between them, and this problem cannot be eliminated.
Another approach to complex requirements without spending a lot of time querying databases is to use bitmaps. A set of permissions can be expressed through a bitmap and all you need to do in code is to "decode" that to what you actually let the user do.
The downside to this approach is that it requires some planning and to maintain in code what mask retrieves what permission(s).
I only did a quick read of permit.io offering but iirc they don't focus on hierarchical data. If having access to a resource cannot grant access to unbounded number of other independent resources (eg sharing a folder) then almost all issues of the article disappear
tekkk|2 months ago
jschorr|2 months ago
However, some deeply recursive or wide relations can still be slow, so Zanzibar also has a pre-computation cache called Leopard that is used for a very specific subset of these relations [2]. For SpiceDB, we called our version of this cache Materialize and it is designed expressly for handling "Enterprise" levels of scale in a similar fashion, as sometimes it is simply too slow to walk these deep graphs in real-time.
[1]: https://zanzibar.tech/24uQOiQnVi:1T:4S [2]: https://zanzibar.tech/21tieegnDR:0.H1AowI3SG:2O
eliocs|2 months ago
svaha1728|2 months ago
smarx007|2 months ago
Xmd5a|2 months ago
Fine-grained authorization as an incremental computation problem
eliocs|2 months ago
gneray|2 months ago
bencyoung|2 months ago
calderwoodra|2 months ago
casper14|2 months ago
nh2|2 months ago
julik|2 months ago
Trees with RDBMSes do stay a pain, though :-)
charcircuit|2 months ago
>The main risk with pre-computed permissions is data getting out of sync.
It would make sense to have permissions be a first class concept for databases and to ensure such a desync could never happen. Data being only read or written from specific users is a very common thing for data so it would be worth having first class support for it.
eliocs|2 months ago
valiant55|2 months ago
jeffbee|2 months ago
the_arun|2 months ago
samarthr1|2 months ago
And that it is an internal google system?
unknown|2 months ago
[deleted]
ExoticPearTree|2 months ago
The downside to this approach is that it requires some planning and to maintain in code what mask retrieves what permission(s).
bitweis|2 months ago
Scales both on the tech, and on the human side - e.g. your product manager can add roles (with CI approval) without requiring engineering involvement.
(I'm biased but still true)
afiori|2 months ago