Atree: A simple and efficient pointer-free tree implementation

[+] tlack|2 years ago|reply

Always surprising to click through a link on HN and discover it is one's own work. For a time I was very interested in lightweight array-based implementations of common data structures and this one seemed particularly handy.

[+] spenczar5|2 years ago|reply

It sounds a little like it didn’t work out as well as you hoped. How did it fare?

I am interested because I have some scientific software (astrodynamics - propagating orbits in space) that would really benefit from a cache-friendly tree, and this seemed promising.

[+] 8372049|2 years ago|reply

Friendly heads up, "pseudo" is frequently misspelled "psuedo" in the readme.

[+] edflsafoiewq|2 years ago|reply

This is the representation I usually see used for the tree of nodes for skinning 3D models. There each node has a transform, and the most common operation is to recompute the world transform for all nodes, formed by composing the transform for each node with the one for all of its parents. If the array is sorted so that parents always precede their children, that's just a straight loop over the arrays

   for i = 0, num_nodes do
     if parents[i] == -1 then
       world_xforms[i] = xforms[i]
     else
       world_xforms[i] = world_xforms[parents[i]] * xforms[i]

[+] ninepoints|2 years ago|reply

I would hope your joint hierarchy is at least depth sorted or the loop as shown would not work. The proposed ATree has no order invariants...

[+] rav|2 years ago|reply

Lots of comments about iterating over children being O(N) for this style of trees. It's actually easy to generalize the atree design by e.g. adding pointers for "first child" and "next sibling" and potentially removing the parent pointers, if that's what you need in your application. I think the "Operations in psuedocode" section should simply state that there's no O(1) way to access children of a node - instead of recommending the O(N) approach, you should recommend changing the data structure to support the operations you need.

Storing nodes in arrays and using indices for pointers is a must whenever you're implementing algorithms on trees. I typically prefer using an array of structs, putting the key and the parent index next to each other, instead of putting them in separate arrays. If you need to access the children of a node, then be sure to consider if you can save memory by having a separate structure for leaves - remember that over half of the nodes will be leaves, so using space to store a child pointer on both internal nodes and leaves can be wasteful use of memory.

[+] akoboldfrying|2 years ago|reply

One way to maintain leaves efficiently (amortised O(1) time and space to add/remove) is to keep 2 extra vectors that "point at each other": L[] is a vector containing the indices of all leaves, and LP[] is a vector such that LP[i] stores the index within L[] of the value i if i is a leaf, and -1 otherwise. That is, for i a non-leaf, LP[i] == -1, while for i a leaf, L[LP[i]] == i.

To find the indices of all leaves: That's just L[].

To add a leaf i: Append i to L[], and set LP[i] to L.length - 1.

To remove a leaf i in constant time:

    j = L[L.length - 1]
    L[LP[i]] = j
    LP[j] = i
    LP[i] = -1.
    DropLastElement(L[])

[+] bicsi|2 years ago|reply

I find it a bit odd how this tree representation as a parents array (which is, by the way, I think the most basic representation in any CS course), got so much traction on HN. I think this goes to show how far a good presentation can drive a trivial idea. On top of that, it just casually presents suboptimal procedures for a lot of essential operations, without diving into too much details about the impact of the suboptimality. Good PR I guess…

[+] matt3210|2 years ago|reply

An integer which references another data location is a pointer. This project only replaces system pointers with home grown pointers

[+] 8372049|2 years ago|reply

I think in this context, "pointer-free" is meant to imply (spatial) locality of reference, no additional memory allocations, address-independent and presumably memory safety (but I didn't read the code and may be wrong about the last one).

[+] nine_k|2 years ago|reply

It's not exactly so right on the hardware level. Many architectures, including the ubiquitous x64, have an ability to add a direct offset to a memory-access operation (like mov), or efficiently compute the address with an offset and element size (like lea). This removes an extra pointer deteference, which may be hugely expensive.

With a cache-friendly structure like array, the one dereference you may need for the array access has a high chance to be served from L2 or L1, saving you a lot of clocks, because RAM has huge latency.

[+] naasking|2 years ago|reply

This is typically referred to as a handle rather than a pointer. Handles can permit more flexibility than pointers.

[+] nilamo|2 years ago|reply

> An integer which references another data location is a pointer.

A more useful, storage/serializable friendly, pointer perhaps.

[+] huhtenberg|2 years ago|reply

Oh, please.

There are technically correct nitpicks with some merit and there are trivial remarks like yours. It is perfectly clear what "pointer-free" was referring to in the post title.

[+] zeroCalories|2 years ago|reply

Looks nice. That said, if you want to reduce mallocs and encourage data locality, maybe you could try a more traditional tree implementation using a pool of nodes similar to thread pools. I've found that most of my problems related to those were solved by such techniques.

[+] OnlyMortal|2 years ago|reply

A custom pool for objects is often a good idea. Pre-allocate “enough” memory and dish them out as appropriate, releasing back to the pool when the reference count goes to zero.

You can also use the pool to bound the maximum size you’re able to allow.

[+] icsa|2 years ago|reply

See Aaron Hsu's excellent descriptions (using APL):

https://www.youtube.com/watch?v=hzPd3umu78g https://www.youtube.com/watch?v=X5_5MtOYNos

[+] mihaic|2 years ago|reply

One change I'd make to this structure would be to pack all the children of a node together. That way you could find out the list of all children of a node with a binary search, and they'd have some cache-friendly properties.

Inserting a child would need a memmove in O(N), but if edits are rare after an initial build it wouldn't be that bad.

[+] snovv_crash|2 years ago|reply

You could also have extra space allocated for where extra data should go. By aggressively rebalancing you should only need to do a full reallocation when the tree is actually getting pretty full.

[+] jiggawatts|2 years ago|reply

I was about to comment that this reminds me of Aaron Hsu’s Dyalog compiler, but it’s mentioned right there in the README as one of the inspirations.

IMHO, compilers are about an order of magnitude slower than they could be, and this type of tree representation could be one key element for fixing that.

One interesting approach would be to combine egg[1] with this style of vectorised trees for efficient optimisation passes.

[1] https://arxiv.org/pdf/2004.03082.pdf

[+] sinuhe69|2 years ago|reply

What is a array index if not a “pointer”? In fact, access an array member involves a pointer arithmetic: array_p + index * data_size

The fact that traditional tree implementation requires malloc is solely based on the wish to dynamically provide and remove of nodes. If the tree can have a limited size with no frequent delete operation, an array implementation is fine. Otherwise, expand an array can be a very costly or even an impossible operation in some circumstances.

And full scanning is not efficient.

[+] kleiba|2 years ago|reply

The difference lies in the CPU cache.

[+] emmanueloga_|2 years ago|reply

A lot of people commenting about "an index into an array is also a pointer", I thought people commonly referred to integer indexes of this kind as handles, or index-handles? (like in this article [1]).

This way of representing trees reminds me of two classic data structures: heaps [2] and disjoint-sets [3].

--

1: https://floooh.github.io/2018/06/17/handles-vs-pointers.html

2: https://en.wikipedia.org/wiki/Heap_(data_structure)

3: https://en.wikipedia.org/wiki/Disjoint-set_data_structure

[+] amluto|2 years ago|reply

> Pointers are annoying anyway.

But the parent indices are pointers. Not in the index-into-all-memory sense but in the offset-into-array sense. They’re smaller, type safe, inherently well packed, and can’t point outside the instance of the data structure in question if they’re bounds checked. But I would still think of them as a sort of pointer.

So this is just a tree, with only parent pointer, in SOA form, and iterable (in no particular order). Which is maybe useful if you want that specific data structure.

And it’s utterly useless if you want, say, a tree with children and need better-than-O(n) access to children. Which you probably do need. Of course, you can add child pointers in SOA form if you want.

(SOA is Struct Of Arrays)

[+] alexchamberlain|2 years ago|reply

I think you could achieve something close to (I haven't sat down and thought about it) O(logN) with sorted arrays? You'd have the downside of inserting things in the middle, but I think that would be lost in cache efficiency elsewhere for many use cases.

[+] unknown|2 years ago|reply

[deleted]

[+] catlifeonmars|2 years ago|reply

I feel like atree in TFA is just giving a name to something I do all the time when it makes sense to so. For example, I used something similar recently when building a ECS entity tree for a game engine.

[+] irreducible|2 years ago|reply

For anyone interested, I sketched out a basic implementation of this in Rust. https://github.com/irreducible-io/apter-tree

[+] conradludgate|2 years ago|reply

Petgraph uses a similar design (two vecs) for storing graphs https://docs.rs/petgraph/latest/petgraph/graph/struct.Graph....

[+] readthenotes1|2 years ago|reply

This reminds me of data structures used to represent data on the old tape devices

[+] mamcx|2 years ago|reply

Oh, cool.

I made my own attempt at the same kind of idea at https://elmalabarista.com/blog/2022-flat-tree/.

Is pretty simple actually. What I have observed (and my tree exploit) is that most tree are "too much" and I only have cared by pre-order tree and "sequentially" scan is the major op.

So, the major takeaway is that if your case is simple, Go! Go! Vec!

[+] gumby|2 years ago|reply

A pointer is simply an index into an array which is RAM.

[+] lelandbatey|2 years ago|reply

Conceptually, yes. There are a lot of finer points to this concept though. One of the points most relevanthere is that accessing "RAM" is not _actually_ an O(1) operation, due to CPU caches which can speed up sequential access of "close together" data. Thus, if you manually ensure that data is "close together" (by putting it all in an array together), you can massively speed up your program in the real world because now all your keys/data can be squeezed into the cache at once.

[+] senderista|2 years ago|reply

It's all just names, all the way down.

https://www.dropbox.com/s/mmi46lxk8ipoci0/Principles%20of%20...

[+] mrkeen|2 years ago|reply

But they point into your RAM, which means you can't store them on disk or pass them over the network.

[+] zappb|2 years ago|reply

Pointers are a bit more abstract than that. Since each program has a virtual address space, and said address space is made up of memory pages, even raw pointers need to be translated to the physical address via TLBs. In that sense, indices into arrays are more like pointers than the other way around.

[+] stevage|2 years ago|reply

Interesting the casual references to languages K, J and Q. I have never heard of them, and they are hard to google. Anyone know?

[+] spenczar5|2 years ago|reply

Oh boy, they are weird. They are all related to APL. Extremely terse, dense languges. Got some traction in actuarial and financial circles.

Arthur Whitney is the origin of much of that family of languages. Bryan Cantrill’s interview with him is good: https://dl.acm.org/doi/pdf/10.1145/1515964.1531242

[+] mgaunard|2 years ago|reply

K and Q are the languages behind KDB.

K is the lower-level language and is a variant of APL. Q is the higher-level one, bringing in elements of SQL.

J is another APL-like programming language, unrelated to KDB, by the actual creator of APL.

All are easy to google by adding "programming language" to your query, and each has a wikipedia page.

[+] laszlokorte|2 years ago|reply

They are array languages. If you have never heart of array languages check out the code_report youtube channel.

[+] ufo|2 years ago|reply

I'm curious about the performance of separating the data from the pointers. On one hand, homogenous arrays pack better. On the other, the traditional version that packages the data and child pointers next to each other might have better cache locality.

[+] timrobinson333|2 years ago|reply

I'm surprised nobody has pointed out what seems to be a fairy obvious limitation: the children within a parent cannot be ordered (unless you want to splice them in the middle of the arrays when inserting which would be pretty inefficient)

[+] alpaca128|2 years ago|reply

If all you care about is the order and not that children are neighbours in the array it should be fine.

105 comments