Hi, thank you for the in depth response. I really needed to hear these things. I have gone ahead and addressed almost all of the issues that you pointed out. I updated the root to be the only point for compare and swap and on mutations I do a clone instead of mutating the shared pointer. I updated set bit from xor to set and have an explicit clearbit fn for deletions. I also updated k/v aliasing to copy slices on write so that old copies do not share ref to same slice. The node pool has been removed completely for the time being as well. I added in additional tests to test for these cases and added in more concurrent tests as well. this was huge feedback and I really appreciate it.
kiitos|8 months ago
For example this code
https://github.com/sirgallo/cmapv2/blob/6bcaa0253b1b0b261e8a...
and in particular its use of this code
https://github.com/sirgallo/cmapv2/blob/6bcaa0253b1b0b261e8a...
is still completely unsound.
Looking at *only this code path* -- and there are *many more* --
---
Put
- Snapshots the current root pointer with atomic.LoadPointer
- Makes an updated root pointer via putRecursive, given the snapshotted root pointer
- Spins on a CAS of the root ptr and the updated ptr with runtime.Gosched() between attempts
---
atomic.LoadPointer isn't a real snapshot
- It's atomic only over the root ptr, not any interior field
- Those interior fields are mutated in-place via e.g. setBitmap, setChild, etc.
- Any goroutine can see partial data, violating the memory model, etc.
---
putRecursive is unsound
- copyNode performs a shallow copy, child pointers are shared, subsequent setChild, extendTable, etc. mutate nodes other goroutines can still hold -- this is a fundamental bug that seems to remain un-addressed from the previous review
- Those mutations use plain writes (no atomics/locks/etc.) -- data race, memory model violation, etc.
- Get later returns the internal []byte slice directly -- data race, memory model violation, etc.
- Newly created nodes are cast to unsafe.Pointer without an atomic store, bypassing the write barrier required by the GC
---
That compareAndSwap is unsound
- It compares only the root pointer, a shallow copy
- After a successful CAS other writers can still mutate any shared children (see above), so readers following any shared path see data races, memory model violations, etc.
- The retry loop can livelock, details elided
---
The implementation still seems to confuse "atomic pointer swap" with "atomic update of a complex, shared value", misunderstands the requirements of the Go memory model, and consistently mis-uses unsafe.Pointer.
tl;dr here is probably to just stop using package unsafe altogether, until you have some time to properly understand its semantics, requirements, and limitations...
sirgallo|8 months ago
sirgallo|8 months ago