(no title)
exacube | 2 years ago
There is a bug in the compaction method:
def compact(sstables, output_sstable):
# Ordered by ascending key. pop() results in the item of the smallest key.
heap = heapq.heapify([(sstable.next(), sstable) for sstable in sstables])
while (item, sstable) := heap.pop()
if not item.is_tombstone():
output_sstable.write(item)
You should only skip tombstones when you are compacting the final (i.e., largest) level, instead of between every level. Otherwise, an entry in a lower level will unmask itself because the tombstone in the upper level was compacted away.It's one of the properties of LSM-based DBs that deletions/tombstones records linger for a long time, though some databases (eg RocksDB) put in some optimizations to get around this.
tontinton|2 years ago
exacube|2 years ago
vlovich123|2 years ago