(no title)
hyc_symas | 2 years ago
And of course, none of this is working in a realtime environment, so a couple nanoseconds here or there just doesn't matter.
hyc_symas | 2 years ago
And of course, none of this is working in a realtime environment, so a couple nanoseconds here or there just doesn't matter.
vlovich123|2 years ago
I really don’t understand this defensiveness and doubling down on something that’s pretty well studied and documented. Keep in mind that the original paper looked at raw mmap vs raw io api issues regardless of the surrounding DB: https://www.cidrdb.org/cidr2022/papers/p13-crotty.pdf
It’s a really well written and accessible paper and this is not the hill to die on if you’re disagreeing. I agree mmap for the read path only is a weakness in their paper and that idea I agree has some merit. But cross tlb shootdowns are a real performance downside of this approach.
hyc_symas|2 years ago
We tell no lies. Those tests were run on an HP DL585 G5 server with 128GB of RAM and 4 quad core AMD Opteron 8354 CPUs. If cross-TLB shootdowns were actually as big a problem as they claim, there should be massive latency spikes due to cross-socket communication, far worse than for inter-core single socket.
You're definitely misreading the graphs, the Y axis is clearly labeled "msec". You're going to have to pay better attention if you want to discuss this further. Otherwise we're just wasting our time.
> Keep in mind that the original paper looked at raw mmap vs raw io api issues regardless of the surrounding DB
Yes, but that's exactly why the paper is utterly worthless and never should have made it past peer review. Their initial claim is this:
> Unfortunately, mmap has a hidden dark side with many sordid problems that make it undesirable for file I/O in a DBMS. As we describe in this paper, these problems involve both data safety and system performance concerns. We contend that the engineering steps required to overcome them negate the purported simplicity of working with mmap. For these reasons, we believe that mmap adds too much complexity with no commensurate performance benefit and strongly urge DBMS developers to avoid using mmap as a replacement for a traditional buffer pool.
To prove their claims they would have to have demonstrated that a DBMS using mmap was more complex and less reliable than one using a traditional buffer pool. They also had to demonstrate that a DBMS using mmap had no performance benefit vs a DBMS using a traditional buffer pool. They didn't demonstrate any of these things, and LMDB is concrete proof that all of these claims are wrong: at only 7KLOCs it is simpler than every other traditional DBMS, its reliability is literally perfect, and its read performance is always orders of magnitude than all traditional designs.
The reviewers should have known just from reading the abstract that this paper was a dud:
> Such problems make it difficult, if not impossible, to use mmap correctly and efficiently in a modern DBMS.
Their claim required them to prove a negative, which obviously defies the rules of logic. Everything after that was just noise.
arandomusername|2 years ago
Him doubling down makes sense, given that LMDB continues to offer the best read performance in any benchmark, regardless of who performed it.
If tlb shoot down times are such a problem, why aren't we seeing it being represented in any DB benchmark?