nvanbenschoten's comments

nvanbenschoten | 1 month ago | on: ANN v3: 200ms p99 query latency over 100B vectors

(author here) The 92% mentioned in this post is showing recall@10 across all 100B vectors, calculated by comparing to the global top_k.

turbopuffer will also continuously monitor production recall at the per-shard level (or on-demand with https://turbopuffer.com/docs/recall). Perhaps counterintuitively, the global recall will actually be better than the per-shard recall if each shard is asked for its own, local top_k!

nvanbenschoten | 2 years ago | on: Jepsen: MySQL 8.0.34

> Cockroach only allow SERIALIZABLE transactions so the unviability of SERIALIZABLE seems questionable.

We've actually been hard at work on adding Read Committed and Repeatable Read isolation into CockroachDB. The risks of weak isolation levels are real, but they do have a role in SQL databases. We did our best to avoid the pitfalls and inconsistencies of MySQL and even PostgreSQL by defining clear read snapshot scopes (statement vs. transaction).

The preview release for both will be dropping in Jan. Some links if you're interested: - RFC: https://github.com/cockroachdb/cockroach/blob/master/docs/RF... - Hermitage test: https://github.com/ept/hermitage/blob/master/cockroachdb.md

nvanbenschoten | 3 years ago | on: Back from the Future: Global Tables in CockroachDB

I don't think this scheme provides the "monotonic reads" property discussed in the blog post. Specifically, it would be possible for a reader to observe a new value from r2 (who received a timely heartbeat), then to later observe an older value from r3 (who received a delayed heartbeat). This would be a violation of linearizability, which mandates that operations appear to take place atomically, regardless of which replica is consulted behind the scenes. This is important because linearizability is compositional, so users of CockroachDB and internal systems within CockroachDB can both use global tables as a building block without needing to design around subtle race conditions.

However, for the sake of discussion, this is an interesting point on the design spectrum! A scheme that provides read-your-writes but not monotonic reads is essentially what you would get if you took global tables as described in this blog post, but then never had read-only transactions commit-wait. It's a trade-off we considered early in the design of this work and one that we may consider exposing in the future for select use cases. Here's the relevant section of the original design proposal, if you're interested: https://github.com/cockroachdb/cockroach/blob/master/docs/RF....

nvanbenschoten | 5 years ago | on: Postgres scaling advice

Hi cuu508, CockroachDB engineer here. You are correct that row-level partitioning is not supported in the OSS version of CRDB. However, it sounds like there's a bit of confusion about where manual table partitioning is and is not needed. The primary use-case for row-level partitioning is to control the geographic location of various data in a multi-region cluster. Imagine a "users" table where EU users are stored on European servers and NA users are stored on North American servers.

If you are only looking to scale write throughput then manual partitioning is not be needed. This is because CRDB transparently performs range partitioning under-the-hood on all tables, so all tables scale in response to data size and load automatically. If you are interested in learning more, https://www.cockroachlabs.com/docs/stable/architecture/distr... discusses these concepts in depth.

nvanbenschoten | 6 years ago | on: Parallel Commits: A New Atomic Commit Protocol for Distributed Transactions

> If it's PENDING, reader just ignores it and skips its data, since they use MVCC. There's no waiting here.

I think this is where the confusion is coming from. You're correct that a read can simply ignore writes, even pending ones, at higher timestamps due to MVCC. This improves transaction concurrency.

However, if a read finds a provisional write (an intent) at a lower timestamp, it can't just ignore it. It needs to know whether to observe the write or not. So it looks up the write's transaction record and may have to wait. If the write transaction is not finalized then it needs to either wait on the transaction to finish or force the transaction's timestamp up above its read timestamp. This is true regardless of parallel commits or not.

What parallel commits gets us is a faster path to transaction commit, as irfansharif pointed out below. So the write can not only be committed faster with parallel commits, but it can also be resolved faster to get out of other reads' ways. In that way, it improves both the synchronous latency profile and the contention footprint of transactions, assuming no coordinator failures.

page 1