databrecht | 4 years ago | on: Prisma – ORM for Node.js and TypeScript
databrecht's comments
databrecht | 5 years ago | on: Latency Comparison: DynamoDB vs. FaunaDB vs. Redis
databrecht | 5 years ago | on: Latency Comparison: DynamoDB vs. FaunaDB vs. Redis
databrecht | 5 years ago | on: Latency Comparison: DynamoDB vs. FaunaDB vs. Redis
databrecht | 5 years ago | on: Ask HN: SQL or NoSQL?
databrecht | 5 years ago | on: Ask HN: SQL or NoSQL?
Mixing paradigms in one database is probably going to be happening more. Just like Postgres is offering a 'document' style, some document databases are offering documents with relations. It wouldn't surprise me to see document databases offer optional schemas. I think that the future is a mix of options and tools in one database (which JSONB columns are a first step for). Depending on the situation we'll just model differently. The best database might become the one that makes us use these different tools most elegantly together. The difference between a document and a table is only a 'flatten' and a 'schema' away.
databrecht | 5 years ago | on: Ask HN: SQL or NoSQL?
SQL will (maybe sadly?.. maybe not?) not go away. Many so-called 'NoSQL' are looking into providing SQL or already provided SQL (with or without limitations) to their users because they just want to use what they know. I would be stoked for an SQLV2 standard!
databrecht | 5 years ago | on: Ask HN: SQL or NoSQL?
databrecht | 5 years ago | on: Ask HN: SQL or NoSQL?
Graph databases are considered 'NoSQL' yet they have relations and transactions. Schemaless is often also one of the properties give to NoSQL, but it's also a bit strange to consider that a NoSQL attribute. Some traditional databases offer schemaless options and databases like Cassandra has schema yet is considered NoSQL. I work at Fauna which has relations, stronger consistency than many traditional databases. It is schemaless at this point but that might change in the future. Since it doesn't offer SQL it's thrown into the NoSQL bucket with the assumptions that come along with it.
None of these one-liners in computer science make sense IMHO and we listen way too often to colleagues who use them. Similarly "Use SQL for enforced schema" might be accurate in many cases but in essence it depends on your situation, and we need to do research about what we use instead of following one-liners ;)
databrecht | 5 years ago | on: Ask HN: SQL or NoSQL?
The type of join shouldn't be a problem, SQL engines should in most cases be able to determine the best join. In the cases it can't you can go start tweaking (although tricky to get right, especially if your data evolves, it's possible, you probably want to fix your query plan). B is however tricky and a performance loss since it's really a bit silly that data is flattened into a set each time to be then (probably) put into a nested (Object-Oriented or JSON) format to provide the data to the client. This is closely related to C, in a social graph you might have nodes (popular people or tweets) who have a much higher amount of links than others. That means if you do a regular join on tweets and comments and sort it, on the tweet you might not get beyond the first person. Instead, you probably only want the first x comments. That query might result in an amount of nested groups. So it might look more like the following SQL (wrote it by heart, probably not correct):
SELECT tweet.*, jsonb_agg(to_jsonb(comment)) ->> 0 as comments, FROM tweet JOIN comment ON tweet.id = comment.tweet_id
GROUP BY tweet.id HAVING COUNT(comment.tweet_id) < 64 LIMIT 64
That obviously becomes increasingly complex if you want a feed with comments, likes, retweets, people, etc.. all in one. There are reasons why two engineers that helped to scale twitter create a new database (https://fauna.com/) where I work. Although relational, the relations are done very differently. Instead of flattening sets, you would essentially walk the tree and on each level join. I did an attempt to explain that here for the GraphQL case: https://www.infoworld.com/article/3575530/understanding-grap...
TLDR, in my opinion you can definitely use a traditional relational database. But it might not be the most efficient choice due to the impedance mismatch. Relational applies to more than traditional SQL databases though, graph database or something like fauna is also relational and would be a better match (Fauna is similar in the sense that joins are very similar to how a graph database does these). Obviously I'm biased though since I work for Fauna.
databrecht | 5 years ago | on: Comparing Fauna and DynamoDB: Architecture and Pricing
We realize it's not the perfect solution and doesn't deliver the best Dev XP at this point though :)
databrecht | 5 years ago | on: Comparing Fauna and DynamoDB: Architecture and Pricing
In Fauna we have temporality as a first-class citizen. You can get efficient and cheap changesets by leveraging temporality since you can just ask: "what has been added/removed in this collection or index match after timestamp X" and can combine that by writing an index that delivers you the answer to "what are the updated documents in a collection after a certain timestamp?". That brings you very cheap pull-based CDC.
We recently introduced a second possibility that allows for push-based streaming for documents. Document-based streaming allows you to open separate streams for multiple existing documents to get updates on those. This is only the first phase, sets (such as index matches or whole collections) are coming up. Streaming becomes cheaper if you want your data to be really life (<1s) which is excellent for UI redraws but could potentially also be used for CDC (probably in combination with the temporal features if you need to restart streams). Both push-based as pull-based are strongly ordered.
I describe how to get the query for the pull-based approach here: https://forums.fauna.com/t/example-custom-subscription-funct... And a blog on the streaming API can be found here: https://fauna.com/blog/live-ui-updates-with-faunas-real-time...
Once set streaming (next to document streaming) is out as well, we have two strong solutions and you can choose what suits you best and what is most efficient for your use case. Do you want instant UI redraws? Use push-based, are you pulling a changeset every houre? Use pull-based.
databrecht | 5 years ago | on: Comparing Fauna and DynamoDB: Architecture and Pricing
I respectfully disagree :). I don't think that the combination of relations, strong consistency, flexible/powerful indexing, a language that allows you to do complex conditional transactions or reads in one query are subtle differentiators. Especially when you can maintain all those things while being multi-region and scalable (and you also get a flexible security system and get to query back-in-time and/or query/alter history and/or get changesets cheaply). Of course, this post didn't go in depth on all of these since that's not the topic of this post.
Many databases have limitations on the former and present workarounds that require you to either do a lot of work or build something in such an inflexible hard-coded way that it would be very hard to change. The mere fact that they present workarounds (which essentially what a single-table design is for me), to me, indicates that there is a need for their users to work around it.
> So you go after people who aren't using DynamoDB at massive scale. Say, early-stage startup founders who want to be on DynamoDB from day 1 because someday their product will be Web Scale. But don't have a lot of time to carefully evaluate claims like this. They just say "10x cost reduction? Wow, Fauna is the new best DB!" Most of these guys fail, but a few of them are a runaway success (and would have been equally so if they'd used DynamoDB), are now stuck with Fauna whether they like it or not (but let's assume they like it as least as well as DynamoDB, maybe even slightly more), and are now listed as large scale users of Fauna on their website. You too could be a unicorn startup! Start using Fauna today!
I think you just described the life of a developer when selecting <insert random new technology>. Technological advances are accelerating, and we don't have enough time to research them all, so we skim through the posts/docs and look at what other companies have done. I understand what you mean and it's an everyday source of frustration to me as well that many chase new technologies based on one article. That's how many startups ended up with microservices they didn't need or how a new SPA technology takes the world by storm every 2 years.
> Basically, I think the makers of Fauna are trying to con you with this article. It's not that their product is bad, it's that they're trying to get you to buy it for reasons other than that it's good.
The last sentence is quite unfair imho though. Fauna is one of the databases that tries to be correct in their messaging and respects other products deeply. In my personal opinion, Dynamo and Fauna are very different products with a different focus. Dynamo focuses on a use case where you need scalability and sheer speed and are less interested in relations, many access patterns or consistency over many collections. At the same time they do appeal to people who do need those features by presenting workarounds. Maybe someone from a relational background sees these workarounds, didn't think it through and then gets stuck in their inflexibility? Is Dynamo to blame? No, they are just helping their users with questions that often come back. Similarly, the question of 'how is Fauna different from Dynamo' and more importantly for this article 'how does pricing compare' is a question that often comes back here. A question that is hard to answer since it depends entirely on how you use it and many subtleties that are not visible at first hand. Do you need relations? A single-table approach would help but will also blow up your table with redundant data and therefore increase pricing although at first Dynamo looked cheap, it depends on the use case. If you do not research a product thoroughly, chances are you will run into a wall and be stuck with it no matter whether the product is Dynamo, Fauna, Spanner, Firebase, etc. All we can do is provide as much details on what we can and can't do and I think the Fauna docs and forums do quite a good job on that.
I am a developer advocate at Fauna, this reply is however entirely my personal opinion.
databrecht | 5 years ago | on: Comparing Fauna and DynamoDB: Architecture and Pricing
Latencies can be found here: https://status.fauna.com/ As a long fan of Fauna (who now works as a dev adv for them after following them for 2 years) for me what was the most attractive is the combination of features without compromises. Scalability/distribution without losing consistency, relations and powerful indexing (e.g. best of NoSQL and traditional databases combined). I was also attracted by the temporality aspect personally.
databrecht | 5 years ago | on: Please stop calling databases CP or AP (2015)
databrecht | 5 years ago | on: Please stop calling databases CP or AP (2015)
databrecht | 5 years ago | on: Prisma Raises $12M Series A
databrecht | 5 years ago | on: Supabase (YC S20) – An open source Firebase alternative
If there is something missing in our GraphQL feature set, feedback is very welcome :).
databrecht | 5 years ago | on: Supabase (YC S20) – An open source Firebase alternative
databrecht | 5 years ago | on: Supabase (YC S20) – An open source Firebase alternative
Note: the performance impact does depend heavily on whether your database maps well on ORMs. Traditional databases have an impedance missmatch when it comes to translating tables to an objet format. Graph databases or the way Fauna works (documents with relations and map/reduce-like joins) map well on ORMs so the performance impact would be small.