erikwitt | 9 years ago | on: The AWS and MongoDB Infrastructure of Parse
erikwitt's comments
erikwitt | 9 years ago | on: The AWS and MongoDB Infrastructure of Parse
That said, there are a lot of upsides to having a company work full-time on your proprietary cloud solution and ensure its quality and availability. If an open source project dies or becomes poorly maintained you are in trouble too. Your team might not have the capacity to maintain this complex project on top of their actual tasks.
Also open sourcing your platform is a big risk for a company. Take RethinkDB for example: Great database, outstanding team but without a working business model and most recently without a team working full time, it is doomed to die eventually.
Nevertheless, we try to make migrating from and to Baqend as smooth as possible. You can import and export all your data and schemas, your custom business logic is written in Node.js and can be executed everywhere. You can also download a community server edition (single server setup) to host it by yourself.
Still a lot of users even require proprietary solutions and the maintenance and support that comes with it. And often they have good reasons, from requiring a maintenance free platform to to warranties or license issues. After all, a lot of people are happy to lock into AWS even though solutions based on OpenStack, Eucalyptus etc. are available.
erikwitt | 9 years ago | on: The AWS and MongoDB Infrastructure of Parse
- The first thing is that we do not read from slaves. Replicas are only used for fault tolerance as it's the default in MongoDB. This means you always get the newest object version from the server.
- Our default update operation compares object versions and rejects writes if the object was updated concurrently. This ensures consistency for single object read-modify-write use cases. There is also an operation called "optimisticSave" the retries your updates until no concurrent modification comes in the way. This approach is called optimistic concurrency control. With forced updates, however, you can override whatever version is in the database, in this case, the last writer wins.
- We also expose MongoDBs partial update operators to our clients (https://docs.mongodb.com/manual/reference/operator/update/). With this, one can increase counters, push items into arrays, add elements into sets and let MongoDB handle concurrent updates. With these operations, we do not have to rely on optimistic retries.
- The last and most powerful tool we are currently working on is a mechanism for full ACID transactions on top of MongoDB. I've been working on this at Baqend for the last two years and also wrote my master thesis on it. It works roughly like this:
1. The client starts the transaction, reads objects from the server (or even from the cache using our Bloom filter strategy) and buffers all writes locally.
2. On transaction commit all read version and updated objects are sent to the server to be validated.
3. The server validates the transaction and ensures the isolation using optimistic concurrency control. In essence, if there were concurrent updates, the transaction is aborted.
4. Once the transaction is successfully validated, updates are persisted in MongoDB.
There is a lot more in the details to ensure isolation, recovery as well as scalability and also to make it work with our caching infrastructure. The implementation is currently in our testing stage. If you are interested in the technical details, this is my master thesis: https://vsis-www.informatik.uni-hamburg.de/getDoc.php/thesis...erikwitt | 9 years ago | on: The AWS and MongoDB Infrastructure of Parse
I guess 5 indexes might be a little short for some apps. On the other hand too many or too large indexes can get a bottleneck too. In essence, you want to be quite careful when choosing indexes for large applications.
Also some queries tend to get complicated and choosing the best indexes to speed up these queries can be extremely difficult especially if you want your algorithms to choose it automatically.
erikwitt | 9 years ago | on: Building a Shop with Sub-Second Page Loads: Lessons Learned
If you compare using a distributed database to building sharding yourself for say your MySQL backed architecture, NoSQL will most certainly be the better choice.
I'll admit though dealing with NoSQL when you come from a SQL background isn't easy. Even finding the database that fit your needs is tough. We have blog post dedicated to this challenge: https://medium.baqend.com/nosql-databases-a-survey-and-decis...
erikwitt | 9 years ago | on: Building a Shop with Sub-Second Page Loads: Lessons Learned
Fortunately, only www.baqend.com has this bug, it's not in our framework.
erikwitt | 9 years ago | on: Building a Shop with Sub-Second Page Loads: Lessons Learned
erikwitt | 9 years ago | on: Building a Shop with Sub-Second Page Loads: Lessons Learned
erikwitt | 9 years ago | on: Building a Shop with Sub-Second Page Loads: Lessons Learned
Which tools would you use to generate this amount of traffic?
What would in retrospect be your preferred approach to prevent users from executing inefficient queries?
We are currently investigating whether deep reinforcement learning is a good approach for detecting slow queries and making them more efficient by trying different combinations of indices.