top | item 33067962

Postgres WASM

887 points| kiwicopple | 3 years ago |supabase.com

185 comments

order
[+] pistoriusp|3 years ago|reply
Peter from Snaplet here. A month ago I saw the CrunchyData post and wanted to play around with the code that made it happen, it wasn't OSS so I asked for help:

> If anyone out there wants to work on an open source version of this full-time please reach out to me. [0]

Paul reached out and we started working on it almost immediately. Check out the repo here: https://github.com/snaplet/postgres-wasm

We have a blog post about some of the interesting technical challenges that we faced whilst building this: https://www.snaplet.dev/post/postgresql-in-the-browser

Like most things, this is built on-top of the amazing open-source projects that made this possible, but special mention goes to v86.js and buildroot. We just glued it together.

My hope is that we as a community can own this project and make PostgresQL, and the software that runs on it, accessible to a larger audience.

---

[0] Request for collaboration: https://news.ycombinator.com/item?id=32500526

[+] stuaxo|3 years ago|reply
Nice, I wonder if anyone has got Django with postgres working in this environment?
[+] sgt|3 years ago|reply
Didn't realize you guys are in SA. Hi from Stellenbosch. Cool to see more South Africans here and in the startup community in general.
[+] kiwicopple|3 years ago|reply
Hey HN, we’re excited about this launch. This was a collaborative effort with the team at Snaplet [0].

postgres-wasm is an embeddable Linux VM with Postgres installed, which runs inside a browser. It provides some neat features: persisting state to browser, restoring from pg_dump, logical replication from a remote database, etc.

The idea was inspired by CrunchyData’s HN post about a month ago [1]. We love the possibilities of Postgres+WASM, and so Supabase & Snaplet teamed up to create an open source version. The linked blog post explains the technical difficulties we encountered, and the architecture decisions we made.

We’re still working hard on this, but it’s at a good “MVP” stage where you can run it yourself. Snaplet are working on a feature where you can drag-and-drop a snapshot into your browser to restore the state from any backup. Supabase are exploring ways we can run the entire Supabase stack inside the browser. You can find the Snaplet repo here [2], and the Supabase fork here [3]. There’s very little difference between these two, we just have a different browser UI.

Both Supabase team and the Snaplet team will be in here commenting if you want to know anything else about the technical details.

[0] Snaplet: https://www.snaplet.dev/

[1] Crunchy post: https://news.ycombinator.com/item?id=32498435

[2] Snaplet repo: https://github.com/snaplet/postgres-wasm

[3] Supabase fork: https://github.com/supabase-community/postgres-wasm

[+] vineyardmike|3 years ago|reply
> an embeddable Linux VM with Postgres installed, which runs inside a browser.

Wow! I feel like this is the lede. How much work was done supporting the VM and OS privatives (eg networking) vs PG specific work? I feel like a minimal Linux in the browser opens up a LOT more opportunities than just a database.

When figma got bought out, a lot of articles were written about “where’s the wasm applications”, and I feel like throwing Linux into a browser really shows potential. One commenter already wondered if it could be used to compile microcontrollers (so creative, i now want that too), I wonder if it can be used similar to Repl.it, with packaging test environments.

To be very, very clear, I would LOVE a write up about just the linux portion of this interesting project.

[+] boltzmann-brain|3 years ago|reply
Do you have a benchmark of load times after the includes are cached? Depending on whether it's < 100 ms or > 1s it will make the applications for this very different.
[+] gurjeet|3 years ago|reply
The links at the bottom are broken; presumably the post was written in Markdown, HN doesn’t understand that.
[+] angelmm|3 years ago|reply
One of the things I love from this post is the collaboration on this kind of OSS projects.

Making Postgres Wasm helped:

- v86[0] to find a new bug

- Providing a great deep-dive article that will trigger new ideas in the future

- Showcase the possibilities of Wasm and how you can overcome the current challenges

I really appreciate these projects are OSS :)

Congratulations for the project!

[0] https://github.com/copy/v86

[+] MuffinFlavored|3 years ago|reply
Could I hypothetically pass USB through to v86? Like libusb on the host into v86 on the guest? Or do USB over IP or something?
[+] bearjaws|3 years ago|reply
The use cases here are going to be really wide spread in my opinion, just a few ideas off the cuff. Obviously the 30mb size means it won't really be for regular consumer apps, but for enterprise or specific tasks it can make a lot sense.

1. Training websites

2. Interview challenges involving SQL

3. Client side tooling that loads data into your local machine and displays into a SaaS web app without the SaaS app ever having your data

Appreciate the hard work from Supabase and Snaplet on this!

[+] burggraf|3 years ago|reply
Supabase developer here.

I've used this to move data from a live Supabase database down to the browser for testing and playing around with things in a "sandbox" environment. Then I save snapshots along the way in case I mess things up.

To move a table over from my Supabase-hosted postgres instance to the browser, I just exit out of psql and run something like this:

pg_dump --clean --if-exists --quote-all-identifiers -t my_table -h db.xxxxx.supabase.co -U postgres | psql -U postgres

Keep in mind if you try something like this, our proxy is rate limited for now to prevent abuse, so it might not be super fast. It's easy to remove rate limiting at the proxy, though.

[+] mrtksn|3 years ago|reply
IMHO the prime use case for all these WASM stuff is going to be platform independence. Web browsers are not that interesting because for regular use they already have ballooning resource use issues and making web apps even more resource intensive is not exactly inspiring, HOWEVER the web technologies are the only true multi-platform solution we have and it makes sense to use it to make everything with it and everything instantly becomes multi-platform.

What I suspect may happen is, the rise of web browsers of a 3rd kind where these are not really for browsing the web but running code written for native domains. So instead of browsing web of linked text, we can have a web of algorithms to process data and requests.

[+] alex_duf|3 years ago|reply
I can really appreciate the fun and technical challenge of running postgres in a browser. However the use cases are extremely far fetched.

1. training website: you can use a hosted PG, or use a sqlite wasm

2. same as above

3. if the use case is being offline, then the web browser isn't very relevant. If the use case is to avoid a load on the server, the sqlite in wasm will be just fine.

It's only if you go into triggers and such that it might start being relevant, but then I'd start seriously questioning what on earth are you trying to do :D

All of that to say: well done to the team that has done it, really fun and interesting work, I just can't see the use from where I stand.

[+] burggraf|3 years ago|reply
Also, in case you're curious, PostgreSQL logical replication to the browser wasm instance DOES work. I've done it. :)
[+] NoGravitas|3 years ago|reply
> Obviously the 30mb size means it won't really be for regular consumer apps

You know that it will end up being used for regular consumer apps. And once everyone is doing it, regular web pages being over 30MB and including an enterprise-grade SQL server engine will simply be accepted as normal, and everyone not doing it is a luddite.

[+] member_mini|3 years ago|reply
> 3. Client side tooling that loads data into your local machine and displays into a SaaS web app without the SaaS app ever having your data

Member mini mongo?

Supabase will be Meteor in no time.

[+] ASalazarMX|3 years ago|reply
Without seeing the marketing, I think running a full RDBMS inside your browser is not a great idea. Just idling it becomes my most CPU intensive Firefox tab, out of dozens, according to about:performance.

I shudder what performance a full-fledged application would demand. I know some people will embed this on an Electron app, for double the fun.

[+] mytherin|3 years ago|reply
Looks really nice, but I'm really curious about what hurdles you encountered in making a native WASM version of Postgres. It seems that both the performance and binary size suffer immensely from the VM running underneath. For example, if I compare to DuckDB-WASM [1] there is an immense difference in load time and query execution speed.

Are there future plans at creating a native WASM version of Postgres? Making it run many times faster would certainly open up a lot more use cases.

(Disclaimer: I work on DuckDB, but have not worked on the WASM version myself)

[1] https://shell.duckdb.org

[+] pistoriusp|3 years ago|reply
Snaplet person here: Having a native WASM version would be a huge win for everyone, especially is storage is decoupled from compute!

At the moment the CPU and memory snapshot of the VM (with Postgres) is 12 MB, and subsequent reloads are cached. So yeah, not the worst, but not great.

An optimization is that we're using 9P filesystem. So accessing anything on disk is lazily loaded over the network.

> Are there future plans at creating a native WASM version of Postgres?

Yup! I think that should be the goal, and we (Supabase & Snaplet) would be very happy to work with anyone that wants to build towards that.

[+] kiwicopple|3 years ago|reply
congrats to the DuckDB team for their WASM version. I've been following it from afar and it's very impressive.

> what hurdles you encountered in making a native WASM

I'm sure Mark & Peter can jump in with specifics but mostly it was due to complexity - there it probably can be done it's just that we took the path of least resistance.

> Are there future plans at creating a native WASM version of Postgres

We'd like that. If anyone would like to collaborate with Supabase + Snaplet to create a more "native WASM" version then please reach out

[+] mockery|3 years ago|reply
The first thing to point out is that our implementation isn't pure WASM. We attempted to compile Postgres for WASM directly from source, but it was more complicated that we anticipated.

Crunchy's HN post provided some hints about the approach they took, which was to virtualize a machine in the browser. We pursued this strategy too, settling on v86 which emulates an x86-compatible CPU and hardware in the browser.

I’m out-of-domain but very curious about this part - it seems like a pretty extreme solution with a lot of possible downsides. Does this mean the “just compile native code to WASM” goal is still far off?

[+] fdr|3 years ago|reply
The problem is you need operating system features to run Postgres as-is, e.g. mapping memory, forking processes, manipulating files. What is missing is a WASM kernel that skips the x86 emulation but implements enough of the other stuff.

For example, for just one of many hairy problems, consider that Postgres uses global variables in each backend for backend-local state (global state as such is in shared memory). How does this look in assembly, accounting for both the kernel and userspace components? This is the problem.

A general way to convey this is: the more system calls a piece of software uses, the more difficult a WASM target without architecture emulation becomes. And Postgres doesn't even obligate that many obscure ones.

[+] burggraf|3 years ago|reply
Supabase developer here. I've tried compiling directly to WASM, but it did not go well. As I recall, there were features used by PostgreSQL that WASM didn't support yet. This is definitely something we'll revisit though, especially as WASM matures!
[+] jhd3|3 years ago|reply
I was wondering if anyone had thought about using this to experiment with the planner.

The engineering and support teams at Greenplum, a fork of Postgres, have a tool (minirepro[0]) which, given a sql query, can grab a minimal set of DDLs and the associated statistics for the tables involved in the query that can then be loaded into a "local" GPDB instance. Having the DDL and the statistics meant the team was able to debug issues in the optimizer (example [1]), without having access to a full set of data. This approach, if my understanding is correct, could be enabled in the browser with this Postgres WASM capability.

[0] https://github.com/greenplum-db/gpdb/blob/6X_STABLE/gpMgmt/b...

[1] https://github.com/greenplum-db/gpdb/issues/5740#issuecommen... (has an example output)

[+] xani_|3 years ago|reply
Huh, thought for a second that was a WASM support for stored procedures or something and was excited for a second
[+] supamichelp|3 years ago|reply
Well it's not what you were expecting, but your idea sounds interesting too! Maybe you could muster the same open source forces to make it a reality?
[+] michaelmior|3 years ago|reply
This was also what I was expecting. Another reply to your comment mentioned Wasmer, which I found after a quick Google search. Unfortunately it looks like the last commit was almost 2 years ago and there's an issue where people are inquiring about the status of the project that hasn't seen a reply from the maintainers in about the same amount of time.
[+] pashkinelfe|3 years ago|reply
For me it's interesting can this work without the need of external proxy server which seems to be needed only to overcome browser connections limitations and nothing else. May there be some more "internal" way to implement/allow these connections?

Overall this seems an inspiring thing. Thanks!

[+] pistoriusp|3 years ago|reply
Snaplet person here: Totally. It cannot, and it very likely will never be able to do that. Raw sockets were added, and subsequently removed, from Chromium.

Do you mean, connect from 1 browser tab to another?

[+] burggraf|3 years ago|reply
Supabase developer here.

At this point, the proxy is necessary because all the major browsers block direct TCP/IP traffic. They allow websocket connections so that's how we're getting around it.

There have been proposals to open up TCP/IP traffic but they've all been shot down so for the security implications.

[+] BrandiATMuhkuh|3 years ago|reply
This is a really exciting development. It's super awesome to see again what the supabase team does. Thanks for your great work.

Here is my use case: We use firestore + PG. PG has a copy of all firestore data. And PG is used for search, aggregation, etc. (Everything firestore can't do). Sadly, the developer flow breaks during local development. Because each developer would need to have a local PG server running. My dream is to simply add PG via NPM and use the wasm version during local development. That way, everything simply works via NPM/node

I think an in-memory node only version of PG might even be simpler to achieve than the already developed approach. As it doesn't need the websocket workaround.

[+] tmpz22|3 years ago|reply
I expect to see WASM tooling expose that Google's stewardship of Chromium has left many, many, features to be desired. You can blame it on the bureaucracy and legacy cruft of other companies like Microsoft and Apple, but at the end of the day there's just so much stuff you can't do with web apps (and oh wow just happens to align with Google's Ad and Cloud businesses).
[+] robertlagrant|3 years ago|reply
Can you list the features you mean that Firefox and Safari have that the Chromium project has left out because of Google's business interests? Be interesting to see.
[+] burggraf|3 years ago|reply
So I've been using this for a couple weeks now on my mobile phone. There's really no reason to do this, other than I'm sitting in a car waiting for my wife to come out of the grocery store and I just feel like writing some some SQL code -- because I can. If you noticed, there's a little keyboard icon at the top of the demo that toggles on a mini keyboard with ^c, ^d and cursor keys. That's because the iPhone doesn't have those keys, and I'd start pinging a server and realize I can't break out of it and now I need to reset the whole emulator. Anyhow, that's why that little toggle is in there. :)
[+] jcuenod|3 years ago|reply
Hint: Don't use ctrl-w in their terminal. Turns out that has another meaning in Firefox...

[EDIT]: I'm zero for three thus far. I give up.

[+] rkagerer|3 years ago|reply
Aside from folks who already have Postgres infrastructure they want to replicate, what sort of applications would find this a better fit than using SQLite on the client side?
[+] Existenceblinks|3 years ago|reply
The most important piece of jigsaw left is writable local file access api. Yet, no energy put to do it! Security shouldn't be a stopper but a thing to be solved, probably only allow installed pwa app write with little cute settings widget like on mobile to see which apps have accessed to local machine's storage, showing path, usage size, etc. It's like conspiracy to leave Chrome do it alone. Strange!
[+] mathiasrw|3 years ago|reply
WASI will blow your mind
[+] mradek|3 years ago|reply
I haven't messed around with wasm but I like postgres a lot. Just curious what are some use cases for this? Also not a supabase user even tho it looks cool.

In my case, I use postgres along with postGIS for some of my services. Could this allow me to have some parity where the client can have a 1:1 table but populated and kept up-to-date with their own data to cut down on making network requests?

[+] spiderice|3 years ago|reply
There is a "why" section in the blog post that gives a number of examples. Namely:

- Documentation: for tutorials and demos.

- Offline data: running it in the browser for an offline cache, similar to sql.js or absurd-sql.

- Offline data analysis: using it in a dashboard for offline data analysis and charts.

- Testing: testing PostgresSQL functions, triggers, data modeling, logical replication, etc.

- Dev environments: use it as a development environment — pull data from production or push new data, functions, triggers, views up to production.

- Snapshots: create a test version of your database with sample data, then take a snapshot to send to other developers.

- Support: send snapshots of your database to support personnel to demonstrate an issue you're having.

edit: formatting

[+] statico|3 years ago|reply
Somewhat related, I've been working on a tool to help do ad hoc work on CSV files. My current solution is embedding SQLite in the browser: https://csvhacker.statico.io/

I feel like Postgres would make this even more powerful.

[+] danielvaughn|3 years ago|reply
Very impressive technically, but what are the practical use cases for this? We have IndexedDB and WebSQL, so apart from the other comments mentioning using it in tutorials, I'm not sure what the value add is.

edit: as always, I should read the whole article first. The idea of using it as a dev environment is very cool.

[+] jdmdmdmdmd|3 years ago|reply
>We have [...] WebSQL

Not any more - iirc it was depreciated.

In the case of IndexedDB, I haven't looked into it, but Mozilla has the following to say about it:

>Note: IndexedDB API is powerful, but may seem too complicated for simple cases. If you'd prefer a simple API, try libraries in See also section that make IndexedDB more programmer-friendly.[0]

I suppose this project could make development easier by allowing developers to share server-side code? And it has the benefit of already having a large userbase.

[0]https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_A...

[+] smallerfish|3 years ago|reply
There is room for serverless applications with a sane and reasonably complete data storage interface (which localstorage and indexeddb are not.) You could offer users the ability to backup data locally using the filesystem api (which is reasonably usable), or perhaps sync data using p2p using webrtc (which is not as usable as it should be).
[+] kiwicopple|3 years ago|reply
There are some things that are uniquely beneficial for Postgres beyond the data storage - the main one I'm excited about is using PostGIS inside the browswer.

OSM + PostGIS in the browser has the potential to do for Maps, what Figma's WASM approach did for design.