For background: we have a storage product for large files (like photos, videos, etc). The storage paths are mapped into your Postgres database so that you can create per-user access rules (using Postgres RLS)
This update adds S3 compatibility, which means that you can use it with thousands of tools that already support the protocol.
I'm also pretty excited about the possibilities for data scientists/engineers. We can do neat things like dump postgres tables in to Storage (parquet) and you can connect DuckDB/Clickhouse directly to them. We have a few ideas that we'll experiment with to make this easy
Let us know if you have any questions - the engineers will also monitor the discussion
This is great news. Now I can utilize any CDN provider that supports S3. Like bunny.net [1] which has image optimization, just like Supabase does but with better pricing and features.
I have been developing with Supabase past two months. I would say there are still some rough corners in general and some basic features missing. Example Supabase storage has no direct support for metadata [2][3].
Overall I like the launch week and development they are doing. But more attention to basic features and little details would be needed because implementing workarounds for basic stuff is not ideal.
> I can utilize any CDN provider that supports S3. Like bunny.net
Bunny is a great product. I'm glad this release makes that possible for you and I imagine this was one of the reasons the rest of the community wanted it too
> But more attention to basic features and little details
This is what we spend most of our time doing, but you won't hear about it because they aren't HN-worthy.
> no direct support for metadata
Fabrizio tells me this is next on the list. I understand it's frustrating, but there is a workaround - store metadata in the postgres database (I know, not ideal but still usable). We're getting through requests as fast as we can.
I don't. S3-compatible storages usually are significantly cheaper, allow to offload HTTP requests. Also huge databases make backups and recoveries slow.
The only upside of storing blobs in the database is transactional semantics. Buf if you're fine with some theoretical trash in S3, that's trivially implemented with proper ordering.
Lol. The most PG blob storage I've used in prod was a couple hundred GB. It was a hack and the performance wasn't ideal, but the alternatives were more complicated. Simple is good.
This looks great! How easy is it to self host Supabase? Is it more like "we're open-source, but good luck getting this deployed!", or can someone really build on Supabase and if things get a little too expensive it's easy enough to self-host the whole thing and just switch over? I wonder if people are doing that.
Anyone who is familiar with basic server management skills will have no problem self-hosting. every tool in the supabase stack[0] is a docker image and works in isolation. If you just want to use this Storage Engine, it's on docker-hub (supabase/storage-api). Example with MinIO: https://github.com/supabase/storage/blob/master/docker-compo...
Some may disagree but in my experience Supabase was definitely challenging to selfhost. Don't get me wrong; I'm pretty confident with selfhosting but Supabase was definitely on the hard side.
Pocketbase being literally single-binary doesn't make Supabase look good either, although funtionalities differ.
Supporting an existing API provides interoperability which is beneficial for the users. So that way if there is a better storage service it’s easier to adopt it. However, the S3 API compatibility can be a hindrance when you want to innovate and provide additional features and functionality. In our case, providing additional features [1] [2] while continuing to be S3 API compatible has forced us to rely on custom headers.
I might be misremembering this but I was under the impression that Ceph offered the same or very similar object storage API prior to Amazon building S3.
Supabase also announced this week Oriole (the team not just the table storage plugin) is joining them so I guess this is part of the same story. Anyway it's nice timing I was thinking about a hookup to Cloudflare R2 for something and this may be the way.
Oriole are joining to work on the OrioleDB postgres extension. That's slightly different to this release:
- This: for managing large files in s3 (videos, images, etc).
- Oriole: a postgres extension that's a "drop-in replacement" for the default storage engine
We also hope that the team can help develop Pluggable Storage in Postgres with the rest of the community. From the blog post[0]:
> Pluggable Storage gives developers the ability to use different storage engines for different tables within the same database. This system is available in MySQL, which uses the InnoDB as the default storage engine since MySQL 5.5 (replacing MyISAM). Oriole aims to be a drop-in replacement for Postgres' default storage engine and supports similar use-cases with improved performance. Other storage engines, to name a few possibilities, could implement columnar storage for OLAP workloads, highly compressed timeseries storage for event data, or compressed storage for minimizing disk usage.
Tangentially: we have a working prototype for decoupled storage and compute using the Oriole extension (also in the blog post). This stores Postgres data in s3 and there could be some inter-play with this release in the future
This is my biggest reservation towards Supabase. Google bought Firebase in 2014. I've seen Vercel run Nextjs in to the ground and fuck up their pricing for some short-term gains. And Figma almost got bought by Adobe. I have a hard time trusting products with heavy VC backing.
Never used Supabase before but I'm very much comfortable with their underlying stack. I use a combination of postgres, PostgREST, PLv8 and Auth0 to achieve nearly the same thing.
A question about implementation, is the data really stored in a Postgres database? Do you support transactional updates like atomically updating two files at once?
Is there a Postgres storage backend optimized for storing large files?
We do not store the files in Postgres, the files are stored in a managed S3 bucket.
We store the metadata of the objects and buckets in Postgres so that you can easily query it with SQL. You can also implement access control with RLS to allow access to certain resources.
It is not currently possible to guarantee atomicity on 2 different file uploads since each file is uploaded on a single request, this seems a more high-level functionality that could be implemented at the application level
What are the chances of Supabase doing a license change? Seems to be fashionable these days so always a little wary of building on these sort of platforms
Seeing as Neon, Nile, Citus, etc. are all open source, I highly doubt it. But who knows. In the end, most license changes are blown out of proportion by a vocal minority and largely have zero effect on 99.9% of users.
Unfortunately there isn't a way to prove our intentions to remain open source
That said, I think Supabase is much more de-risked from this happening because we aim to support existing tools with a strong preference of tools that are controlled by foundations rather than commercial entities. For example, 2 of the main tools:
- Postgres (PostgreSQL license)
- PostgREST (MIT license)
Every tool/library/extension that we develop and release ourselves is either MIT, Apache2, or PostgreSQL
This is really nice to see! I asked about that feature almost 2 years ago as we wanted to use Supabase for everything. Unfortunately there were no plans back then to support it, so we had to use another provider for object storage.
The S3 API isn't unknown or anything, the client library SDKs all being open source. So I'd imagine a software developer writing a tool that aims to be S3 API-compliant would use one of the open source SDKs, and write their API while making requests to it locally through a client from one of the Amazon SDKs. Not trivial effort, but also pretty easy to imagine how you'd start off. If a client function from the SDK doesn't work with your API, you write your API handler to handle the HTTP request that function makes until it is supported.
I have wondered if Amazon has some additional tooling for other software providers to make their own S3-compliant APIs, but I don't know what Amazon's motivation would be to help make it easier for people to switch between other vendors. Whereas the incentive is much more obvious for other software vendors to make their own APIs S3-compliant. So I've so far imagined it is a similar process to how I described above, instead.
You specifically say "for large files". What's your bandwidth and latency like for small files (e.g. 20-20480 bytes), and how does it compare to raw S3's bandwidth and latency for small files?
hey, supabase engineer here; we didn’t check that out with files that small, but thanks for the idea, i will try it out
the only thing i can say related to the topic is that s3 multipart outperforms other methods for files larger than 50mb significantly, but tends to have similar or slightly slower speeds compared to s3 regular upload via supabase or simplest supabase storage upload for files with size about and less than 50mb.
s3-multipart is indeed the fastest way to upload file to supabase with speeds up to 100mb/s(115 even) for files>500mb. But for files about 5mb or less you are not going to need to change anything in your upload logic just for performance cause you won’t notice any difference probably
You can think of the Storage product as an upload server that sits in front of S3.
Generally, you would want to place an upload server to accept uploads from your customers, that is because you want to do some sort of file validation, access control or anything else once the file is uploaded. The nice thing is that we run Storage within the same AWS network, so the upload latency is as small as it can be.
In terms of serving files, we provide a CDN out-of-the-box for any files that you upload to Storage, minimising latencies geographically
I tried to migrate from Firebase once and it wasn't really straightforward and decided against doing it, I think you guys (if you haven't already) should make migration plugins a first class priority that "just works" as the amount of real revenue generating production projects on Firebase and similar are of a much higher number. It's a no-brainer that many of them may want to switch if it were safe and simple to do so.
Yes, both can be fine :) after all, a Protocol can be interpreted as a Standardised API which the client and server interact with, it can be low-level or high-level.
I hope you like the addition and we have the implementation all open-source on the Supabase Storage server
I think that protocol is appropriate here since s3 resources are often represented by a s3:// url where the scheme part of the url is often used to represent the protocol.
At the same time, I worry about them drifting too far from their core mission. I think Vercel and Netlify kinda went that way and when you look at their suite of features, you just have to ask why would I not just use AWS directly.
Supabase is great and I’ve used it for a number of projects over the years both with a backend alongside it or direct from the client with RLS.
There are some weird edges (well really just faff) around auth with the JS library but if nothing else they are by far the cheapest hosted SQL offering I can find so any faff you don’t want to deal with there’s an excellent database right there to allow you to roll your own (assuming you have a backend server alongside it)
Is there any request pricing (I could not find a mention to it on the pricing page). Could be quite compelling for some use-cases if request pricing is free.
One of the big wins we get from AWS is that you can do things like load structured data files (csv, parquet) from S3 directly in Redshift using SQL queries.
This is indeed pretty cool. They also have the `aws_s3` extension [1] for doing the same thing inside Postgres. Unfortunately, the extension isn't open source.
Hi, a question, but first some background. I've been looking at solutions to store columnar data with versioning, essentially Parquet. But, I'd also like to store PDFs, CSVs, images, and such for our ML workflows. I wonder if now, that Supabase is getting better for data science DuckDB crowd, could Supabase be that one solution for all this?
> Parquet. But, I'd also like to store PDFs, CSVs, images
yes, you can store all of these in Supabase Storage and it will probably "just work" with the tools that you already use (since most tools are s3-compatible)
It's more of a "accessibility layer" on top of S3 or any other s3-compatible backend (which means that it also works with MinIO out-of-the-box [0])
I don't think we'll ever build the underlying storage layer. I'm a big fan of what the Tigris[1] team have built if you're looking for other good s3 alternatives
Just commenting to say I really appreciate your business model. Whereas most businesses actively seek to build moats to maintain competitive advantage and locking people in, actions like this paint a different picture of Supabase. I'll be swapping out the API my app uses for Supabase storage to switch it to an S3 API this weekend in case I ever need to switch it.
My only real qualm at this point is mapping JS entities using the JS DB API makes it hard to use camelcase field names due to PG reasons I can't recall. I'm not sure what a fix for that would look like.
> mapping JS entities using the JS DB API makes it hard to use camelcase field names due to PG reasons I can't recall
Postgres requires you to "quoteCamelCase". There are some good JS libs to map between snake case and camel case. FWIW, I like a mix of both in my code: snake_case indicates it's a database property, and camelCase indicates its a JS property. Try it out - it might grow on you :)
Do you manage your functions and triggers through source code? What framework do you use to do that? I like Supabase but it’s desire to default to native pg stuff for a lot of that has kind of steered me away from using it for more complex projects where you need to use sprocs to retrieve data and pgtap to test them, because hiding away business logic in the db like that is viewed as an anti pattern in a lot of organizations. I love it for simple CRUD apps though, the kind where the default postgrest functionality is mostly enough and having to drop into a sproc or build a view is rarely necessary.
I think if there was a tightly integrated framework for managing the state of all of these various triggers, views, functions and sproc through source and integrating them into the normal SDLC it would be a more appealing sell for complex projects
We are hosted on aws and are just passing the cost over to our users. We make no margin on egress fees. Deploying storage on other clouds including Fly.io is planned.
We don't support S3 event notifications directly, but you achieve similar functionality by using Database Webhooks [1]. You can trigger any HTTP endpoint or a Supabase Edge function by adding a trigger to the objects table [3] in the Storage schema.
"S3 protocol" typically refers to object storage read/write/delete, not additional service APIs. Support in other S3-compatible vendors varies, often with a different payload (though a translation wrapper shouldn't be too difficult to implement)
(supabase team member) Firebase is an amazing tool for building fast. I want Supabase to be a "tool for everyone", but ultimately giving developers choices between various technologies is a good thing for developers. I think it's great that Flutterflow support both Firebase & Supabase.
I know Flutterflow's Firebase integration is a bit more polished so hopefully we can work closer with the FF team to make our integration more seamless
Friendly reminder that Supabase is really cool, and if you haven't tried it out you should do it (everything can be self hosted and they have generous free tiers!)
Plus, with all the inflated vc money fueled hype on vector databases, they seem to have the only offering in this space that actually makes sense to me. With them you can store your embeddings close to all the rest of your data, in a single postgres db.
Cool to see that Supabase is adding S3 protocol! Nice to see more and more storage solutions available.
We, at Milvus, I've integrated S3, Parquet and other ones to make it possible for developers to use their data no matter what they use.
For those who have used both, how do you find the performance and ease of integration compares between Supabase and other solutions like Milvus that have had these features for some time?
kiwicopple|1 year ago
For background: we have a storage product for large files (like photos, videos, etc). The storage paths are mapped into your Postgres database so that you can create per-user access rules (using Postgres RLS)
This update adds S3 compatibility, which means that you can use it with thousands of tools that already support the protocol.
I'm also pretty excited about the possibilities for data scientists/engineers. We can do neat things like dump postgres tables in to Storage (parquet) and you can connect DuckDB/Clickhouse directly to them. We have a few ideas that we'll experiment with to make this easy
Let us know if you have any questions - the engineers will also monitor the discussion
devjume|1 year ago
I have been developing with Supabase past two months. I would say there are still some rough corners in general and some basic features missing. Example Supabase storage has no direct support for metadata [2][3].
Overall I like the launch week and development they are doing. But more attention to basic features and little details would be needed because implementing workarounds for basic stuff is not ideal.
[1] https://bunny.net/ [2] https://github.com/orgs/supabase/discussions/5479 [3] https://github.com/supabase/storage/issues/439
kiwicopple|1 year ago
Bunny is a great product. I'm glad this release makes that possible for you and I imagine this was one of the reasons the rest of the community wanted it too
> But more attention to basic features and little details
This is what we spend most of our time doing, but you won't hear about it because they aren't HN-worthy.
> no direct support for metadata
Fabrizio tells me this is next on the list. I understand it's frustrating, but there is a workaround - store metadata in the postgres database (I know, not ideal but still usable). We're getting through requests as fast as we can.
giancarlostoro|1 year ago
I'm more used to Azure Blob Storage than anything, so I'm OOL on what people do other than store files on S3.
inian|1 year ago
https://www.youtube.com/watch?v=diL00ZZ-q50
cmollis|1 year ago
Rapzid|1 year ago
Large object crew, who's with me?!
vbezhenar|1 year ago
The only upside of storing blobs in the database is transactional semantics. Buf if you're fine with some theoretical trash in S3, that's trivially implemented with proper ordering.
dymk|1 year ago
kiwicopple|1 year ago
code_biologist|1 year ago
yoavm|1 year ago
kiwicopple|1 year ago
And a 5-min demo video with Digital Ocean: https://www.youtube.com/watch?v=FqiQKRKsfZE&embeds_referring...
Anyone who is familiar with basic server management skills will have no problem self-hosting. every tool in the supabase stack[0] is a docker image and works in isolation. If you just want to use this Storage Engine, it's on docker-hub (supabase/storage-api). Example with MinIO: https://github.com/supabase/storage/blob/master/docker-compo...
[0] architecture: https://supabase.com/docs/guides/getting-started/architectur...
zipping1549|1 year ago
Pocketbase being literally single-binary doesn't make Supabase look good either, although funtionalities differ.
brap|1 year ago
bdcravens|1 year ago
The idea of a propriety API becoming the industry defacto standard isn't uncommon. The same thing happened with Microsoft's XMLHttpRequest.
ovaistariq|1 year ago
[1] https://www.tigrisdata.com/docs/objects/conditionals/ [2] https://www.tigrisdata.com/docs/objects/caching/#caching-on-...
garbanz0|1 year ago
mmcwilliams|1 year ago
mousetree|1 year ago
moduspol|1 year ago
jimmySixDOF|1 year ago
kiwicopple|1 year ago
- This: for managing large files in s3 (videos, images, etc).
- Oriole: a postgres extension that's a "drop-in replacement" for the default storage engine
We also hope that the team can help develop Pluggable Storage in Postgres with the rest of the community. From the blog post[0]:
> Pluggable Storage gives developers the ability to use different storage engines for different tables within the same database. This system is available in MySQL, which uses the InnoDB as the default storage engine since MySQL 5.5 (replacing MyISAM). Oriole aims to be a drop-in replacement for Postgres' default storage engine and supports similar use-cases with improved performance. Other storage engines, to name a few possibilities, could implement columnar storage for OLAP workloads, highly compressed timeseries storage for event data, or compressed storage for minimizing disk usage.
Tangentially: we have a working prototype for decoupled storage and compute using the Oriole extension (also in the blog post). This stores Postgres data in s3 and there could be some inter-play with this release in the future
[0] https://supabase.com/blog/supabase-aquires-oriole
jonplackett|1 year ago
kiwicopple|1 year ago
we only have plans to keep pushing open standards/tools - hopefully we have enough of a track record here that it doesn't feel like lip service
gherkinnn|1 year ago
codextremist|1 year ago
foerster|1 year ago
nextaccountic|1 year ago
Is there a Postgres storage backend optimized for storing large files?
fenos|1 year ago
We store the metadata of the objects and buckets in Postgres so that you can easily query it with SQL. You can also implement access control with RLS to allow access to certain resources.
It is not currently possible to guarantee atomicity on 2 different file uploads since each file is uploaded on a single request, this seems a more high-level functionality that could be implemented at the application level
unknown|1 year ago
[deleted]
Havoc|1 year ago
ezekg|1 year ago
kiwicopple|1 year ago
That said, I think Supabase is much more de-risked from this happening because we aim to support existing tools with a strong preference of tools that are controlled by foundations rather than commercial entities. For example, 2 of the main tools:
- Postgres (PostgreSQL license)
- PostgREST (MIT license)
Every tool/library/extension that we develop and release ourselves is either MIT, Apache2, or PostgreSQL
madsbuch|1 year ago
Congrats on the release!
pull_my_finger|1 year ago
starttoaster|1 year ago
I have wondered if Amazon has some additional tooling for other software providers to make their own S3-compliant APIs, but I don't know what Amazon's motivation would be to help make it easier for people to switch between other vendors. Whereas the incentive is much more obvious for other software vendors to make their own APIs S3-compliant. So I've so far imagined it is a similar process to how I described above, instead.
inian|1 year ago
[1]: https://docs.aws.amazon.com/AmazonS3/latest/API/API_Operatio...
avodonosov|1 year ago
https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-...
fenos|1 year ago
JoshTriplett|1 year ago
egorr|1 year ago
the only thing i can say related to the topic is that s3 multipart outperforms other methods for files larger than 50mb significantly, but tends to have similar or slightly slower speeds compared to s3 regular upload via supabase or simplest supabase storage upload for files with size about and less than 50mb.
s3-multipart is indeed the fastest way to upload file to supabase with speeds up to 100mb/s(115 even) for files>500mb. But for files about 5mb or less you are not going to need to change anything in your upload logic just for performance cause you won’t notice any difference probably
everything mentioned here is for upload only
fenos|1 year ago
Generally, you would want to place an upload server to accept uploads from your customers, that is because you want to do some sort of file validation, access control or anything else once the file is uploaded. The nice thing is that we run Storage within the same AWS network, so the upload latency is as small as it can be.
In terms of serving files, we provide a CDN out-of-the-box for any files that you upload to Storage, minimising latencies geographically
ovaistariq|1 year ago
kaliqt|1 year ago
kiwicopple|1 year ago
also some community tools: https://github.com/supabase-community/firebase-to-supabase
we often help companies migrating from firebase to supabase - usually they want to take advantage of Postgres with similar tooling.
gime_tree_fiddy|1 year ago
Also my sympathies for having to support the so-called "S3 standard/protocol".
fenos|1 year ago
I hope you like the addition and we have the implementation all open-source on the Supabase Storage server
preommr|1 year ago
spacebanana7|1 year ago
I'd happily pay a 50% markup for the sake of having everything in one place.
tootie|1 year ago
kiwicopple|1 year ago
we have a built-in CDN[0] and we have some existing integrations for transactional emails [1]
[0] Smart CDN: https://supabase.com/docs/guides/storage/cdn/smart-cdn
[1] Integrations: https://supabase.com/partners/integrations
simonbarker87|1 year ago
There are some weird edges (well really just faff) around auth with the JS library but if nothing else they are by far the cheapest hosted SQL offering I can find so any faff you don’t want to deal with there’s an excellent database right there to allow you to roll your own (assuming you have a backend server alongside it)
animeshjain|1 year ago
inian|1 year ago
fswd|1 year ago
tootie|1 year ago
https://docs.aws.amazon.com/redshift/latest/dg/t_loading-tab...
inian|1 year ago
[1]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_...
mritchie712|1 year ago
0 - https://www.youtube.com/watch?v=yrrCQnfKEig
1 - https://www.definite.app/
unknown|1 year ago
[deleted]
jarpineh|1 year ago
kiwicopple|1 year ago
yes, you can store all of these in Supabase Storage and it will probably "just work" with the tools that you already use (since most tools are s3-compatible)
Here is an example of one of our Data Engineers querying parquet with DuckDB: https://www.youtube.com/watch?v=diL00ZZ-q50
We're very open to feedback here - if you find any rough edges let us know and we can work on it (github issues are easiest)
Bnjoroge|1 year ago
filleokus|1 year ago
kiwicopple|1 year ago
I don't think we'll ever build the underlying storage layer. I'm a big fan of what the Tigris[1] team have built if you're looking for other good s3 alternatives
[0] https://github.com/supabase/storage/blob/master/docker-compo...
[1] Tigris: https://tigrisdata.com
mattgreenrocks|1 year ago
My only real qualm at this point is mapping JS entities using the JS DB API makes it hard to use camelcase field names due to PG reasons I can't recall. I'm not sure what a fix for that would look like.
Keep up the good work.
kiwicopple|1 year ago
Postgres requires you to "quoteCamelCase". There are some good JS libs to map between snake case and camel case. FWIW, I like a mix of both in my code: snake_case indicates it's a database property, and camelCase indicates its a JS property. Try it out - it might grow on you :)
ntry01|1 year ago
Does this mean that Supabase (via S3 protocol) supports file download streaming using an API now?
As far as I know, it was not achievable before and the only solution was to create a signed URL and stream using HTTP.
fenos|1 year ago
The good news is that the Standard API is also supporting stream!
foerster|1 year ago
I was hesitant to use triggers and PG functions initially, but after I got my migrations sorted out, it's been pretty awesome.
SOLAR_FIELDS|1 year ago
I think if there was a tightly integrated framework for managing the state of all of these various triggers, views, functions and sproc through source and integrating them into the normal SDLC it would be a more appealing sell for complex projects
pier25|1 year ago
I wish they would offer a plan with just the pg database.
Any news on pricing of Fly PG?
inian|1 year ago
We are actively working on our Fly integration. At the start, the pricing is going to be exactly the same as our hosted platform on aws - https://supabase.com/docs/guides/platform/fly-postgres#prici...
sgt|1 year ago
fenos|1 year ago
iamcreasy|1 year ago
WhitneyLand|1 year ago
kiwicopple|1 year ago
poxrud|1 year ago
inian|1 year ago
[1]: https://supabase.com/docs/guides/database/webhooks [2]: https://supabase.com/docs/guides/functions [3]: https://supabase.com/docs/guides/storage/schema/design
bdcravens|1 year ago
https://developers.cloudflare.com/r2/buckets/event-notificat...
https://docs.digitalocean.com/reference/api/spaces-api/
withinboredom|1 year ago
kiwicopple|1 year ago
I know Flutterflow's Firebase integration is a bit more polished so hopefully we can work closer with the FF team to make our integration more seamless
denysvitali|1 year ago
isoprophlex|1 year ago
kiwicopple|1 year ago
joshxyz|1 year ago
stephen37|1 year ago
We, at Milvus, I've integrated S3, Parquet and other ones to make it possible for developers to use their data no matter what they use.
For those who have used both, how do you find the performance and ease of integration compares between Supabase and other solutions like Milvus that have had these features for some time?
teaearlgraycold|1 year ago
jongjong|1 year ago