Supabase Storage now supports the S3 protocol

hey hn, supabase ceo here

For background: we have a storage product for large files (like photos, videos, etc). The storage paths are mapped into your Postgres database so that you can create per-user access rules (using Postgres RLS)

This update adds S3 compatibility, which means that you can use it with thousands of tools that already support the protocol.

I'm also pretty excited about the possibilities for data scientists/engineers. We can do neat things like dump postgres tables in to Storage (parquet) and you can connect DuckDB/Clickhouse directly to them. We have a few ideas that we'll experiment with to make this easy

Let us know if you have any questions - the engineers will also monitor the discussion

devjume|1 year ago

This is great news. Now I can utilize any CDN provider that supports S3. Like bunny.net [1] which has image optimization, just like Supabase does but with better pricing and features.

I have been developing with Supabase past two months. I would say there are still some rough corners in general and some basic features missing. Example Supabase storage has no direct support for metadata [2][3].

Overall I like the launch week and development they are doing. But more attention to basic features and little details would be needed because implementing workarounds for basic stuff is not ideal.

[1] https://bunny.net/ [2] https://github.com/orgs/supabase/discussions/5479 [3] https://github.com/supabase/storage/issues/439

kiwicopple|1 year ago

> I can utilize any CDN provider that supports S3. Like bunny.net

Bunny is a great product. I'm glad this release makes that possible for you and I imagine this was one of the reasons the rest of the community wanted it too

> But more attention to basic features and little details

This is what we spend most of our time doing, but you won't hear about it because they aren't HN-worthy.

> no direct support for metadata

Fabrizio tells me this is next on the list. I understand it's frustrating, but there is a workaround - store metadata in the postgres database (I know, not ideal but still usable). We're getting through requests as fast as we can.

giancarlostoro|1 year ago

I've not done a whole lot with S3 but is this due to it being easy to sync between storage providers that support S3 or something?

I'm more used to Azure Blob Storage than anything, so I'm OOL on what people do other than store files on S3.

inian|1 year ago

Here is the example of the DuckDB querying parquet files directly from Storage because it supports the S3 protocol now - https://github.com/TylerHillery/supabase-storage-duckdb-demo

https://www.youtube.com/watch?v=diL00ZZ-q50

cmollis|1 year ago

Yes. Duckdb works very well with parquet scans on s3 right now.

Rapzid|1 year ago

I like to Lob my BLOBs into PG's storage. You need that 1-2TB of RDS storage for the IOPS anyway; might as well fill it up.

Large object crew, who's with me?!

vbezhenar|1 year ago

I don't. S3-compatible storages usually are significantly cheaper, allow to offload HTTP requests. Also huge databases make backups and recoveries slow.

The only upside of storing blobs in the database is transactional semantics. Buf if you're fine with some theoretical trash in S3, that's trivially implemented with proper ordering.

dymk|1 year ago

38TB of large objects stored in Postgres right here

kiwicopple|1 year ago

that's not how this works. files are stored in s3, metadata in postgres

code_biologist|1 year ago

Lol. The most PG blob storage I've used in prod was a couple hundred GB. It was a hack and the performance wasn't ideal, but the alternatives were more complicated. Simple is good.

yoavm|1 year ago

This looks great! How easy is it to self host Supabase? Is it more like "we're open-source, but good luck getting this deployed!", or can someone really build on Supabase and if things get a little too expensive it's easy enough to self-host the whole thing and just switch over? I wonder if people are doing that.

kiwicopple|1 year ago

self-hosting docs are here: https://supabase.com/docs/guides/self-hosting/docker

And a 5-min demo video with Digital Ocean: https://www.youtube.com/watch?v=FqiQKRKsfZE&embeds_referring...

Anyone who is familiar with basic server management skills will have no problem self-hosting. every tool in the supabase stack[0] is a docker image and works in isolation. If you just want to use this Storage Engine, it's on docker-hub (supabase/storage-api). Example with MinIO: https://github.com/supabase/storage/blob/master/docker-compo...

[0] architecture: https://supabase.com/docs/guides/getting-started/architectur...

zipping1549|1 year ago

Some may disagree but in my experience Supabase was definitely challenging to selfhost. Don't get me wrong; I'm pretty confident with selfhosting but Supabase was definitely on the hard side.

Pocketbase being literally single-binary doesn't make Supabase look good either, although funtionalities differ.

brap|1 year ago

Always thought it’s kind of odd how the proprietary API of AWS S3 became sort of the de-facto industry standard

bdcravens|1 year ago

S3 is one of the original AWS services (SQS predates it), and has been around for 18 years.

The idea of a propriety API becoming the industry defacto standard isn't uncommon. The same thing happened with Microsoft's XMLHttpRequest.

ovaistariq|1 year ago

Supporting an existing API provides interoperability which is beneficial for the users. So that way if there is a better storage service it’s easier to adopt it. However, the S3 API compatibility can be a hindrance when you want to innovate and provide additional features and functionality. In our case, providing additional features [1] [2] while continuing to be S3 API compatible has forced us to rely on custom headers.

[1] https://www.tigrisdata.com/docs/objects/conditionals/ [2] https://www.tigrisdata.com/docs/objects/caching/#caching-on-...

garbanz0|1 year ago

Same thing seems to be happening with openai api

mmcwilliams|1 year ago

I might be misremembering this but I was under the impression that Ceph offered the same or very similar object storage API prior to Amazon building S3.

mousetree|1 year ago

Because that's where most of the industry store their data.

moduspol|1 year ago

Yeah--though I guess kudos to AWS for not being litigious about it.

jimmySixDOF|1 year ago

Supabase also announced this week Oriole (the team not just the table storage plugin) is joining them so I guess this is part of the same story. Anyway it's nice timing I was thinking about a hookup to Cloudflare R2 for something and this may be the way.

kiwicopple|1 year ago

Oriole are joining to work on the OrioleDB postgres extension. That's slightly different to this release:

- This: for managing large files in s3 (videos, images, etc).

- Oriole: a postgres extension that's a "drop-in replacement" for the default storage engine

We also hope that the team can help develop Pluggable Storage in Postgres with the rest of the community. From the blog post[0]:

> Pluggable Storage gives developers the ability to use different storage engines for different tables within the same database. This system is available in MySQL, which uses the InnoDB as the default storage engine since MySQL 5.5 (replacing MyISAM). Oriole aims to be a drop-in replacement for Postgres' default storage engine and supports similar use-cases with improved performance. Other storage engines, to name a few possibilities, could implement columnar storage for OLAP workloads, highly compressed timeseries storage for event data, or compressed storage for minimizing disk usage.

Tangentially: we have a working prototype for decoupled storage and compute using the Oriole extension (also in the blog post). This stores Postgres data in s3 and there could be some inter-play with this release in the future

[0] https://supabase.com/blog/supabase-aquires-oriole

jonplackett|1 year ago

Dear supabase. Please don’t get bought out by anyone and ruined. I’ve built too many websites with a supabase backend now to go back.

kiwicopple|1 year ago

we don't have any plans to get bought.

we only have plans to keep pushing open standards/tools - hopefully we have enough of a track record here that it doesn't feel like lip service

gherkinnn|1 year ago

This is my biggest reservation towards Supabase. Google bought Firebase in 2014. I've seen Vercel run Nextjs in to the ground and fuck up their pricing for some short-term gains. And Figma almost got bought by Adobe. I have a hard time trusting products with heavy VC backing.

codextremist|1 year ago

Never used Supabase before but I'm very much comfortable with their underlying stack. I use a combination of postgres, PostgREST, PLv8 and Auth0 to achieve nearly the same thing.

foerster|1 year ago

I'm a bit terrified of this as well. I have built a profitable product on the platform, and it were to drastically change or go away, I'd be hosed.

nextaccountic|1 year ago

A question about implementation, is the data really stored in a Postgres database? Do you support transactional updates like atomically updating two files at once?

Is there a Postgres storage backend optimized for storing large files?

fenos|1 year ago

We do not store the files in Postgres, the files are stored in a managed S3 bucket.

We store the metadata of the objects and buckets in Postgres so that you can easily query it with SQL. You can also implement access control with RLS to allow access to certain resources.

It is not currently possible to guarantee atomicity on 2 different file uploads since each file is uploaded on a single request, this seems a more high-level functionality that could be implemented at the application level

unknown|1 year ago

[deleted]

Havoc|1 year ago

What are the chances of Supabase doing a license change? Seems to be fashionable these days so always a little wary of building on these sort of platforms

ezekg|1 year ago

Seeing as Neon, Nile, Citus, etc. are all open source, I highly doubt it. But who knows. In the end, most license changes are blown out of proportion by a vocal minority and largely have zero effect on 99.9% of users.

kiwicopple|1 year ago

Unfortunately there isn't a way to prove our intentions to remain open source

That said, I think Supabase is much more de-risked from this happening because we aim to support existing tools with a strong preference of tools that are controlled by foundations rather than commercial entities. For example, 2 of the main tools:

- Postgres (PostgreSQL license)

- PostgREST (MIT license)

Every tool/library/extension that we develop and release ourselves is either MIT, Apache2, or PostgreSQL

madsbuch|1 year ago

This is really nice to see! I asked about that feature almost 2 years ago as we wanted to use Supabase for everything. Unfortunately there were no plans back then to support it, so we had to use another provider for object storage.

Congrats on the release!

pull_my_finger|1 year ago

Is there a formal s3 protocol spec or do these companies try to reverse engineer/feature match what AWS provides?

starttoaster|1 year ago

The S3 API isn't unknown or anything, the client library SDKs all being open source. So I'd imagine a software developer writing a tool that aims to be S3 API-compliant would use one of the open source SDKs, and write their API while making requests to it locally through a client from one of the Amazon SDKs. Not trivial effort, but also pretty easy to imagine how you'd start off. If a client function from the SDK doesn't work with your API, you write your API handler to handle the HTTP request that function makes until it is supported.

I have wondered if Amazon has some additional tooling for other software providers to make their own S3-compliant APIs, but I don't know what Amazon's motivation would be to help make it easier for people to switch between other vendors. Whereas the incentive is much more obvious for other software vendors to make their own APIs S3-compliant. So I've so far imagined it is a similar process to how I described above, instead.

inian|1 year ago

The S3 API reference [1] is closest to a formal spec there is. The request, response and the error codes are pretty well documented.

[1]: https://docs.aws.amazon.com/AmazonS3/latest/API/API_Operatio...

avodonosov|1 year ago

The article does not mention: do you support pre-signed URLs?

https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-...

fenos|1 year ago

Thanks for asking this, we do not support signed URLs just yet, but it will be added in the next iteration

JoshTriplett|1 year ago

You specifically say "for large files". What's your bandwidth and latency like for small files (e.g. 20-20480 bytes), and how does it compare to raw S3's bandwidth and latency for small files?

egorr|1 year ago

hey, supabase engineer here; we didn’t check that out with files that small, but thanks for the idea, i will try it out

the only thing i can say related to the topic is that s3 multipart outperforms other methods for files larger than 50mb significantly, but tends to have similar or slightly slower speeds compared to s3 regular upload via supabase or simplest supabase storage upload for files with size about and less than 50mb.

s3-multipart is indeed the fastest way to upload file to supabase with speeds up to 100mb/s(115 even) for files>500mb. But for files about 5mb or less you are not going to need to change anything in your upload logic just for performance cause you won’t notice any difference probably

everything mentioned here is for upload only

fenos|1 year ago

You can think of the Storage product as an upload server that sits in front of S3.

Generally, you would want to place an upload server to accept uploads from your customers, that is because you want to do some sort of file validation, access control or anything else once the file is uploaded. The nice thing is that we run Storage within the same AWS network, so the upload latency is as small as it can be.

In terms of serving files, we provide a CDN out-of-the-box for any files that you upload to Storage, minimising latencies geographically

ovaistariq|1 year ago

S3 is not known for performing well with such small files. Is this primarily how your dataset on S3 looks like?

kaliqt|1 year ago

I tried to migrate from Firebase once and it wasn't really straightforward and decided against doing it, I think you guys (if you haven't already) should make migration plugins a first class priority that "just works" as the amount of real revenue generating production projects on Firebase and similar are of a much higher number. It's a no-brainer that many of them may want to switch if it were safe and simple to do so.

kiwicopple|1 year ago

we have some guides (eg: https://supabase.com/docs/guides/resources/migrating-to-supa...)

also some community tools: https://github.com/supabase-community/firebase-to-supabase

we often help companies migrating from firebase to supabase - usually they want to take advantage of Postgres with similar tooling.

gime_tree_fiddy|1 year ago

Shouldn't it be API rather than protocol?

Also my sympathies for having to support the so-called "S3 standard/protocol".

fenos|1 year ago

Yes, both can be fine :) after all, a Protocol can be interpreted as a Standardised API which the client and server interact with, it can be low-level or high-level.

I hope you like the addition and we have the implementation all open-source on the Supabase Storage server

preommr|1 year ago

I think that protocol is appropriate here since s3 resources are often represented by a s3:// url where the scheme part of the url is often used to represent the protocol.

spacebanana7|1 year ago

I wish supabase had more default integrations with CDNs, transactional email services and domain registrars.

I'd happily pay a 50% markup for the sake of having everything in one place.

tootie|1 year ago

At the same time, I worry about them drifting too far from their core mission. I think Vercel and Netlify kinda went that way and when you look at their suite of features, you just have to ask why would I not just use AWS directly.

kiwicopple|1 year ago

are there any specific integrations that you want? (eg: which companies?)

we have a built-in CDN[0] and we have some existing integrations for transactional emails [1]

[0] Smart CDN: https://supabase.com/docs/guides/storage/cdn/smart-cdn

[1] Integrations: https://supabase.com/partners/integrations

simonbarker87|1 year ago

Supabase is great and I’ve used it for a number of projects over the years both with a backend alongside it or direct from the client with RLS.

There are some weird edges (well really just faff) around auth with the JS library but if nothing else they are by far the cheapest hosted SQL offering I can find so any faff you don’t want to deal with there’s an excellent database right there to allow you to roll your own (assuming you have a backend server alongside it)

animeshjain|1 year ago

Is there any request pricing (I could not find a mention to it on the pricing page). Could be quite compelling for some use-cases if request pricing is free.

inian|1 year ago

There is no per request pricing.

fswd|1 year ago

I just finished implementing S3 file upload in nextjs to cloudflare R2 with a supabase backend. Wish I had been lazy and waited a day!

tootie|1 year ago

One of the big wins we get from AWS is that you can do things like load structured data files (csv, parquet) from S3 directly in Redshift using SQL queries.

https://docs.aws.amazon.com/redshift/latest/dg/t_loading-tab...

inian|1 year ago

This is indeed pretty cool. They also have the `aws_s3` extension [1] for doing the same thing inside Postgres. Unfortunately, the extension isn't open source.

[1]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_...

mritchie712|1 year ago

Very cool and very good timing. We just released better support to directly query S3[0] in Definite[1]. It's powered by duckdb.

0 - https://www.youtube.com/watch?v=yrrCQnfKEig

1 - https://www.definite.app/

unknown|1 year ago

[deleted]

jarpineh|1 year ago

Hi, a question, but first some background. I've been looking at solutions to store columnar data with versioning, essentially Parquet. But, I'd also like to store PDFs, CSVs, images, and such for our ML workflows. I wonder if now, that Supabase is getting better for data science DuckDB crowd, could Supabase be that one solution for all this?

kiwicopple|1 year ago

> Parquet. But, I'd also like to store PDFs, CSVs, images

yes, you can store all of these in Supabase Storage and it will probably "just work" with the tools that you already use (since most tools are s3-compatible)

Here is an example of one of our Data Engineers querying parquet with DuckDB: https://www.youtube.com/watch?v=diL00ZZ-q50

We're very open to feedback here - if you find any rough edges let us know and we can work on it (github issues are easiest)

Bnjoroge|1 year ago

you should look at lance(https://lancedb.github.io/lance/)

filleokus|1 year ago

Do you think Supabase Storage (now or in the future) could be an attractive standalone S3 provider as an alternative to e.g MinIO?

kiwicopple|1 year ago

It's more of a "accessibility layer" on top of S3 or any other s3-compatible backend (which means that it also works with MinIO out-of-the-box [0])

I don't think we'll ever build the underlying storage layer. I'm a big fan of what the Tigris[1] team have built if you're looking for other good s3 alternatives

[0] https://github.com/supabase/storage/blob/master/docker-compo...

[1] Tigris: https://tigrisdata.com

mattgreenrocks|1 year ago

Just commenting to say I really appreciate your business model. Whereas most businesses actively seek to build moats to maintain competitive advantage and locking people in, actions like this paint a different picture of Supabase. I'll be swapping out the API my app uses for Supabase storage to switch it to an S3 API this weekend in case I ever need to switch it.

My only real qualm at this point is mapping JS entities using the JS DB API makes it hard to use camelcase field names due to PG reasons I can't recall. I'm not sure what a fix for that would look like.

Keep up the good work.

kiwicopple|1 year ago

> mapping JS entities using the JS DB API makes it hard to use camelcase field names due to PG reasons I can't recall

Postgres requires you to "quoteCamelCase". There are some good JS libs to map between snake case and camel case. FWIW, I like a mix of both in my code: snake_case indicates it's a database property, and camelCase indicates its a JS property. Try it out - it might grow on you :)

ntry01|1 year ago

This is great news, and I agree with everyone in the thread - Supabase is a great product.

Does this mean that Supabase (via S3 protocol) supports file download streaming using an API now?

As far as I know, it was not achievable before and the only solution was to create a signed URL and stream using HTTP.

fenos|1 year ago

Yes, absolutely! You can download files as streams and make use of Range requests too.

The good news is that the Standard API is also supporting stream!

foerster|1 year ago

no feedback on this in particular, but I love supabase. I use it for several projects and it's been great.

I was hesitant to use triggers and PG functions initially, but after I got my migrations sorted out, it's been pretty awesome.

SOLAR_FIELDS|1 year ago

Do you manage your functions and triggers through source code? What framework do you use to do that? I like Supabase but it’s desire to default to native pg stuff for a lot of that has kind of steered me away from using it for more complex projects where you need to use sprocs to retrieve data and pgtap to test them, because hiding away business logic in the db like that is viewed as an anti pattern in a lot of organizations. I love it for simple CRUD apps though, the kind where the default postgrest functionality is mostly enough and having to drop into a sproc or build a view is rarely necessary.

I think if there was a tightly integrated framework for managing the state of all of these various triggers, views, functions and sproc through source and integrating them into the normal SDLC it would be a more appealing sell for complex projects

pier25|1 year ago

At $0.1/GB of egress it’s not super attractive compared to B2 or R2 for anything but trivial projects.

I wish they would offer a plan with just the pg database.

Any news on pricing of Fly PG?

inian|1 year ago

We are hosted on aws and are just passing the cost over to our users. We make no margin on egress fees. Deploying storage on other clouds including Fly.io is planned.

We are actively working on our Fly integration. At the start, the pricing is going to be exactly the same as our hosted platform on aws - https://supabase.com/docs/guides/platform/fly-postgres#prici...

sgt|1 year ago

Can Supabase host static content yet (in a decent way)?

fenos|1 year ago

We don’t support static website hosting just yet - might happen in the future :)

iamcreasy|1 year ago

In this setup, can postgresql query data stored in object storage? i.e. Hive/Icebarg table.

WhitneyLand|1 year ago

Is iOS support a priority for supabase?

kiwicopple|1 year ago

Yes, we made it an official library this week: https://supabase.com/blog/supabase-swift

poxrud|1 year ago

Do you support S3 event notifications?

inian|1 year ago

We don't support S3 event notifications directly, but you achieve similar functionality by using Database Webhooks [1]. You can trigger any HTTP endpoint or a Supabase Edge function by adding a trigger to the objects table [3] in the Storage schema.

[1]: https://supabase.com/docs/guides/database/webhooks [2]: https://supabase.com/docs/guides/functions [3]: https://supabase.com/docs/guides/storage/schema/design

bdcravens|1 year ago

"S3 protocol" typically refers to object storage read/write/delete, not additional service APIs. Support in other S3-compatible vendors varies, often with a different payload (though a translation wrapper shouldn't be too difficult to implement)

https://developers.cloudflare.com/r2/buckets/event-notificat...

https://docs.digitalocean.com/reference/api/spaces-api/

withinboredom|1 year ago

Now we just need flutterflow to get off the firebase bandwagon.

kiwicopple|1 year ago

(supabase team member) Firebase is an amazing tool for building fast. I want Supabase to be a "tool for everyone", but ultimately giving developers choices between various technologies is a good thing for developers. I think it's great that Flutterflow support both Firebase & Supabase.

I know Flutterflow's Firebase integration is a bit more polished so hopefully we can work closer with the FF team to make our integration more seamless

denysvitali|1 year ago

Friendly reminder that Supabase is really cool, and if you haven't tried it out you should do it (everything can be self hosted and they have generous free tiers!)

isoprophlex|1 year ago

Plus, with all the inflated vc money fueled hype on vector databases, they seem to have the only offering in this space that actually makes sense to me. With them you can store your embeddings close to all the rest of your data, in a single postgres db.

kiwicopple|1 year ago

thanks for taking time to make a comment. it's nice to get some kind words between the feedback (which we also like)

joshxyz|1 year ago

their team is crazy good

stephen37|1 year ago

Cool to see that Supabase is adding S3 protocol! Nice to see more and more storage solutions available.

We, at Milvus, I've integrated S3, Parquet and other ones to make it possible for developers to use their data no matter what they use.

For those who have used both, how do you find the performance and ease of integration compares between Supabase and other solutions like Milvus that have had these features for some time?

teaearlgraycold|1 year ago

Just make your own post if you’re selling your product

jongjong|1 year ago

Nice. Will check out Milvus.

187 comments