top | item 45081898

(no title)

ttiurani | 6 months ago

> You can run one (and it costs about $300/mo to run a Bluesky AppView ingesting all data currently on the network in real time if you want to do that).

A clarifying question: the blog post [0] I found about zeppelin.social which I think is a full AppView, the author said this:

"The cost to run this is about US $200/mo, primarily due to the 16 terabytes of storage it currrently uses"

Last I heard the amount of storage was just a couple of terabytes so the growth seems to be very fast.

If and when the primary cost is the storage, IMO the crucial question is: what's the expected future cost of running community AppViews?

Because unless storage cost drops as fast as the BlueSky data grows (unlikely?), to me this architecture looks like it will very soon kick out smaller players and leave only BlueSky with enough money to keep the AppView running.

[0] https://whtwnd.com/futur.blue/3ls7sbvpsqc2w

discuss

order

danabramov|6 months ago

I can’t speak to how fast it grows and what it was, but I mean — if what you want is to keep the entire data of the network (similar to having all tweets on Twitter) ready to be queried then you have to store them. That’s just unavoidable in any technological solution. Alternatively you could hydrate and query posts on-demand from their sources (PDS), and people have done that as an experiment, but you need at least some aggregation to happen somewhere (for reconstructing reply lists or like lists etc). If more collectively-run network caches are available, this becomes more feasible without storing everything yourself.

In any case, if you’re okay with a partial snapshot of the network (eg all posts during some window or even more partial) then you can arbitrarily narrow that down. In Mastodon, having a “full” archive is downright impossible which is why we’re not talking about the same with regards to Mastodon. Whereas ATProto makes it possible, with the cost being the floor of what you’d expect the cost for storing data to be. How could it be better?

ttiurani|6 months ago

> if what you want is to keep the entire data of the network (similar to having all tweets on Twitter) ready to be queried then you have to store them.

They need to be stored, but do they technically have to be stored by just one AppView? I get that it's a 100x easier to implement it like that, but I don't think a distributed search would've been technically impossible (although, granted, necessarily it would have had worse UX).

Choosing this feature and then implementing it like they did was a technical choice. Technical choices have consequences and this, I think, was the one which will prevent BlueSky from reaching any meaningful decentralization.

And saying "you can create an inferior UX with affordable costs" is not a real answer. Any meaningful decentralization IMO can only happen if it's affordable to create feature identical nodes. That can only happen if you refuse to implement features in ways that need centralization to scale.