A chatbot's worst enemy is page refresh

> But we’ve hit the ceiling for SSE. That terrible Claude UI refresh gif is state of the art for SSE. And it sucks.

This is nothing to do with SSE. It's trivial to persist state over disconnects and refresh with SSE. You can do all the same pub sub tricks.

None of theses companies are even using brotli on their SSE connection for 40-400x compression.

It's just bad engineering and it's going to be much worse with web sockets. Because, you have to rebuild http from scratch, compression is nowhere near as good, bidirectional nukes your mobile battery because of the duplex antenna, etc, etc.

andersmurphy|9 days ago

Just to add. The main value of websockets was faster up events pre http2. But, now with multiplexing in http2 that's no longer the case.

So the only thing you get from websockets is bidirectional events (at the cost of all the production challenges websockets bring). In practice most problems don't need that feature.

anonzzzies|9 days ago

Thanks for that. I know very little about frontend and this definitely will help me make something better.

mrieck|9 days ago

No mention of ChatGPT? Anyone else have this problem:

Go to ChatGPT.com while logged in, start typing right away, 8 words into typing it clears the text in the form. Why?

hsbauauvhabzb|9 days ago

Claude also has odd UI/UX bugs in what is almost literally a single page web application.

qouteall|9 days ago

It's probably due to server side rendering and rehydration. The rehydration use server side component state to override DOM state.

nicbou|9 days ago

I switched away from ChatGPT mainly due to that. Gemini is much faster to type into.

0xy|9 days ago

It's a rerendering bug. Insanely annoying.

trillic|9 days ago

useEffect if I had to guess

hglaser|9 days ago

Yep. We had to do a surprising amount of work to solve this in our product: https://www.kitewing.ai/blog/stateless-agents-stateful-produ...

Very weird that the foundational LLM companies' own chat pages don't do this.

kgeist|9 days ago

>surprising amount of work

Dunno, in my Go+HTMX project, it was pretty trivial to add SSE streaming. When you open a new chat tab, we load existing data from the DB and then HTMX initiates SSE streaming with a single tag. When the server receives a SSE request from HTMX, it registers a goroutine and a new Go channel for this tab. The goroutine blocks and waits for new events in the channel. When something triggers a new message, there's a dispatcher which saves the event to the DB and then iterates over registered Go channels and sends the event to it. On a new event in the tab's channel, the tab's goroutine unblocks and passes the event from the channel to the SSE stream. HTMX handles inserting new data to the DOM. When a tab closes, the goroutine receives the notification via the request's context (another Go primitive), deregisters the channel and exits. If the server restarts, HTMX automatically reopens the SSE stream. It took probably one evening to implement.

luxurytent|9 days ago

We resolved this by creating a separate context for the lifecycle of a chat/turn so if the user leaves the page, the process continues on the server. UI calls an RPC to fetch in progress turn, which allows it to resume, or if it's done, simply render the full turn.

Wasn't that complex!

zknill|9 days ago

Assuming the traditional stateless routing of requests, say round robin from load balancers; how do you make sure the returning UI client ends up on the same backend server replica that's hosting the conversation?

Or is it that all your tokens go through a DB anyway?

It's fairly easy to keep an agent alive when a client goes away. It's a lot harder to attach the client back to that agents output when the client returns, without stuffing every token though the database.

7777777phil|7 days ago

The SSE thing is a symptom of something bigger imo. These models are stateless but we often act like context windows are memory. Nothing around them actually remembers anything, and vector search doesn't fix it. I went down this rabbit hole recently: https://philippdubach.com/posts/beyond-vector-search-why-llm...

xyzsparetimexyz|9 days ago

It is honestly shocking how sloppy (pun intended) a lot of the online chatbot UIs are.

cyanydeez|9 days ago

Its further fascinating how they're trying to sell coding tools and a future wgere these things arw integral

mrcartmeneses|7 days ago

This is a feature of the web. Browser refreshes SHOULD dump state. Otherwise it can be difficult to recover from system errors. Of course if you can build a system that is guaranteed to never have bugs then go ahead and disable this feature. But users may still be confused as to why refreshing hasn’t restarted their window

jasonjmcghee|9 days ago

It's interesting because this is a solved problem with collaborative docs.

CRDT or OT will work great but are even overkill. But so many of the edge cases you'd usually need to think about just disappear.

(I've built an agent / chat that used CRDT to represent the chat. You can have an arbitrary number of tabs, closing/opening at any time. All real time, in sync.)

unknown|9 days ago

[deleted]

rcarmo|9 days ago

I fixed that in my own front-end: https://github.com/rcarmo/vibes.

unknown|9 days ago

[deleted]

rbbydotdev|9 days ago

t3.chat solves this pretty well. I believe they utilize convex db. I think it’s something like a backend server process is the true connection and state of the chat. The front end syncs and receives updates from it.

unknown|9 days ago

[deleted]

kazinator|9 days ago

> What are folks doing to get around it?

Some are using Google Gemini.

It saves your chats, which are presented in a pane you can expand on the left and search. You can jump back into any chat and continue it, or delete individual chats.

This history is attached to your Google account, not to the chat window. You can pick up an existing chat in another browser on another device where you are authenticated with the same Google identity.

Now about the specific use scenario in this article (hitting refresh immediately after submitting a prompt, while the response is coming). Not sure why that would be important?

I just tried it several times. Both times, it initially appeared as if the Gemini interface lost the chats, since they didn't appear in the chat history section of the left pane. But after another refresh, they appeared. So there is just some delay.

Anyway, it's good in this regard beyond giving a damn.

nubg|9 days ago

Lmao sorry but you completely missed the point of the article.

Yes of course all chat providers store your chats, and they will be available eventually when the response has finished streaming and has been dumped to a db.

This is about live streaming getting lost and not being reconnected (and restreamed) when you refresh the page.

And since chatting with AI and seeing the responses streamed is a major usecase, the author was correct to question why eg Anthropic wouldn invest some of the 30B in fixing this glaring problem.

Esp since it looks like your initial message was not received by the backend server at all!

It may not be super criticsl, but it's like saying "my ferrari sometimes shows the wrong speed. it's still driving, but the speedometer is stuck. it does get back to the correct speed eventually though, so no biggie"

25 comments