top | item 46258299

(no title)

atum47 | 2 months ago

I usually have this technical hypothetical discussions with ChatGpt, I can share if you like, me asking him this: aren't LLMs just huge Markov Chains?! And now I see your project... Funny

discuss

pavel_lishin|2 months ago

> I can share if you like

Respectfully, absolutely nobody wants to read a copy-and-paste of a chat session with ChatGPT.

atum47|2 months ago

When you say nobody you mean you, right? You can't possible be answering for every single person in the world.

I was having a discussion about similarities between Markov Chains and LLMs and short after I found this topic on HN, when I wrote "I can share if you like" was as a proof about the coincidence.

matusp|2 months ago

LLMs are indeed Markov chains. The breakthrough is that we are able to efficiently compute well performing probabilities for many states using ML.

famouswaffles|2 months ago

LLMs are not Markov Chains unless you contort the meaning of a Markov Model State so much you could even include the human brain.

arboles|2 months ago

Markov models with more than 3 words as "context window" produce very unoriginal text in my experience (corpus used had almost 200k sentences, almost 3 million words), matching the OP's experience. These are by no means large corpuses, but I know it isn't going away with a larger corpus.[1] The Markov chain will wander into "valleys" of reproducing paragraphs of its corpus one for one because it will stumble upon 4-word sequences that it has only seen once. This is because 4 words form a token, not a context window. Markov chains don't have what LLMs have.

If you use a syllable-level token in Markov models the model can't form real words much beyond the second syllable, and you have no way of making it make more sense other than increasing the token size, which exponentially decreases originality. This is the simplest way I can explain it, though I had to address why scaling doesn't work.

[1] There are 4^400000 possible 4-word sequences in English (barring grammar) meaning only a corpus with 8 times that amount of words and with no repetition could offer two ways to chain each possible 4 word sequence.

cwyers|2 months ago

Yeah, there's only two differences between using Markov chains to predict words and LLMs:

* LLMs don't use Markov chains, * LLMs don't predict words.

srean|2 months ago

They are definitely not Markov Chains they may, however, be Markov Models. There's a difference between MC and MM.

unknown|2 months ago

[deleted]

atum47|2 months ago

Don't know what happened. I stumbled onto a funny coincidence - me talking to a LLM about its similarities with MC - decided to share on a post about using MC to generate text. Got some nasty comments and a lot of down votes. Even though my comment sparked a pretty interesting discussion.

Hate to be that guy, but I remember this place being nicer.

roarcher|2 months ago

Ever since LLMS became popular, there's been an epidemic of people pasting ChatGPT output onto forums (or in your case, offering to). These posts are always received similarly to yours, so I'm skeptical that you're genuinely surprised by the reaction.

Everyone has access to ChatGPT. If we wanted its "opinion" we could ask it ourselves. Your offer is akin to "Hey everyone, want me to Google this and paste the results page here?". You would never offer to do that. Ask yourself why.

These posts are low-effort and add nothing to the conversation, yet the people who write them seem to expect everyone to be impressed by their contribution. If you can't understand why people find this irritating, I'm not sure what to tell you.

pavel_lishin|2 months ago

Nobody was being nasty. roarcher explained why people reacted the way they did.

roarcher|2 months ago

...are you under the impression that you have an exclusive relationship with "him"? Everyone else has access to ChatGPT too.

atum47|2 months ago

Yes. Yes I was. Thank you for the wake up call. I was under the impression that he was talking only to me.