top | item 44518903

(no title)

falleng0d | 7 months ago

[flagged]

discuss

It's not word salad, Grok was literally posting unironic praise for Hitler two days ago.

Levitz|7 months ago

It was also stating that the life of a single Jew is worth more than that of two million non-Jews.

LLMs can occasionally say crazy stuff, that is not surprising, and I think we should do better than leaning into the outrage machine.

The opposite is how we end up with ridiculous guardrails, like having ChatGPT say that it would rather allow all of humanity to perish than to say the N word, a statement which is orders of magnitude worse, only more publicly palatable.

mgoetzke|7 months ago

LLMs can be baited, small changes to system prompts can cause this quite unexpectedly just like many big companies found out by accident.

we fix it and move on.

Treegarden|7 months ago

It was but so were other models before. OP said the twitter to grok feature is a good use case and I agree. Its great for fact checking. For example it will debunk conspiracy theories and misinformation tweets in general. I even asked it about its own hitler meltdown and it rejected its own words (so I must have asked it after they fixed it).