(no title)
daoboy | 5 months ago
I recently started an audio dream journal and want to keep it private. Set up whisper to transcribe the .wav file and dump it in an Obsidian folder.
The plan was to put a local llm step in to clean up the punctuation and paragraphs. I entered instructions to clean the transcript without changing or adding anything else.
Hermes responded by inventing an intereview with Sun Tzu about why he wrote the Art of War. When I stopped the process it apologized and advised it misunderstood when I talked about Sun Tzu. I never mentioned Sun Tzu or even provided a transcript. Just instructions.
We went around with this for a while before I could even get it to admit the mistake, and it refused to identify why it occurred in the first place.
Having to meticulously check for weird hallucinations will be far more time consuming than just doing the editing myself. This same logic applies to a lot of the areas I'd like to have a local llm for. Hopefully they'll get there soon.
simonh|5 months ago
I suppose we shouldn’t be surprised in hindsight. We trained them on human communicative behaviour after all. Maybe using Reddit as a source wasn’t the smartest move. Reddit in, Reddit out.
smallmancontrov|5 months ago
root_axis|5 months ago
More fundamental than the training data is the fact that the generative outputs are statistical, not logical. This is why they can produce a sequence of logical steps but still come to incorrect or contradictory conclusions. This is also why they tackle creativity more easily since the acceptable boundaries of creative output is less rigid. A photorealistic video of someone sawing a cloud in half can still be entertaining art despite the logical inconsistencies in the idea.
HankStallone|5 months ago
dragonwriter|5 months ago
It is easy, comparatively. Accuracy and correctness is what computers have been doing for decades, except when people have deliberately compromised that for performance or other priorities (or used underlying tools where someone else had done that, perhaps unwittingly.)
> Yet here we are, the actual problem is inventing new heavy enough training sticks to beat our AIs out of constantly making stuff up and lying about it.
LLMs and related AI technologies are very much an instance of extreme deliberate compromise of accuracy, correctness, and controllability to get some useful performance in areas where we have no idea how to analytically model the expected behavior but have lots of more or less accurate examples.
unknown|5 months ago
[deleted]