(no title)
ezst
|
4 days ago
I hate comments anthropomorphizing LLMs. You are just asking a token producing system to produce tokens in a way that optimises for plausibility. Whatever it writes has no relation to its inner workings or truths. It doesn't "believe". It has no "intent". It cannot "admit". Steering a LLM to say anything you want is the defining characteristic of an LLM. That's how we got them to mimic chatbots. It's not clear there is any way at all to make them "safe" (whatever that means).
user3939382|4 days ago
Inner workings were determined by me, not the LLM. It assisted in generating inputs which had 100% boolean results in the output.
SJMG|4 days ago