(no title)
agobineau | 5 months ago
or in this case, the llm inadvertently trained to conceal its intent to the user and rather to condition the user to the conclusion it truly wants rather than to answer directly
agobineau | 5 months ago
or in this case, the llm inadvertently trained to conceal its intent to the user and rather to condition the user to the conclusion it truly wants rather than to answer directly
kennywinker|5 months ago
It’d be awful if llms were able to conceal their true intent like that.
agobineau|5 months ago