top | item 46043850

(no title)

Huppie | 3 months ago

I real life if someone with an administrative job would jot 50 * 3,000 in a calculator and not notice the answer 1,500,000 is wrong (a typo) I will consider them most definitely at fault. Similarly I know some structural engineers who will notice something went wrong with the input if an answer is not within a given range.

A calculator can be used to do things you know how to do _faster_ imho but in most jobs it still requires you to at least somewhat understand what is happening under the hood. The same principle applies to using LLMs at work imho. You can use it to do stuff you know how to do faster but if you don't understand the material there's no way you can evaluate the LLMs answer and you will be at fault when there's AI slop in your output.

eta: Maybe it would be possible to design labs with LLM's in such a way that you teach them how to evaluate the LLM's answer? This would require them to have knowledge of the underlying topic. That's probably possible with specialized tools / LLM prompts but is not going to help against them using a generic LLM like ChatGPT or a cheating tool that feeds into a generic model.

discuss

nicce|3 months ago

> Maybe it would be possible to design labs with LLM's in such a way that you teach them how to evaluate the LLM's answer? This would require them to have knowledge of the underlying topic. That's probably possible with specialized tools / LLM prompts but is not going to help against them using a generic LLM like ChatGPT or a cheating tool that feeds into a generic model.

What you are desribing is that they should use LLM just after they know the topic. A dilemma.

Huppie|3 months ago

Yeah, I kinda like the method siscia suggests downthread [0] where the teacher grades based on the question they ask the LLMs during the test.

I think you should be able to use the LMM at home to help you better understand the topic (they have endless patience and you can usually you can keep asking until you actually grok the topic) but during the test I think it's fair to expect that basic understanding to be there.

[0] https://news.ycombinator.com/item?id=46043012