top | item 45400587

(no title)

neuronic | 5 months ago

How are you getting these results? Even with grounding in sources, careful context engineering and whatever technique comes to your mind we are just getting sloppy junk out of all models we have tried.

The sketchy part is that LLMs are super good at faking confidence and expertise all while randomly injected subtle but critical hallucinations. This ruins basically all significant output. Double-checking and babysitting the results is a huge time and energy sink. Human post-processing negates nearly all benefits.

Its not like there is zero benefit to it, but I am genuinely curious how you get consistently correct output for a "complicated subject matter like insurance".

discuss

bdangubic|5 months ago

I genuinely think that biggest issue LLM tools is that most people expect magic because first attempts at some simple things feel magical. however, they take insane amount of time to get expertise in. what is confusing is that I think SWEs spent immense amounts of time in general learning the tools of the trade but this seems to escape a lot of people when it comes to LLMs. on my team, every developer is using LLMs all day, every day. on average based on sprint retros each developer spends no less than an hour each day experimenting/learning/reading… how to make them work. the realization we made early is that when it comes to LLMs there are two large groups:

- group that see them as invaluable tools capable of being an immense productivity multiplier

- group that tried things here and there and gave up

we collectively decided that we want to be in the first group and were willing to put time to be in that group.

danpalmer|5 months ago

I'm persisting, have been using LLMs quite a bit for the last year, they're now where I start with any new project. Throughout that time I've been doing constant experimentation and have made significant workflow improvements throughout.

I've found that they're a moderate productivity increase, i.e. on a par with, say, using a different language, using a faster CI system, or breaking down some bureaucracy. Noticeable, worth it, but not entirely transformational.

I only really get useful output from them when I'm holding _most_ of the context that I'd be holding if writing the code, and that's a limiting factor on how useful they can be. I can delegate things that are easy, but I'm hand-holding enough that I can't realistically parallelise my work that much more than I already do (I'm fairly good at context switching already).

lomase|5 months ago

I have been in teams that do this and in teams that dont.

I have not see any tangible difference in the output of both.

vivzkestrel|5 months ago

dont you think it would be better off getting that expertise in actual system design, software engineering and all the programming related fields. by involving chat GPT to make code, we ll eventually lose the skill to sit and craft code like we used to do all these years. after all the brain s neural pathways only remember what you put to work daily

caseyf7|5 months ago

Where are you finding the best material for reading/learning?

oblio|5 months ago

> Its not like there is zero benefit to it, but I am genuinely curious how you get consistently correct output for a "complicated subject matter like insurance".

Most likely by trying to get a promotion or bonus now and getting the hell out of Dodge before anyone notices those subtle landmines left behind :-)

fn-mote|5 months ago

Cynical, but maybe not wrong. We are plenty familiar with ignoring technical debt and letting it pile up. Dodgy LLM code seems like more of that.

Just like tech debt, there's a time for rushing. And if you're really getting good results from LLMs, that's fabulous.

I don't have a final position on LLM's but it has only been two days since I worked with a colleague who definitely had no idea how to proceed when they were off the "happy path" of LLM use, so I'm sure there are plenty of people getting left behind.

gamblor956|5 months ago

A lot of programmers that say that LLMs are awesome tend to be inexperienced, not good programmers, or just gloss over the significant amount of extra work that using LLMs requires.

Programmers tend to overestimate their knowledge of non-programming domains, so the OP is probably just not understanding that there are serious issues with the LLM's output for complicated subject matters like insurance.

cjbarber|5 months ago

What are you trying to use LLMs for and what model are you using?

0000000000100|5 months ago

Depends a lot. Use it for one off scripts, particularly for anything Microsoft 365 related (expanding Sharepoint drives, analyzing AWS usage, general IT stuff). Where there is a lot of heavy context based business logic it will fail since there’s too much context for it to be successful.

I work in custom software where the gap in non-LLM users and those who at least roughly know how to use it is huge.

It largely depends on the prompt though. Our ChatGPT account is shared so I get to take a gander at the other usages and it’s pretty easy see: “okay this person is asking the wrong thing”. The prompt and the context has a major impact on the quality of the response.

In my particular line of work, it’s much more useful than not. But I’ve been focusing on helping build the right prompts with the right context, which makes many tasks actually feasible where before it would be way out of scope for our clients budgets.

kace91|5 months ago

Could you give an example of a prompt?