top | item 35300012

(no title)

gateorade | 2 years ago

This has been my experience. I’m really impressed by how well GPT-4 seems to be able to interpolate between problems heavily represented in the training data to create what feels like novelty, eg. Creating a combination of pong and conway’s game of life, but it doesn’t seem to be good at extrapolation.

The type of work I do is highly niche. I’ve recently been working on a specific problem for which there are probably only a hundred at most implementations running on production systems, all of them highly proprietary. I would be surprised if there were any implementations in GPTs training set. With that said, this problem is not actually that complicated. A rudimentary implementation can be done in ~100 lines of code.

I asked GPT-4 to write me an implementation. It knew a decent amount about the problem (probably from Wikipedia). If it was actually capable of something close to reasoning it should have been able to write an implementation, but when it actually started writing code it was reluctant to write more than a skeleton. When I pushed it to implement specific details it completely fell apart and started hallucinating. When I gave it specific information about what it was doing wrong it acknowledged that it made a mistake and simply gave me a new equally wrong hallucination.

The experience calmed my existential fears about my job being taken by AI.

discuss

softfalcon|2 years ago

This exact scenario is what I described to a friend of mine who is an AI researcher.

He was convinced that if we trained the AI on enough data, GPT-x would become sentient.

My opinion was similar to yours. I felt like the hallucinating the AI does was insufficient in performing true extrapolating thought.

I said this because humans don’t truly have access to infinite knowledge, even when they do, they can’t process all of it. Adding endless information for the AI to feed on doesn’t seem like the solution to figuring out true intelligence. It’s just more of the same hallucinating.

Yet despite lacking knowledge, us humans still come up with consistently original thoughts and expressions of our intelligence daily. With limited information, our minds create new representations of understanding. This seems to be impossible for Chat GPT.

I could be completely wrong, but that discussion solidified for me that my role as a dev still has at least a couple more decades of shelf life left.

It’s nice to hear that others are reaching similar conclusions.

visarga|2 years ago

Current LLMs decode in a greedy manner, token by token. In some cases this is good enough - namely for continuous tasks, but in other cases the end result means the model has to backtrack and try another approach, or edit the response. This doesn't work well with the way we are using LLMs now, but could be fixed. Then you'd get a model that can do discontinuous tasks as well.

>> Write a response that includes the number of words in your response.

> This response contains exactly sixteen words, including the number of words in the sentence itself.

It contains 15 words.

The model would have to plan everything before outputting the first token if it were to solve the task correctly. Works if you follow up with "Explicitly count the words", let it reply, then "Rewrite the answer".

andsoitis|2 years ago

> This exact scenario is what I described to a friend of mine who is an AI researcher. He was convinced that if we trained the AI on enough data, GPT-x would become sentient. My opinion was similar to yours. I felt like the hallucinating the AI does was insufficient in performing true extrapolating thought.

It turns out it isn’t just AIs that hallucinate; AI researchers do as well.

Majromax|2 years ago

> He was convinced that if we trained the AI on enough data, GPT-x would become sentient.

Is there enough data?

As I understand it, the latest large language models are trained on almost every piece of available text. GPT-4 is multimodal in part because there isn't an easy way to increase its dataset with more text. In the meantime, text is already quite information dense.

I'm not sure that future models will be able to train on an order of magnitude more information, even if the size of their training sets has a few more zeroes added to the end.

psychphysic|2 years ago

The threshold for sentience is continually falling.

So he might be right but due to time and not due to improved performance.

I believe in the UK all vertibrates are considered sentient (by law not science). That includes goldfish.

And good luck even getting a goldfish to reverse a linked list. Even after 1000 implementations are provided.

wslh|2 years ago

> He was convinced that if we trained the AI on enough data, GPT-x would become sentient.

Not saying your friend is right or wrong, but imagine if civilization gives more information, in realtime, to an AI system through sensors: will be at least sentient as the civilization? Seems like a scifi story, a competitor to G-d.

kaba0|2 years ago

I’m not at all an expert on the topic, but from what I gathered LLMs are fundamentally limited in the kind of problems they can approximate. They can approximate any integrable function quite well, but we can only come up with limits on a case-by-case basis for non-integrable ones, and I believe most interesting problems are of this latter kind.

Correct me if I’m wrong, but doesn’t it mean that they can’t recursively “think”, on a fundamental basis? And sure I know that you can pass “show your thinking” to GPT, but that’s not general recursion, just “hard-coded to N iterations” basically, isn’t it? And thus no matter how much hardware we throw at it, it won’t be able to surpass this fundamental limit (and without proof, I firmly believe that for a GAI we do need the ability to basically follow through a train of thought)

aiphex|2 years ago

If they aren't already, AIs will be posting content on social media apps. These apps measure the amount of attention you pay to each thing presented to you. If it's more than a picture or a video, but something interactive, then it could also learn how we interact with things in more complex ways. It also gets feedback from us through the comments section. Like biological mutations, AIs will learn which of its (at first) random novel creations we find utility in. It will then better learn what drives us and will learn to create and extrapolate at a much faster pace than us.

dmichulke|2 years ago

> Yet despite lacking knowledge, us humans still come up with consistently original thoughts and expressions of our intelligence daily.

I think there is some sampling bias in your observation ;-)

oliveiracwb|2 years ago

More data will only mean more inference. But at some unexpected moment, the newly created "senseBERT" breaks the barrier between intelligence and consciousness.

antonvs|2 years ago

> He was convinced that if we trained the AI on enough data, GPT-x would become sentient.

It sounds like he doesn't even understand the basics of what GPT is, or what sentience is. GPT is an impressive manipulator/predictor of language, but we have evidence from all sorts of directions that there's more to sentience or consciousness than that.

suction|2 years ago

[deleted]

braindead_in|2 years ago

I would like to propose a thought experiment concerning the realm of knowledge acquisition. Given that the scope of human imagination is inherently limited, it is inevitable that certain information will remain beyond our grasp; these are the so-called "known unknowns." In the event that an individual generates a piece of knowledge from this inaccessible domain, how might it manifest in our perception? It is likely that such knowledge would appear incomprehensible to us. Consequently, it is worth considering the possibility that the GPT model is not, in fact, experiencing hallucinations; rather, our human understanding is simply insufficient to fully grasp its output.

vasco|2 years ago

> The experience calmed my existential fears about my job being taken by AI.

The issue is that among all the 100k+ software engineers, many don't really do anything novel. How many startups are employing dozens of engineers to create online accessible CRUDs to replace a spreadsheet?

In the company I work for I'd say we have about 15 developers or about 3 teams doing interesting work, and everyone else builds integrations, CRUDs, moves a button there and back in "an experiment", ads a new upsell, etc. All these last parts could be done by a PM or good UX person alone, given good enough tools.

The other parts I'm not worried about either.

yohannesk|2 years ago

For the type of engineers you describe the hard part I think is communication with other devs, communication with product owners, understanding the problem, suggesting different ways of solving the problem, figuring out which department personnel (outside other devs) to talk to about a little detail that you don't have... it's not writing the code which is hard, atleast from my experience

oblio|2 years ago

The question is... writing the code is a very small part of the job.

Figuring out what code to write is one of the big parts.

Fixing it when it breaks in many creative ways is the other big part.

How good is ChatGPT at fixing bugs? Security bugs or otherwise?

sterlind|2 years ago

I had a similar experience. I wanted it to write code to draw arcs on a world map, with different bends rather than going on a straight bearing. I did all the tricks, told it to explain its chain of thought, gave it a list of APIs to use (with d3-geo), simplified and simplified and spent a couple hours trying to reframe it.

It just spit out garbage. Because (afaict) there aren't really examples of that specific thing on the Internet. And it's just been weirdly bad at all the cartography-related programming problems I've thrown at it, in general.

And yeah, I'm much less worried about it replacing me now. It's just not.. lucid, yet.

laurels-marts|2 years ago

GPT-4 is reasonably good at D3 and drawing arcs on a projection (e.g. orthographic) is not that unique, you’ll find examples of it on observable. However I wonder if you broke down the problem into a small enough task. It performs best if you provide a clear but brief problem description with a code snippet that already kind of does what you want (e.g. using straight lines) and then just ask it to modify your code to calculate arcs instead. The combination of clear description + code I found decreases the likelihood of it getting confused about what you’re asking and hallucinate. If you give it a very long-winded request with no code as basis for it then good luck.

noduerme|2 years ago

I imagine that creative approaches to spacial problem solving would be one of the harder areas for it - not just because there are by definition fewer public examples of one-off or original solutions, but also because one has to visualize things in space before figuring out how to code it. These bots don't have a concept of space. I'm thinking of DALL-E (et. al) having problems with "an X above Y, behind Z".

v4dok|2 years ago

GPT4 has its hands tied behind its back. It does not have active learning and it does not have a robust system of memory or a reward/punishment mechanism. We only now start seeing work on this side [1]

It might not know more than you about your niche. I don't. I would search and I would try to reason, but if I was forced to give a token by token output that is answering the question as truthfully as possible, I might have started saying bullshit as well.

I don't think that the fact that gpt doesn't know things or does some things wrong is sufficient to save dev work from automation.

[1]: https://github.com/noahshinn024/reflexion-human-eval

blablabla123|2 years ago

> The experience calmed my existential fears about my job being taken by AI.

Same for me. I didn't try GPT-4 yet, and not on code from work anyway but GPT-3 seems borderline useless at this point. The hallucinations are quite significant. Also I tried to produce advice for Agile development with references and as stated in other articles the links where either 404s or even completely unrelated articles.

Still I'm taking this seriously. Just considering the leaps that happened with AlphaGo/AlphaZero or autonomous driving, that was considered unthinkable in the respective domains before.

zeroonetwothree|2 years ago

Even if AI only takes over “easy” programming jobs, it might still create a huge downward pressure on compensation.

After all, just look at manufacturing. Compared to 1970 we produce 5x the real output but employ only 50% the people. The same will likely happen to fields like programming as AI improves.

olivermuty|2 years ago

For the crap devs maybe, but high skill devs and arcitechts will be able to charge more than ever to oversee all of this «productivity» from the AIs.

nimbix|2 years ago

I asked it to write a trivial c#/dotnet example of two actors where one sends a ping message and the other responds with pong. It couldn't get the setup stage right, called several methods that don't exist, and and had a cyclic dependency between actors that would probably take some work to resolve.

Event after several iterations of giving it error messages and writing explanations of what's not working, it didn't even get past the first issue. Sometimes it would agree that it needs to fix something, but would then print back code with exactly the same problem.

toss1|2 years ago

Yes, exactly this.

I wrote some questions in the specialist legal field of someone in my household, then started to get into more specialist questions, and then specifically asked about a paper that she wrote innovating a new technique in the field.

The general question answers were very impressive to the attny. The specialist questions started turning up errors and getting concepts backwards - bad answers.

When I got to summarizing the paper with the new technique, it could not have been more wrong. It got the entire concept backwards and wrong, barfing generic and wrong phrases, and completely ignored the long list of citations.

Worse yet, to the point of hilariously bad, when asked for the author, date, and employer of the paper, it was entirely hallucinating. Literally, the line under the title was the date, and after that was "Author: [name], [employer]". It just randomly put up dates and names (or combinations of real names) of mostly real authors and law firms in the region. Even when pointed out the errors, it would apologize, and then confidently spout a new error. Eventually it got the date correct, and that stuck, but even when prompted with "Look at where it says 'Author: [fname]" and tell me the full name and employer, it would hallucinate a last name and employer. Always with the complete confidence of a drunken bullshit artist.

Similar for my field of expertise.

So, yes, for anything real, we really need to keep it in the middle-of-the-road zone of maximum training. Otherwise, it will provide BS (of course if it is BS we want, it'll produce it on an industrial scale!).

willbudd|2 years ago

Yeah, in that sense I think one of the next logical steps will be providing on-demand lightweight learning/finetuning of LLM versions/forks (maybe as LoRAs?) as an API and integrated UX based on user chat feedback, while abstracting away all the technical hyperparameter and deployment details involved in a DIY setup. With a lucrative price tag of course.

funstuff007|2 years ago

> but it doesn’t seem to be good at extrapolation.

This is true to varying degrees for every statistical model ever.

gateorade|2 years ago

Yeah that’s basically my point. The hype on HN/Twitter/etc. forget this.

lostmsu|2 years ago

What would you be able to write with similar requests, if you'd only ever be allowed to use Notepad, and never run compiler/linter/tests, and not allowed to use Internet?

SketchySeaBeast|2 years ago

Given I don't have petabytes of information accessible for instant retrieval (including perfect copies of my language of choice's entire API) I don't think that's comparable. I wouldn't need the entire if I'd memorized a large portion of it.

mannykannot|2 years ago

Unlike current LLMs, your typical competent programmer would not hallucinate.

m3kw9|2 years ago

Quants jobs are safe because if it’s public there’s no edge