top | item 38273066

(no title)

amayne | 2 years ago

Hello!

I’m the “shark-diving science journalist” in question.

First of all, you can run the experiments like I did and test this yourself. I’m not asking anyone to take my word. Just do what I did: Read the original paper. Test the claims for yourself.

And to clarify a couple things:

1. The shark-diving part is true.

2. I’ve never been a journalist of any kind that I’m aware of unless you count writing for Skeptic Magazine. I’ve had many, many jobs though.

3. I started at OpenAI as a software engineer and member of technical staff. When I started there was just over a hundred people there. The lines between engineering were and are blurry.

4. I was the original prompt engineer at OpenAI and discovered many of the examples for using GPT-3 and wrote a lot of the original documentation. Internally my title was “prompt whisperer.”

5. I’m in the GPT-4 research paper for my contributions to model capability. I helped find abilities for long-text, vision, etc.

6. I was given the title Science Communicator when I started doing background briefings for media, etc., but still worked on model capability and other things.

7. I left OpenAI two months ago to work on a startup.

Best,

Andrew Mayne

discuss

order

user_named|2 years ago

1. What do you think the reversal curse implies about LLMs? 2. Do you believe that LLMs are capable of logic? 3. Do you believe that LLMs are intelligent? 4. Do you believe that your blog post shows 3 or 4? If not, what is it about?

amayne|2 years ago

1. I don't think the original research paper demonstrated the reversal curse. They claimed that you'd only get random answers from their example prompt. I showed that wasn't the case. I also pointed out what I believe to be a flaw in how they trained their model that when corrected for gave results that were non-random.

2. That depends on what you mean by logic. What would be an example of logical reasoning that would settle this?

3. Alan Turing created the Imitation Game thought experiment to show the futility of this question. If intelligence is something that can be observed and tested, then when we should be able to describe what to test for.

4. I don't make any specific claims about LLMs logic or intelligence. I just wanted to put their claim that LLMs can't generalize from B to A to the test.