top | item 35473881

(no title)

fdgsdfogijq | 2 years ago

I work on a research team in FAANG. What it really feels like is one company made everyone else obsolete. And we are going to work working on NLP models that underperform ChatGPT by a huge margin. Twiddling my thumbs and keeping quiet while no one wants to recognize the elephant in the room.

Also, there is no "working in AI", a few thousand people are doing real AI at most. The rest of us are calling an API.

discuss

order

lacker|2 years ago

This reminds me of back in the mid 2000's, there were a lot of smart people working on search algorithms at different companies. But eventually, you'd talk to someone smart working on Yahoo Search, and they would just be kind of beaten down by the frustration of working on a competing search engine when Google was considered to be by far the best. It got harder for them to recruit, and eventually they just gave up.

So... I don't know where you're working. But don't twiddle your thumbs for too long! It's no fun to be in the last half of people to leave the sinking ship.

UncleOxidant|2 years ago

Why do you think OpenAPI is so far out in front? It's not like there's a lot of secret sauce here - most of this stuff (transformers,etc.) is all out there in papers. And places like Google & Meta must have a lot more computing resources to train on that OpenAI does thus they should be able to train faster. Do you think OpenAI has discovered something they haven't been open about?

JumpCrisscross|2 years ago

> Why do you think OpenAPI is so far out in front?

There is a network effect forming around its models. The strengths of its kit speak for themselves. (It also cannot be understated how making ChatGPT public, something its competitors were too feeble, incompetent and behind the curve to do, dealt OpenAI a massive first-mover advantage.)

But as others note, other models are in the ballpark. Where OpenAI is different is in the ecosystem of marketing literature, contracts, code and e.g. prompt engineers being written and trained with GPT in mind. That introduces a subtle switching cost, and not-so-subtle platform advantage, that–barring a Google-scale bout of incompetence–OpenAI is set to retain for some time.

fdgsdfogijq|2 years ago

I dont work at Google, I think other FAANGS underinvested in this area as they didnt think it was promising. But I will admit, I am suspicious that Google is incompetent. Probably they can come back given how much money they will be forced to throw at it. But Bard is clearly behind and I dont believe their "abundance of caution" arguments for why Bard cant even code.

redox99|2 years ago

I don't know about GPT4, but GPT3.5 I'd bet is pretty traditional and boring. It's power comes from a really good, properly curated dataset (including the RLHF).

GPT3.5 turbo is much more interesting probably, because they seem to have found out how to make it much more efficient (some kind of distillation?).

GPT4 if I had to make a very rough guess, probably flash attention, 100% of the (useful) internet/books for it's dataset, and highly optimized hyperparameters.

I'd say with GPT4 they probably reached the limit of how big the dataset can be, because they are already using all the data that exists. Thus for GPT5 they'll have to scale in other ways.

black3r|2 years ago

For a little more than a year I worked in an AI startup doing basically everything other than AI (APIs, webapps, devops...), but from what I've seen there the "secret sauce" to AI success is the training process (dataset, parameters, fine-tuning steps, ...). And OpenAI isn't open about theirs since their beginnings.

sangnoir|2 years ago

> Do you think OpenAI has discovered something they haven't been open about?

They have not, which makes me curious about which company gp works for because the "F" and "G" in FAANG are publicly known to already have LLMs. Not sure about Amazon, but I'm guessing they do too.

As an outsider, the amazing thing about ML/AI research is that you get a revolutionary discovery of a technique or refinement that changes everything, and a few months later another seminal paper is published[0]. My bet is ChatGPT is not the last word in AI, and OpenAI will not have a monopoly on upcoming discoveries that will improve the state of the art. They will have to contend with the fact that Google, Meta & Amazon own their datacenters and can likely train models for cheaper[1] than what Microsoft is paying itself via their investment in OpenAI.

0. In no particular order: Deep learning, GANs, Transformers, transfer learning, Style Transfer, auto-encoders, BERT, LLMs. Betting the farm on LLMs doesn't sound like a reasonable thing to do - not saying that's what OpenAI is doing, but there are a lot of folk on HN who are treating LLMs as the holy grail.

1. OpenAI may get a discount, but my prediction when they burn through Microsoft, they'll end up being "owned" by Microsoft for all intents and purposes.

majormajor|2 years ago

A lot of FAANG data folks aren't on the teams that were doing research into this stuff and weren't using the latest fruits of that research.

OpenAI has released a ton more easy-to-use-for-everyone stuff that has really leapfrogged what a lot of "applied" folks everywhere else were trying to build themselves, despite being on-the-face-of-it more "general."

rqtwteye|2 years ago

I think it’s the way things go usually. The big players have a business to run so they can’t focus much on innovation. OpenAI has the only purpose right now to push AI and nothing else. Once they have a real business they will also slow down.

letitgo12345|2 years ago

They have been collecting human feedback data for 2 years + probably have a lot of data from Copilot + are training with large context models + have invested a ridiculous amount in curating pretraining data -- the kind of stuff that won't get you a ton of pubs (so you won't see Google researchers having focused on it a lot) but apparently turns out to be super important for a good LLM

logicallee|2 years ago

All of the neural network architecture for human level thinking and processing, including vision, speech, emotion, abstract thought, balance and fine motor skills, everything was publicly released in April 2003, twenty years ago this month. It's a 700 megabyte tarball and sets up an 80b parameter neural network.

What? Huh? Yes the human genome encodes all human level thought.[1] Clearly it does because the only difference between humans that have abstract thought as well as language capabilities and primates that don't is slightly different DNA.

In other words: those slight differences matter.

To anyone who has used GPT since ChatGPT's public release in November and who pays to use GPT 4 now, it is clear that GPT 4 is a lot smarter than 3 was.

However, to the select few who see an ocean in a drop of water, the November release already showed glimmers of abstract thought, many other people dismiss it as an illusion.

To a select few, it is apparent that OpenAI have found the magic parameters. Everything after that is just fine tuning.

Is it any surprise that without OpenAI releasing their weights, models, or training data, Google can't just come up with its own? Why should they when without turning it into weights and models, the human neural network architecture itself is still unmatched (even by OpenAI) despite being digitized twenty years ago?

No, it's no surprise. OpenAI performed what amounts to a miracle, ten years ahead of schedule, and didn't tell anyone how they did it.

If you work for another company, such as Google, don't be surprised that you are ten years behind. After all, the magic formula had been gathering dust on a CD-ROM for 20 years (human DNA which encodes the human neural network architecture), and nobody made the slightest tangible progress toward it until OpenAI brute forced a solution using $1 billion of Azure GPU's that Microsoft poured into OpenAI in 2019.

Is your team using $1 billion of GPU's for 3 years? If not, don't expect to catch up with OpenAI's November miracle.

p.s. two months after the November miracle, Microsoft closed a $10 billion follow-on investment in OpenAI.

[1] https://en.m.wikipedia.org/wiki/Human_Genome_Project

andsoitis|2 years ago

Having a model does not a platform or a product make. You also need users and mindshare.

OpenAI is enjoying first mover advantage around the platformication and product-ification of LLMs.

For instance, why has G not yet exposed some next-level capabilities in mail, in docs, and many of their other properties?

Why do Google Assistant and Amazon Alexa and Apple Siri still suck?

nicpottier|2 years ago

Until we see otherwise, don't we have to assume there's some secret sauce? Bard doesn't match GPT4 and it isn't for a lack of trying. (though perhaps that will change, so far that's the case)

hackerlight|2 years ago

It probably is the secret sauce which remains undisclosed. Differences that seem small can lead to large differences in model quality.

diego|2 years ago

If you try Bard or Claude or character.ai they are not far behind GPT4. They might even be on par in terms of raw LLM capabilities. ChatGPT has better marketing and in some cases better UX. A lot of this is self-fulfilling. We think it's far ahead, so it appears to be far ahead.

qqtt|2 years ago

ChatGPT is cool and novel, but FAANG's requirements for ML/AI go far beyond what ChatGPT provides as a product. ChatGPT is good at answering questions based on an older data set. FAANG typically requires up to date real time inference for huge rapidly changing data sets.

Working on the practical side of ML/AI at FAANG, you will probably be working with some combination of feature stores, training platforms, inference engines, and so on - all attempting to optimize inference and models for specific use cases - largely ranking - which ads to show which customers based on feature store attributes, which shows to show which customers - all these ranking problems exist orthogonal to ChatGPT, which is using relatively stale datasets to answer knowledge based questions.

The scaling problems for AI/ML for productionizing these ranking models from training to inference is a huge scaling problem. ChatGPT hasn't really come close to solving it in a general way (and also solves a different class of problems).

yanderekko|2 years ago

Agreed. For my job maintaining real-time models with high business value to be disrupted by a chatbot, an LLM would have to be able to plug into our entire data ecosystem and yield insights in realtime. The backend engineering work required to facilitate this will be immense, and if the answer to that is "an LLM will create a new backend data architecture required to support the front-end prompt systems", then... well, suffice to say I can't see that happening overnight. It will require several major iterative and unpredictable pivots to re-envisage what exactly engineers are doing at the company.

For the time being, I expect LLMs to start creeping their tendrils into various workflows where the underlying engineering work is light but the rate of this will be limited by the slow adaptability of the humans that are not yet completely disposable. The "low hanging fruit" is obvious, but EVPs who are asking "why can't we just replace our whole web experience with a chatbot interface?" may end up causing weird overcorrections among their subordinates.

fdgsdfogijq|2 years ago

I can tell you that we have applied teams working on open problems, which can be solved out of the box with ChatGPT. Its a huge deal

alfor|2 years ago

ChatGPT is human level intelligence, it’s not just novel and cool, it’s the thing. Remember, GPT-4 training was finished 6 months ago. Listen to people at OpenAI, their concern is: disruption to the world, UBI, getting people used to superintelligence as part of our world. I think they have quite a few things in the pipeline.

So yes ads optimisation/recommendations still need to be reliable for the time being, but for how long?

zone411|2 years ago

I'm quite surprised at how little progress FAANG companies have made in recent years, as I believe much of what's happening now with ChatGPT was predictable. Here's a slide from a deck for a product I was developing in 2017: https://twitter.com/LechMazur/status/1644093407357202434/pho.... Its main function was to serve as a writing assistant.

Analog24|2 years ago

Scaling up an LM from 2017 would not achieve what GPT-4 does. It's nowhere near that simple. Of course companies saw the potential of natural language interfaces, there has been billions spent on it over the years and a lot of progress was made prior to ChatGPT coming along.

BulgarianIdiot|2 years ago

Calling an API doesn't mean no value is captured. There are vastly complex integrations of LLM as a small component in larger systems, with their own programming, memory, task models and so on.

If you think GPT is just about chat, you've misunderstood LLMs.

blazespin|2 years ago

Folks need to start getting over themselves. It's pretty trivial to get GPT4 to explain how transformers work, where the bottlenecks are in both performance and learning, and start modifying pytorch.

It's really not that complicated. Gatekeeping is so over.

xnx|2 years ago

Not sure why LLM would make Facebook (ads), Apple (hardware), Amazon (hosting, retail), Netflix (tv) obsolete. It's definitely something Google needs to think about, but there's no reason to think they won't again be the leader soon.

atonse|2 years ago

I actually think Apple is in a unique position here again with the hardware/software integration.

Once again, their ability to do computation on device and optimize silicon to do it, is unparalleled.

A huge Achilles heel of current models like GPT-4 is that they can’t be run locally. And there are tons of use cases where we don’t necessarily want to share what we’re doing with OpenAI.

That’s why if Apple wasn’t so behind on the actual models (Siri is still a joke a decade later), they’d be in great shape hardware-wise.

letitgo12345|2 years ago

Imagine Walmart launching a ChatGPT interfaced bot for shopping that customers take a liking to. Walmart starts acquiring both new customers as well as high quality data they can use for RLHF for shopping. Eventually Walmart's data moat becomes so big, that Amazon retail cannot catch up and customers start leaving Amazon.

For AWS, if MS starts giving discounts for OAI model usage to regular Azure customers, that's gonna be a strong incentive to switch

For Apple, A Windows integrated with GPT tech may become a tough beast to beat.

anon7725|2 years ago

Can confirm. People are scrambling to remain relevant.

TechnicolorByte|2 years ago

How does that manifest specifically?

VirusNewbie|2 years ago

I work at a FAANG and our unreleased models are fantastic. Now, there might be panic about how to productize it all, but tech wise i'm pretty surprised how good they are.

Robotbeat|2 years ago

Not releasing the models may be the same as the models never existing in the end.

letitgo12345|2 years ago

Hope so. Don't want a single corporate entity (OAI/MS) dominating the entire economy. This sector desperately needs competition

tayo42|2 years ago

Sounds like the same thing that happened with datacenters? No one has ops or hardware sysadmins, no one sets up large networks except a few in those centralized cloud companies and couple other niche uses. website ops job changed

carabiner|2 years ago

Not just NLP, even 3D art: https://www.youtube.com/watch?v=SzGEfYh9ITQ

Top comment: I love seeing my job get transformed from 3D artist into prompt writer into jobless in a year or less, yay!

Washuu|2 years ago

I do a lot of stylized 3D art: I still have time before AI figures that out!~

rlt|2 years ago

> Also, there is no "working in AI", a few thousand people are doing real AI at most. The rest of us are calling an API.

I would call that “applied AI” and there’s no shame in figuring out novel ways to apply a new technology.

2-718-281-828|2 years ago

come on, it's not that bad, at least you're doing linear regression