The current paradigm is that AI is a destination. A product you go to and interact with.
That's not at all how the masses are going to interact with AI in the near future. It's going to be seamlessly integrated into every-day software. In Office/Google docs, at the operating system level (Android), in your graphics editor (Adobe), on major web platforms: search, image search, Youtube, the like.
Since Google and other Big Tech continue to control these billion-user platforms, they have AI reach, even if they are temporarily behind in capability. They'll also find a way to integrate this in a way where you don't have to directly pay for the capability, as it's paid in other ways: ads.
OpenAI faces the existential risk, not Google. They'll catch up and will have the reach/subsidy advantage.
And it doesn't end there. This so-called "competition" from open source is going to be free labor. Any winning idea ported into Google's products on short notice. Thanks open source!
The part of the post that resonates for me is that working with the open source community may allow a model to improve faster. And, whichever model improves faster, will win - if it can continue that pace of improvement.
The author talks about Koala but notes that ChatGPT is better. GPT-4 is then significantly better than GPT-3.5. If you've used all the models and can afford to spend money, you'd be insane to not use GPT-4 over all the other models.
Midjourney is more popular (from what I'm seeing) than Stable Diffusion at the moment because it's better at the moment. Midjourney is closed-source.
The point I'm wanting to make is that users will go to whoever has the best model. So, the winning strategy is whatever strategy allows your model to compound in quality faster and to continue to compound that growth in quality for longer.
Open source doesn't always win in producing better quality products.
Linux won in servers and supercomputing, but not in end user computing.
Open-source databases mostly won.
Chromium sorta won, but really Chrome.
Then in most other areas, closed-source has won.
So one takeaway might be that open-source will win in areas where the users are often software developers that can make improvements to the product they're using, and closed-source will win in other areas.
Fantastic article if you are quick to just go to the comments like I usually do, don't. Read it.
One of my favorites:
LoRA works by representing model updates as low-rank factorizations, which reduces the size of the update matrices by a factor of up to several thousand. This allows model fine-tuning at a fraction of the cost and time. Being able to personalize a language model in a few hours on consumer hardware is a big deal, particularly for aspirations that involve incorporating new and diverse knowledge in near real-time. The fact that this technology exists is underexploited inside Google, even though it directly impacts some of our most ambitious projects.
Anyone has worked with LoRa ? Sounds super interesting.
This gets attention due to being a leak, but it’s still just one Googler’s opinion and it has signs of being overstated for rhetorical effect.
In particular, demos aren’t the same as products. Running a demo on one person’s phone is an important milestone, but if the device overheats and/or gets throttled then it’s not really something you’d want to run on your phone.
It’s easy to claim that a problem is “solved” with a link to a demo when actually there’s more to do. People can link to projects they didn’t actually investigate. They can claim “parity” because they tried one thing and were impressed. Figuring out if something works well takes more effort. Could you write a product review, or did you just hear about it, or try it once?
I haven’t investigated most projects either so I don’t know, but consider that things may not be moving quite as fast as demo-based hype indicates.
It comes across as something from an open source enthusiast outside Google. Note the complete lack of references to monetization. Also, there's no sense of how this fits with other Google products. Given a chat engine, what do you do with it? Integrate it with search? With Gmail? With Google Docs? LLMs by themselves are fun, but their use will be as components of larger systems.
Having enough scale to perpetually offer free/low-cost compute is a moat.
The primary reason ChatGPT went viral in the first place was because it was free, with no restrictions. Back in 2019, GPT-2 1.5B was made freely accessible by a single developer via the TalkToTransformers website, which was the very first time many people talking about AI text generation...then the owner got hit with sticker shock from the GPU compute needed to scale.
AI text generation competitors like Cohere and Anthropic will never be able to compete with Microsoft/Google/Amazon on marginal cost.
And ChatGPT has a super low barrier to entry while open source alternatives have a high one.
Creating a service that can compete with it on that regard implies you can scale GPU farms in a cost effective way.
It's not as easy as it sounds.
Meanwhile, openai still improves their product very fast, and unlike google, it's their only one. It's their baby. It has their entire focus.
Since for most consumers, AI == ChatGPT, they have the best market share right now, which mean the most user feedback to improve their product. Which they do at a fast pace.
They also understand that to get mass adoption, they need to censor the AI, like MacDonald and Disney craft their family friendly image. Which irritate every geeks, including me, but make commercially sense.
Plus, despite the fact you can torrent music and watch it with VLC, and that Amazon+Disney are competitors, netflix exists. Having a quality service has value in itself.
I would not count open ai as dead as a lot of people seem to desperately want it to be. Just because Google missed the AI train doesn't mean wishful thinking the market to be killed by FOSS is going to make it so.
As usual with those things it's impossible to know in advance what's going to happen, but odds are not disfavoring chatgpt as much as this article says.
> Having enough scale to perpetually offer free/low-cost compute is a moat.
Its a moat for services, not models, and its only a moat for AI services as long as that compute isn’t hobbled by being used for models which are so inefficient compared to SOTA as to waste the advantage, which underlines why leaning into open source the way this piece urges is in Google’s interests, the same way open source has worked to Google and Amazon’s benefits as service providers in other domains.
(Not so much “the ability to offer free/low-cost compute” but “the advantages of scale and existing need for widely geographically dispersed compute on the cost of both marginal compute and having marginal compute close to the customer where that is relevant”, but those are pretty close to differenly-focussed rephrasings of the same underlying reality.)
> AI text generation competitors like Cohere and Anthropic will never be able to compete with Microsoft/Google/Amazon on marginal cost.
Anthropic already does, with its models. They are same price or cheaper than OpenAI, with comparable quality.
> Having enough scale to perpetually offer free/low-cost compute is a moat.
Rather than a moat it is a growth strategy. At one point in time you need to start to monetize and this is the moment when rubber hits the road. If you can survive monetization and continue to grow, now you have a moat.
Investors are obsessed with moats, but people have to realize that the entire world runs on business that have no moats.
There are no moats to being a plumber, a baker, a restaurant...
The moat concept is predominant because the idea that everything must make billions have infected the debate about businesses.
It's all about being a unicorn, a giant, a monopoly, making every body at the top billionaires, and it's like there is no other way to live.
Except that's not how most people do live, even entrepreneurs.
Even Apple, which today is the typical example of a business with a moat, didn't start with "we can't get into this computer business, we'd have no moat".
They have a moat now, but it's a consequence of all the business decisions and the thing they built after many decades.
They didn't start their project by the moat. The started their project by providing value and marketing it.
You can't blame them: gratuitous moats (like those provided by winner-takes-all dynamics) are not common in a functioning (competitive) economy so they get to be revered.
It feels unlikely that the recent period of big tech can keep the same benefits going forward. It was basically a political moat: counting on the ongoing lack of antitrust and consumer protection regulation. Even if the political dysfunction that allows that continues (quite likely), the wheels of the universe are turning.
The "leaked" report focuses on open source - a mode of producing software that is bound to become a major disruptor. We tend to discount open source because of its humble beginnings, long incubation, many false dawns and difficult business models. But if you objectively take a look at what is possible today with open source software, its quite breathtaking. I would not discount some tectonic shifts in adoption. The long running joke is "the year of the linux desktop", but keep adding open source AI and other related functionality and at some point the value proposition of open source computing (both for individuals and enterprises) will be crushingly large to ignore.
Don't forget too, that other force of human nature: geopolitics (e.g., think TikTok and friends). The current "moats" were established during an earlier, more innocent era. Now digitization is a top priority / concern for many countries. The idea that somebody can build a long-lived AI moat given the stakes is strange to say the least.
> There are no moats to being a plumber, a baker, a restaurant...
This line is interesting to me, because actually I think there _is_ a major moat there: locality. I don't disagree with the rest of your comment, but for those examples specifically a lot of the value of specific instances of those business comes from their being in your neighborhood. If I live in Toronto, I'm not going to fly a plumber from Manhattan to fix my pipes; if I want a loaf of sourdough, I'm not going to get it from San Francisco, I'm going to get it from the bakery around the corner; I might travel out of town for a particularly unique and amazing restaurant, but not every week, I've got solid enough options within a ten minute drive. Software is different because that physical accessibility hurdle doesn't exist.
Shareholders gain pennies with moats, pennies that someone else does not earn. Without moats they benefit much more, but it's not more than someone else. And that's the contentious issue. How would I benefit more than my neighbor?
I have been toying around with Stable Diffusion for a while now and becoming comfortable with the enormous community filled with textual inversions, LoRAs, hyper networks and checkpoints. You can get things with names like “chill blend”, a fine-tuned model on top of the SD with the author’s personal style.
There is something called automatic1111 which is a pretty comprehensive web UI for managing all these moving parts. Filled to the brim with extensions to handle AI upscaling, inpainting, outpainting, etc.
One of these is ControlNet where you can generate new images based on pose info extracted from an existing image or edited by yourself in the web based 3d editor (integrated, of course). Not just pose but depth maps, etc. All with a few clicks.
The level of detail and sheer amount of stuff is ridiculous and it all has meaning and substantial impact on the end result. I have not even talked about the prompting. You can do stuff like [cow:dog:.25] where the generator will start with a cow and then switch over at 25% of the process to a dog. You can use parens like ((sunglasses)) to focus extra hard on that concept.
There are so called LoRAs trained on specific styles and/or characters. These are usually like 5-100MB and work unreasonably well.
You can switch over to the base model easily and the original SD results are 80s arcade game vs GTA5. This stuff has been around for like a year. This is ridiculous.
LLMs are enormously “undertooled”. Give it a year or so.
My point by the way is that any quality issues in the open source models will be fixed and then some.
I'm a bit sceptical of the "no moat" proposition because (a) ChatGPT 4.0 really does seem in a different league and (b) it's clearly very hard to run. I haven't seen anything from the explosion of open source / community efforts that comes close for general applications.
The take in the post rings of the classic trademark Google arrogance where they assume that if somebody else can do it they can do it better if they just try - where the challenge of "just trying" is discounted to zero. In reality, "Just trying" is massively important and sometimes all that is important. The gap between unrefined model output and the level of polish and refinement that is apparent with ChatGPT 4 may appear technically small but it's the whole difference between a widely applicable and usable product and something that can't be more than a toy. I'm not sure Google has it in it any more to really fight for something they want to achieve that level of polish.
Great read, but I don't agree with all of these points. OpenAI's technological moat is not necessarily meaningful in a context where the average consumer is starting to recognize ChatGPT as a brand name.
Furthermore, models which fine-tune LLMs are still dependent on the base model's quality. Having a much higher quality base model is still a competitive advantage in scenarios where generalizability is an important aspect of the use case.
Thus far, Google has failed to integrate LLMs into their products in a way that adds value. But they do have advantages which could be used to gain a competitive lead:
- Their crawling infrastructure could allow their to generate better training datasets, and update models more quickly.
- Their TPU hardware could allow them to train and fine-tune models more quickly.
- Their excellent research divisions could give them a head start with novel architectures.
If Google utilizes those advantages, they could develop a moat in the future. OpenAI has access to great researchers, and good crawl data through Bing, but it seems plausible to me that 2 or 3 companies in this space could develop sizeable moats which smaller competitors can't overcome.
Consumers recognizing ChatGPT might just end up like vacuum cleaners; at least in the UK, people will often just call it a "hoover" but the likelihood of it being a Hoover is low.
It is difficult to see where the moat might exist if it's not data and the majority of the workings are published / discoverable. I don't think the document identifies a readily working strategy to defend against the threats it recognises.
I'll also mark myself as skeptical of the brand-as-moat. I think AskJeeves and especially Yahoo probably had more brand recognition just before Google took over than ChatGPT or openai has today.
You're forgetting the phenomenon of the fast follower or second to market effect. Hydrox and Oreos, Newton and Palm, MySpace and Facebook, etc. Just because you created the market doesn't necessarily mean you will own it long term. Competitors often respond better to customer demand and are more willing to innovate since they have nothing to lose.
> in a context where the average consumer is starting to recognize ChatGPT as a brand name.
That brand recognition could hurt them, though. If the widespread use of LLMs results in severe economic disruption due to unemployment, ChatGPT (and therefore OpenAI) will get the majority of the ire even for the effects of their competition.
> context where the average consumer is starting to recognize ChatGPT as a brand name.
Zoom was once that brand name which was equated to a product. Now, people might say "Zoom call", but may use Teams or Meet or whatever. Similarly, people call a lot of robot vacuum cleaners Roombas, even though they might be some other brand.
Brand recognition is one thing, but the actual product used will always depend on what their employer uses, what their mobile OS might use, or what API their products might use.
For businesses, a lot will be about the cost and performance vs "the best available".
FWIW I posted Simon's summary because it's what I encountered first, but here's the leaked document itself[0].
Some snippets for folks who came just for the comments:
> While our models still hold a slight edge in terms of quality, the gap is closing astonishingly quickly. Open-source models are faster, more customizable, more private, and pound-for-pound more capable. They are doing things with $100 and 13B params that we struggle with at $10M and 540B. And they are doing so in weeks, not months.
> A tremendous outpouring of innovation followed, with just days between major developments (see The Timeline for the full breakdown). Here we are, barely a month later, and there are variants with instruction tuning, quantization, quality improvements, human evals, multimodality, RLHF, etc. etc. many of which build on each other.
> This recent progress has direct, immediate implications for our business strategy. Who would pay for a Google product with usage restrictions if there is a free, high quality alternative without them?
> Paradoxically, the one clear winner in all of this is Meta. Because the leaked model was theirs, they have effectively garnered an entire planet’s worth of free labor. Since most open source innovation is happening on top of their architecture, there is nothing stopping them from directly incorporating it into their products.
> And in the end, OpenAI doesn’t matter. They are making the same mistakes we are in their posture relative to open source, and their ability to maintain an edge is necessarily in question. Open source alternatives can and will eventually eclipse them unless they change their stance. In this respect, at least, we can make the first move.
This is so indicative of Google culture missing the point. The idea of spending $10M training a single model is treated as a casual reality. But “tHaNk GoOdNeSs those generous open source people published their HiGh QuAlItY datasets of ten thousand examples each. Otherwise we’d have no way of creating datasets like that…” :| the sustainable competitive advantage has been and will continue to be HUGE PROPRIETARY DATASETS. (Duh - this is as true for new AI as it was for old AI = ad targeting). It was the _query+click pairs_ that kept Google dominant all these years, not the brilliant engineers. They had all of humanity labeling the entire internet with “when I click on this page/ad for this query I do/don’t search again” a billion+ times a day for a decade. For good measure they’ve also been collecting your email, your calendar, and your browsing habits for nearly as long. The fact that they’ve managed to erase that historic advantage from their collective consciousness (presumably because AI researchers would rather not spend time debugging data labeling UI) is strange to me. It at least deserves a mention in a strategy memo like this. Not vague platitudes about “influence through innovation.” Spend that $10M you were going to spend on a training run as $9.9999M on a private dataset, then the remaining $100 on training. Better still, build products that gets user behavior to train your models for you. Obviously.
We’re going to watch the biggest face plant in recent economic history if they can’t get this one together. I can’t decide if that makes me happy about an overdue changing of the guard in the Valley or sad about the fall of a once great company.
It’s not about the models! Model training is a commodity! It’s about the data! Come on guys.
One way to push back on the data argument is to consider the progress DeepMind made with self play. Perhaps Bard can self-dialogue and achieve superhuman results. I won’t be surprised. Plus the underlying architecture is dense. Sparse transformers are a major upgrade. That’s only one of many upgrades you can make. There is still a lot of headroom and IMHO GPT-4 already implements AGI if you give it the right context
This is easily among the rare highest quality articles/comments I've read in the past weeks, perhaps months (on LLMs/AI since that's what I am particularly interested in). And this was for internal consumption before it was made public. Reinforces my recent impression that so much that's being made for public consumption now is shallow and it is hard to find the good stuff. And sadly, increasing so even on HN. As I write this, I acknowledge I discovered this on HN :) Wish we had ways to incentivize the public sharing of such high-quality content that don't die at the altar of micro rewards.
Really interesting to look at this from a product perspective. I've been obsessively looking at it from an AI user perspective, but instead of thinking of it as a "moat", I just keep thinking of the line from Disney's The Incredibles, "And when everyone is super, no one will be."
Every app that I might build utilizing AI is really just a window, or a wrapper into the model itself. Everything is easy to replicate. Why would anyone pay for my AI wrapper when they could just build THING themselves? Or just wait until GPT-{current+1} when the model can do THING directly, followed swiftly by free and open source models being able to do THING as well.
Because people pay for convenience, and may not be technical enough to stay up to date on the latest and best AI company for their use case. Presumably your specialized app would switch to better AI instances for that use case as they come along in which case they're paying for your curation as well.
I am amazed that people haven't gotten used to these "internal Google doc leaks".
This is just the opinion of some random googler, one among over 100,000.
For some reason random googlers seem like to write random docs on hot topics and share it widely across the company. And someone, among those over 100,000 googlers, ends up "leaking" the opinion of that person to outside Google.
This is more like a blog post of some random dude over the Internet expressing his opinion. The fact that random dude ended up working at Google should not bear much on evaluating the claims in the doc.
A website published that with a title "Google ..." is misleading. The accurate title would be "Some random googler: ..."
I've been using Stable Diffusion to generate cover images for the music I release & produce for others. It's a massive time saver compared to comping together the release art using image editing software, and a lot cheaper than working with artists, which just doesn't make sense financially as an independent musician.
It's a little bit difficult to get what you want out of the models, but I find them very useful! And while the output resolution might be quite low, things are improving & AI upscaling also helps a lot.
I think from the perspective of a Google researcher/engineer, it must be alarming to see the crazy explosion going on w/ LLM development. We've gone from just one or two weirdos implementing papers (eg https://github.com/lucidrains?tab=repositories who's amazing) to now an explosion where basically every dev and PhD student is hacking on neat new things and having a field day and "lapping" (eg productizing) what Google Research was previously holding back.
And we're also seeing amazing fine-tunes/distillations of very useful/capable smaller models - there's no denying that things have gotten better and more importantly, cheaper way faster than anyone expected. That being said, most of these are being trained with the help of GPT-4, and so far nothing I've seen being done publicly (and I've been spending a lot of time tracking these https://docs.google.com/spreadsheets/d/1kT4or6b0Fedd-W_jMwYp...) gets close in quality/capabilities to GPT-4.
I'm always rooting for the open source camp, but I think the flip-side is that there are still only a handful of organizations in the world that can train a >SoTA foundational model, and that having a mega-model is probably a huge force multiplier if you know how to take advantage of it (eg, I can't imagine that OpenAI has been able to release software at the pace they have been without leveraging GPT-4 for co-development; also can you distill or develop capable smaller models without a more capable foundational model to leverage?). Anthropic for example has recently taken the flip side of the "no moat" argument, arguing that there is a potential winner-take-all scenario where the lead may become insurmountable if one group gets too far ahead in the next couple years. I guess what we'll just have to see, but my suspicion, is that the crux to the "moat" question is going to be whether the open source approach can actually train a GPT-n++ system.
> Giant models are slowing us down. In the long run, the best models are the ones which can be iterated upon quickly. We should make small variants more than an afterthought, now that we know what is possible in the <20B parameter regime.
Maybe this is true for the median query/conversation that people are having with these agents - but it certainly has not been what I have observed in my experience in technical/research work.
GPT-4 is legitimately very useful. But any of the agents below that (including ChatGPT) cannot perform complex tasks up to snuff.
My understanding was that most of the current research effort was towards trimming and/or producing smaller models with power of larger models, is that not true?
So I use ChatGPT every day. I like it a lot and it is useful but it is overhyped. Also from 3.5 to 4 the jump was nice but seemed relatively marginal to me.
I think the head start OpenAi has will vanish. Iteration will be slow and painful giving google or whoever more than enough time to catch up.
ChatGPT was a fantastic leap getting us say 80% to Agi but as we have seen time and time again the last 20% are excruciatingly slow and painful (see Self driving cars).
Personally, the difference between GPT4 and 3.5 is, pretty immense for what I am using it for. I can use GPT 3.5 for things like summarization tasks (as long as the text isn't too complex), reformatting, and other transformation type tasks alright. I don't even bother with using it for logical or programming tasks though.
> So I use ChatGPT every day. I like it a lot and it is useful but it is overhyped.
It is incorrectly hyped. The vision most pundits have is horribly wrong. It is like people who thought librarians would be out of work because of ebooks, barking up the wrong tree.
ChatGPT does amazing things, but it is also prone to errors, but so are people! So what, people still get things done.
Imaging feeding ChatGPT an API for smart lights, a description of your house, and then asking it to turn on the lights in your living room. You wouldn't have to name the lights "living room", because Chat GPT knows what the hell a living room is.
Meanwhile, if I'm in my car, and I ask my phone to open Spotify, it will occasionally open Spotify on my TV back home. Admittedly it hasn't done for quite some time, I presume it may have been a bug Google fixed, but that bug only exists because Google Assistant is, well, not smart.
Here is an app you could build right now with ChatGPT:
1. Animatronics with voice boxes, expose an API with a large library of pre-canned movements and feed the API docs to ChatGPT
2. Ask ChatGPT to write a story, complete with animations and poses for each character.
3. Have ChatGPT emit code with API calls and timing for each character
4. Feed each character's lines through one of the new generation of TTS services, and once generation is done, have the play performed.
Nothing else exists that can automate things to that extent. A specialized model could do some of it, but not all of it. Maybe in the near future you can chain models together, but right now ChatGPT does it all, and it does it really well.
And ChatGPT does all sorts of cool things like that, mixing together natural language with machine parsable output (JSON, XML, or create your own format as needed!)
I also felt this way initially, like "that's it?". But overall the massive reduction in hallucinations and increase in general accuracy makes it almost reliable. Math is correct, it follows all commands far more closely, can continue when it's cut off by the reply limit, etc.
Then I tried it for writing code. Let's just say I no longer write code, I just fine tune what it writes for me.
GPT feels like an upgrade from MapQuest to Garmin.
Garmin was absolutely a better user experience. Less mental load, dynamically updating next steps, etc, etc.
However, both MapQuest and Garmin still got things wrong. Interestingly, with Garmin, the lack of mental load meant people blindly followed directions. When it come something wrong, people would do really stupid stuff.
Not only they have no moat, Open source models are uncensored and this is huge. Censorship is not just political , it cripples the product to basically an infantile stage and precludes so many applications. For once, it is a liability
But this article doesn't state the very obvious: When will google (the inventor of Transformer, and "rightful" godfather of modern LLMs) , release a full open source, tinkerable model better than LLaMa?
(To the dead comment below, there are many uncensored variations of vicuna)
My very naive opinion is that the best way to predict the big-picture actions of Google is a simple question: WWIitND - What Would IBM in the Nineties Do?
In more direct terms, their sole, laser focus seems to be on maintaining short-term shareholder value, and I really don't trust the typical hedge fund manager to approve of any risky OSS moves for a project/tech that they're surely paying a LOT of attention to.
Giving away transformer tech made Google look like "where the smartest people on the planet work", giving away full LLM models now would (IMO) make them look like arrogant and not... well, cutthroat enough. At least this is my take in a world where financial bigwigs don't know or care about OSS at all; hopefully not the case forever!
> When will google release a full open source, tinkerable model better than LLaMa?
Arguably, Facebook released llama because it had no skin in the game.
Google, on the other hand, has a lot of incentive to claw back the users who went to Bing to get their AI fix. Presumably without being the place for “Ok, google, write me a 500 word essay on the economic advantages of using fish tacos as currency” for peoples’ econ 101 classes causing all kinds of pearl clutching on how they’re destroying civilization.
The open source peeps are well on the path of recreating a llama base model so unless google does something spectacular everyone will be like, meh.
[+] [-] dahwolf|2 years ago|reply
That's not at all how the masses are going to interact with AI in the near future. It's going to be seamlessly integrated into every-day software. In Office/Google docs, at the operating system level (Android), in your graphics editor (Adobe), on major web platforms: search, image search, Youtube, the like.
Since Google and other Big Tech continue to control these billion-user platforms, they have AI reach, even if they are temporarily behind in capability. They'll also find a way to integrate this in a way where you don't have to directly pay for the capability, as it's paid in other ways: ads.
OpenAI faces the existential risk, not Google. They'll catch up and will have the reach/subsidy advantage.
And it doesn't end there. This so-called "competition" from open source is going to be free labor. Any winning idea ported into Google's products on short notice. Thanks open source!
[+] [-] tikkun|2 years ago|reply
The author talks about Koala but notes that ChatGPT is better. GPT-4 is then significantly better than GPT-3.5. If you've used all the models and can afford to spend money, you'd be insane to not use GPT-4 over all the other models.
Midjourney is more popular (from what I'm seeing) than Stable Diffusion at the moment because it's better at the moment. Midjourney is closed-source.
The point I'm wanting to make is that users will go to whoever has the best model. So, the winning strategy is whatever strategy allows your model to compound in quality faster and to continue to compound that growth in quality for longer.
Open source doesn't always win in producing better quality products.
Linux won in servers and supercomputing, but not in end user computing.
Open-source databases mostly won.
Chromium sorta won, but really Chrome.
Then in most other areas, closed-source has won.
So one takeaway might be that open-source will win in areas where the users are often software developers that can make improvements to the product they're using, and closed-source will win in other areas.
[+] [-] lysecret|2 years ago|reply
One of my favorites: LoRA works by representing model updates as low-rank factorizations, which reduces the size of the update matrices by a factor of up to several thousand. This allows model fine-tuning at a fraction of the cost and time. Being able to personalize a language model in a few hours on consumer hardware is a big deal, particularly for aspirations that involve incorporating new and diverse knowledge in near real-time. The fact that this technology exists is underexploited inside Google, even though it directly impacts some of our most ambitious projects.
Anyone has worked with LoRa ? Sounds super interesting.
[+] [-] skybrian|2 years ago|reply
In particular, demos aren’t the same as products. Running a demo on one person’s phone is an important milestone, but if the device overheats and/or gets throttled then it’s not really something you’d want to run on your phone.
It’s easy to claim that a problem is “solved” with a link to a demo when actually there’s more to do. People can link to projects they didn’t actually investigate. They can claim “parity” because they tried one thing and were impressed. Figuring out if something works well takes more effort. Could you write a product review, or did you just hear about it, or try it once?
I haven’t investigated most projects either so I don’t know, but consider that things may not be moving quite as fast as demo-based hype indicates.
[+] [-] Animats|2 years ago|reply
[+] [-] minimaxir|2 years ago|reply
AI text generation competitors like Cohere and Anthropic will never be able to compete with Microsoft/Google/Amazon on marginal cost.
[+] [-] BiteCode_dev|2 years ago|reply
Creating a service that can compete with it on that regard implies you can scale GPU farms in a cost effective way.
It's not as easy as it sounds.
Meanwhile, openai still improves their product very fast, and unlike google, it's their only one. It's their baby. It has their entire focus.
Since for most consumers, AI == ChatGPT, they have the best market share right now, which mean the most user feedback to improve their product. Which they do at a fast pace.
They also understand that to get mass adoption, they need to censor the AI, like MacDonald and Disney craft their family friendly image. Which irritate every geeks, including me, but make commercially sense.
Plus, despite the fact you can torrent music and watch it with VLC, and that Amazon+Disney are competitors, netflix exists. Having a quality service has value in itself.
I would not count open ai as dead as a lot of people seem to desperately want it to be. Just because Google missed the AI train doesn't mean wishful thinking the market to be killed by FOSS is going to make it so.
As usual with those things it's impossible to know in advance what's going to happen, but odds are not disfavoring chatgpt as much as this article says.
[+] [-] dragonwriter|2 years ago|reply
Its a moat for services, not models, and its only a moat for AI services as long as that compute isn’t hobbled by being used for models which are so inefficient compared to SOTA as to waste the advantage, which underlines why leaning into open source the way this piece urges is in Google’s interests, the same way open source has worked to Google and Amazon’s benefits as service providers in other domains.
(Not so much “the ability to offer free/low-cost compute” but “the advantages of scale and existing need for widely geographically dispersed compute on the cost of both marginal compute and having marginal compute close to the customer where that is relevant”, but those are pretty close to differenly-focussed rephrasings of the same underlying reality.)
[+] [-] seydor|2 years ago|reply
[+] [-] FemmeAndroid|2 years ago|reply
[+] [-] freediver|2 years ago|reply
Anthropic already does, with its models. They are same price or cheaper than OpenAI, with comparable quality.
> Having enough scale to perpetually offer free/low-cost compute is a moat.
Rather than a moat it is a growth strategy. At one point in time you need to start to monetize and this is the moment when rubber hits the road. If you can survive monetization and continue to grow, now you have a moat.
[+] [-] bickfordb|2 years ago|reply
[+] [-] BiteCode_dev|2 years ago|reply
There are no moats to being a plumber, a baker, a restaurant...
The moat concept is predominant because the idea that everything must make billions have infected the debate about businesses.
It's all about being a unicorn, a giant, a monopoly, making every body at the top billionaires, and it's like there is no other way to live.
Except that's not how most people do live, even entrepreneurs.
Even Apple, which today is the typical example of a business with a moat, didn't start with "we can't get into this computer business, we'd have no moat".
They have a moat now, but it's a consequence of all the business decisions and the thing they built after many decades.
They didn't start their project by the moat. The started their project by providing value and marketing it.
[+] [-] nologic01|2 years ago|reply
You can't blame them: gratuitous moats (like those provided by winner-takes-all dynamics) are not common in a functioning (competitive) economy so they get to be revered.
It feels unlikely that the recent period of big tech can keep the same benefits going forward. It was basically a political moat: counting on the ongoing lack of antitrust and consumer protection regulation. Even if the political dysfunction that allows that continues (quite likely), the wheels of the universe are turning.
The "leaked" report focuses on open source - a mode of producing software that is bound to become a major disruptor. We tend to discount open source because of its humble beginnings, long incubation, many false dawns and difficult business models. But if you objectively take a look at what is possible today with open source software, its quite breathtaking. I would not discount some tectonic shifts in adoption. The long running joke is "the year of the linux desktop", but keep adding open source AI and other related functionality and at some point the value proposition of open source computing (both for individuals and enterprises) will be crushingly large to ignore.
Don't forget too, that other force of human nature: geopolitics (e.g., think TikTok and friends). The current "moats" were established during an earlier, more innocent era. Now digitization is a top priority / concern for many countries. The idea that somebody can build a long-lived AI moat given the stakes is strange to say the least.
[+] [-] moberemk|2 years ago|reply
This line is interesting to me, because actually I think there _is_ a major moat there: locality. I don't disagree with the rest of your comment, but for those examples specifically a lot of the value of specific instances of those business comes from their being in your neighborhood. If I live in Toronto, I'm not going to fly a plumber from Manhattan to fix my pipes; if I want a loaf of sourdough, I'm not going to get it from San Francisco, I'm going to get it from the bakery around the corner; I might travel out of town for a particularly unique and amazing restaurant, but not every week, I've got solid enough options within a ten minute drive. Software is different because that physical accessibility hurdle doesn't exist.
Rest of this is spot-on though
[+] [-] nashashmi|2 years ago|reply
[+] [-] mastax|2 years ago|reply
[+] [-] SanderNL|2 years ago|reply
There is something called automatic1111 which is a pretty comprehensive web UI for managing all these moving parts. Filled to the brim with extensions to handle AI upscaling, inpainting, outpainting, etc.
One of these is ControlNet where you can generate new images based on pose info extracted from an existing image or edited by yourself in the web based 3d editor (integrated, of course). Not just pose but depth maps, etc. All with a few clicks.
The level of detail and sheer amount of stuff is ridiculous and it all has meaning and substantial impact on the end result. I have not even talked about the prompting. You can do stuff like [cow:dog:.25] where the generator will start with a cow and then switch over at 25% of the process to a dog. You can use parens like ((sunglasses)) to focus extra hard on that concept.
There are so called LoRAs trained on specific styles and/or characters. These are usually like 5-100MB and work unreasonably well.
You can switch over to the base model easily and the original SD results are 80s arcade game vs GTA5. This stuff has been around for like a year. This is ridiculous.
LLMs are enormously “undertooled”. Give it a year or so.
My point by the way is that any quality issues in the open source models will be fixed and then some.
[+] [-] zmmmmm|2 years ago|reply
The take in the post rings of the classic trademark Google arrogance where they assume that if somebody else can do it they can do it better if they just try - where the challenge of "just trying" is discounted to zero. In reality, "Just trying" is massively important and sometimes all that is important. The gap between unrefined model output and the level of polish and refinement that is apparent with ChatGPT 4 may appear technically small but it's the whole difference between a widely applicable and usable product and something that can't be more than a toy. I'm not sure Google has it in it any more to really fight for something they want to achieve that level of polish.
[+] [-] Reubend|2 years ago|reply
Furthermore, models which fine-tune LLMs are still dependent on the base model's quality. Having a much higher quality base model is still a competitive advantage in scenarios where generalizability is an important aspect of the use case.
Thus far, Google has failed to integrate LLMs into their products in a way that adds value. But they do have advantages which could be used to gain a competitive lead: - Their crawling infrastructure could allow their to generate better training datasets, and update models more quickly. - Their TPU hardware could allow them to train and fine-tune models more quickly. - Their excellent research divisions could give them a head start with novel architectures.
If Google utilizes those advantages, they could develop a moat in the future. OpenAI has access to great researchers, and good crawl data through Bing, but it seems plausible to me that 2 or 3 companies in this space could develop sizeable moats which smaller competitors can't overcome.
[+] [-] ealexhudson|2 years ago|reply
It is difficult to see where the moat might exist if it's not data and the majority of the workings are published / discoverable. I don't think the document identifies a readily working strategy to defend against the threats it recognises.
[+] [-] kevinmchugh|2 years ago|reply
[+] [-] russellbeattie|2 years ago|reply
You're forgetting the phenomenon of the fast follower or second to market effect. Hydrox and Oreos, Newton and Palm, MySpace and Facebook, etc. Just because you created the market doesn't necessarily mean you will own it long term. Competitors often respond better to customer demand and are more willing to innovate since they have nothing to lose.
[+] [-] JohnFen|2 years ago|reply
That brand recognition could hurt them, though. If the widespread use of LLMs results in severe economic disruption due to unemployment, ChatGPT (and therefore OpenAI) will get the majority of the ire even for the effects of their competition.
[+] [-] amf12|2 years ago|reply
Zoom was once that brand name which was equated to a product. Now, people might say "Zoom call", but may use Teams or Meet or whatever. Similarly, people call a lot of robot vacuum cleaners Roombas, even though they might be some other brand.
Brand recognition is one thing, but the actual product used will always depend on what their employer uses, what their mobile OS might use, or what API their products might use.
For businesses, a lot will be about the cost and performance vs "the best available".
[+] [-] cube2222|2 years ago|reply
Some snippets for folks who came just for the comments:
> While our models still hold a slight edge in terms of quality, the gap is closing astonishingly quickly. Open-source models are faster, more customizable, more private, and pound-for-pound more capable. They are doing things with $100 and 13B params that we struggle with at $10M and 540B. And they are doing so in weeks, not months.
> A tremendous outpouring of innovation followed, with just days between major developments (see The Timeline for the full breakdown). Here we are, barely a month later, and there are variants with instruction tuning, quantization, quality improvements, human evals, multimodality, RLHF, etc. etc. many of which build on each other.
> This recent progress has direct, immediate implications for our business strategy. Who would pay for a Google product with usage restrictions if there is a free, high quality alternative without them?
> Paradoxically, the one clear winner in all of this is Meta. Because the leaked model was theirs, they have effectively garnered an entire planet’s worth of free labor. Since most open source innovation is happening on top of their architecture, there is nothing stopping them from directly incorporating it into their products.
> And in the end, OpenAI doesn’t matter. They are making the same mistakes we are in their posture relative to open source, and their ability to maintain an edge is necessarily in question. Open source alternatives can and will eventually eclipse them unless they change their stance. In this respect, at least, we can make the first move.
[0]: https://www.semianalysis.com/p/google-we-have-no-moat-and-ne...
[+] [-] keenon|2 years ago|reply
We’re going to watch the biggest face plant in recent economic history if they can’t get this one together. I can’t decide if that makes me happy about an overdue changing of the guard in the Valley or sad about the fall of a once great company.
It’s not about the models! Model training is a commodity! It’s about the data! Come on guys.
[+] [-] bionhoward|2 years ago|reply
[+] [-] summerlight|2 years ago|reply
[+] [-] ChaitanyaSai|2 years ago|reply
[+] [-] ngngngng|2 years ago|reply
Every app that I might build utilizing AI is really just a window, or a wrapper into the model itself. Everything is easy to replicate. Why would anyone pay for my AI wrapper when they could just build THING themselves? Or just wait until GPT-{current+1} when the model can do THING directly, followed swiftly by free and open source models being able to do THING as well.
[+] [-] sdenton4|2 years ago|reply
[+] [-] Nick87633|2 years ago|reply
[+] [-] somerandomdudes|2 years ago|reply
This is just the opinion of some random googler, one among over 100,000.
For some reason random googlers seem like to write random docs on hot topics and share it widely across the company. And someone, among those over 100,000 googlers, ends up "leaking" the opinion of that person to outside Google.
This is more like a blog post of some random dude over the Internet expressing his opinion. The fact that random dude ended up working at Google should not bear much on evaluating the claims in the doc.
A website published that with a title "Google ..." is misleading. The accurate title would be "Some random googler: ..."
[+] [-] tiniuclx|2 years ago|reply
It's a little bit difficult to get what you want out of the models, but I find them very useful! And while the output resolution might be quite low, things are improving & AI upscaling also helps a lot.
[+] [-] benjaminsky2|2 years ago|reply
[+] [-] lhl|2 years ago|reply
And we're also seeing amazing fine-tunes/distillations of very useful/capable smaller models - there's no denying that things have gotten better and more importantly, cheaper way faster than anyone expected. That being said, most of these are being trained with the help of GPT-4, and so far nothing I've seen being done publicly (and I've been spending a lot of time tracking these https://docs.google.com/spreadsheets/d/1kT4or6b0Fedd-W_jMwYp...) gets close in quality/capabilities to GPT-4.
I'm always rooting for the open source camp, but I think the flip-side is that there are still only a handful of organizations in the world that can train a >SoTA foundational model, and that having a mega-model is probably a huge force multiplier if you know how to take advantage of it (eg, I can't imagine that OpenAI has been able to release software at the pace they have been without leveraging GPT-4 for co-development; also can you distill or develop capable smaller models without a more capable foundational model to leverage?). Anthropic for example has recently taken the flip side of the "no moat" argument, arguing that there is a potential winner-take-all scenario where the lead may become insurmountable if one group gets too far ahead in the next couple years. I guess what we'll just have to see, but my suspicion, is that the crux to the "moat" question is going to be whether the open source approach can actually train a GPT-n++ system.
[+] [-] whimsicalism|2 years ago|reply
Maybe this is true for the median query/conversation that people are having with these agents - but it certainly has not been what I have observed in my experience in technical/research work.
GPT-4 is legitimately very useful. But any of the agents below that (including ChatGPT) cannot perform complex tasks up to snuff.
[+] [-] pbhjpbhj|2 years ago|reply
[+] [-] lysecret|2 years ago|reply
I think the head start OpenAi has will vanish. Iteration will be slow and painful giving google or whoever more than enough time to catch up.
ChatGPT was a fantastic leap getting us say 80% to Agi but as we have seen time and time again the last 20% are excruciatingly slow and painful (see Self driving cars).
[+] [-] Tostino|2 years ago|reply
[+] [-] com2kid|2 years ago|reply
It is incorrectly hyped. The vision most pundits have is horribly wrong. It is like people who thought librarians would be out of work because of ebooks, barking up the wrong tree.
ChatGPT does amazing things, but it is also prone to errors, but so are people! So what, people still get things done.
Imaging feeding ChatGPT an API for smart lights, a description of your house, and then asking it to turn on the lights in your living room. You wouldn't have to name the lights "living room", because Chat GPT knows what the hell a living room is.
Meanwhile, if I'm in my car, and I ask my phone to open Spotify, it will occasionally open Spotify on my TV back home. Admittedly it hasn't done for quite some time, I presume it may have been a bug Google fixed, but that bug only exists because Google Assistant is, well, not smart.
Here is an app you could build right now with ChatGPT:
1. Animatronics with voice boxes, expose an API with a large library of pre-canned movements and feed the API docs to ChatGPT
2. Ask ChatGPT to write a story, complete with animations and poses for each character.
3. Have ChatGPT emit code with API calls and timing for each character
4. Feed each character's lines through one of the new generation of TTS services, and once generation is done, have the play performed.
Nothing else exists that can automate things to that extent. A specialized model could do some of it, but not all of it. Maybe in the near future you can chain models together, but right now ChatGPT does it all, and it does it really well.
And ChatGPT does all sorts of cool things like that, mixing together natural language with machine parsable output (JSON, XML, or create your own format as needed!)
[+] [-] moffkalast|2 years ago|reply
Then I tried it for writing code. Let's just say I no longer write code, I just fine tune what it writes for me.
[+] [-] SkyPuncher|2 years ago|reply
Garmin was absolutely a better user experience. Less mental load, dynamically updating next steps, etc, etc.
However, both MapQuest and Garmin still got things wrong. Interestingly, with Garmin, the lack of mental load meant people blindly followed directions. When it come something wrong, people would do really stupid stuff.
[+] [-] jimsimmons|2 years ago|reply
[+] [-] whimsicalism|2 years ago|reply
[+] [-] seydor|2 years ago|reply
But this article doesn't state the very obvious: When will google (the inventor of Transformer, and "rightful" godfather of modern LLMs) , release a full open source, tinkerable model better than LLaMa?
(To the dead comment below, there are many uncensored variations of vicuna)
[+] [-] bbor|2 years ago|reply
In more direct terms, their sole, laser focus seems to be on maintaining short-term shareholder value, and I really don't trust the typical hedge fund manager to approve of any risky OSS moves for a project/tech that they're surely paying a LOT of attention to.
Giving away transformer tech made Google look like "where the smartest people on the planet work", giving away full LLM models now would (IMO) make them look like arrogant and not... well, cutthroat enough. At least this is my take in a world where financial bigwigs don't know or care about OSS at all; hopefully not the case forever!
[+] [-] UncleEntity|2 years ago|reply
Arguably, Facebook released llama because it had no skin in the game.
Google, on the other hand, has a lot of incentive to claw back the users who went to Bing to get their AI fix. Presumably without being the place for “Ok, google, write me a 500 word essay on the economic advantages of using fish tacos as currency” for peoples’ econ 101 classes causing all kinds of pearl clutching on how they’re destroying civilization.
The open source peeps are well on the path of recreating a llama base model so unless google does something spectacular everyone will be like, meh.
[+] [-] thomas34298|2 years ago|reply
Vicuna-13B: I'm sorry, but I cannot generate an appropriate response to this prompt as it is inappropriate and goes against OpenAI's content policy.