top | item 33283712

Why we chose not to release Stable Diffusion 1.5 as quickly

298 points| dwynings | 3 years ago |danieljeffries.substack.com

343 comments

[+] machina_ex_deus|3 years ago|reply

I'm not a data hoarder, but from the moment Stable Diffusion was released I had a gut feeling that I should download everything available while it's there.

Somewhat similar gut feeling to when popcorn time was released, although it might not be exactly the same.

While I really wish I'm wrong, my gut tells me that broadly trained machine learning models available to the general public won't last and that intellectual property hawks are going to one day cancel and remove these models and code from all convenient access channels.

That somehow international legislation will converge on the strictest possible interpretation of intellectual property, and those models will become illegal by the mere fact they were trained on copyrighted material.

So reminder to everyone: Download! Get it and use it before they try to close the Stable doors after the horses Diffused. Do not be fooled by the illusion that just because it's open source it will be there forever! Popcorn time lost a similar battle.

Get it now when there are trustworthy sources. Once these kinds of things go underground, it gets much harder to get a trustworthy version.

[+] williamcotton|3 years ago|reply

From my research the general consensus is that the processing of copyrighted material will be considered fair use. Here is a lengthy legal discussion:

https://texaslawreview.org/fair-learning/

Here is a short quote from an IP lawyer:

“In terms of the ingestion of publicly accessible code, Ochoa said, there may be software license violations but that's probably protected by fair use. While there hasn't been a lot of litigation about that, a number of scholars have taken that position and he said he's inclined to agree.”

https://www.theregister.com/2022/10/19/github_copilot_copyri...

[+] Vetch|3 years ago|reply

What's the point of downloading it when it'd just stagnate? This isn't like regular software where people can easily put in hard work and sweat to improve it.

LLMs have the unfortunate limitation of being both powerful and lending themselves to centralized control choke-points due to how resource intensive they are to train. Under this paradigm, I fear commercial entities will be able to easily navigate the legal landmines and continually improve while open efforts perpetually lag far behind.

There are many vested interests who want this control for various reasons they justify as: protection from x-risk, keeping it out of the hands of abusers and bullies, economic advantage. Their reasons for want of control are either well intended but wrong-headed or profit-motivated and disingenuous.

Rather than challenging the likes of GPT-3 and Copilot enabling freedom, I fear folks will be forced to send all their videos, pictures, text and code to the servers of Microsoft, Amazon and Google or lose access to advantages as LLMs continue to improve at a rapid clip.

[+] WheelsAtLarge|3 years ago|reply

Companies have a similar problem now with AI than what the music labels had with Napster and MP3s in the 90's. Music labels tried very hard to legislate the problem away but it failed. I remember Metallica's Lars Ulrich working hard to fight it. They finally embraced the change. If it can't be done in the U.S., it will be done in some other country. That country will have the competitive advantage.

We'll go thru the same with AI but ultimately it won't be stopped. As long as there's no world wide coordination limiting its impact, AI will continue its course.

[+] Satam|3 years ago|reply

Wow, thank you. That's a very interesting take.

You mention Popcorn time. I wonder if torrents in general could be a great example of how something like this plays out? Torrenting took the world by storm and had an amazing "product-market fit" for the early internet days. Of course, downloading copyrighted material was always illegal but that didn't stop many.

Over time, legal but paid alternatives rose up: Spotify, iTunes, Netflix. These players found their place in the market by balancing the interest of copyright holders and the needs of users looking for cheap and easy access to entertainment.

Just as Netflix acquired large content libraries, same here. With enough money, large training datasets could be acquired in a legally solid manner.

It's interesting to think where this analogy might fail as well, and how the paths of these technologies could differ. For one, torrenting was mostly for entertainment, and thus impacted B2C first. On the other hand, language models are more so for media _creation_ and the B2B sphere.

[+] pabs3|3 years ago|reply

Models that are trained on data under open source licenses (such as Creative Commons) would likely be much safer from copyright claims. I like to use the Debian Deep Learning Team's Machine Learning Policy to evaluate the openness of ML work.

https://salsa.debian.org/deeplearning-team/ml-policy

[+] prepend|3 years ago|reply

I forked deepfake a few years ago because it seemed interesting. I didn’t have a spidy sense just thought it would be something interesting to look into. But I forked in GitHub rather than doing a proper clone so now it’s gone.

It reminds me to follow the datahoarder maxim that if you don’t admin then servers, you don’t have the data. So now I clone stuff to a local drive.

[+] spaceman_2020|3 years ago|reply

This lines up with my observation of the sudden and complete absence of celebrity deep fakes (the adult rated or otherwise) from the internet.

There is a legal machinery that works behind the scenes which we aren't always aware of.

[+] gedy|3 years ago|reply

I think the savvy media companies realize that we're at the cusp of ai generated media - movies and music included. If we have free/open models trained on the past 100 years of media, they may become obsolete and they will fight this to the death.

Irony is the "NSFW" moral concerns, when the media companies put out such negative and filthy content as it is.

[+] fbdab103|3 years ago|reply

Any particular repos/artifacts you suggest downloading?

[+] blackoil|3 years ago|reply

I am more hopeful. Unlike popcon/napster these models aren't directly impacting existing bottom line of any company/organization. Most of the models are trained on opensource / public datasets, so you won't find any company to sponsor the fight against these models. The cost of these models is an issue right now, but Mr. Moore has always handeled that well.

[+] 2Gkashmiri|3 years ago|reply

>Popcorn time lost a similar battle.

i was actively following torrentfreak at the time and there was genuine excitement with something incredible but that only lasted a week :-(

why do you say they lost the battle? the original team threw in the towel within the week but there are people who have taken the fight

https://github.com/popcorn-official/popcorn-desktop/releases... here, the latest release was on 04 Sep 2022 so it is very much in active development with a lot of people contributing https://github.com/popcorn-official/popcorn-desktop/graphs/c...

so while the original team might not be working on it, like a true free software, the code lives.

[+] liuliu|3 years ago|reply

Model can be retrained (with some money). But data is harder. I cannot backup LAION 5B unfortunately. If you can, please do! (About 200T)

[+] manholio|3 years ago|reply

> That somehow international legislation will converge on the strictest possible interpretation of intellectual property, and those models will become illegal by the mere fact they were trained on copyrighted material.

That's the only possible interpretation, really. AI models algorithmically remix input intellectual property en masse, without any significant amount of human creativity, the only thing copyright law protects. As such, the models themselves are wholly derived works, essentially a compressed and compact representation of the artistic features of the original works.

Legally, a AI model is equivalent to a huge tar.gz of copyrighted thumbnails: very limited fair use applies, only in some countries, and only in certain use contexts that generally don't harm the original author or out-compete them in the market place - the polar opposite of what AI models are.

[+] adamsmith143|3 years ago|reply

>That somehow international legislation will converge on the strictest possible interpretation of intellectual property, and those models will become illegal by the mere fact they were trained on copyrighted material.

Just feels absurd to me because how is this different from any Human artist who you could equally say was "trained" on copyrighted material.

>Get it now when there are trustworthy sources. Once these kinds of things go underground, it gets much harder to get a trustworthy version.

People have already reverse engineered most text2image models and given enough hardware can train their own. There is no need for this hysterical take. As long as the internet exists you will be able to train these models.

[+] green_on_black|3 years ago|reply

Here's a (not-recommended but amusing) nuclear option:

Tit-for-tat. Regulators and artists don't want this? Okay, include in all open source software licenses that regulators and artists are now barred from using them without payment.

[+] didibus|3 years ago|reply

The UK and the EU have already made to law that text and data mining is excluded from copyright for non-commercial uses, and the UK has even done so for commercial use cases.

Personally, I think commercial use cases should get license agreements from the authors for their training data, but I think non-commercial exemptions to advance the field of AI makes sense.

Irregardless of what I think though, the UK has set an international precedent, and the EU is apparently discussing about possibly extending it to commercial use cases as well. So there's that.

[+] corndoge|3 years ago|reply

I agree that it’s a good idea to download everything now and I agree that the legal powers that be will probably soon force it underground - but I’m less certain the driving reason will be copyright / IP. I think it will be reasons similar to what TA hints at. People are (somewhat understandably) upset with certain classes of output the model is capable of generating and a moral panic is likely to ensue that, historically, has won most cases it’s presented itself in.

[+] datacruncher01|3 years ago|reply

I figure these tools fall in a similar category to web scraping which is legal. What you can’t do is copy the file. If you can demonstrate that you are modifying the source data then it’s a new work. Style is not protected by copyright as much as famous artists may want.

Where copyright may be applicable is when the models reproduce original art without modification that a reasonable person wouldn’t know the difference.

[+] speleding|3 years ago|reply

> those models will become illegal by the mere fact they were trained on copyrighted material

The blog post says they are worried about the ability to use the model to "use it for illegal purposes and hurting people". I think that they are referring to the ability to create all kinds of compromising pictures (porn) with celebrities, kids, etc. Am I misreading that? They don't mention copyright anywhere.

[+] stelonix|3 years ago|reply

Yesterday I was backing up and old failing HD. I looked at the models I downloaded since 2014 and since I was out of time, I decided to just delete them. But I deleted them with the same thought you just shared: those old models probably don't even exist anymore, they're probably gone. I'm just hoping that time you described isn't happening anytime soon.

[+] EGreg|3 years ago|reply

Where can we get Stable Diffusion downloaded?

[+] LawTalkingGuy|3 years ago|reply

I think it'll be EU-style privacy regulations that make it illegal to train on the majority of data. Perhaps the requirement to be able to remove a user's impact from an already computed model if they file a right-to-be-forgotten.

Something that would make any non-trivial model a legal nightmare.

[+] ionwake|3 years ago|reply

> close the Stable doors after the horses Diffused

Encapsulates it all well I like this statement, total pottery

[+] dividedbyzero|3 years ago|reply

I've followed this sort of thing rather loosely so far, any recommendations what other pre-trained models would be worth looking at?

[+] metadat|3 years ago|reply

Is there a torrent available? This is an effective way to ensure the models and information remain available indefinitely.

[+] zakki|3 years ago|reply

can you point the good source to download? thanks.

[+] seydor|3 years ago|reply

what if the models are used to generate a new training set ?

[+] tarunmuvvala|3 years ago|reply

Stability AI is formed with that vision to keep is open-source and accessible to the masses. It's very rare that we might see it becoming a closed source

[+] Satam|3 years ago|reply

Based on a Reddit post [1], the author of this is Stability AI's chief information officer.

My very rough take on the situation: the company gained their notoriety by building on OpenAI's pioneering research but with an important twist of releasing their models as unneutered open source. Now, their openness is starting to falter due to strong pressure from outside forces.

If they're unable to continue playing the hardball game they themselves invented, I think their glory days will end as fast as they started. The competitive advantage was always their boldness. If they lose that, quickly others will take their place.

In general, I don't think tech that's as open, powerful and easily reproducible as these language models can be stopped. Sure, maybe regulations will delay it a bit, but give it a few years and any decent hacker or tinkerer will be dabbling with 5x better tech with 5x less effort.

[1] https://archive.ph/Z5sU3

[+] pr337h4m|3 years ago|reply

"We’ve heard from regulators and the general public that we need to focus more strongly on security to ensure that we’re taking all the steps possible to make sure people don't use Stable Diffusion for illegal purposes or hurting people."

"What we do need to do is listen to society as a whole, listen to regulators, listen to the community."

"So when Stability AI says we have to slow down just a little it's because if we don't deal with very reasonable feedback from society and our own communities then there is a chance open source AI simply won't exist and nobody will be able to release powerful models."

Looks like someone is leaning on them :(

[+] thorum|3 years ago|reply

The author (Stability.AI’s CIO) did an impromptu AMA on Reddit:

https://reddit.com/r/StableDiffusion/comments/y9ga5s/stabili...

His comments regarding RunwayML’s release of 1.5 were especially interesting:

> “No they did not. They supplied a single researcher, no data, not compute and none of the other reseachers. So it’s a nice thing to claim now but it’s basically BS. They also spoke to me on the phone, said they agreed about the bigger picture and then cut off communications and turned around and did the exact opposite which is negotiating in bad faith.”

> “I’m saying they are bad faith actors who agreed to one thing, didn’t get the consent of other researchers who worked hard on the project and then turned around and did something else.”

[+] icelancer|3 years ago|reply

His answers on reddit are downvoted and the redditors are correctly pointing out that most of these "protections" smack of the fact that his investors want to stop giving things away and to close up source / resources for better monetization strategies.

[+] minimaxir|3 years ago|reply

> At Stability, we see ourselves more as a classical democracy, where every vote and voice counts, rather than just a company.

After taking $100M in venture capital and two distinct drama events due to disorganization, this is unlikely to last.

[+] aortega|3 years ago|reply

Powerful people are pulling strings to control AI everywhere. OpenAI is exactly the opposite of open. Now someone is pushing on Stability AI to close it up, I believe those models are more powerful or dangerous than they seem, and it got some people scared in some way.

I read than when some guys from 4chan started running the leaked NovelAI model, they generated porn non-stop for 20 hs or more, no sleep, no eating.

[+] fsociety999|3 years ago|reply

While they frame the post as if this is a positive and something they want to do, reading between the lines, it sounds to me like something has them rattled.

They mentioned regulators here, and I would be curious to hear the story behind that.

Don’t want to go too tin foil hat, but it makes you wonder if a certain other AI company that claims to be “open” may be afraid of a company that actually is open and is applying political pressure.

[+] Roark66|3 years ago|reply

As always in such cases this is 100% bull**. Either something is not working out for them and they have to delay in which case they could've just said so, or this is some sort of pretense to show how "responsibility minded" they are.

The reality is that bad actors have the resources to train their own stable diffusion on a dataset of whatever they want to deep fake and such delays do not slow them down one bit.

What it does slow down is normal people using those models.

From the smallest thing like mobilenetv3 through whisper, stable diffusion, CodeGen, and bloom those are huge productivity equalisers between the huge corpos and the little guy.

Also the same thing can be said about frameworks like huggingface's. Just recently I was looking for a way to classify image type (photo or not photo[clip art, cartoon, drawing]) in an android app. Of course first hits on Google stear towards Microsoft Azure's paid API service. I was unhappy with having to use an over-the-Internet-API (with potentially sensitive end user's private pictures) so in one day of work I managed to download a pretrained MobileNetV3. A couple of 10k+ image datasets and I wrote <50 lines of python to tweak the last layer and fine tune the network. On rtx 2070 training took 10 minutes. Resulting accuracy on real data? 90%+. The model loads and infers in few hundreds of ms on modern phones(instantiating and loading takes longer than the inference BTW).This is priceless and 100% secure for end users. For thilose interested in the details I use ncnn and vulkan for gpu(mobile!) inference.

Every commercial model maker's wet dream is to expose the model through an API, lock it behind a firewall and have people pay for access. This is not just hugely inefficient. It is insecure by design.

Take copilot by example. I'm perfectly happy for all my hobby-grade code to be streamed to Microsoft, but no chance in hell I'll use it on any of my commercial projects. However faux pilot run locally is on my list of things to try.

The first AI revolution was creation of those super powerful models, the second is the ability to run them on the edge devices.

[+] fxtentacle|3 years ago|reply

I think the most important part is this comment:

https://danieljeffries.substack.com/p/why-the-future-of-open...

The people that he discredits as "leak the model in order to draw some quick press to themselves" are the researchers that are named in the Stable Diffusion paper. Yes, Stability.AI gave them lots of money. But no, they are not leaking the model, they are publishing their own work. It's university researchers, after all. And Stability.AI does NOT own the model.

[+] wyxuan|3 years ago|reply

No I think this is runway.ml who run a video editing startup based on AI and helped with the development. This is the link https://huggingface.co/runwayml/stable-diffusion-v1-5

[+] ComodoHacker|3 years ago|reply

More context: https://old.reddit.com/r/StableDiffusion/comments/y99yb1/a_s...

=============================

RunwayML, who co-authored the StableDiffusion paper and funded CompVis together with StabilityAI, have unilaterally released the newest model of StableDiffusion, version 1.5. It seems that this was done without StabilityAIs consent, who so far have held the finished model back to supposedly prune it of NSFW stuff. This is criticized by many and accusations exist that they are only doing this to make more money as the 1.5 model has been available for quite some time on their own website against a usage fee. Do note however that the 1.5 model has only very minor improvements over the 1.4 model.

The link to the model can be found here: https://huggingface.co/runwayml/stable-diffusion-v1-5

The release was accompanied by the following tweet from RunwayML:

https://twitter.com/runwayml/status/1583109275643105280

This was followed by an accusing statement by a - now confirmed to be fake - account claiming to be Patrick Esser:

https://media.discordapp.net/attachments/1023643945319792731...

The model was released under the following license which indicates that RunwayML were legally allowed to release the model:

    Use-based restrictions as referenced in paragraph 5 MUST be included as an enforceable provision by You in any type of legal agreement (e.g. a license) governing the use and/or distribution of the Model or Derivatives of the Model, and You shall give notice to subsequent users You Distribute to, that the Model or Derivatives of the Model are subject to paragraph 5. This provision does not apply to the use of Complementary Material. You must give any Third Party recipients of the Model or Derivatives of the Model a copy of this License; You must cause any modified files to carry prominent notices stating that You changed the files; You must retain all copyright, patent, trademark, and attribution notices excluding those notices that do not pertain to any part of the Model, Derivatives of the Model. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions - respecting paragraph 4.a. - for use, reproduction, or Distribution of Your modifications, or for any such Derivatives of the Model as a whole, provided Your use, reproduction, and Distribution of the Model otherwise complies with the conditions stated in this License.

    Trademarks and related. Nothing in this License permits You to make use of Licensors’ trademarks, trade names, logos or to otherwise suggest endorsement or misrepresent the relationship between the parties; and any rights not expressly granted herein are reserved by the Licensors.

The license can be found here: https://huggingface.co/spaces/CompVis/stable-diffusion-licen...

This was followed by a takedown notice from StabilityAI:

    Company StabilityAI has requested a takedown of this published model characterizing it as a leak of their IP

    While we are awaiting for a formal legal request, and even though Hugging Face is not knowledgeable of the IP agreements (if any) between this repo owner (RunwayML) and StabilityAI, we are flagging this repository as having potential/disputed IP rights.

The takedown notice can be found here: https://huggingface.co/runwayml/stable-diffusion-v1-5/discus...

This was followed by a statement from RunwayML in that same thread:

    Hi all,

    Cris here - the CEO and Co-founder of Runway. Since our founding in 2018, we’ve been on a mission to empower anyone to create the impossible. So, we’re excited to share this newest version of Stable Diffusion so that we can continue delivering on our mission.

    This version of Stable Diffusion is a continuation of the original High-Resolution Image Synthesis with Latent Diffusion Models work that we created and published (now more commonly referred to as Stable Diffusion). Stable Diffusion is an AI model developed by Patrick Esser from Runway and Robin Rombach from LMU Munich. The research and code behind Stable Diffusion was open-sourced last year. The model was released under the CreativeML Open RAIL M License.

    We confirm there has been no breach of IP as flagged and we thank Stability AI for the compute donation to retrain the original model.

Emad, CEO of StabilityAI, has come forward on the official StableDiffusion discord stating that they are okay with the release and have taken down the takedown notice:

https://media.discordapp.net/attachments/1015751613840883735...

Emad also says they didn't send the takedown request: https://cdn.discordapp.com/attachments/1032745835781423234/1...

[+] 13of40|3 years ago|reply

Two thoughts I've had about Stable Diffusion:

1. The web UIs I have used are taking advantage of the same mental pathways as an electronic slot machine. Just like you can max out your bet on a slot machine and mash a button until you run out of credits, you can do the same on the hosted stable diffusion apps until you get a shareable hit.

2. Just like the dream you had last night, nobody wants to hear about it at breakfast, no matter how epic it was, because it's not backed by any meaning.

That said, I love stable diffusion and am an addict to it almost every day.

[+] notacanofsoda|3 years ago|reply

1) Who is Daniel Jeffries? There's no explanation of how he's related to Stability.

2) StabilityAI gave RunwayML compute time for them to train Stable Diffusion (they're also the creators of the original model). It's weird to categorize them as " other groups leak the model". They're the ones that created the model! (Source: https://huggingface.co/runwayml/stable-diffusion-v1-5/discus...)

[+] lairv|3 years ago|reply

The discourse has already changed quite a bit since the first release, which was only 2 months ago, and is getting alarmingly close from OpenAI's "we must delay release of XXX for safety reasons". It was probably to be expected, OpenAI are not just morons who decided to freeze opensource progresses, there are likely legal reasons behind it. But adding to that last weeks dramas, I am not very bullish on StabilityAI, hope I'll be proven wrong

[+] Beaver117|3 years ago|reply

So you want it to be open source, but not too open, because then bad people will use it. Good luck with that. If you want to filter everything behind a SaaS like OpenAI go ahead, but then you can't call it open source. And maybe that would have been the right choice. But Pandora's box is open now.

[+] dang|3 years ago|reply

We replaced the title, which has a whiff of corporate press release about it, with what appears to be a representative phrase from the article body. If there's a more representative phrase, we can change it again.

[+] p3opl3s|3 years ago|reply

Y/ou can't comment unless you pay to subscribe.. lol - isn't that a company blog post?

Anyways.. this shit grinds me.. yet another "open source" AI proejct pretending to be fo rthe people.. finally get a massive valuation and now it's all "we must be security concious"..

Hypocrtyes and here is an interview with the founder of Stable Diffusion stating the exact opposite approach by "having faith in people"!

https://youtu.be/YQ2QtKcK2dA?t=704

[+] jackblemming|3 years ago|reply

Guys I'm going to release an invention called the car, but my security team needs to make sure it's safe and won't be abused by drunk drivers. Next I plan to release an invention called the gun, but please hold your horses, because it could be abused. I need to double check and make sure it's safe to release this piece of equipment.

[+] machinekob|3 years ago|reply

All this is PR talk after few dramas with immoral activities.

They got 100mil USD in founding and I feel like pressure squeeze them hard as they are trying to monetise models, but how you monetise open source models when someone can just fine-tune your weights and make better/faster/cleaner model and software without losing 10mil+ on training original.

You are always few mil behind rivals and after past few weeks which was PR nightmare they lost most of the "community driven" advantage.

I fell like they are extremely desperate for attention (drama was artificially created cause it clicks conspiracy) or they are just so chaotic and lack proper leaders that everything is burin.

[+] imhoguy|3 years ago|reply

Well, models will be taken down anyway (at least attempted), save whatever you can put hands on. It is happening, govt is just catching up with this rapid situation:

https://www.federalregister.gov/documents/2022/10/13/2022-21... (AI mentioned 4 times)

https://eshoo.house.gov/sites/eshoo.house.gov/files/9.20.22L... (at the very end "export controls" are mentioned multiple times)

[+] charcircuit|3 years ago|reply

>NSFW policies

Ugh. It feels like so many of these models are trying to censor NSFW material.