Ask HN: DALL-E was trained on watermarked stock images?
266 points| whycombinetor | 3 years ago | reply
Prompt: "king of belgium giving a speech to an audience, but the audience members are cucumbers"
All 4 results (all no good as far as the prompt is concerned): https://ibb.co/gz5RDkB
Fullsize of the one with the watermark https://ibb.co/DzGR063
[+] [-] dlg|3 years ago|reply
In the United States, there are two bits of case law that are widely cited and relevant: In Kelly v. Arriba Soft Corp (9th), found that making thumbnails of images for use in a search engine was sufficiently "transformative" that it was ok. Another case, Perfect 10 (9th), found that thumbnails for image search and cached pages were also transformative.
OTOH, cases like Infinity Broad. Corp. v. Kirkwood found that that retransmission of radio broadcast over telephone lines is not transformative.
If I understand correctly, there are four parts to the US courts' test for transformativness within fair use (1) character of use (2) creative nature of the work (3) amount or substantiality of copying (4) market harm.
I'd think that training a neural network on artwork--including copyrighted stock photos--is almost certainly transformative. However, as you show, a neural network might be overtrained on a specific image and reproduce it too perfectly--that image probably wouldn't fall under fair use.
There are also questions of if they violated the CFAA or some agreement crawling the images (but Hiq v Linkedin makes it seem like it's very possible to do legally) and whether they reproduced Getty's logo in a way that violates trademarks (are they trying to use it in trade in a way there could be confusion though?)
[+] [-] chrismorgan|3 years ago|reply
When that is finally tried in court, if it fails to any meaningful extent at all (including going all the way up to Supreme Courts as it doubtless will), then Copilot is dead, DALL·E is dead, GPT-3 is dead, all of these things will be immediately discontinued in at least the affected jurisdictions, at least until such a time as they get the laws changed or judgements overturned.
[+] [-] webwielder2|3 years ago|reply
[+] [-] ehsankia|3 years ago|reply
Here's an example: Stressful Shapes
Dall-E: https://i.imgur.com/JBkSh0y.png
Midjourney: https://i.imgur.com/C02Zq3i.png
On the other hand, here's a specific prompt: "nerdy yellow duck reading a magical book full of spells"
Dall-E: https://i.imgur.com/FMKZ8zc.png
Midjourney: https://i.imgur.com/lpsg6af.png
[+] [-] yreg|3 years ago|reply
Let's not confuse the AI with "buts", just say that he is giving the speech to cucumbers.
Lastly, specify some style, because this would probably not work out as a photo.
My single try is not bad at all and it could definitely be improved.
https://labs.openai.com/s/3OUmUxKefJCeLhAk4hkeKX4V
[+] [-] NoMoreBro|3 years ago|reply
It's a bit risky to invest too much time because every generator is different and they change the underlying model frequently (see the beta of MidJourney yesterday), but if you do it for passion or curiosity there is no problem.
Now I'm experimenting with a local installation of Stable Diffusion (well, not really "local" because I have an old computer) and the prompt is only one of the things you can tweak. There are num_inference_steps, guidance_scale and other parameters.
[+] [-] smlacy|3 years ago|reply
For simple prompts with little additional guidance, all the diffusion image generators I've seen/used will produce output about like what the author linked most of the time. There are always a few gems, and honing in via prompt engineering helps immensely.
[+] [-] grumbel|3 years ago|reply
On top of that DALL-E2 has generally issues with anything dealing with multiple objects. A single person will render fine, groups of people will generally give artifacts. Attributes will also be spread across all objects in the scenes, not just the ones you specified in your prompt, so doing anything more complex will require manual uncropping und inpainting, not just a single prompt.
Anyway, if you avoid the obvious weak spots and holes in the training set, DALL-E2 output is for most part pretty amazing out of the box. It's really more a top 50% than a top 1%.
The biggest bias when it comes to published DALL-E2 images are the prompts. Most prompts you see online are not the actual prompts, but funny descriptions made by a human after the fact. The actual prompt are often much longer and sometimes completely different.
[+] [-] pdntspa|3 years ago|reply
And from my experience getting high-quality output from AIs takes a bit of finesse. Not quite unlike crafting a good Google query
so... yes
[+] [-] grungegun|3 years ago|reply
A twitter user figured out which words they were using by generating a lot of images with the starting prompt "A sign being held that says "
[+] [-] tkgally|3 years ago|reply
[1] https://news.ycombinator.com/item?id=32433821
[+] [-] JimDabell|3 years ago|reply
Yes, people tend to share the best of the best. However these results seem especially bad, like bottom 10% bad.
[+] [-] neurostimulant|3 years ago|reply
Example: https://news.ycombinator.com/item?id=32088718
[+] [-] gojomo|3 years ago|reply
Also: my sense is that getting the best results often requires a lot of extra coaching with style/detail words. As we can't see the prompt here, we don't know what sort of style/details were requested. GIGO.
[+] [-] smileybarry|3 years ago|reply
[+] [-] BrainVirus|3 years ago|reply
The dynamics in play is highly questionable. Countless artists and photographers put effort into creating their works. They put they work online to get some attention and recognition. A company comes along, scrapes all of it and starts selling access to the model to generate something that looks highly derivative. The original cohort of artists and photographers not only get zero money or attention from this new endeavor, they are now in competition with the resulting model.
In short, someone whose work was essential to building a thing gets no benefits and possibly even gets (financially) harmed by that thing. Just because this gets verbally labeled "fair use" doesn't make it fair.
Additional point:
Just a few years ago a bunch of tech companies were talking about "data dignity". Somehow, magically, this (marketing) term is no longer used anywhere.
[+] [-] xg15|3 years ago|reply
Considering how strict and heavy-handed copyright handling has been otherwise, this has added to my belief that copyright in practice is really just enforcement of the interests of whatever industry has the most power at a given time: When entertainment and content generation was the biggest revenue generator, copyright couldn't be strict enough, now all money is on AI and suddenly loopholes the size of barn doors pop up.
[+] [-] rich_sasha|3 years ago|reply
It's a bit of an exaggeration but maybe not too much.
[+] [-] Karunamon|3 years ago|reply
At least, I can't see a substantial difference in the result.
[+] [-] wongarsu|3 years ago|reply
[+] [-] ShamelessC|3 years ago|reply
They aren't hosting the infringing content. Training on the data is probably covered under fair use. Generations are of _learned_ representations of the dataset, not the dataset itself. This makes it closer to outputting original works (probably owned by the person who used the model).
The players involved here are known for being litigious, however. I wouldn't be surprised if OpenAI did in fact pay some hefty fee upfront to get full permission to use these images.
[+] [-] BeefWellington|3 years ago|reply
"Probably" is doing a lot of heavy lifting in that sentence.
As for "_learned_", that's pretty debatable considering it's reproducing recognizable trademark infringement.
> The players involved here are known for being litigious, however. I wouldn't be surprised if OpenAI did in fact pay some hefty fee upfront to get full permission to use these images.
I have no idea why anyone would assume the "move fast and break things" disruption mindset that pervades tech companies these days, especially in spaces like ML/"AI", would mean they considered the legality, ethics, or good business sense of their training dataset.
As with Copilot, I suspect the DALL-E terms of use puts the onus on the user to avoid using infringing items.
[+] [-] kej|3 years ago|reply
[+] [-] gricardo99|3 years ago|reply
I’m guessing they assumed fair use and there will be lawsuits.
[+] [-] chrismorgan|3 years ago|reply
[+] [-] whywhywhywhy|3 years ago|reply
[+] [-] bhedgeoser|3 years ago|reply
[+] [-] StillLrning123|3 years ago|reply
https://www.reddit.com/r/KidsAreFuckingStupid/comments/8tgxs...
[+] [-] sulam|3 years ago|reply
[+] [-] cercatrova|3 years ago|reply
[0] https://cdn.ca9.uscourts.gov/datastore/opinions/2022/04/18/1...
[+] [-] resoluteteeth|3 years ago|reply
The ruling you are linking to is about whether scraping violates the Computer Fraud and Abuse Act.
This isn't really applicable here. First of all, that's a separate issue from copyright. Just because scraping publicly accessible data doesn't violate the CFAA doesn't mean that suddenly all images posted on the internet are public domain or that can use copyrighted images from websites for whatever you want, for example.
Furthermore, how copyright applies to training neural networks on copyrighted works is an open question right now.
[+] [-] olliej|3 years ago|reply
[+] [-] im3w1l|3 years ago|reply
[+] [-] otoburb|3 years ago|reply
Until somebody tries to float a trial balloon (case) in court.
[+] [-] petesergeant|3 years ago|reply
Some would argue that technically these people _discover_[0] the law, but it amounts to the same thing
[0] https://www.jstor.org/stable/3143421
[+] [-] jcims|3 years ago|reply
BTW you can add 'royalty free' to the prompt to get rid of those most of the time (lol?).
[+] [-] trention|3 years ago|reply
That being said, arguments about copyright are just a fig leaf as far as I am concerned. The outcome of whether this is allowed or not will depend on the net impact of using those models on the job market and whether society will be willing to tolerate it.
[+] [-] gojomo|3 years ago|reply
You'll get a public link, at `labs.openai.com` rather than some random image-sharing site, which will show the image & the prompt used to generate it (including a credit to "your-first-name × DALL·E").
[+] [-] RcouF1uZ4gsC|3 years ago|reply
Say you were an artist who went to every art show and museum and studied all the art there.
If you produced a work of art solely from memory that contained large portions of other people's copyrighted art, would that still fall under copyright/require licensing?
[+] [-] whycombinetor|3 years ago|reply
[+] [-] alcolade|3 years ago|reply
[+] [-] 8note|3 years ago|reply
[+] [-] egypturnash|3 years ago|reply
There is a comics creator named Kieth Giffen. He's done a lot of solid work over the years for DC and Marvel, there's a playful love of the medium and its history that flows through a lot of his work. At first his style was pretty middling; nothing terrible, nothing to really stand out from the pack. Then one day his work changed dramatically - he got a lot more daring in spotting his blacks, inking with a heavier brush, and doing a lot of panels that were a closeup of a backlit head with rim lighting, and eyes and teeth standing out in white. It was grounded in observation but had a lot of fresh ways to abstract a scene in the service of story. It was like nothing else on the racks and really striking.
It was also completely swiped from the work of an Argentinian artist named José Muñoz. Pick up one of Muñoz's shadow-drenched crime stories, put it next to one of Giffen's superhero tales, and you could clearly see the influence. And not just the influence, influence is okay - Giffen had started entirely cloning Muñoz's style, completely dropping all his other influences in the process. Muñoz was not happy when he heard about this, and neither were other artists in the field of comics. Influence is one thing, everyone's influenced by other artists, and if you're familiar with an artist's influences you can tell. But dropping all your other influences to start drawing almost exactly like a new one? That's just not done.
Giffen got a lot of shit for this. Giffen quit comics for a couple of years after this, and when he came back he had a new look. He still does the Shadowy Muñoz Face now and then but it's more along the lines of one of the many things he's borrowed from his multiple influences rather than one of the ways he was wholesale ripping off Muñoz.
"Style theft" is completely legal in the eyes of the court. There was nothing legally actionable going on here. But in the court of his fellow artists, Giffen was judged, and found guilty.
There's a range here. Nobody's going to care if you pick up a collection of Winsdor McCay's pioneering 19xx comic strip "Little Nemo" and do a dream-themed story that borrows his distinctive panel composition, lettering, and inking choices. Nobody's going to care if you do one drawing that precisely lifts Mike Mignola's heavy use of black and thin, clear lines. If you do superheros long enough then you're pretty much obligated to do at least one story that emulates Jack Kirby as closely as you can. If you worked as someone's assistant for a half a decade then you are very much allowed to bust out a perfect rendition of their style at any point in your entire life. But there is definitely a line you can cross where every artist (and a lot of non-artists) who sees a side-by-side view of what you're doing and what you're swiping from will say "dude, not cool, stop swiping their style".
These image generators actively encourage adding the names of prominent, living artists to your prompts to get the results you want. Is this crossing the same line Kieth Giffen did?
[+] [-] donkarma|3 years ago|reply
[+] [-] _trampeltier|3 years ago|reply
[+] [-] userbinator|3 years ago|reply
[+] [-] agnosis|3 years ago|reply
[+] [-] Geonode|3 years ago|reply
[+] [-] surfacedetail|3 years ago|reply
We can't assume any licensing behind closed doors, my guess is that OpenAI has an agreement with Getty, take a look at the licensing in this Observer piece, it's been licensed by Getty, this would indicate that Getty are happy with scraping.
https://www.theguardian.com/commentisfree/2022/aug/20/ai-art...
Besides, this is not infringement in principle, the AI has been trained to think that high-quality news images have watermarks.
[+] [-] registeredcorn|3 years ago|reply
If a company reverse engineers a competitors product, they still buy the product to tear it apart and figure out how it works.
If a student learns from their teacher, then goes on to sell a similar kind of work as what their teacher makes, at least the student paid for the classes.
This arrangement offers none of that. As long as theft is illegal, this should be. I'd call it parasitic, but it isn't; this is a parasite who's sole intent is to kill the host.