(no title)
qz_kb
|
3 years ago
I have to wonder how much releasing these models will "poison the well" and fill the internet with AI generated images that make training an improved model difficult. After all if every 9/10 "oil painted" image online starts being from these generative models it'll become increasingly difficult to scrape the web and to learn from real world data in a variety of domains. Essentially once these things are widely available the internet will become harder to scrape for good data and models will start training on their own output. The internet will also probably get worse for humans since search results will be completely polluted with these "sort of realistic" images which can ultimately be spit out at breakneck speed by smashing words from a dictionary together...
rhacker|3 years ago
I can see the future as being devoid of any humanity.
slimsag|3 years ago
I guess the concern would be: If one of these recipe websites _was_ generated by an AI, the ingredients _look_ correct to an AI but are otherwise wrong - then what do you do? Baking soda swapped with baking powder. Tablespoons instead of teaspoons. Add 2tbsp of flower to the caramel macchiato. Whoops! Meant sugar.
[0] http://slimsag.com/best-apache-chef-recipe/1438731.htm
thelittleone|3 years ago
exikyut|3 years ago
animal_spirits|3 years ago
rmbyrro|3 years ago
As AI advances, a lot of people will look after experiencing life outside the digital world.
Even digital communication will not be trustworthy anymore with deepfaces and everything else, so people will want to get together more often.
Edit: for the lazy ones, yeah, digital will be a sad and heartless environment...
4m1rk|3 years ago
kimi|3 years ago
Considering how many of the readers of said blog will be scrapers and bots, who will use the results to generate more spammy "content", I think you are right.
joshspankit|3 years ago
walt74|3 years ago
I can see a past where this already happened, to paraphrase Douglas Adams ;)
rg111|3 years ago
Unless you assume there are bad actors who will crop out the tags. Not many people now have access to Dall-E2 or will have access to Imagen.
As someone working in Vision, I am also thinking about whether to include such images deliberately. Using image augmentation techniques is ubiquitous in the field. Thus we introduce many examples for training the model that are not in the distribution over input images. They improve model generality by huge margins. Whether generated images improve generality of future models is a thing to try.
Damn I just got an idea for a paper writing this comment.
SirHound|3 years ago
viraptor|3 years ago
I don't know why people do that but lots of randoms on the internet do that and they're not even bad actors per se. The removed signatures from art posted online became a kind of a meme itself. Especially when comic strips are reposted on Reddit. So yeah, we'll see lots of them.
zone411|3 years ago
JayStavis|3 years ago
The irony is that if you had a great discriminator to separate the wheat from the chaff, that it would probably make its way into the next model and would no longer be useful.
My only recommendation is that OpenAI et al should be tagging metadata for all generated images as synthetic. That would be a really interesting tag for media file formats (would be much better native than metadata though) and probably useful across a lot of domains.
joshspankit|3 years ago
agar|3 years ago
Neil Stephenson covered this briefly in "Fall; or Dodge In Hell." So much 'net content was garbage, AI-generated, and/or spam that it could only be consumed via "editors" (either AI or AI+human, depending on your income level) that separated the interesting sliver of content from...everything else.
jillesvangurp|3 years ago
A bit far out there in terms of plot but the notion of authenticating based on a multitude of factors and fingerprints is not that strange. We've already started doing that. It's just that we currently still consume a lot of unsigned content from all sorts of unreliable/untrustworthy sources.
Fake news stops being a thing as soon as you stop doing that. Having people sign off on and vouch for content needs to start becoming a thing. I might see Joe Biden saying stuff in a video on Youtube. But how do I know if that's real or not?
With deep fakes already happening, that's no longer an academic question. The answer is that you can't know. Unless people sign the content. Like Joe Biden, any journalists involved, etc. You might still not know 100% it is real but you can know whether relevant people signed off on it or not and then simply ignore any unsigned content from non reputable sources. Reputations are something we can track using signatures, blockchains, and other solutions.
Interesting with Neal Stephenson that he presents a problem and a possible solution in that book.
afro88|3 years ago
If the AI models can't consume it, it can't be commoditised and, well, ruined.
whatshisface|3 years ago
joshspankit|3 years ago
I think you’re right, and it’s unlikely that we (society) will convince people to label their AI content as such so that scraping is still feasible.
It’s far more likely that companies will be formed to provide “pristine training sets of human-created content”, and quite likely they will be subscription based.
trhway|3 years ago
well, we do have organic/farmed/handcrafted/etc. food. One can imagine information nutrition label - "contains 70% AI generated content, triggers 25% of the daily dopamine release target".
kleer001|3 years ago
qz_kb|3 years ago
I think this will introduce unavoidable background noise that will be super hard to fully eliminate in future large scale data sets scraped from the web, there's always going to be more and more photorealistic pictures of "cats" "chairs" etc. in the data that are close to looking real but not quite, and we can never really go back to a world where there's only "real" pictures, or "authentic human art" on the internet.
abel_|3 years ago
Less common opinion: this is also how you end up with models that understand the concept of themselves, which has high economic value.
Even less common opinion: that's really dangerous.
rajnathani|3 years ago
[0] https://creativecloud.adobe.com/discover/article/how-to-use-...
gwern|3 years ago
dclowd9901|3 years ago
actionfromafar|3 years ago
kulikalov|3 years ago
LoveMortuus|3 years ago
benlivengood|3 years ago
VMG|3 years ago
Cheap books, cheap TV and cheap music will be generated.
bowmessage|3 years ago
telesilla|3 years ago
richrichardsson|3 years ago
A digital picture of an oil painting != an actual oil painting
Of course once someone trains an AI with a robotic arm to do the actual painting, then your worry holds firm.
Gigachad|3 years ago
unknown|3 years ago
[deleted]
ismepornnahi|3 years ago