Nothing that Watson learned from the Urban Dictionary could possibly be any dirtier than what I hear from enterprise people all the time:
"We use our deep subject matter expertise to deliver value through actionable advice that enables our clients to harness the power of best practices in order to shift their paradigms and achieve 10X deltas against competitive industry metrics."
The worst thing is that translated into normal speak that's actually a reasonable sounding proposition:
"We use our experience in this field to provide practical advice to our clients which helps them improve their way of doing business and ROFL-stomp their competition 10x over."
What an interesting reflection on who we are as a species.
We build systems to organize who we are (urbandictionary) but hate it when the systems use that information tell us who we are (watson).
It feels so much like the emperor isn't wearing any clothes.
Perhaps an appropriate response would be for the computer to measure the tension in the human voice response to it's queries and optimize for lower tension.
So it can pick three words:
Bullshit -> 80% confident;
Sham -> 70% confident;
Fallacy -> 50% confident;
Within limits, it will pick the less optimal word and measure the tension in the response and find a way of influencing confidence based on responses.
Think multi-armed bandit problem but with social situations. I mean, to be honest, isn't that what we all did when we were in middle school? We used as many bad words as possible measuring the response we got from others? None of us were born with a binary understanding of when to use certain words it was more trial and error.
I'm not an "angel" by any means, but any time I visit Urban Dictionary, I come away feeling filthy. I typically go there to look up some abbreviation I heard on reddit or IRC or a blog post somewhere and what I look up usually ends up being middling-dirty, but the stuff that I see there "on my way" to the word I'm looking for makes me cringe and gives me a bleak vision of what the next generation is going to be like should Urban Dictionary actually be representing the majority of the population (I firmly hold that it does not).
It's more like they couldn't have Watson using them on national television. Many a company, even if it weren't showcasing a machine like Watson, would want to avoid that language on a huge television event like the Jeopardy challenge. I think you might be looking a little too deeply at this.
Or as a culture. It seems to me that the gap between what language is acceptable in informal vs formal settings is fairly large in the US, while in other countries, using words like 'bullshit' in more formal settings is less taboo.
They could have, but then they would have had to go through the Urban Dictionary and try to classify its terms by register. Like it says in the article, the problem wasn't that all of Urban Dictionary was obscene, it was that they couldn't tell the computer which parts were and which parts weren't.
I saw the headline and assumed the story was going to be that management decided that computer memorization was copyright infringement. Glad it was just a computer acting like a teenager and cursing at the dinner table.
"Watson couldn't distinguish between polite language and profanity ...
Ultimately, Brown's 35-person team developed a filter to keep Watson from swearing ..."
Sounds just like what happens when you raise kids. "Daddy why is XXX a good word but YYY a bad word?"
The refrigerator was old and the shelf brackets worn to the point where from time to time they would detach themselves from the door. I arrived home late - I was working long hours, and was fetching my dinner.
Opened the door. Jars and cans and bottles spilled out on the floor.
"Shit!"
From the bathtub I hear my two year old son admonish, "Don't use that word."
It's a great memory, but I still wonder why he learned that lesson so thoroughly at daycare.
Watson couldn't distinguish between polite language and profanity ... Ultimately, Brown's 35-person team developed a filter to keep Watson from swearing ...
It's too bad Urban Dictionary doesn't let users vote on how vulgar they think the dictionary entries are. I've got that feature on The Online Slang Dictionary, but since UD is so much more popular, they could collect that much more data.
We have tried building educated gentlebots capable of playing chess and other noble pursuits. It didn't lead to GAI.
Maybe an uneducated scumbot would be better? Swearing and cursing because its peers do. Full of prejudice and bigotry because of weak anecdotal evidence. Vengeful. Impulsive. Using questionable grammar . Easily addicted. Cognitively biased. Wishfully thinking. Superstitious. Believing in fallacious logic. Thinking with the little head. Anti-intellectual and believing in conspiracy theories. Gossiping, slandering. Enjoying tv-shop.
"In tests it even used the word "bullshit" in an answer to a researcher's query" — Has to be the funniest thing I've heard all week. Sounds like something straight out of an Adam Sandler movie. This reminds me of an AI chat program I used to have called Billy. He would learn from your words and sentences, actually quite smart and I remember adding in slang words so whenever one of my friends would use it, it would most likely swear and insult them without realising it. The Billy program can be downloaded from here, still works quite well: http://www.leedberg.com/glsoft/billyproject.shtml
I like how someone commented on the main article that the time is getting close to where AI can step up to the plate of creativeness and how widespread and easy this will make our lives. Watson is a giant server farm, not a single PC, this stuff won't make a huge impact until IBM can shrink it or until computers get much much faster and smaller. Not that it won't happen, it's just not "around the corner" in any way.
I think "around the corner" type predictions generally fall into 2 camps:
1.Problems that we don't know how to solve yet, but we think we are close to based on "similar" problems we have solved.
2.Problems that we have a solution for, but it currently takes an unreasonable amount of time to use these solutions in practice.
Problems in class 1 are like AI in the 1960-70s, everybody thought we were super close to amazing AI based on discoveries we'd had, but these estimates were very wrong.
Problems in class 2 are like nlp and ml work in the 90s and 00s. A rather large chunk of 'wow' ml/nlp we have in applications today were pretty much solved 20 years ago, but there was no sane way to run them, certainly not on your cell phone.
Problems in class 2 are safer bets, there does seems to be consistant increases in processing power, memory, etc. Problems in class 1 are harder to guess because as history has shown, just because a solution seems similar doesn't mean that it actually is (shortest path is solved, longest path is NP-hard, shortest path touching all points once (ie. tsp) is NP-hard).
I think it's safe to say having Watson on our smart phones is right "around the corner" (20-30 years?) saying that we'll create "creative" AI, not so much.
Wireless communication is widespread enough that I don't think it matters too much where Watson "lives". The inputs and outputs required from "him" (for questioning, anyway, not for training) are tiny, so bandwidth isn't much of a concern. Assuming the architecture for it is parallel enough that it can be responding to lots of people at once how much it is distributed vs hosted on one system isn't particularly relevant to its usefulness, IMO.
I don't think robots will ever be completely autonomous. There will always be Skynet or a central data center which feeds information and controls each machine. Otherwise, things could potentially get out of hand if machines are intelligent enough and all indicators indicate that they very well will be within the next 100 years if not less.
As an aside, the best treatment of taboo I've ever read is law professor Christopher Fairman's paper, "Fuck".
It explores that word through the lens of jurisprudence, which I think is a fascinating and unusual approach to taboo. It's exceptionally well-written and manages to be witty, absurdist, informative, and thought-provoking in equal measure.
At issue are the 4th Amendmnent, self-censorship, sexual harassment, education, and broadcasting.
[+] [-] edw519|13 years ago|reply
"We use our deep subject matter expertise to deliver value through actionable advice that enables our clients to harness the power of best practices in order to shift their paradigms and achieve 10X deltas against competitive industry metrics."
[+] [-] callmeed|13 years ago|reply
http://www.urbandictionary.com/define.php?term=synergasm
;)
[+] [-] benohear|13 years ago|reply
"We use our experience in this field to provide practical advice to our clients which helps them improve their way of doing business and ROFL-stomp their competition 10x over."
[+] [-] bguthrie|13 years ago|reply
[+] [-] anonymous|13 years ago|reply
[+] [-] mbesto|13 years ago|reply
[+] [-] IheartApplesDix|13 years ago|reply
[+] [-] nnnnni|13 years ago|reply
[+] [-] ljd|13 years ago|reply
We build systems to organize who we are (urbandictionary) but hate it when the systems use that information tell us who we are (watson).
It feels so much like the emperor isn't wearing any clothes.
Perhaps an appropriate response would be for the computer to measure the tension in the human voice response to it's queries and optimize for lower tension.
So it can pick three words: Bullshit -> 80% confident; Sham -> 70% confident; Fallacy -> 50% confident;
Within limits, it will pick the less optimal word and measure the tension in the response and find a way of influencing confidence based on responses.
Think multi-armed bandit problem but with social situations. I mean, to be honest, isn't that what we all did when we were in middle school? We used as many bad words as possible measuring the response we got from others? None of us were born with a binary understanding of when to use certain words it was more trial and error.
[+] [-] ComputerGuru|13 years ago|reply
I'm not an "angel" by any means, but any time I visit Urban Dictionary, I come away feeling filthy. I typically go there to look up some abbreviation I heard on reddit or IRC or a blog post somewhere and what I look up usually ends up being middling-dirty, but the stuff that I see there "on my way" to the word I'm looking for makes me cringe and gives me a bleak vision of what the next generation is going to be like should Urban Dictionary actually be representing the majority of the population (I firmly hold that it does not).
[+] [-] rtkwe|13 years ago|reply
[+] [-] skreech|13 years ago|reply
Wonder if there's any research on this.
[+] [-] brudgers|13 years ago|reply
I'm not sure which is worse: the singularity with a bullshit detector or without.
[+] [-] sp332|13 years ago|reply
[+] [-] yassim|13 years ago|reply
Sure could/should/must not be used to advertise the tech, but If I had a pocket watson, I'd have no problems with it calling it like it parses it.
[+] [-] dhughes|13 years ago|reply
[+] [-] nsns|13 years ago|reply
[0] http://en.wikipedia.org/wiki/Register_%28sociolinguistics%29 [1] http://en.wikipedia.org/wiki/Code_switching
[+] [-] azernik|13 years ago|reply
[+] [-] NoPiece|13 years ago|reply
[+] [-] ChuckMcM|13 years ago|reply
[+] [-] RyanMcGreal|13 years ago|reply
[+] [-] plg|13 years ago|reply
Sounds just like what happens when you raise kids. "Daddy why is XXX a good word but YYY a bad word?"
"It just IS. Don't say that word again."
"Ok Daddy" (kid adds word to internal blacklist)
[+] [-] brudgers|13 years ago|reply
Opened the door. Jars and cans and bottles spilled out on the floor.
"Shit!"
From the bathtub I hear my two year old son admonish, "Don't use that word."
It's a great memory, but I still wonder why he learned that lesson so thoroughly at daycare.
[+] [-] WalterGR|13 years ago|reply
It's too bad Urban Dictionary doesn't let users vote on how vulgar they think the dictionary entries are. I've got that feature on The Online Slang Dictionary, but since UD is so much more popular, they could collect that much more data.
[+] [-] im3w1l|13 years ago|reply
Maybe an uneducated scumbot would be better? Swearing and cursing because its peers do. Full of prejudice and bigotry because of weak anecdotal evidence. Vengeful. Impulsive. Using questionable grammar . Easily addicted. Cognitively biased. Wishfully thinking. Superstitious. Believing in fallacious logic. Thinking with the little head. Anti-intellectual and believing in conspiracy theories. Gossiping, slandering. Enjoying tv-shop.
[+] [-] mxfh|13 years ago|reply
[+] [-] 3am_hackernews|13 years ago|reply
[+] [-] icodestuff|13 years ago|reply
[+] [-] sethbannon|13 years ago|reply
[+] [-] DigitalSea|13 years ago|reply
[+] [-] mitchi|13 years ago|reply
[+] [-] jeremyarussell|13 years ago|reply
[+] [-] Homunculiheaded|13 years ago|reply
1.Problems that we don't know how to solve yet, but we think we are close to based on "similar" problems we have solved.
2.Problems that we have a solution for, but it currently takes an unreasonable amount of time to use these solutions in practice.
Problems in class 1 are like AI in the 1960-70s, everybody thought we were super close to amazing AI based on discoveries we'd had, but these estimates were very wrong.
Problems in class 2 are like nlp and ml work in the 90s and 00s. A rather large chunk of 'wow' ml/nlp we have in applications today were pretty much solved 20 years ago, but there was no sane way to run them, certainly not on your cell phone.
Problems in class 2 are safer bets, there does seems to be consistant increases in processing power, memory, etc. Problems in class 1 are harder to guess because as history has shown, just because a solution seems similar doesn't mean that it actually is (shortest path is solved, longest path is NP-hard, shortest path touching all points once (ie. tsp) is NP-hard).
I think it's safe to say having Watson on our smart phones is right "around the corner" (20-30 years?) saying that we'll create "creative" AI, not so much.
[+] [-] georgemcbay|13 years ago|reply
[+] [-] tlb|13 years ago|reply
Can you explain why your argument is valid, but the one above isn't?
[+] [-] sakopov|13 years ago|reply
[+] [-] egypturnash|13 years ago|reply
The big question is just when is Moore's Law going to quit applying.
[+] [-] hhuio|13 years ago|reply
[+] [-] edj|13 years ago|reply
It explores that word through the lens of jurisprudence, which I think is a fascinating and unusual approach to taboo. It's exceptionally well-written and manages to be witty, absurdist, informative, and thought-provoking in equal measure.
At issue are the 4th Amendmnent, self-censorship, sexual harassment, education, and broadcasting.
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=896790
[+] [-] phogster|13 years ago|reply
[+] [-] yxhuvud|13 years ago|reply