Here's why I have absolutely no sympathy for Google in this situation.
They hired Gebru as a professional thorn in their side. "Come up in here and be a pain in the ass! Tell us what we're doing wrong!", they said. "We're Enlightened Corporate America, after all!" She is a chess piece in the game of Wokeness Street Cred.
She then proceeded to do the job she was hired for, and now they're all "Hey lady, Here At Google that's not how we do things".
That said, as a manager, I would have "accepted her resignation" and/or fired her without hesitation.
Agreed; but this isn't just a Google problem. Seems to me like a lot of SF (and SF-inspired) "big tech" wants to be known for their "wokeness"[1], which leads to hires like Timnit and other "politically-outspoken" people, which in turn leads to situations like this, James Damore, and other individuals/situations that amount to workplace political activism.
I am wholly uncomfortable with any discussion of politics in a workplace environment like a mailing list. I am more than fine with employees choosing to associate politically outside the workplace and workplace spaces, no matter how radical I find their views. This is in itself political - it supports the status quo - but anything else is inviting dissent and combativeness, and situations like these will keep happening.
That said, I've not seen the email outlining her conditions "or else", but I feel I'd have very much taken the same stance as you, given the surrounding coverage of what was in that email. Ultimatums to your employer don't often go well. And perhaps this is a good thing for her, because she may leave for a place that better suits her.
-
[1] my derisive use of this term does not stem from actual efforts at inclusiveness, those are good, but from surface-level attempts at it that end up feeling performative, at best.
I don't really have a perspective on this particular case. But yeah. "Companies" (really senior people within those companies) hire people who are known in their field, bring an outside perspective, have an independent voice, etc. And then those people (or perhaps different senior people) get unhappy when the individual who was hired doesn't go through channels or otherwise toe the party line. Either both sides find a middle ground by some combination of putting certain opinions in a box and choosing to ignore the use of channels or messages outside of official guidelines--or you part ways.
I have some experience with this. Not to nearly the same degree. But I had something of a tiff with an exec my second day on the job in a prior role.
I agree with this in part, but I think there is a difference between "tell us what we're doing wrong" and "send emails to mailing lists telling our employees to lobby congress against us". Plus the whole setting an ultimatum thing.
Would you feel differently if, say, Apple hired a "Privacy Watchdog", with a long history of activism on the topic? Someone you could trust to speak out if something was amiss.
If the person is later fired, that's a sign that something is wrong at the company. But if they stay, and have generally good things to say, that's a sign the company can be trusted.
Something is seriously wrong, BTW, when a manager has to find out from their own ex-employee that that employee has been fired by the manager’s manager, who, while not informing the manager, has sent a message about it to the fired employee’s direct reports.
Real change takes hard work and time, and patience, and also playing politics with powerful people you dont like.
Sounds like she basically made a one sided hit piece which was against the work some teams of brain, and took the "my way or the highway" approach. That wasn't so smart from her either...
The problem is she made some higher up angry and they said to fire her post haste. I think this is all being blown out of proportion. If she had been professional she wouldn't have put in an ultimatum, if google would have been professional google would have said "hol' up, no firing, let's communicate on this". So as usual both sides were wrong and unwilling to be adults.
She had been doing that job for awhile. IMO the reason this particular paper got the treatment it did is subtle but simple, based on the following observations:
* Many jurisdictions impose strict controls on the use of racially biased methods (eg: redline laws like the fair housing act in the US).
* Different advertising companies have access to different targeting methods (eg facebook has your social graph, apple have your app use, google have your search history).
* Language models are a key technology for google to maintain advertising relevance & thus keep adwords competitive
"Models trained on modern language reflect societal biases" may seem like an obvious fact, but once google says it publicly they no longer have a basis to deny it in court.
> Here's why I have absolutely no sympathy for Google in this situation.
Abstract away from the particular goal here, the problem itself is difficult and interesting. How can the leader of a large organization change the organization’s culture?
Such an excellent point. Hiring activists is fine, unless your ACTUAL mission is profit vs whatever utopian vision (sometimes worthy, usually delusional) the activist is going for.
I've always found it funny how we treat companies like Google as imaginary people with no single individual bearing responsibility.
To be fair, some people are aware of the people involved like Jeff Dean, but I look forward to a shift in the future where individuals within companies are as popular and held to the same standards as individuals within politics. Some of these corporate individuals already seem to have just as much power so hopefully it doesn't stand to be too much of a stretch in an ideological vision for the future.
With respect to this particular case of Timnit Gebru, it sounds like she was already on her way to being let go. From reading her Twitter, it seems like she has a flair for the dramatic which could make her critiques of people potentially come off as needlessly harsh and unconstructive. Whether that's good grounds to fire her may only be known to those within the company who interacted with her most I guess.
Are these numbers the energy to train a model? The whole point of these new NLP models is transfer learning, meaning you train the big model once and fine-tune it for each use case with a lot less training data.
5 cars worth of carbon emissions is not a lot given that it is a fixed cost. Very few are retraining BERT from scratch.
EDIT:
The other two points are also disingenuous.
* "[AI models] will also fail to capture the language and the norms of countries and peoples that have less access to the internet and thus a smaller linguistic footprint online. "
NLP in "low resource" languages is a major area of research, especially because that's where the "next billion users" are for Big Tech. Facebook especially is financially motivated to solve machine translation to/from such languages. https://ai.facebook.com/blog/recent-advances-in-low-resource...
* "Not as much effort goes into working on AI models that might achieve understanding, or that achieve good results with smaller, more carefully curated datasets (and thus also use less energy)."
This is also a major area of research. Achieving understanding falls under the purview of AGI, which itself carries ethical and safety concerns. There are certainly research groups working toward this. And reducing parameter sizes of big networks like GPT-3 is the next big race. See https://news.ycombinator.com/item?id=24704952
Carbon emissions arguments tend to ignore the value of what's being done as well. BERT and other transformers were meaningful experiments that were valuable in furthering a major research direction and enabling more effective consumer and business applications. In that sense, it's like any other company doing R&D - of course energy will be used and of course there will be some inefficiencies.
I think it's quite misleading to compare the energy usage of an industry-wide research effort to individual consumption. The graphs look bad - "wow, 626,000 lbs! that's 284 metric tons of CO2! a plane flight is way less!" - but there's a fundamental difference between "progress on a problem being worked on by thousands of highly-paid researchers" and "I bought a car".
Meanwhile, the worst power plants are generating on the order of 10+ million tons of CO2 every year. There are at least a dozen of these in the US alone. Car factories are emitting hundreds of thousands of tons of CO2 (Tesla is somewhere around 150,000 tons a year, apparently, and it's designed to be efficient). Perhaps activism around CO2 emissions in ML training might be better focused on improving the efficiency of those things instead, seeing as a 1% improvement would outweigh the entirety of the NLP model training industry. It's certainly good to keep in mind the energy costs of training in case things balloon out of control, but right now the costs relative to the results seem small and not worth highlighting as some forgotten sin.
The Strubell paper which is the origin of this "5 cars" number isn't even in the right ballpark for this stuff. What they did was take desktop GPU power consumption running the model in fp32, extrapolate to a 240x GPU (P100) setup that would run for a year straight at 100% power consumption.
Yes if you do run 240x p100s at literally 100% 24/7 for a year you get the power consumption of 5 cars. This run never happened though, this all ran on TPUs at lower precision, lower power consumption and much lower time to converge.
If anything this tells you that electronics are ridiculously green even when operating at 100%. I've never profiled world-wide carbon production but something tells me if you wanted to carbon optimise you'd be better served trying to take cars off the road and planes out of the sky.
> 5 cars worth of carbon emissions is not a lot given that it is a fixed cost. Very few are retraining BERT from scratch.
It's not a fixed cost though, it's just how much was spent on this year's iteration of the model. The overall point being made is that model training costs are growing unbounded. Next year it could be 30 cars' worth or whatever for BERT-2, then 600 cars' worth for BERT-3 the year after. That's what it's warning against. At some point it isn't worth it.
I'm more interested in the related nugget, which is the carbon footprint of a Google Search. Some estimates from a decade back put it at 7g (boil a cup of water), but since then it's probably only gotten larger. However if Google's dstacenters truly carbon neutral, does it even matter?
I did not read the paper (just like most people here), but by the title — “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” — it does not look like the CO2 emissions thing is the main topic of this research.
BTW, "Stochastic Parrots" is a very descriptive name for the problem
> Moreover, because the training datasets are so large, it’s hard to audit them to check for these embedded biases. “A methodology that relies on datasets too large to document is therefore inherently risky,” the researchers conclude. “While documentation allows for potential accountability, [...] undocumented training data perpetuates harm without recourse.”
Since these models are being applied in a lot of fields that directly affects the life of millions of people, this is a very important and underdiscussed problem.
> Since these models are being applied in a lot of fields that directly affects the life of millions of people
In particular, it is being applied right now to rank Google search results, and probably responsible for lots and lots of Google's profit. You should be skeptical of Google's appraisal of the paper that is material to Google's profit.
The first page was leaked. The environmental angle was a significant part of it, particularly the claim that environmental and financial costs ‘doubly punishes marginalized communities’.
> BTW, "Stochastic Parrots" is a very descriptive name for the problem
Meh, both parrots and language models are inferior to humans in producing language, but one is more useful than the other. And real parrots are also stochastic, like all living things.
Why? It sounds like ideologically driven circular reasoning. If you train an AI on the largest dataset it's possible to obtain then you have, almost by definition, done the most you can to avoid bias of any sort: the model will learn the most accurate representation of reality it can given the data available.
Gebru is the type of person who defines "bias" as anything that isn't sufficiently positive towards people who look like herself, not the usual definition of a deviation from reality as exists. Having encountered AI "fairness" and "bias" papers (words quoted because the words aren't used with their dictionary definitions), it's not even clear to me they should count as research at all, let alone be worth reading. They take as the starting premise that anything a model learns about the world that is politically incorrect is a bug, and go downhill from there.
I feel very disheartened that whatever good Timnit Gebru is doing in her work seems to be undermined by her constant antagonism and hostility. Based on this article, and additional context from other Google employees, I think she is in the right, but that she herself is such an obnoxious person it's hard for a lot of us here to be able to separate the issue with her personality.
My first introduction to Gebru (and probably a lot of others here) was through her fight with Yann LeCun. Whatever everyone thinks about that debate, one thing was clear: LeCun consistently asked his supporters to not attack Gebru, and tried to engage the discourse, whereas she repeatedly encouraged her supporters to attack LeCun, and avoided his points.
FWIW, I myself am a minority. I am very sympathetic to the investigation of AI ethics and Gebru's general area of work. I also completely understand and support the need for activists to disrupt norms: we wouldn't have civil liberties if activists weren't willing to disrupt middle-class norms.
But I think for exactly that reason activists have a responsibility (a greater responsibility even) to be selective about which norms or civilities they choose to disrupt. They do such enormous harm when their actions can be picked apart and dismissed for completely unrelated issues - in this case because Timnit doesn't seem to engage in good faith discussions with other experts in her field. It sucks, I really hate that this is what ethics in tech is going to be associated with.
> whereas she repeatedly encouraged her supporters to attack LeCun, and avoided his points.
This is historical revisionism, colored by your misremembering of the events. If you go back and look, you'll be unable to find Gebru encouraging anyone to attack LeCun.
Nor will you find LeCun making an effort to engage with Gebrus research. He never so much as acknowledges the existence of her publications when asked to read or comment on them.
E: as an amusing example of the importance of this kind of research, my phone decided that when I typed "Gebru", I meant "Henry".
AI ethics research is funny. It's obviously important but it's also kind of ... obvious. I am surprised they get paid so much.
* Lots of processing uses more energy..
* Large amounts of data might contain bad data (garbage in, garbage out)
* Wealthy countries/communities have the most 'content' online so less wealthy countries/communities will be unrepresented.
here's some more:
* AI being forced to chose between two "bad" scenarios will result in an unfavorable outcome for one party
* AI could reveal truths people don't want to hear e.g. it might say the best team for a project is an all white male team between 25 - 30 rather than a more diverse team. It might say that a 'perfect' society needs a homogeneous population.
* AI could disrupt a lot of lower paid jobs first without governments having proper supports and retraining structures in place
Playing the devil's advocate here, I'm with you that half of AI ethics is obvious and the other half is wrong, but is't it the goal of the field to try and give meaning to things that aren't obviously well defined?
To make an example that's en vogue right now, AI explainability. Nobody even has a definition of what it means for a model to be explainable (is a linear regression "more explainable" than ML? isn't Google search far less explainable than any model of anything ever?), but a reasonable framework for that concept could certainly be interesting.
Obviously, serious frameworks are done with definitions and math, not with words and storytelling, but all the air around those things seems to me more the fault of politicians, crap journalists and freaking idiots on social media (the current term is 'influencer', I reckon) rather than an issue with the field itself.
It's always obvious in hindsight, much like what can be said of something like Dijkstra's algorithm, but the fact of the matter is not everyone can spend the energy to both understand the context of the problem and direct their attention towards the evaluation of ethics within that context. Some people even find it hard to understand the situations of others enough to identify where their technologies can be used for harm.
The problem that AI ethics research addresses are ethical problems that executives and employees aren't paying attention to. It may seem obvious when stated explicitly because of the amount of ease it takes to grasp the concepts (and the seemingly simple derivation of cause and effect relationships), but I assure you it is not obvious to a lot of people I know in the field at least.
There's also a clear misunderstanding of what ethics research should entail:
* AI being forced to chose between two "bad" scenarios will result in an unfavorable outcome for one party
This is a trivial result that doesn't hold much value as a standalone observation and probably wouldn't be touted as a research point in a respectable publication of AI ethics.
* AI could reveal truths people don't want to hear e.g. it might say the best team for a project is an all white male team between 25 - 30 rather than a more diverse team. It might say that a 'perfect' society needs a homogeneous population.
The fact that you made this comment may be a cause for an ethics discussion in itself. You used "truths" to describe the statement "the best team for a project is an all white male team between 25 - 30 rather than a more diverse team." This shows a disregard for the reality that most data is contextualized and biased. Using terms like "best team" and "more diverse team" make the statement like the one you made at risk for having took a misguided conclusion from data.
Maybe the following revised statement would be closer to what we can call a contextualized "truth" generated by an ML model:
"Teams that comprise of white males between 25 - 30 have a statistically larger chance of meeting milestones set by leadership rather than teams with one or more non-white male."
Even statements like that aren't complete as my definition of "meeting milestones" could be sourced from self-reported data (in which case it could mean that white males just self-report more milestone completions).
* AI could disrupt a lot of lower paid jobs first without governments having proper supports and retraining structures in place
A problem to consider, sure, but this is one of the more popular observations and has been echoed over time within the context of technological advancements in general.
Last comment about it since I spent too much time on this story. As a researcher in the field I feel bad people don't give Google more credit (I have no affiliation with Google). They created a research environment where researchers have freedom to work on their own interests and publish papers (they publish more than any other company). You don't find many environments like that outside of academia. I still remember Microsoft Research closing down their research lab in the west coast and sending a huge number of researchers home. I can tell you I always apply when there is an opening. So far without luck. If you're a Google Researcher don't forget how lucky you are.
This is a poor argument assuming discriminatory practices are happening there. This is like giving Weinstein a free pass because his movies are really good.
Note, I’m not saying Google discriminates, but if they do then your statement doesn’t contribute to their defense.
Anyone else see the narrative on Twitter to be so much different than Reddit and hacker news?
On reddit and hacker news I have never seen any unconditional support for Timmit. Even amongst those anti Google and broadly in support of her, there's no whole agreement with all of what Timmit says; the assertion of her story, the reasons why and the conclusions.
that a. She was fired b. That she was fired because of sexism and racism and c. That all research at Google is pointless and should stop until reform.
On Twitter there is more vocal support, but here and Reddit even on those most critical of Google and behind Timmit, there's no entire agreement with her. There are comments supporting some parts but none supporting all, none agreeing with the Google Walkout document. Comments pointing out holes with official explanations but not supporting the conclusions.
Why is this? I find it odd, I would expect more unequivocal solidarity seen with less public visibility.
This doesn't strike me as anything that needs to be buried. The energy argument is tenuous and at this point models use relatively little compute power. The language bias is a better tack but I don't think this is earth shattering to anyone. Most of the internet is from Western sources and some fraction of that is racist/prejudiced. I think this is obvious to anyone who has ever used the internet.
While I will not dive into the extremes here, this is yet another battle in the fight over language, right? The most important point of criticism seems to be that the language models are "unethical", i.e., not subject to control by a particular political standard.
Am I the only one that finds this line of argumentation highly troubling? The idea that once you do something with language there should be someone proactively controlling you? Should it not always be the output that will be judged by the public?
Is this paper on arxiv? This overview doesn't answer any critical questions. For example, it's easy to fill up 128 references and a reader shouldn't blindly trust a claim that, "The version of the paper we saw does also nod to several research efforts on reducing the size and computational costs of large language models, and on measuring the embedded bias of models."
If a key part of Google's claim is that the paper omits relevant research, an author should have simply posted their 128 references and openly asked what work was missing. This whole saga could be easily solved instead of being dragged out for clicks.
See the other thread were one mentions that with reCaptcha Google learns that all cabs are yellow (and traffic lights hang over streets - they don't here). But I guess everyone only sees the blind spots of other people.
> A version of Google’s language model, BERT, which underpins the company’s search engine, produced 1,438 pounds of CO2 equivalent in Strubell’s estimate
So... they're saying it used about $100 worth of electricity.
It's nice to finally get to see some of the content of the paper and it's awesome that Bender was willing to step up and give some context to world about what the hell it was about.
"...is known for coauthoring a groundbreaking paper that showed facial recognition to be less accurate at identifying women and people of color, which means its use can end up discriminating against them"
Wait, isn't that the other way around? If it can't recognize people of some category then it can't be used to discriminate against them, e.g. the police can't use it to identify peaceful protesters with those characteristics. I wonder what would have happened if the these networks were much better at recognizing women and people of color, would the paper then be about Google designing technology to detect minorities?
> An AI model trained on vast swaths of the internet won’t be attuned to the nuances of this vocabulary and won’t produce or interpret language in line with these new cultural norms.
As those become pervasive cultural norms, why would the model not adapt to include them?
> An AI model taught to view racist language as normal is obviously bad. The researchers, though, point out a couple of more subtle problems. One is that shifts in language play an important role in social change; the MeToo and Black Lives Matter movements, for example, have tried to establish a new anti-sexist and anti-racist vocabulary. An AI model trained on vast swaths of the internet won’t be attuned to the nuances of this vocabulary and won’t produce or interpret language in line with these new cultural norms.
This is a pretty superficial take on what is an extremely interesting sociological topic. (To be clear, I’m referring to the article, not the underlying paper which we don’t have.) Obviously just because social movements “have tried to establish ... vocabulary” doesn’t meant that vocabulary has become a “new cultural norm.” Plenty of such efforts end up being cultural dead-ends.
Take for example a term like “LatinX.” This term has been proposed and is used by certain people, but is extremely unfamiliar and often alienating to Latinos themselves: https://www.vox.com/2020/11/5/21548677/trump-hispanic-vote-l... (“[O]nly 3 percent of US Hispanics actually use it themselves.... The message of the term, however, is that the entire grammatical system of the Spanish language is problematic, which in any other context progressives would recognize as an alienating and insensitive message.”).
The article hand-waves away a deeply interesting question: What should an AI do here? Should AI reflect society, or be a vehicle for accelerating change? It seems at least reasonable to say that the AI should reflect what people actually say, in which case a big training dataset is appropriate, instead of what some experts decide that people should say. In some contexts, for example with “LatinX,” researchers seeking to enhance inclusivity could instead end up imposing a kind of racist elitism. (People without college educations—which disproportionately comprises immigrants and people of color—tend to be less knowledgeable about and slower to adopt these changes in vocabulary.)
The paper seems to imply that AIs should not reflect “social norms” but that training data should be selected to accentuate “attempt[ed]” shifts in such norms. Maybe that’s true, but it doesn’t seem obviously true. To return to the example above, is some Google AI generating the phrase “LatinX” (which 3/4 of Latinos have never even heard of: https://www.pewresearch.org/hispanic/2020/08/11/about-one-in...) in preference to “Latino” or “Hispanic” actually the desired result?
Good grief - this terrible clickbaity headline writing: "We read the paper that forced Timnit Gebru out of Google. Here’s what it says"
The paper DID NOT force her out of Google. Her subsequent behaviour - submitting without approval, rant, ultimatum, and resignation - did. And she wasn't "forced out": she resigned of her own volition. She could have chosen to make improvements to the paper based on the feedback she was given, resubmit it for approval, and then get on with her life, but she went the other way.
The headline from the last discussion on Timnit's exit[0] was awful as well: "The withering email that got an ethical AI researcher fired at Google". So bad in fact that it was changed on HN to more accurately reflect what actually happened: "AI researcher Timnit Gebru resigns from Google" (much more neutral and factual in tone).
Seriously, what happened to journalistic standards and integrity? Why are the actual events being so forcefully twisted to fit a particular narrative? No wonder the general population struggle to figure out what's true and what's not, and fall victim to all kinds of nonsense, BS theories, and conspiracies.
I wish I had a good idea on how to change this behaviour by journalists and publications.
(Clearly this is a problem that goes far beyond Timnit's story.)
[+] [-] jpm_sd|5 years ago|reply
They hired Gebru as a professional thorn in their side. "Come up in here and be a pain in the ass! Tell us what we're doing wrong!", they said. "We're Enlightened Corporate America, after all!" She is a chess piece in the game of Wokeness Street Cred.
She then proceeded to do the job she was hired for, and now they're all "Hey lady, Here At Google that's not how we do things".
That said, as a manager, I would have "accepted her resignation" and/or fired her without hesitation.
[+] [-] nathanlied|5 years ago|reply
I am wholly uncomfortable with any discussion of politics in a workplace environment like a mailing list. I am more than fine with employees choosing to associate politically outside the workplace and workplace spaces, no matter how radical I find their views. This is in itself political - it supports the status quo - but anything else is inviting dissent and combativeness, and situations like these will keep happening.
That said, I've not seen the email outlining her conditions "or else", but I feel I'd have very much taken the same stance as you, given the surrounding coverage of what was in that email. Ultimatums to your employer don't often go well. And perhaps this is a good thing for her, because she may leave for a place that better suits her.
-
[1] my derisive use of this term does not stem from actual efforts at inclusiveness, those are good, but from surface-level attempts at it that end up feeling performative, at best.
[+] [-] ghaff|5 years ago|reply
I have some experience with this. Not to nearly the same degree. But I had something of a tiff with an exec my second day on the job in a prior role.
[+] [-] DangerousPie|5 years ago|reply
[+] [-] Wowfunhappy|5 years ago|reply
Would you feel differently if, say, Apple hired a "Privacy Watchdog", with a long history of activism on the topic? Someone you could trust to speak out if something was amiss.
If the person is later fired, that's a sign that something is wrong at the company. But if they stay, and have generally good things to say, that's a sign the company can be trusted.
I do think this is a good system!
[+] [-] dragonwriter|5 years ago|reply
Her manager, however, apparently would not have: https://m.facebook.com/story.php?story_fbid=3469738016467233...
Something is seriously wrong, BTW, when a manager has to find out from their own ex-employee that that employee has been fired by the manager’s manager, who, while not informing the manager, has sent a message about it to the fired employee’s direct reports.
Irrespective of the merits of the firing itself.
[+] [-] vmception|5 years ago|reply
It doesn’t matter the topic that’s the outcome
Not to mention healthier for everyone involved
Google employees are way too comfortable with that message board they have over there
[+] [-] mam2|5 years ago|reply
Real change takes hard work and time, and patience, and also playing politics with powerful people you dont like.
Sounds like she basically made a one sided hit piece which was against the work some teams of brain, and took the "my way or the highway" approach. That wasn't so smart from her either...
[+] [-] stjohnswarts|5 years ago|reply
[+] [-] danielheath|5 years ago|reply
* Many jurisdictions impose strict controls on the use of racially biased methods (eg: redline laws like the fair housing act in the US).
* Different advertising companies have access to different targeting methods (eg facebook has your social graph, apple have your app use, google have your search history).
* Language models are a key technology for google to maintain advertising relevance & thus keep adwords competitive
"Models trained on modern language reflect societal biases" may seem like an obvious fact, but once google says it publicly they no longer have a basis to deny it in court.
[+] [-] guantanamo_bob|5 years ago|reply
[+] [-] itronitron|5 years ago|reply
[+] [-] bradleyjg|5 years ago|reply
Abstract away from the particular goal here, the problem itself is difficult and interesting. How can the leader of a large organization change the organization’s culture?
[+] [-] JPKab|5 years ago|reply
[+] [-] derangedHorse|5 years ago|reply
To be fair, some people are aware of the people involved like Jeff Dean, but I look forward to a shift in the future where individuals within companies are as popular and held to the same standards as individuals within politics. Some of these corporate individuals already seem to have just as much power so hopefully it doesn't stand to be too much of a stretch in an ideological vision for the future.
With respect to this particular case of Timnit Gebru, it sounds like she was already on her way to being let go. From reading her Twitter, it seems like she has a flair for the dramatic which could make her critiques of people potentially come off as needlessly harsh and unconstructive. Whether that's good grounds to fire her may only be known to those within the company who interacted with her most I guess.
[+] [-] sthatipamala|5 years ago|reply
5 cars worth of carbon emissions is not a lot given that it is a fixed cost. Very few are retraining BERT from scratch.
EDIT:
The other two points are also disingenuous.
* "[AI models] will also fail to capture the language and the norms of countries and peoples that have less access to the internet and thus a smaller linguistic footprint online. "
NLP in "low resource" languages is a major area of research, especially because that's where the "next billion users" are for Big Tech. Facebook especially is financially motivated to solve machine translation to/from such languages. https://ai.facebook.com/blog/recent-advances-in-low-resource...
* "Not as much effort goes into working on AI models that might achieve understanding, or that achieve good results with smaller, more carefully curated datasets (and thus also use less energy)."
This is also a major area of research. Achieving understanding falls under the purview of AGI, which itself carries ethical and safety concerns. There are certainly research groups working toward this. And reducing parameter sizes of big networks like GPT-3 is the next big race. See https://news.ycombinator.com/item?id=24704952
[+] [-] rococode|5 years ago|reply
I think it's quite misleading to compare the energy usage of an industry-wide research effort to individual consumption. The graphs look bad - "wow, 626,000 lbs! that's 284 metric tons of CO2! a plane flight is way less!" - but there's a fundamental difference between "progress on a problem being worked on by thousands of highly-paid researchers" and "I bought a car".
Meanwhile, the worst power plants are generating on the order of 10+ million tons of CO2 every year. There are at least a dozen of these in the US alone. Car factories are emitting hundreds of thousands of tons of CO2 (Tesla is somewhere around 150,000 tons a year, apparently, and it's designed to be efficient). Perhaps activism around CO2 emissions in ML training might be better focused on improving the efficiency of those things instead, seeing as a 1% improvement would outweigh the entirety of the NLP model training industry. It's certainly good to keep in mind the energy costs of training in case things balloon out of control, but right now the costs relative to the results seem small and not worth highlighting as some forgotten sin.
[+] [-] confuseshrink|5 years ago|reply
Yes if you do run 240x p100s at literally 100% 24/7 for a year you get the power consumption of 5 cars. This run never happened though, this all ran on TPUs at lower precision, lower power consumption and much lower time to converge.
If anything this tells you that electronics are ridiculously green even when operating at 100%. I've never profiled world-wide carbon production but something tells me if you wanted to carbon optimise you'd be better served trying to take cars off the road and planes out of the sky.
[+] [-] CydeWeys|5 years ago|reply
It's not a fixed cost though, it's just how much was spent on this year's iteration of the model. The overall point being made is that model training costs are growing unbounded. Next year it could be 30 cars' worth or whatever for BERT-2, then 600 cars' worth for BERT-3 the year after. That's what it's warning against. At some point it isn't worth it.
[+] [-] marmaduke|5 years ago|reply
I think this race has been ongoing for a while now
[+] [-] ramraj07|5 years ago|reply
[+] [-] hezag|5 years ago|reply
BTW, "Stochastic Parrots" is a very descriptive name for the problem
> Moreover, because the training datasets are so large, it’s hard to audit them to check for these embedded biases. “A methodology that relies on datasets too large to document is therefore inherently risky,” the researchers conclude. “While documentation allows for potential accountability, [...] undocumented training data perpetuates harm without recourse.”
Since these models are being applied in a lot of fields that directly affects the life of millions of people, this is a very important and underdiscussed problem.
I really want to read the paper.
[+] [-] sanxiyn|5 years ago|reply
In particular, it is being applied right now to rank Google search results, and probably responsible for lots and lots of Google's profit. You should be skeptical of Google's appraisal of the paper that is material to Google's profit.
[+] [-] throwaway_hare|5 years ago|reply
[+] [-] Veedrac|5 years ago|reply
[+] [-] visarga|5 years ago|reply
Meh, both parrots and language models are inferior to humans in producing language, but one is more useful than the other. And real parrots are also stochastic, like all living things.
[+] [-] thu2111|5 years ago|reply
Gebru is the type of person who defines "bias" as anything that isn't sufficiently positive towards people who look like herself, not the usual definition of a deviation from reality as exists. Having encountered AI "fairness" and "bias" papers (words quoted because the words aren't used with their dictionary definitions), it's not even clear to me they should count as research at all, let alone be worth reading. They take as the starting premise that anything a model learns about the world that is politically incorrect is a bug, and go downhill from there.
[+] [-] saeranv|5 years ago|reply
My first introduction to Gebru (and probably a lot of others here) was through her fight with Yann LeCun. Whatever everyone thinks about that debate, one thing was clear: LeCun consistently asked his supporters to not attack Gebru, and tried to engage the discourse, whereas she repeatedly encouraged her supporters to attack LeCun, and avoided his points.
FWIW, I myself am a minority. I am very sympathetic to the investigation of AI ethics and Gebru's general area of work. I also completely understand and support the need for activists to disrupt norms: we wouldn't have civil liberties if activists weren't willing to disrupt middle-class norms.
But I think for exactly that reason activists have a responsibility (a greater responsibility even) to be selective about which norms or civilities they choose to disrupt. They do such enormous harm when their actions can be picked apart and dismissed for completely unrelated issues - in this case because Timnit doesn't seem to engage in good faith discussions with other experts in her field. It sucks, I really hate that this is what ethics in tech is going to be associated with.
[+] [-] joshuamorton|5 years ago|reply
This is historical revisionism, colored by your misremembering of the events. If you go back and look, you'll be unable to find Gebru encouraging anyone to attack LeCun.
Nor will you find LeCun making an effort to engage with Gebrus research. He never so much as acknowledges the existence of her publications when asked to read or comment on them.
E: as an amusing example of the importance of this kind of research, my phone decided that when I typed "Gebru", I meant "Henry".
[+] [-] eecks|5 years ago|reply
* Lots of processing uses more energy..
* Large amounts of data might contain bad data (garbage in, garbage out)
* Wealthy countries/communities have the most 'content' online so less wealthy countries/communities will be unrepresented.
here's some more:
* AI being forced to chose between two "bad" scenarios will result in an unfavorable outcome for one party
* AI could reveal truths people don't want to hear e.g. it might say the best team for a project is an all white male team between 25 - 30 rather than a more diverse team. It might say that a 'perfect' society needs a homogeneous population.
* AI could disrupt a lot of lower paid jobs first without governments having proper supports and retraining structures in place
[+] [-] qsort|5 years ago|reply
To make an example that's en vogue right now, AI explainability. Nobody even has a definition of what it means for a model to be explainable (is a linear regression "more explainable" than ML? isn't Google search far less explainable than any model of anything ever?), but a reasonable framework for that concept could certainly be interesting.
Obviously, serious frameworks are done with definitions and math, not with words and storytelling, but all the air around those things seems to me more the fault of politicians, crap journalists and freaking idiots on social media (the current term is 'influencer', I reckon) rather than an issue with the field itself.
[+] [-] derangedHorse|5 years ago|reply
The problem that AI ethics research addresses are ethical problems that executives and employees aren't paying attention to. It may seem obvious when stated explicitly because of the amount of ease it takes to grasp the concepts (and the seemingly simple derivation of cause and effect relationships), but I assure you it is not obvious to a lot of people I know in the field at least.
There's also a clear misunderstanding of what ethics research should entail:
* AI being forced to chose between two "bad" scenarios will result in an unfavorable outcome for one party
This is a trivial result that doesn't hold much value as a standalone observation and probably wouldn't be touted as a research point in a respectable publication of AI ethics.
* AI could reveal truths people don't want to hear e.g. it might say the best team for a project is an all white male team between 25 - 30 rather than a more diverse team. It might say that a 'perfect' society needs a homogeneous population.
The fact that you made this comment may be a cause for an ethics discussion in itself. You used "truths" to describe the statement "the best team for a project is an all white male team between 25 - 30 rather than a more diverse team." This shows a disregard for the reality that most data is contextualized and biased. Using terms like "best team" and "more diverse team" make the statement like the one you made at risk for having took a misguided conclusion from data.
Maybe the following revised statement would be closer to what we can call a contextualized "truth" generated by an ML model:
"Teams that comprise of white males between 25 - 30 have a statistically larger chance of meeting milestones set by leadership rather than teams with one or more non-white male."
Even statements like that aren't complete as my definition of "meeting milestones" could be sourced from self-reported data (in which case it could mean that white males just self-report more milestone completions).
* AI could disrupt a lot of lower paid jobs first without governments having proper supports and retraining structures in place
A problem to consider, sure, but this is one of the more popular observations and has been echoed over time within the context of technological advancements in general.
[+] [-] slim|5 years ago|reply
[+] [-] ur-whale|5 years ago|reply
[deleted]
[+] [-] tinyhouse|5 years ago|reply
[+] [-] kenjackson|5 years ago|reply
Note, I’m not saying Google discriminates, but if they do then your statement doesn’t contribute to their defense.
[+] [-] thinkingemote|5 years ago|reply
On reddit and hacker news I have never seen any unconditional support for Timmit. Even amongst those anti Google and broadly in support of her, there's no whole agreement with all of what Timmit says; the assertion of her story, the reasons why and the conclusions.
that a. She was fired b. That she was fired because of sexism and racism and c. That all research at Google is pointless and should stop until reform.
On Twitter there is more vocal support, but here and Reddit even on those most critical of Google and behind Timmit, there's no entire agreement with her. There are comments supporting some parts but none supporting all, none agreeing with the Google Walkout document. Comments pointing out holes with official explanations but not supporting the conclusions.
Why is this? I find it odd, I would expect more unequivocal solidarity seen with less public visibility.
[+] [-] dang|5 years ago|reply
https://news.ycombinator.com/item?id=25307167
https://news.ycombinator.com/item?id=25292386
https://news.ycombinator.com/item?id=25285502
https://news.ycombinator.com/item?id=25289445
[+] [-] treis|5 years ago|reply
[+] [-] choeger|5 years ago|reply
Am I the only one that finds this line of argumentation highly troubling? The idea that once you do something with language there should be someone proactively controlling you? Should it not always be the output that will be judged by the public?
[+] [-] sumnuyungi|5 years ago|reply
If a key part of Google's claim is that the paper omits relevant research, an author should have simply posted their 128 references and openly asked what work was missing. This whole saga could be easily solved instead of being dragged out for clicks.
[+] [-] KingOfCoders|5 years ago|reply
[+] [-] nullc|5 years ago|reply
So... they're saying it used about $100 worth of electricity.
[ https://www.eia.gov/tools/faqs/faq.php?id=74&t=11 ]
[ https://www.statista.com/statistics/190680/us-industrial-con... ]
[+] [-] chairmanwow1|5 years ago|reply
[+] [-] ethical-ai|5 years ago|reply
Should not you avoid submitting early drafts which are not good enough to be published yet?
[+] [-] 0-_-0|5 years ago|reply
Wait, isn't that the other way around? If it can't recognize people of some category then it can't be used to discriminate against them, e.g. the police can't use it to identify peaceful protesters with those characteristics. I wonder what would have happened if the these networks were much better at recognizing women and people of color, would the paper then be about Google designing technology to detect minorities?
[+] [-] sokoloff|5 years ago|reply
As those become pervasive cultural norms, why would the model not adapt to include them?
[+] [-] rayiner|5 years ago|reply
This is a pretty superficial take on what is an extremely interesting sociological topic. (To be clear, I’m referring to the article, not the underlying paper which we don’t have.) Obviously just because social movements “have tried to establish ... vocabulary” doesn’t meant that vocabulary has become a “new cultural norm.” Plenty of such efforts end up being cultural dead-ends.
Take for example a term like “LatinX.” This term has been proposed and is used by certain people, but is extremely unfamiliar and often alienating to Latinos themselves: https://www.vox.com/2020/11/5/21548677/trump-hispanic-vote-l... (“[O]nly 3 percent of US Hispanics actually use it themselves.... The message of the term, however, is that the entire grammatical system of the Spanish language is problematic, which in any other context progressives would recognize as an alienating and insensitive message.”).
The article hand-waves away a deeply interesting question: What should an AI do here? Should AI reflect society, or be a vehicle for accelerating change? It seems at least reasonable to say that the AI should reflect what people actually say, in which case a big training dataset is appropriate, instead of what some experts decide that people should say. In some contexts, for example with “LatinX,” researchers seeking to enhance inclusivity could instead end up imposing a kind of racist elitism. (People without college educations—which disproportionately comprises immigrants and people of color—tend to be less knowledgeable about and slower to adopt these changes in vocabulary.)
The paper seems to imply that AIs should not reflect “social norms” but that training data should be selected to accentuate “attempt[ed]” shifts in such norms. Maybe that’s true, but it doesn’t seem obviously true. To return to the example above, is some Google AI generating the phrase “LatinX” (which 3/4 of Latinos have never even heard of: https://www.pewresearch.org/hispanic/2020/08/11/about-one-in...) in preference to “Latino” or “Hispanic” actually the desired result?
[+] [-] ur-whale|5 years ago|reply
[+] [-] bartread|5 years ago|reply
The paper DID NOT force her out of Google. Her subsequent behaviour - submitting without approval, rant, ultimatum, and resignation - did. And she wasn't "forced out": she resigned of her own volition. She could have chosen to make improvements to the paper based on the feedback she was given, resubmit it for approval, and then get on with her life, but she went the other way.
The headline from the last discussion on Timnit's exit[0] was awful as well: "The withering email that got an ethical AI researcher fired at Google". So bad in fact that it was changed on HN to more accurately reflect what actually happened: "AI researcher Timnit Gebru resigns from Google" (much more neutral and factual in tone).
Seriously, what happened to journalistic standards and integrity? Why are the actual events being so forcefully twisted to fit a particular narrative? No wonder the general population struggle to figure out what's true and what's not, and fall victim to all kinds of nonsense, BS theories, and conspiracies.
I wish I had a good idea on how to change this behaviour by journalists and publications.
(Clearly this is a problem that goes far beyond Timnit's story.)
[0] https://news.ycombinator.com/item?id=25292386