I know nothing about this particular company. I also see no reason to assume it’s particularly bad (it could very well be above average). But I do see a lot of data around company engagement. There’s basically no chance this level of positivity is ubiquitous for them, especially when the target workforce appears to be people otherwise below the poverty line.
Although pointless tedium is not the most important factor in rating a job, I think data labeling could well be along the most pointlessly tedious jobs ever created.
Mental tedium is a relative luxury: it's still clean indoor work with a predictable income, and thus far preferable compared to many other options like planting rice (literally backbreaking), working in a factory repeating exactly the same physical motion over and over, scavenging through garbage, etc.
The broader point, though, is while people in the West bemoan outsourcing to poor countries as exploitative etc (and it can be!), even these "shit" jobs can be transformational. BPO has propelled literally millions of people into the middle class in both India and the Philippines.
I'm a software engineer at Scale and the author of this blog post.
Let's talk about Venezuela for a second. ~75% of the population lost >19lb in body weight in a year according to this survey: https://www.upi.com/Top_News/World-News/2017/02/19/Venezuela... It's unbelievable that we haven't figured out how to prevent people from living like that in the 21st century. I think it's totally deplorable and honestly an affront to humanity.
You know what I think the most effective way to combat large scale poverty is? Not by working for an aid agency (we've all heard the horror stories) - instead, how about making those people economically valuable? The internet is an amazing way to reach those people - and guess what - Scale is actually doing that (evidence: these stories). If we continue to grow, we'll be doing that even more.
To be clear, this isn't the main mission of the company - but I don't see how anyone could think it isn't a great side effect. Hence the title of the blog post - positive externalities.
One interesting side to this is citizen science; websites like Zooniverse where you can get random members of the public to do your labelling. Technically there's little difference to what Scale does, just that it might be pictures of Pelicans or galaxies rather than cars on the road. In my experience what you're paying for with Scale is consistency. Zooniverse relies on the public being... Gaussian in quality - we needed something like twenty annotations per image I think.
People spend hours going through data and they don't get paid. On Zooniverse most of the data is boring, but the fun is the prospect of finding an unusual image. Some of the datasets are genuinely interesting (like the old ship logs). Many people say they enjoy the idea that they're contributing to science.
> Although pointless tedium is not the most important factor in rating a job, I think data labeling could well be along the most pointlessly tedious jobs ever created.
I don't want to pass judgement either way, but it's an interesting debate. These jobs have been around forever - data entry, for example, used to be (still is?) a fairly common unskilled job in first world countries. And tedium aside, you can at least work from home if you have a computer.
But where is the line? A lot of work I've seen on MTurk is transcription, for example. What about translation/subtitling services? These are labelling tasks in some way or another.
Are you saying that the tedium itself is engineered to be present? I mean, the work is definitely not pointless. It is as pointed as it gets, as significant as a miner in a mine, a farmer in the field. Monotonous, tedious, yes. Pointless? No.
Now, the idea that the tedium is engineered in would be interesting. That would suggest it could be engineered out. Which is very interesting. You solve that problem, I submit there are some HR folks who would like your contact info.
I don’t feel that the pat in the back is necessary when highlighting these realities of the world economic inequality. The whole article could have been published without the intense PR focus.
To clarify, there is no hard numbers around the quantity of people performing this task, for how long they are employed, their average working hours per day, the median satisfaction rating, the median pay per employee, etc. It feels as if only some happy examples were picked.
If they don't like it, why are they doing it? For every rational labeller, it must be better than every other option available to them. Perhaps you're saying the workers are irrational and need some outside for to stop them doing this job? Like it's an unhealthy addiction or is limiting their future opportunities?
I'm not familiar with the process of this particular company but in general that's unlikely to have that much impact. While labelling itself might have an effect, lots of biases happen earlier, namely in the selection of samples to be labelled in the first place.
If you think about bias in ML models that we've seen in recent years for example: if you let people label 980 images of old white people and 20 diverse ones that's still a heavy bias, even if they spot and label the 2% outliers correctly.
There are many laboratories that hire part-time data labelers to secure their own verified dataset. In fact, many global companies actually create values and earn money from their own various datasets. I think the value will be created from the data itself, less focus on data-processing skills. Of course, modeling skills will be still important at future, but data-processing would rather become much more easier. That's why self-bi tools like Tableau, elastic search are becoming more and more popular. I personally recommend Metatron Discovery, which is an Open-Sourced Big data analytics platform for citizen scientists.
Link : https://github.com/metatron-app/metatron-discovery
"Life of a data labeler" is a wildly inaccurate description for "we asked some of our labelers for times when working for Scale made a positive difference in their lives."
What's the median hourly pay for a Scale data labeler? What fraction of their employees enjoy working for Scale? What's their six month retention rate? Amazon's Mechanical Turk has a reputation for grinding people up and burning them out; what makes Scale better?
Specifics are intentionally hidden, but it seems like the median hourly wage is somewhere around $1-$2.50 an hour.
The labeler side of scale is hosted at "remotasks.com" and they run IP checks to try and ensure that only people from certain countries can sign up to label data. Also a Facebook account is required for signup...
A similar opportunity (and maybe a step up) would be 'crowd tester'; if a Dollar or Euro goes a little farther in your country than in the west, then 1€ for a bug reproduction up to 10€ for a major bug could be an attractive additional income ...
... while of course not providing with a stable regular job, and not necessarily doing much to strengthen the local economy in a fundamental sense.
And how am I supposed to know if the attractive, well-educated, contented folks chosen by this company's PR department are actually representative of its workforce?
Check out my other comment on this thread. I interviewed these folks and wrote the blog post (got lots of feedback from friends and design help from our awesome designers) - I'm a software engineer, not a PR department, lol :P
I tried to pick the answers which were most well written, not the ones which were most positive/negative. Personally I do think this is representative of the labelers who've stuck with us, but at this point don't have the energy to argue this; hopefully my other comment is convincing (i.e, there isn't a high bar to making people happy when their next best alternative is a lot worse)
[+] [-] b_tterc_p|6 years ago|reply
Although pointless tedium is not the most important factor in rating a job, I think data labeling could well be along the most pointlessly tedious jobs ever created.
[+] [-] 9nGQluzmnq3M|6 years ago|reply
The broader point, though, is while people in the West bemoan outsourcing to poor countries as exploitative etc (and it can be!), even these "shit" jobs can be transformational. BPO has propelled literally millions of people into the middle class in both India and the Philippines.
[+] [-] cardigan|6 years ago|reply
Let's talk about Venezuela for a second. ~75% of the population lost >19lb in body weight in a year according to this survey: https://www.upi.com/Top_News/World-News/2017/02/19/Venezuela... It's unbelievable that we haven't figured out how to prevent people from living like that in the 21st century. I think it's totally deplorable and honestly an affront to humanity.
You know what I think the most effective way to combat large scale poverty is? Not by working for an aid agency (we've all heard the horror stories) - instead, how about making those people economically valuable? The internet is an amazing way to reach those people - and guess what - Scale is actually doing that (evidence: these stories). If we continue to grow, we'll be doing that even more.
To be clear, this isn't the main mission of the company - but I don't see how anyone could think it isn't a great side effect. Hence the title of the blog post - positive externalities.
[+] [-] joshvm|6 years ago|reply
One interesting side to this is citizen science; websites like Zooniverse where you can get random members of the public to do your labelling. Technically there's little difference to what Scale does, just that it might be pictures of Pelicans or galaxies rather than cars on the road. In my experience what you're paying for with Scale is consistency. Zooniverse relies on the public being... Gaussian in quality - we needed something like twenty annotations per image I think.
People spend hours going through data and they don't get paid. On Zooniverse most of the data is boring, but the fun is the prospect of finding an unusual image. Some of the datasets are genuinely interesting (like the old ship logs). Many people say they enjoy the idea that they're contributing to science.
> Although pointless tedium is not the most important factor in rating a job, I think data labeling could well be along the most pointlessly tedious jobs ever created.
I don't want to pass judgement either way, but it's an interesting debate. These jobs have been around forever - data entry, for example, used to be (still is?) a fairly common unskilled job in first world countries. And tedium aside, you can at least work from home if you have a computer.
But where is the line? A lot of work I've seen on MTurk is transcription, for example. What about translation/subtitling services? These are labelling tasks in some way or another.
[+] [-] killjoywashere|6 years ago|reply
Are you saying that the tedium itself is engineered to be present? I mean, the work is definitely not pointless. It is as pointed as it gets, as significant as a miner in a mine, a farmer in the field. Monotonous, tedious, yes. Pointless? No.
Now, the idea that the tedium is engineered in would be interesting. That would suggest it could be engineered out. Which is very interesting. You solve that problem, I submit there are some HR folks who would like your contact info.
[+] [-] DevKoala|6 years ago|reply
To clarify, there is no hard numbers around the quantity of people performing this task, for how long they are employed, their average working hours per day, the median satisfaction rating, the median pay per employee, etc. It feels as if only some happy examples were picked.
[+] [-] lopmotr|6 years ago|reply
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] xkcd-sucks|6 years ago|reply
[+] [-] tastroder|6 years ago|reply
If you think about bias in ML models that we've seen in recent years for example: if you let people label 980 images of old white people and 20 diverse ones that's still a heavy bias, even if they spot and label the 2% outliers correctly.
[+] [-] lawrenceyan|6 years ago|reply
https://medium.com/kiwicampus/how-kiwi-empowers-students-in-...
[+] [-] redstone08|6 years ago|reply
[+] [-] Causality1|6 years ago|reply
What's the median hourly pay for a Scale data labeler? What fraction of their employees enjoy working for Scale? What's their six month retention rate? Amazon's Mechanical Turk has a reputation for grinding people up and burning them out; what makes Scale better?
[+] [-] biofunsf|6 years ago|reply
The labeler side of scale is hosted at "remotasks.com" and they run IP checks to try and ensure that only people from certain countries can sign up to label data. Also a Facebook account is required for signup...
[+] [-] UweSchmidt|6 years ago|reply
... while of course not providing with a stable regular job, and not necessarily doing much to strengthen the local economy in a fundamental sense.
[+] [-] sars1996|6 years ago|reply
[+] [-] cardigan|6 years ago|reply
I tried to pick the answers which were most well written, not the ones which were most positive/negative. Personally I do think this is representative of the labelers who've stuck with us, but at this point don't have the energy to argue this; hopefully my other comment is convincing (i.e, there isn't a high bar to making people happy when their next best alternative is a lot worse)
[+] [-] AFascistWorld|6 years ago|reply
[+] [-] _1tan|6 years ago|reply
[+] [-] ddeokbokki|6 years ago|reply