I'm waiting for journalists to walk around with google glass type device to do this on the fly. Bonus it could record what they see and hear for later use.
I always think the future of journalism will be something like in Garth Ennis's 'Transmetropolitan' where there are camera (drones) everywhere, watching everything and an escalating tension (and maybe technical arms race) between those hiding / burying the signal and those trying to bring them to light.
Consider this a recommendation for anyone looking for a inspiring (though somewhat adult) tale of near-ish future sci-fi.
I was hoping to read an article about NYTimes setting up video cameras outside of popular restaurants in DC and using ML to perform facial recognition on everyone to try to find members of congress and well known lobbyists. oh well... it would be like TMZ-4-DC
The author says that training their own model would have been too hard due to lack of training data, but evidently Rekognition had sufficient training data to make it work? Why can't NYT use the same training set Rekognition uses? Does Amazon somehow have a secret non-public collection of celebrity photos?
It shouldn't take an intern too long to collect a representative set of Congress people and other high officials for training. Maintaining it would not be an undue burden. That would eliminate the false positive matches for all the unwanted celebs. Clearly Amazon's models aren't that great to begin with so there's little reason to stick with them.
Wrap it up into a simple native app and you can bypass the MMS BS. Even better, a sufficiently capable dev could integrate an opensource recognition library [1] to have it entirely implemented on the device.
Rekognition crawled and annotated millions of images of different celebrities to train their face recognition model. Once you have an accurate model for a lot of classes it's much easier to add new ones with just a few samples.
I have wanted for awhile to build a site which trained a machine learning system on the various data made available surrounding Congresspeople and information on members which were eventually found to be guilty of adultery or other similar crimes - then produce a score for every member of Congress rating how likely it is that they are cheating on their spouse, or taking bribes, or similar. Give them a sneak preview into the types of systems they are aiding and abetting in the creation of. I am uncertain of whether it could be considered defamation to have a brainless machine learning system decide there's an 85% chance some random member of Congress is an adulterer. I don't actually believe that any such system could ever reach any reasonable level of actual effectiveness due to the fundamental complexities of human behavior and circumstance, but that's not stopping the law enforcement side of things from moving forward so I don't see why it ought to stop the side trying to point out fundamental flaws in the strategy.
I've considered something like that, but instead of trying to figure out crimes, it would produce a score for bills.
A corruption score for bills, almost like a facebook for bills "This bill is friends with Exxon". It would figure out who spent the most getting the bill passed, and who they bought off to get it.
Just a simple thing for people to point to when they say things are corrupt. Granted in today's environment, that score would be 100% most of the time, but it would be interesting to have some idea just who bought the bill.
I’d take it a step further and ingest all public record data including using FOIA requests to find any behavior that could have a representative charged with a crime (fraud, bribery, etc).
As sibling comment said, don’t generate an adultery score. That’s not productive or decent. Find actual evidence of wrongdoing, not draconian scoring systems.
It's disturbing to me that you're so focused on adultery, which isn't a crime in most places and is a personal matter for the couple involved. More than 70% of people cheat on a significant other at some point, so you'd be casting a wide net.
Why not instead look at real crimes like pay-for-play, fraud, sexual assault, etc.?
> I don't actually believe that any such system could ever reach any reasonable level of actual effectiveness due to the fundamental complexities of human behavior and circumstance...
Absolutely it could - that would all be factored into the percentage. Human behavior and chance encounters are the exact reason you could never say 0% or 100%, however.
This is an embarrasingly bad approach to face recognition for a small set of frequently photographed people.
Several comments from the article give me concern
- They seem to think Rekognition is a panacea for their problem, but there are many known issues with Rekognition celebrity detection. Not to mention that the cost-per-request is often highly unfavorable compared with building a higher-accuracy, situation-specific solution with extensions to pre-trained models.
- They say some interns took a “novel approach” by creating a hard coded look-up table for disambiguating similar politician-celebrity pairs. This creates awful tech debt and failure cases. I’m not knocking it too hard because it’s pragmatic, which is a good sign about those interns, but this should be seen as a necessary wart to be improved, not a point of pride.
- As others have pointed out, even considering turnover in Congress, it seems like people who report on Congress for their full time job should recognize them. It truly seems like a silly, wasteful use of resources to solve this with computer vision.
This is all consistent with what I’ve heard from colleagues at NYT data science. As well as people I’ve known in data science bootcamps around New York, like Insight, who heard recruiting pitches.
Their department seems self-aggrandizing, using highly overwrought personalization models and seeming to have 538-envy for how they want their data science work to come off despite 538 exiting, among other important figures like Mike Bostock.
It just comes off as a place that wants to do status signalling to seem like a machine learning or data science thought-leader, but they don’t pay competitively or do what’s needed to retain good people and would rather do patchwork stuff like this with interns than to take the work a little more seriously.
I don’t get the impression it’s a place serious ML practitioners would want to go.
Isn't this the same technology that would allow surveillance on every private citizen?
> Most recently, Rachel Shorey found members of Congress at an event hosted by a SuperPAC by trawling through images found on social media and finding matches.
I bet nothing in the technology says "member of Congress" or depends on the target being member of Congress. So anybody can mine social media and collect surveillance data on people. And that is probably already happening.
There's a difference between "difficult" and "can't be done". Yes, facial recognition has come a long way, but it's still non-trivial to set up a custom facial recognition service for your particular needs.
the obvious next step to this would be to build a mobile app with a built-in model to recognize everyone deemed important using live video from the camera.
Hmmm ... your job is to cover the actions of 540 people elected to DC, many of whom you already recognize, and you can't remember what they look like? I'm not a journalist, but that seems like an essential thing to memorize, along with some minor metadata (locale, party, a bit of bio). Spend a weekend and do it.
Every profession has things you can look up and things you just have to memorize. 540 people isn't much - can sports journalists recognize 540 athletes? Otherwise you'll be in situations where you don't have an opportunity to look them up (e.g., can't get a photo, no time, etc.), and you'll have many false negatives: If you don't know what they look like, you won't realize it's a member of Congress at the party with the coke.
As the article states up top, there's decent churn in Congress, making this more than a one-time or annual thing. Also, it's not just members of Congress who are important to cover in a beat, but their senior staff members and aides.
Spending a significant amount of time developing a process for face memorization and undertaking it would be an example of needless/premature optimization, especially for people who may be covering Congress tangentially. Most of a Congress reporter's job does not depend on having random encounters with members of Congress.
Well... I don't know if that's a fair comparison. Members of Congress don't generally walk around with their names embroidered on their shirts (but, hey, that might be a good idea!)
[+] [-] davidkuhta|7 years ago|reply
On that note, they could utilize the box color to match the party affiliation.
[+] [-] danschumann|7 years ago|reply
[+] [-] kingbirdy|7 years ago|reply
[+] [-] gordon_freeman|7 years ago|reply
[+] [-] nkassis|7 years ago|reply
[+] [-] baldeagle|7 years ago|reply
[+] [-] SurrealSoul|7 years ago|reply
[+] [-] IncRnd|7 years ago|reply
[+] [-] rhacker|7 years ago|reply
[+] [-] Maxious|7 years ago|reply
> Rachel Shorey found members of Congress at an event hosted by a SuperPAC by trawling through images found on social media and finding matches.
[+] [-] ericsoderstrom|7 years ago|reply
[+] [-] kevin_thibedeau|7 years ago|reply
Wrap it up into a simple native app and you can bypass the MMS BS. Even better, a sufficiently capable dev could integrate an opensource recognition library [1] to have it entirely implemented on the device.
[1] https://github.com/rudybrian/tuFace
[+] [-] m_ke|7 years ago|reply
[+] [-] AdmiralAsshat|7 years ago|reply
(And no one else)
[+] [-] 2RTZZSro|7 years ago|reply
[+] [-] Isamu|7 years ago|reply
[+] [-] reaperducer|7 years ago|reply
(See previous HN discussion)
[+] [-] jonknee|7 years ago|reply
[+] [-] jeremyjbowers|7 years ago|reply
[+] [-] otakucode|7 years ago|reply
[+] [-] cachemiss|7 years ago|reply
A corruption score for bills, almost like a facebook for bills "This bill is friends with Exxon". It would figure out who spent the most getting the bill passed, and who they bought off to get it.
Just a simple thing for people to point to when they say things are corrupt. Granted in today's environment, that score would be 100% most of the time, but it would be interesting to have some idea just who bought the bill.
[+] [-] toomuchtodo|7 years ago|reply
As sibling comment said, don’t generate an adultery score. That’s not productive or decent. Find actual evidence of wrongdoing, not draconian scoring systems.
[+] [-] smacktoward|7 years ago|reply
Adultery is not a crime.
(You can argue that it's an indicator of a person's character, or lack thereof, sure. But that's something different.)
[+] [-] smt88|7 years ago|reply
Why not instead look at real crimes like pay-for-play, fraud, sexual assault, etc.?
[+] [-] deaps|7 years ago|reply
Absolutely it could - that would all be factored into the percentage. Human behavior and chance encounters are the exact reason you could never say 0% or 100%, however.
[+] [-] danso|7 years ago|reply
[+] [-] sbarker|7 years ago|reply
[+] [-] dominotw|7 years ago|reply
Sounds like a really mean spirited thing to do. They are people too.
[+] [-] mlthoughts2018|7 years ago|reply
Several comments from the article give me concern
- They seem to think Rekognition is a panacea for their problem, but there are many known issues with Rekognition celebrity detection. Not to mention that the cost-per-request is often highly unfavorable compared with building a higher-accuracy, situation-specific solution with extensions to pre-trained models.
- They say some interns took a “novel approach” by creating a hard coded look-up table for disambiguating similar politician-celebrity pairs. This creates awful tech debt and failure cases. I’m not knocking it too hard because it’s pragmatic, which is a good sign about those interns, but this should be seen as a necessary wart to be improved, not a point of pride.
- As others have pointed out, even considering turnover in Congress, it seems like people who report on Congress for their full time job should recognize them. It truly seems like a silly, wasteful use of resources to solve this with computer vision.
This is all consistent with what I’ve heard from colleagues at NYT data science. As well as people I’ve known in data science bootcamps around New York, like Insight, who heard recruiting pitches.
Their department seems self-aggrandizing, using highly overwrought personalization models and seeming to have 538-envy for how they want their data science work to come off despite 538 exiting, among other important figures like Mike Bostock.
It just comes off as a place that wants to do status signalling to seem like a machine learning or data science thought-leader, but they don’t pay competitively or do what’s needed to retain good people and would rather do patchwork stuff like this with interns than to take the work a little more seriously.
I don’t get the impression it’s a place serious ML practitioners would want to go.
[+] [-] smsm42|7 years ago|reply
> Most recently, Rachel Shorey found members of Congress at an event hosted by a SuperPAC by trawling through images found on social media and finding matches.
I bet nothing in the technology says "member of Congress" or depends on the target being member of Congress. So anybody can mine social media and collect surveillance data on people. And that is probably already happening.
[+] [-] asdsa5325|7 years ago|reply
[+] [-] unknown|7 years ago|reply
[deleted]
[+] [-] djhworld|7 years ago|reply
[+] [-] DINKDINK|7 years ago|reply
Someone is woefully ignorant how good facial-recognition surveillance is.
[+] [-] SmooL|7 years ago|reply
[+] [-] evan_|7 years ago|reply
[+] [-] dqpb|7 years ago|reply
[+] [-] rootsudo|7 years ago|reply
[+] [-] shozab|7 years ago|reply
[deleted]
[+] [-] EmilyHealth|7 years ago|reply
[deleted]
[+] [-] forapurpose|7 years ago|reply
Every profession has things you can look up and things you just have to memorize. 540 people isn't much - can sports journalists recognize 540 athletes? Otherwise you'll be in situations where you don't have an opportunity to look them up (e.g., can't get a photo, no time, etc.), and you'll have many false negatives: If you don't know what they look like, you won't realize it's a member of Congress at the party with the coke.
[+] [-] danso|7 years ago|reply
Spending a significant amount of time developing a process for face memorization and undertaking it would be an example of needless/premature optimization, especially for people who may be covering Congress tangentially. Most of a Congress reporter's job does not depend on having random encounters with members of Congress.
[+] [-] jonas21|7 years ago|reply
Well... I don't know if that's a fair comparison. Members of Congress don't generally walk around with their names embroidered on their shirts (but, hey, that might be a good idea!)
[+] [-] nlawalker|7 years ago|reply
[+] [-] ThrustVectoring|7 years ago|reply