top | item 17773855

Face detection – An overview and comparison of different solutions

78 points| metaodi | 7 years ago |liip.ch

17 comments

Note that if you use open source models, it’s orders of magnitude cheaper. In my own tests, an off the shelf face DNN-based face detector [1] ran at 20 FPS on a 16 vCPU machine (Google Cloud). With our hand-rolled distributed execution engine [2], we processed around 40 million images for about $2000, which is 14x cheaper than the cheapest figure cited in OP. I don’t know the accuracy difference, but it sounds like OP will cover this in the next post.

[1] https://github.com/davidsandberg/facenet/tree/master/src/ali...

[2] https://github.com/scanner-research/scanner

bsenftner|7 years ago

It's not open source, but our FR software runs on as low as a $100 Intel Compute Stick, while still providing 10+million FR compares per second, at accuracy rates in the 90's. On a 3.4Ghz i7 we provide 24M compares per second. That could be video or stills, acquired from live cameras, media files, or still images, via local/network storage, rtsp and related ip streams, or a REST API. Exports have full customization, able to include video tracked heads, pose corrected heads, and even full 3D mesh reconstructions of each tracked face per frame. If one wanted, video "face swaps" are a piece of cake with our software. Not open source, if the expenses above are typical, we cost far less less than using open source: https://cyberextruder.com/facial-recognition-software/

bsenftner|7 years ago

This is a marketing misdirection; to compare these companies as if they were leaders in facial recognition is suspicious, nearly fraudulent. They are far from it. These companies are merely brand names laypeople recognize. If one wants to know the leading contenders in facial recognition, formal tests are performed and published by the NTSC every year: https://www.nist.gov/programs-projects/face-recognition-gran....

Not to mention the article does not address any real world key points of using FR. Their usage rages are a joke; real world scenarios begin at 100K much sooner than a month, and are typically measured with these numbers per minute or hour. Real world usage of FR at the rates of these services breaks banks.

pintxo|7 years ago

Nice overview. Too bad the test set was rather tiny (33 images). Detection rates in this test:

  Amazon    52.66 % 
  Google    40.43 %
  IBM       39.36 % 
  Microsoft 17.55 %

Barrin92|7 years ago

just for comparison, I used this open source library (https://github.com/ageitgey/face_recognition) on the dataset and even with the cpu model (according to the github page the CNN is more accurate, but I couldn't test it out right now) it identified 59/188 faces.

I'm wondering what's up with the Microsoft result. Some sort of error in processing the images?

kaivi|7 years ago

I've tinkered with face recognition in the past days, and am now waiting for a response to my GCE quota request for a V100 GPU. $8/h rate for a preemptible instance seems really cheap, yet I am not sure if I will ever be able to process my dataset.

Does anyone have a clue on how much it will cost to detect faces and extract 128d encodings from ~100M of 200x200 photos?

symisc_devel|7 years ago

PixLab do offer quota based monthly plans for such task including standard face detection, shape extraction, gender, age and emotion pattern extraction. They do charge $0.9 per 1000 requests after you reach your monthly quota (1.1M API calls).

https://pixlab.io/cmd?id=facedetect.

epberry|7 years ago

I have found Rekognition to be quite good. Maybe the most interesting machine learning service result for me from the past year was IBM being much better than Google at speech to text. Specifically they offered some key features, like speaker attribution, that made their offering standout. I also found their accuracy rates to be very good.

When I first started exploring these apis I just assumed Google would be amazing but that is _not the case at all_. I suspect they save their best stuff for their own products and the api solutions always lag a little behind. IBM may be better because their apis are such a core offering. Microsoft's stuff is a clear afterthought and the only Amazon service I've had success with is Rekognition. That said, Transcribe just launched so it may/will improve with time.

AndrewKemendo|7 years ago

Before we compare the different face detection API’s, let's scan the images first by ourselves! How many faces would a human be able to detect?

This is great! Honestly, most researchers need to start doing this step. Baselining around accessible human capabilities (even if rough) is a super great way to show the benefits or drawbacks of using ML, especially in image processing applications, where it's more directly comparable.

jeffhollan|7 years ago

Interesting but to me the most appealing aspect of Face APIs is around detecting identity or emotions. I don’t care often if a face is present - but I want to know things ABOUT the face which this test didn’t even go into.