I used Haar - cascade detection to find the face in the video feed. Once the face is found using apparent focal length the distance from the camera is being computed and overlaid on the video feed. The distance computed is pretty accurate, I did cross check though not exhaustively. It works well even if you have multiple faces in the video feed.
About Haar-cascade classifier from OpenCV website:
"Object Detection using Haar feature-based cascade classifiers is an effective object detection method proposed by Paul Viola and Michael Jones in their paper, "Rapid Object Detection using a Boosted Cascade of Simple Features" in 2001. It is a machine learning based approach where a cascade function is trained from a lot of positive and negative images. It is then used to detect objects in other images."
One limitation that I found using the inbuilt classifier is that it is not that great at finding faces that have a slight roll or yaw, refer Fig 1. It does pretty well on faces having large roll though.
If you see the video from 0:12 to 0:14 , you can see the limitation I am referring to.
Calculation of the focal length of the camera.
In order to determine the distance from our camera to a known object or marker, we are going to utilize triangle similarity.
The triangle similarity goes something like this: Let’s say we have a marker or object with a known width W. We then place this marker some distance D from our camera. We take a picture of our object using our camera and then measure the apparent width in pixels P. This allows us to derive the perceived focal length F of our camera:
F = (P x D) / W
As I continue to move my camera both closer and farther away from the object/marker, I can apply the triangle similarity to determine the distance of the object to the camera:
Form a known P, D and W we can calculate F, then use that and find the distance D' for an object with P pixel width.
D’ = (W x F) / P
Commentaires