int torsoWidth = bodyWidth - leftHand - rightHand;
Now, when we have hand?s length and torso?s width, we can determine if the hand is raised or not. For each hand, the algorithm is trying to detect if the hand is not raised, raised diagonally down, raised straight or raised diagonally up. All 4 possible positions are demonstrated on the image below in the order they were listed above:
To check if a hand is raised or not we are going to use some statistical assumptions about body proportions again. If the hand is not raised it?s width on horizontal histogram will not exceed 30% of torso?s width, for example. Otherwise it is raised somehow.
// process left hand
if ( ( (double) leftHand / torsoWidth ) >= handsMinProportion ) {
// hand is raised } else {
// hand is not raised }
So far we are able to recognize one hand's position ? when hand is not raised. Now we need to complete the algorithm recognizing exact hand's position when it is raised. And to do this we?ll use theVerticalIntensityStatistics class, which was mentioned before. But now the class will be applied not to the entire object?s image, but only to the hand?s image:
// extract left hand's image
Crop cropFilter = new Crop( new Rectangle( 0, 0, leftHand, bodyHeight ) );
Bitmap leftHandImage = cropFilter.Apply( bodyImageData );
// get left hand's position
gesture.LeftHand = GetHandPosition( leftHandImage );
The image above contains quite good samples and using above histograms it is quite easy to recognize the gesture. But, in some cases we may not have such clear histograms like the ones above, but some noisy histograms, which may be caused by light conditions and shadows. So before making any final decision about the raised hand, let?s perform two small preprocessing steps of the vertical histogram. These two additional steps are quite simple, so their code is not provided here, but can be retrieved from the attached to the article source code.
1) First of all we need to remove low values from the histogram, which are lower then 10% of maximum histogram?s value, for example. The image below
demonstrates a hand?s image, which contains some artifacts caused by shadows. Such type of artifacts can be easily removed by filtering low value on the histogram, what is also demonstrated on the image below (the histogram is filtered already).
2) Another type of issue, which we also need to take care about, is a ?twin? hand, which is actually a shadow. This also could be easily solved by walking through the histogram and removing all peaks, which are not the highest peak.
At this point we should have quite clear vertical histograms, like the ones we?ve seen before, so now we are few steps away from recognizing the hands gesture. Let?s start with recognizing straight raised hand first. If we take a look at the image of straight hand, then we may make one more assumption about body proportions ? length of the hand is much bigger than its width. In the case of straight raised hand its histogram should have quite high, but thin peak. So, let?s use these properties to check if the hand is raised straight:
if ( ( (double) handImage.Width / ( histogram.Max - histogram.Min + 1 ) ) >
minStraightHandProportion ) {
handPosition = HandPosition.RaisedStraigh; } else {
// processing of diagonaly raised hand }
(Note: Min and Max properties of Histogram class return minimum and maximum values with non-zero probability. In the above sample code these values are used to calculate the width of the histogram area occupied by hand. See documentation to AForge.Math namespace).
Now we need to make the last check to determine if the hand is raised diagonally up or diagonally down. As we can see from histograms of raised diagonally up/down hands, the peak for the diagonally up hand is shifted to the beginning of the histogram (to the top in the case of vertical histogram), but the peak of the diagonally down hand is shifted more to the center. Again we can use this property to check the exact type of raised hand:
if ( ( (double) histogram.Min / ( histogram.Max - histogram.Min + 1 ) ) <
maxRaisedUpHandProportion )
{
handPosition = HandPosition.RaisedDiagonallyUp; } else {
handPosition = HandPosition.RaisedDiagonallyDown; }
We are done! Now our algorithm is able to recognize 4 positions of each hand. Applying the same for the second hand, our algorithm will provide next results for those 4 hands gestures, which were demonstrated above:
? ? ? ?
Left hand is not raised; Right hand is not raised;
Left hand is raised diagonally down; Right hand is not raised; Left hand is raised straight; Right hand is not raised; Left hand is raised diagonally up; Right hand is not raised.
If two not raised hands is not considered to be a gesture, then the algorithm can recognize 15 hands gestures, which are combination of different hands positions.
Conclusion
As we can see from the above article, we got algorithms, which, first of all, allow us to extract moving object from a video feed, and, the second, to recognize
successfully hands gestures demonstrated by the object. The recognition algorithm is very simple and easy as in implementation, as in understanding. Also, since it is based only on information from histograms, it is quite efficient in performance and does not require a lot of computational resources, which is quite important in case if we need to process a lot of frames per second.
To make the algorithms easy to understand we?ve used generic image processing routines fromAForge.Imaging library, which is part of AForge.NET framework. This means that going from generic routines to specialized (routines which may combine several steps in one) it is easily possible to improve performance of these algorithms even more.
Concerning possible areas of improvements of these algorithms, we may identify next areas:
? ?
More robust recognition in case of hands? shadows on walls;
Handling of dynamic scene, where different kind of motion may occur behind the main object.
相关推荐: