In August this year, Lei feng's network (search for "Lei feng's network" public concern) (search for "Lei feng's network" public concern) (search for "Lei feng's network", public interest) in Shenzhen will hold an unprecedented and global reach of artificial intelligence and robotics Innovation Conference. When Lei feng's network will publish "artificial intelligence and robot Top25 list of innovative enterprises". At present, we are visiting related companies in the area of artificial intelligence, robots, sift through them eventually elected to the list list of companies. If you want to join our list, please contact: 2020@leiphone.com.
In 2013, when, startup Leap PC-oriented end released a Leap after the Motion, the first consumer-level gesture recognition is introduced into a market. But for now, gesture recognition does not seem to be on the PC end outbreak trends, by comparison, VR has promoted the development of the technology in the field.
This issue hard to create open courses we invited two guests to answer them on fish technology and gesture recognition questions they are fish technology &CEO, founder, former co-founder Fang Wenxin 360 intelligent cameras, and fish lead, China University of mining and technology algorithms master, computer vision expert, former head of the virtual reality algorithm Arron.
Gesture recognition for VR, what does it mean? What are the scenarios?
VR and AR is recognized as the third-generation computing platforms, but each generation of computing platforms requires a complete set of interactions, such as a PC mouse, in the touch screen of the iPhone, VR+AR to the gesture.
Needless to say, is the most natural way of interaction, VR glasses on naturally is to reach.
No general innovation in human-computer interaction, VR+AR is unlikely to be the next-generation computing platforms, only out of the handle out of the game (10 billion), into people's work and life (100 billion), instead of the computer instead of the phone closest to the everyone the most powerful portable information terminal nodes.
For example, failure of Nokia and Apple's success, the difference is that the latter on human-computer interaction more natural experience better, one using a resistive touch screen uses capacitive touch screen: the former was overturned, of course, there is a complete set of UI design, human-computer interaction design, App and game support.
Subversion of the PC is not a PC variant, Subversion is not a iPhone variant of the iPhone, the next-generation computing platforms must be away from our more recent VR, AR, MR!
And for next-generation computing platforms, we think gesture recognition + speech recognition as a supplement + combinations of artificial intelligence voice Assistant is the best human-computer interaction program.
In theory, the possible areas of application VR/AR, gesture recognition is also possible, for example video, gaming, social networking, architecture, design, experimental, military, education, tourism, holographic interactive control, and so on.
Gesture recognition, gesture recognition, facial recognition, object recognition, what is the difference?
These identification programmes in hardware (such as sensor mode) is generally the same.
And from a technical perspective, they have a few similarities, are required for object extraction, feature recognition and detection, reconstruction and other steps. Of course, if you want to enhance the effect of recognition, gesture recognition is bound to integration of machine learning algorithms, so you can offline, online way to optimize the identification feature is not so that you can enhance the identification of efficiency and accuracy.
Gesture recognition
Gestures recognition and attitude recognition, and people face recognition, and objects recognition of differences main reflected in application scene: gestures recognition currently more for human-computer interaction; people face recognition can application Yu movie in the of animation expression reconstruction, addition in security field application more; attitude recognition is main with in body sense game, for example Kinect; objects recognition of application on more has, for example network shopping real-time draws commodity, furniture model,.
Gesture recognition what is the path? Technology is like?
Gesture recognition programme now has four main types: first is a mechanical gesture recognition, for example DExmo; the second, inertial sensors, Ahrs nine axis noitem catching glove is the third bending sensor is based on the programme; final is the most natural of gestures, gesture recognition based on Visual, such as leapmotion, ThisVR and Kinect.
We mainly talk about programmes based on vision.
In accordance with the structure and data sources to differentiate, can contain four categories: RGB camera, infrared +IR binocular camera fill light, light coding structure of infrared light, depth of ToF camera.
+IR infrared binocular camera fill light is a popular programme. It features images of good quality, the target easier to extract, clean background, through a pair of targets will be able to achieve very good gesture edge three dimensional reconstruction. To leap three-dimensional reconstruction of principle of motion, for example:
Binocular camera program principles
It applied a special infrared light, the infrared camera joined the band a collection of narrow-band bandpass filter, first for object extraction, following adoption of the calibration of two cameras, combined with matching good about visiting the corresponding feature points.
Because binocular camera calibration value is at about the time difference can reach the small one by one, after this three-dimensional reconstruction and matching has been a big help.
In addition, the binocular camera now uses more technologically sophisticated CMOS sensor, resolution and frame rate (easily up to 100 frames) can reach very high levels. Moschino Galaxy S5 Case
But binocular camera of shortcomings is need for algorithm processing Hou to get three dimensional information, because currently of frame rate is high, has can achieved is good of track effect, but it of infrared fill light and makes this programme cannot in strong light or and it same band of light Xia using, because too Sun is full band spectrum, so binocular camera programme in day outdoor environment Xia Basic cannot using.
Principle of ToF
Light infrared structured light coding is also facing the same problem. By contrast, depth of ToF cameras just to make up for this short was plate, you can understand that it is a laser the front through of the transmit and receive light signals phase difference, calculate the depth value directly, good light resistance such schemes, can be used in indoor and outdoor.
Light coding and comparison with ToF
In fact, gesture recognition is a single problem, no matter which proposal, after dismantling segments of analysis and algorithm realization, as right-hand man distinguish, wrists and segmentation of the Palm, front, side, and rear of recognition, finally, finger ID recognition.
Gesture recognition function and wear gloves the same? Moschino Galaxy Case
Actually gesture recognition and wearing gloves are complementary relationships, play games or grip gloves are more suitable, because strong feedback but can handle only game industry for 30 years.
Gesture recognition in the future scenario is not the main game. Back to the VR/AR, they will become the next generation of computing platforms, deeper into the work and life of the masses, is the need for a common human-computer interaction, and human-computer interaction is not only in the field of games or video, imagine into the handle or glove is what kind of scene ... Realization of gesture recognition is to be hands free, without any equipment on hand will be able to achieve the most natural human-computer interaction.
If you compare with market space, size of the gaming industry is only about $ 10 billion, and reaching out to every corner of my work and life: Office, home, daily life, education, tourism and so on, is the trillion-dollar levels of market.
Therefore, we believe that the gesture recognition, voice recognition as a supplement is a third-generation human-computer interaction style.
Gesture recognition from the popularity and how long?
Now, of course, gesture recognition technology is not yet mature.
To our own problems encountered, for example, model of accumulation of gestures at this stage is still relatively small, although tens of thousands of hand-picked, computer modeling also have millions of, but this is not enough, if you want to reach fully usable to upgrade at least 10 times to hundredfold amount involves computation and bandwidth issues at this time.
Strictly speaking, the model libraries, coupled with good feature selection and feature dimension reduction techniques, deep learning system more complete learning more efficient, manufacturers training identification of matrix better, matches the corresponding vendor recognition accuracy and high accuracy, more strong, more suitable with a variety of people of different age, fat, thin.
So the future premise of popularization of gesture recognition is solving the above problems.