Video Analytics is based on four basic concepts:
- Physical awareness is what an object is in a scene, how the scene can be segmented and how the object moves over time.
- Social awareness refers to how people behave, their facial expressions, where they're looking, their articulated pose and how they interact with each other.
- Inference takes into account our physical and social awareness and refers to the things we deduce without direct observation. For example, based on the fact that people are purposefully avoiding a given location, we might infer the presence of a pothole.
- Agency takes into account our physical awareness, social awareness and inferences and refers to certain actions we take in order to achieve a desired outcome. Such actions may be in the form of activating certain devices, issuing a command to an automaton or having an avatar perform a certain action.
The team at GE Research is actively working on video analytics for hospital room monitoring, activity analysis for streets and the inference of social behavior such as rapport and trust for groups of individuals. The end solution is based on a number of C++ video analytics modules that can be composed into a wide variety of applications. The variety of skills from across GE Research that have been leveraged for this project include computer vision, machine learning, statistics, software design and optical system engineering.
The team at GE Research has had a number of break throughs in video analytics. We are the first group to demonstrate the ability to infer group social states such as trust and rapport - this was using the GE Sherlock system funded by DARPA. We developed video analytics for person and vehicle detection that allowed for city-wide deployment at a level never seen before. We are also the first team to combine RGB, Range, Multi-Spectral Imaging, Thermal and Chemical sensing into a handheld probe for the purposes of estimating healing trends for pressure ulcers, sponsored by the US Department of Veterans Affairs.
Other impacts of this technology include the creation of the ReFace system, which allows for reconstructing faces from skulls found at crime scenes; the first to demonstrate iris recognition over a cubic meter with a single camera; the first to demonstrate full 3D reconstruction using a single Helmhotz Stereo pair; and the first to demonstrate the ability to reliably recognize over 500 surgical instruments in a real-time fashion.
Our team's success is predicated on the breadth of our technical vision, our mathematical rigor and our ability to construct real computer vision algorithms for the real world.
Capabilities utilized for Video Analytics project
Developing intelligent robotic systems by integrating robot perception, learning, and motion planningRead more
Enhancing fundamental and applied research to mimic human visualization and interpretationRead more
- Chen, J., Chang, M.C. and Tu, P., 2015, May. A live video analytic system for affect analysis in public space. In 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG) (Vol. 1, pp. 1-1). IEEE.
- Chen, J., Liu, X., Tu, P. and Aragones, A., 2013. Learning person-specific models for facial expression and action unit recognition. Pattern Recognition Letters, 34(15), pp.1964-1970.
- Chen, J., Liu, X., Tu, P. and Aragones, A., 2012, September. Person-specific expression recognition with transfer learning. In 2012 19th IEEE International Conference on Image Processing (pp. 2621-2624). IEEE.
- Chen, J., Chang, M.C., Tian, T.P., Yu, T. and Tu, P., 2015, September. Bridging computer vision and social science: a multi-camera vision system for social interaction training analysis. In Image Processing (ICIP), 2015 IEEE International Conference on (pp. 823-826). IEEE.
- Yang, Y., Chang, M.C., Tu, P. and Lyu, S., 2015, September. Seeing as it happens: Real time 3D video event visualization. In 2015 IEEE International Conference on Image Processing (ICIP) (pp. 2875-2879). IEEE.
- Tu, P., Chang, M.C. and Gao, T., 2016, December. Crowd analytics via one shot learning and agent based inference. In 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP) (pp. 1181-1185). IEEE.