Close your eyes. The world is a very different place. The old cliché that a picture is worth a thousand words to humans is actually true, if not an understatement. Vision provides so much context. What if a picture was worth a thousand words to your smartphone and other devices? Right now, the camera on our smartphone captures a high-quality image, but what if it could understand the image and not just see it as millions of individual pixels? What if it could recognize objects, figure out context, draw conclusions, and then use that information to take an action (like your camera automatically changing settings to take the perfect picture for a given scene)? That’s where we are heading as Qualcomm brings cognitive technologies to life. And visual perception is a key ingredient.
Visual perception will unlock many new possibilities for our devices. Your smartphone will make travel easier through augmented reality by describing landmarks around you and translating street signs written in a foreign language. Drones will be able to autonomously explore and inspect unsafe areas, instead of humans. Robots will be able to assist us in our daily tasks, whether it is cleaning the house or organizing the garage. And autonomous cars will be able to independently drive themselves, identifying hazards, signs, traffic signals, and pedestrians, making our commuting lives easier and more productive.
So what’s the best approach for visual perception? Deep learning-based approaches are repeatedly demonstrating state-of-the-art results in visual perception tasks. Deep learning networks have been primarily run in the cloud, where server farms with almost unlimited compute resources have the luxury of being plugged into the wall. However, cloud-based visual perception is simply not feasible for the requirements of some of the mobile device applications I described above. Would you really want your car relying on the cloud when it needs to make a split second decision to avoid an accident? On-device deep learning is required for low latency and reliability. The big question is how you can run the compute-intensive deep learning networks within the power, thermal, memory bandwidth, and compute constraints of the mobile environment.
To find out the answer, be sure to check out Jeff Gehlhaar’s (Vice President, Qualcomm Research, a division of Qualcomm Technologies, Inc.) upcoming presentation about "Deep-learning-based Visual Perception in Mobile and Embedded Devices: Opportunities and Challenges" at the Embedded Vision Summit on May 12th. He’ll discuss innovative approaches Qualcomm Research has taken to make it possible to efficiently run deep-neural networks on Qualcomm Snapdragon processors. We’ll also be showing live demos of on-device deep learning in the exhibit area.
Also, watch for future blogs and sign up for our newsletter to receive the latest information about mobile computing.