OnQ Blog

How advances in computer vision are moving AI forward

In a previous article, we touched on the unexpected ways in which trained computer vision models perceive reality when they’re not fed the correct data. Neural networks (i.e., AI models loosely built to operate like a human brain) are only as good as the data they learn from. Here, in part two, we’ll discuss the more serious implications that computer vision has on society and how it can complement our day-to-day lives.

Improving computer vision’s accuracy

Increasingly, deep neural networks are getting better and better at image recognition, but by no means are they perfect. Researchers from Cornell University and the University of Wyoming tried to gain a deeper understanding of the difference between human and computer vision by attempting to trick image classifying algorithms into seeing things that weren’t there.

The group used a trained neural network called AlexNet, which had previously achieved quite accurate results in image recognition. They asked a version of the software to create an image of a guitar by generating random pixels across an image. This version had no previous knowledge of guitars. They then asked a second version of the network, which had already been trained to recognize guitars, to rate the image made by the first network. This confidence rating was then used to refine the first network’s next attempt at creating a guitar image. After thousands of rounds between the two networks, the first network made an image that the second network recognized as a guitar with 99 percent confidence. This shows the great potential of neural networks to improve continually and produce results using less data than they previously needed. This will be especially important for fields where an organized data set may not be available.

Making security visual

With sufficient training data and a better overall understanding of exactly how devices and machines see the world, future applications of computer vision could be immense. What might be possible?

If we train computers to recognize our physical traits, this could transform the way we think about our security and privacy. Computer vision could facilitate a shift toward using iris and fingerprint scans to manage access to restricted areas and buildings as well as retrieve our medical or criminal records. This does bring with it potential data privacy issues — that’s why Qualcomm is focusing its efforts on on-device AI, which significantly reduces risk by not sending user data to the cloud. Access to our records would not be based on the information that we remember or the key that we carry. This will make it easier to access our records, minimize scenarios where patients are misidentified, and simplify the process of restricted access to high-security areas. This would be a major step forward in making us safer. Think of how this could improve security at the airport or at large events like music festivals.

Computer vision is already playing a role in keeping us safe online, and in the future, it’ll play an increasingly vital role. Facebook, for example, updated its policies last year explaining how it’s using AI, specifically computer vision, to combat the spread of terrorist messages. And it’s not just Facebook. Other major players in social media, like YouTube and Twitter, are all working together to use image matching technology as part of their strategy to reduce the spread of terrorist messages. Whenever a person attempts to upload terrorist content, AI systems analyze whether the image matches a known terrorist photo or video. So, if a terrorist image has been removed previously, other accounts will be prevented from uploading the same image to Facebook. Thus, in many cases, terrorist content intended for upload simply never reaches the platform. In the future, computer vision will play an increasingly important role in our security, both in public and online.

Making sure that the visual data that we share with our devices is kept private is of utmost importance to Qualcomm. That’s why Qualcomm Research Netherlands developed a Computer Vision Module, which mitigates privacy concerns in smart toys for kids.

Optimizing infrastructure

Computer vision could also have a significant impact on the way our cities are planned and built, making them more efficient and safe. Construction workers could benefit from augmented reality blueprints and plans throughout the building process. This would make the building experience more visual, allowing construction workers to work with greater accuracy without having to consult multiple plans after each step. The building process would become faster and more efficient, lessening the potential for errors that could cause problems further down the line. This, in turn, would improve the quality of buildings, which would be safer and sturdier as a result.

The quality assessment of surfaces, such as roads and pavements, could also be enhanced with computer vision. Through a combination of human expertise and the superhuman seeing power of a machine, the quality inspection of these surfaces would become more accurate, as human error would be minimized. Computer vision would be able to detect complex defects in surfaces faster and more accurately and thus improve the quality of our roads.

In addition, computer vision can generate data that can be used to revolutionize the way we manage traffic in major metropolitan areas. Traditionally, dangerous spots could only be identified after accidents had already occurred. However, as computer vision also identifies and classifies near-miss situations, we can take a more proactive approach to avoid collisions. The algorithm will be able to pinpoint the most dangerous intersections, as well as produce information that will be helpful for preventing crashes such as the days and hours that carry the most risk. Authorities can then use this data to pinpoint the most hazardous locations. This knowledge will then influence decision-makers in their planning of roadways and other infrastructure, allowing them to avoid building particularly hazardous intersections. The way our cities are planned and built could be shaped by computer vision. In achieving this future, imagine how far we’ll have come from the Not hotdog app.

Toward complementing human expertise

As it advances, computer vision has the power to take away many hassles in our lives so that we can focus on the things that matter most.

But realizing this future requires a greater understanding of how devices and machines see the world. We need properly trained models and relevant data to be able to achieve greater levels of accuracy. So, while the funnier applications and hiccups of computer vision tend to make the news, this field of research has the potential to optimize many aspects of our lives. That’s why Qualcomm is committed to making on-bringing AI on device AI ubiquitous and is researching more efficient hardware for AI. If you’re a developer interested in building the next big computer vision application, check out our Snapdragon Neural Processing Engine SDK.


This article was written in collaboration with former Qualcomm intern Megan Smith.

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.