Dec 3, 2020
Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries.
The Neural Information Processing Systems (NeurIPS) conference, which is the largest annual gathering of AI researchers and engineers, is a time to share new discoveries, collaborate, and push the AI industry forward. Although NeurIPS is virtual this year due to COVID-19, the AI community has continued to drive the industry forward with novel work that will ultimately transform industries and enhance lives. Since NeurIPS 2019, intelligent devices and services have become increasingly integrated in our daily lives to offer significant benefits like enhanced cameras, voice UI, and personalization, creating an ongoing demand for AI research and products. Whether you’re virtually attending this year’s NeurIPS conference or just curious about what Qualcomm AI Research has in store, read on to learn about our latest papers, demos, workshops, talks, and other AI highlights.
Our accepted papers
At academic conferences like NeurIPS, novel papers are a primary way to contribute innovative and impactful AI research to the rest of the community. I’d like to highlight three accepted papers that are advancing research in power efficiency and machine learning fundamentals.
- “Bayesian Bits: Unifying Quantization and Pruning” is a practical method for joint mixed-precision quantization and pruning through gradient based optimization. Advancements in quantization and pruning without sacrificing model accuracy is very important for reducing computation demand and increasing power efficiency. Our results show that we can learn pruned, mixed-precision networks that provide a better trade-off between accuracy and efficiency than their static bit width equivalents.
- “Structured Convolutions for Efficient Neural Network Design” also tackles model efficiency by exploiting redundancy in the implicit structure of the building blocks of convolutional neural networks (CNNs). Structured convolutions allow decomposition of the convolution operation into a sum-pooling operation followed by a convolution with significantly lower complexity and fewer weights. By applying our method to a wide range of CNN architectures, we demonstrate significant improvements, such as 'structured' versions of the ResNets that are up to 2x smaller.
- “Natural Graph Networks” explores fundamental research on neural network algorithms for graph-structured data. Instead of applying conventional graph convolutional networks or message passing networks, where the aggregation of updates is permutation invariant, this work studies a concept of naturality which removes this limitation. It shows how this concept is sufficient for a graph network to be well-defined, opening up a larger class of graph networks. The paper also gives a practical instantiation of a natural network on graphs which uses an equivariant message network parameterization, yielding good performance on several benchmarks.
Our research is not just theory. We bring our AI research to life through real demonstrations. I’d like to share our exciting demos along with their corresponding videos to showcase incredible advancements in power efficiency and compelling applications of AI.
On-device group-equivariant CNN demo
Traditional CNNs are not robust to certain symmetry transformations, such as rotation. Group-equivariant CNNs (G-CNNs) generalize well to a variety of transformations, including arbitrary rotations, but have been too complex to implement on a phone. Equivariant CNNs are compute efficient, data efficient, and offer robust performance. Now, for the first time, we are demonstrating G-CNNs running in real time on a phone. Our demo showcases how a G-CNN performs better than a traditional CNN on a segmentation mapping task to correctly classify each pixel of lymph node tissue scans as benign or malignant. Since tissue scans have no inherent orientation, we slowly rotate the images of the scans in 10-degree increments while running the segmentation task. The G-CNN provides a more accurate, stable, and robust classification.
NeurIPS 2020 Demo: On-device group-equivariant CNNs
Dec 3, 2020 | 4:11
Neural network quantization with AdaRound demo
Qualcomm AI Research has been developing state-of-the-art quantization techniques that enable power-efficient fixed-point inference while preserving model accuracy. Last year we showed a demo with Data Free Quantization, and this year we go a step further with AdaRound to achieve improved accuracy and make 4-bit quantization practical. AdaRound, which stands for Adaptive Rounding, is a post-training quantization technique, so it requires only minimal unlabeled data and no model fine-tuning. Rather than rounding to the nearest value during quantization, AdaRound automates finding the best rounding choice in order to retain model accuracy. Our demo shows two side-by-side examples of applying AdaRound versus a baseline quantization:
- 8-bit weight and 8-bit activation quantization on an object detection model
- 4-bit weight and 8-bit activation quantization on a semantic segmentation model
In both examples, the AdaRound quantized model is much more accurate than the baseline quantized model. In addition, the AdaRound quantized model is much more power efficient than the 32-bit floating point model. For example, 4-bit weight quantization provides greater than 8x increase in performance per watt. Stay tuned for AdaRound to be added to AI Model Efficiency Toolkit (AIMET), which is Qualcomm Innovation Center’s (QuIC) open-source project on GitHub for state-of-the-art model efficiency performance (see more about AIMET later in the blog post).
NeurIPS 2020 Demo: Neural network quantization with AdaRound
Dec 12, 2020 | 4:04
Efficient semantic segmentation of high-resolution video demo
Nowadays, video is the source of a high percentage of data. Analyzing video with AI can provide valuable insights and capabilities for many applications such as autonomous driving, smart cameras, and extended reality. However, as video resolution and frame rate increase while AI models become more complex, running these workloads in real time is becoming more challenging. In this side-by-side demo, we showcase our efficient video semantic segmentation approach to process high-resolution video streams on mobile devices in real time. Our approach compared to the original model reduces the compute complexity from 78 Giga-Multiply-and-Accumulator-Count (GMACs) down to 17 GMACS for a 2048 by 1024 RGB video. In addition, the inference time per frame drops from 74 milliseconds down to 26 milliseconds, which is crucial for latency-sensitive applications.
NeurIPS 2020 Demo: Efficient semantic segmentation of high-resolution video
Dec 3, 2020 | 5:11
With our recently announced Qualcomm Snapdragon 888 Mobile Platform, our 6th generation Qualcomm AI Engine offers more capabilities and processing performance than before, such as 26 Trillion Operations Per Second (TOPS). We are demoing customer applications, such as on-device AI for video upscale and audio noise suppression.
Workshops, socials, talks, and poster presentations
If you are attending NeurIPS, be sure to check our sponsor webpage for our up to date agenda and activities. We hope that you can join us at the following activities:
- Differential Geometry meets Deep Learning Workshop on Dec 11th, 5:45 a.m. - 2 p.m. PT. See our “Gauge Theory in Geometric Deep Learning” talk.
- AIMET Overview Talk at the sponsor webpage on Dec 6th, 10:30 – 11 a.m. PT (Expo Day).
- Bayesian Bits: Unifying Quantization and Pruning (poster session on Dec 9th, 9 - 11 a.m. PT). Interview with the authors at the sponsor webpage on Dec 10th, 10:00 - 10:30 a.m. PT.
- Structured Convolutions for Efficient Neural Network Design (poster session on Dec 9th, 9 -11 p.m. PT). Interview with the author at the sponsor webpage on Dec 10th, 8:40 - 9:10 a.m. PT.
- Natural Graph Networks (poster session on Dec 10th, 9 – 11 p.m.). Interview with the author at the sponsor webpage on Dec 9, 11 - 11:30 a.m. PT.
- Starting/transitioning your career in ML during a pandemic social event on Dec 10th, 12 - 1 p.m. PT. It will discuss the challenges and opportunities that arise from having to start a career in machine learning virtually.
AIMET progress and use in the real world
In a short period of time since being open sourced, AIMET is already helping the wider AI ecosystem and having real-world impact on a variety of industry verticals. For smartphones, AIMET is enabling multiple OEMs and ISVs to speedup camera applications for augmented reality use cases like style transfer and image beautification filters. For automotive, AIMET model quantization is being used on object detection models for advanced driver-assistance systems (ADAS) use cases similar to our demo video. The ability to increase performance per watt without sacrificing much accuracy is crucial for automotive where there are many high-resolution cameras running simultaneously. In the data center, customers can use AIMET’s model compression techniques to boost inferences per second for popular models like Resnet-50, which are backbone for vision tasks. In the future, we look forward to enabling other use cases like audio, speech and video by bringing cutting-edge AI model efficiency research to AIMET.
We hope to (virtually) meet you at NeurIPS or future AI conferences to share our impact on AI. At Qualcomm Technologies, we make breakthroughs in fundamental research and scale them across devices and industries. Qualcomm AI Research works hand-in-hand with the rest of the company to integrate the latest AI developments and technology into our products — shortening the time between research in the lab and delivering advances in AI that enrich lives.
If you’re excited about solving big problems with cutting-edge AI research — and improving the lives of billions of people — we’d like to hear from you. We’re recruiting for several machine learning openings, such as Deep Learning Research Engineer in San Diego or Amsterdam. Join us to help create what’s next in AI.