NeurIPS 2022: Qualcomm showcases cutting-edge advancements in machine learning
Collaboration and sharing are key reasons for the astounding pace and multiple vectors of innovation happening in AI. At the 2022 Conference on Neural Information Processing Systems (NeurIPS), AI researchers and engineers from around the world are congregating in New Orleans to share new discoveries, collaborate, and push the industry forward. It’s great to be back in-person, and we’re proud to be sponsoring this year’s event — be sure to see us at booth 408.
Since NeurIPS 2021, we’ve seen amazing work and progress in transformers, large language models, and self-supervised learning. In addition, devices and services powered by edge AI continue to grow and become increasingly integrated in our daily lives. AI experiences that once seemed impossible to run on edge devices are now possible, thanks to full-stack AI optimizations and research. If you're curious about what Qualcomm AI Research has in store at NeurIPS, read on to learn about our latest papers, demos, workshops, talks, and other AI highlights.
Our accepted NeurIPS papers
At academic conferences like NeurIPS, novel papers are a primary way to contribute innovative and impactful AI research to the rest of the machine learning community. I’d like to highlight several accepted papers that span research topics from platform innovations using quantization and combinatorial optimization, to more fundamental research in causality and equivariance.
First, a lot of attention has been recently given to 8-bit floating point formats, as lower bit representations dramatically reduce the computations for machine learning and lead to better latency and power efficiency. Our paper, “FP8 Quantization: The Power of the Exponent,” compares the performance between floating-point and integer formats. We find that the optimal choice of floating-point exponent bits can be quite variable, while overall integer formats can be as effective as floating point when using quantization-aware training. Therefore, when quantizing neural networks for efficient inference, low-bit integers continue to be the go-to format for efficient edge AI.
Second, we explore combinatorial optimization problems, which stem from the memory minimization challenges faced by compilers and placement challenges faced in chip design. Our paper, “Neural Topological Ordering for Computation Graphs,” introduces a novel attention-based graph neural network architecture called “Topoformer,” which can find an optimal topological order on a directed acyclic graph. This model outperforms, or is on-par with, several well-known topological ordering baselines, while being significantly faster when demonstrated on a set of synthetically generated graphs of up to 2k nodes. We also train and test our model on a set of real-world computation graphs, showing performance improvements.
In another paper, “Batch Bayesian Optimization on Permutations using Acquisition Weighted Kernels,” we propose a batch Bayesian optimization method for combinatorial problems on permutations, which is well suited for expensive cost functions. We evaluate the method on several standard combinatorial problems involving permutations, such as quadratic assignment, flowshop scheduling, and the traveling salesman, as well as on a structure learning task.
Third, in moving toward more fundamental research, our team asks whether it is possible to identify high-level concepts and the causal relations between them just from images, without explicit labels. The paper “Weakly Supervised Causal Representation Learning,” indeed shows this is possible when we have images before and after unknown interventions (and no further labels). We introduce a new type of variational autoencoder (VAE) that represents causal structure implicitly, and we demonstrate this causal representation learning in a case of simple image data for simulated robotic manipulation.
Additionally, we continue to study how symmetries can be exploited by models for more efficient deep learning of tasks. For instance, equivariant networks capture the inductive bias about the geometry of a learning task by building the relevant symmetries into the model. Our paper, “A PAC-Bayesian Generalization Bound for Equivariant Networks,” studies how equivariance relates to generalization error.
Cryo-Electron Microscopy (Cryo-EM) is an important imaging method that allows high-resolution reconstruction of the 3D structures of biomolecules from 2D images. Our paper, “On the symmetries of the synchronization problem in Cryo-EM: Multi-Frequency Vector Diffusion Maps on the Projective Plane,” shows how to tackle this task under a group synchronization framework, whereby we propose to first estimate the relative poses of the images and show that this information is sufficient to determine the absolute poses. We validate the recovery capabilities and robustness of our method on randomly generated synchronization graphs and a synthetic Cryo-EM dataset.
Finally, we have the following papers accepted to NeurIPS workshops: “Deconfounded Imitation Learning,” which will be presented at the Deep Reinforcement Learning workshop; "Neural DAG Scheduling via One-Shot Priority Sampling," which will be presented at the Optimization for ML workshop, "Robust scheduling with GFlowNets," which will be presented at the MLSys workshop; and "Decentralized Learning with Random Walks & Communication-Efficient Adaptive Optimization," which will be presented during the Federated Learning Workshop.
Our NeurIPS demos
This year, we continue to bring our AI research to life through demos – often on mobile devices through full-stack optimization – to show its applicability in the real world. At our NeurIPS booth, we are showcasing four new demos this year – three of which were accepted as NeurIPS Expo demos (out of eight total acceptances).
Conditional compute for on-device video understanding (Expo-accepted demo)
Recognizing actions in videos is useful in many industries such as media, security, and health. Traditional deep learning models designed for action recognition process video sequences frame by frame, layer by layer. This leads to accurate results but is compute-intensive, high latency, and power inefficient. Our FrameExit model automatically learns to process fewer frames for simpler videos and more frames for complex ones, saving power and improving performance. Beyond our model architecture innovation, our full-stack optimizations include state-of-the-art quantization techniques with the AI Model Efficiency Toolkit (AIMET) and a novel compiler stack. We demonstrate this on a mobile device, which visitors can hold in their hand and see in real time, experiencing up to 5x reduction in compute and latency (on average) when compared to other methods on commonly used action recognition benchmarks.
NeurIPS 2022 demo: Conditional compute for on-device video understanding
Nov 22, 2022 | 2:11

Efficient real-time INT4 4K super-resolution on mobile (Expo-accepted demo)
Super resolution clarifies, sharpens, and upscales an image to higher resolution for applications like gaming and video playback on high-resolution screens. Although AI-based super-resolution achieves impressive results in terms of visual quality compared to traditional approaches, enabling it in real time on mobile devices has been challenging. To address these challenges, we optimized across the full AI stack, including the algorithm with our Q-SRNet model, the software with AIMET 4-bit quantization, and the Snapdragon 8 Gen 2 hardware with INT4 acceleration. We achieved the world’s first on-device demo of real-time super-resolution using a 4-bit integer model. Moving to 4-bit integer has a dramatic improvement on not only latency but also power consumption. In comparison to INT8, INT4 performance and power efficiency improve by 1.5 to 2 times.
NeurIPS 2022 demo: Efficient real-time INT4 4K super-resolution on mobile
Nov 22, 2022 | 3:42

Real-time on-device 3D reconstruction (Expo-accepted demo)
3D reconstruction is a popular computer vision task that captures the 3D shape and appearance of real objects and scenes from 2D images. In AR for example, 3D reconstruction enables interaction with the environment, virtual object insertion, and collision warning. However, enabling real-time and accurate 3D reconstruction on consumer-friendly AR and VR headsets is challenging. Beyond developing an accurate and efficient monocular depth estimation neural network that is self-supervised, we once again did full-stack optimization through tight software integration across the entire 3D reconstruction pipeline and through 8-bit integer quantization to utilize Snapdragon hardware acceleration. In terms of performance, the relative error of the depth map in the office environment is minimal at 10 percent to 12 percent, and the depth estimation latency is less than 9 milliseconds. The end-to-end on-device reconstruction runs at an interactive rate.
NeurIPS 2022 demo: Real-time on-device 3D reconstruction
Nov 22, 2022 | 2:54

Teach your AI demo through multi-modal few-shot on-device learning
The need for intelligent, personalized experiences powered by AI is ever-growing, such as personalized gesture commands with a digital assistant. Although cloud training can provide this personalization, on-device learning enables it without the need to share personal data and sacrifice privacy. However, teaching these personalized gestures on a mobile device is challenging due to power and compute constraints. To address this, our full-stack AI optimizations included developing an accurate and small gesture recognition model, quantizing the model with AIMET, and utilizing the AI hardware acceleration on Snapdragon. By simultaneously combining a voice command with an arbitrary gesture a few times, the digital assistant learns the new gestures through few-shot on-device learning. We achieve real-time processing at low power while maintaining high accuracy of 95 percent gesture detection.
NeurIPS 2022 demo: Teach your AI
Nov 22, 2022 | 2:25

Other NeurIPS activities
We are also sponsoring the Women in Machine Learning Workshop (WiML 2022), which is co-located with NeurIPS, to support more diversity in the machine learning community. We hope that you can join us there as well.
At Qualcomm Technologies, we make breakthroughs in fundamental research and scale them across devices and industries to power the connected intelligent edge. Qualcomm AI Research works together with the rest of the company to integrate the latest AI developments and technology into our products — shortening the time between research in the lab and delivering advances in AI that enrich our lives.
Interested in joining?

