Back to All
Developer Blog

Driving photorealistic 3D avatars in real time with on-device 3D Gaussian splatting

Our breakthrough research and optimizations are making possible immersive experiences with realistic 3D avatars at 60 FPS on phones and XR devices using 3D Gaussian splatting technology.

The engineering community thought that realistic 3D avatars rendered with 3D Gaussian splatting on a battery-operated device was too computationally expensive for the foreseeable future. At NeurIPS 2024, we showed that this is not true. 3D Gaussian splatting can run in real time on edge devices.

3D Gaussian splatting enables realistic digital twins

Gaussian splatting is an emerging technique for 3D representation due to the realism it adds. The original 3D Gaussian splatting (3DGS) captures some images, using COLMAP to align them, and estimates the splatting parameters using optimization [1].

Recently many papers [2,3] have released their Gaussian splatting avatar work that takes this representation to the next level. The important aspect is to train a 3DGS neural network to estimate the Gaussian splatting parameters of an avatar based on the expression vector as well as the ID of the avatar as conditions to the decoder. This idea [4] was extended with the ability to enroll an avatar from a commodity mobile device, which is a very compelling path to follow.

To represent an avatar, we assume that a retopologized mesh exists for skinning and tracking. The mesh is also assumed to be aligned with a UV map. In sense, each texel of the UV map is a container storing all the splat parameters as shown below.

avatar UV map and corresponding mesh
Figure 1: Avatar UV map and corresponding mesh

Under this concept, the number of splats would correspond to the size of the UV map. For example, a 512×512 UV map would have 262,144 splats. This representation with a large number of splats would provide great quality but is problematic for edge devices since it also requires high compute and data bandwidth. How can we efficiently enable this concept to run Gaussian splatting on edge devices?

Our optimizations to 3D Gaussian splatting

To drive an avatar with facial expressions on device, we developed the flow below. We need a high-fidelity expression encoder to map the image to an expression vector, like blendshapes and gaze vectors. We choose these as they can easily be supported with standards like OpenXR [6]. The decoder would take the expression vector as well as the avatar assets to generate the splats.

To utilize the processing across the Snapdragon platform, we subdivide the computation into different blocks. We run the expression encoder as well as the avatar decoder on the Neural Processing Unit (NPU) of a device powered by Snapdragon. The 3DGS rendering runs on the Graphical Processing Unit (GPU). This way we benefit from the different processors running concurrently. The data flow from NPU to GPU can be managed with the shared memory concept. To reduce the data bandwidth between NPU and GPU, one can easily use existing concepts [7-9].

comparison diagram of processing across Snapdragon platform
Figure 2: Comparison diagram of processing across Snapdragon platform

For the encoder and decoder to run on the NPU, we need to additionally make the AI model compatible with Qualcomm AI Engine direct SDK (e.g., quantized) [10]. To quantize while retaining the model accuracy, we use the AI Model Efficiency Toolkit (AIMET) [11]. As shown in the diagram below, one can first use any ML library to train the 3DGS decoder. Once quality is satisfactory, quantization-aware training (QAT) follows using AIMET [11] to generate a quantized model that can efficiently run on the NPU of an edge device powered by Snapdragon [10].

using any ML library to train 3DGS decoder
Figure 3: using any ML library to train 3DGS decoder

The world’s first demonstration of real-time 3D Gaussian splatting Avatar running on device

With the concept and optimizations just shared, we show in the image and profiling table below, how the overall system can run in real time at 60 FPS on edge devices powered by Snapdragon XR2 Gen 2 and Snapdragon 8 Elite. These numbers correspond to a 512×512 UV map.

Platform

Snapdragon XR2 Gen 2

Snapdragon 8 Elite

Encoder latency (ms)

3.905

1.196

Decoder latency (ms)

13.534

7.58

3DGS renderer latency (ms)

8.85

7.04

The world’s first demonstration of real-time 3D Gaussian splatting Avatar running on device
Figure 4: The world’s first demonstration of real-time 3D Gaussian splatting Avatar running on device

We also show a live video demonstration of the overall system running on a phone equipped with Snapdragon 8 Elite platform where a user can drive various avatars. The models utilized in this demonstration gave permission to Qualcomm Technologies to use their images and corresponding meshes for the purpose of the 3D avatar demonstration.

On-device 3D Gaussian splatting demo

Nov 26, 2024 | 2:18

Video Player is loading.
Current Time 0:00
Duration 2:18
Loaded: 4.34%
Stream Type LIVE
Remaining Time 2:18
 
1x
  • Chapters
  • descriptions off, selected
  • captions off, selected
  • en (Main), selected

What’s next?

Our intent is to make this research a commercial reality. We envision people having truly immersive conversations on XR devices where the lifelike facial avatars make it feel like everyone is in the same room even when you are countries apart.

Let us know what you think! Join our developer community on Developer Discord and sign up for our AI newsletter: What’s next in AI and computing

 

 

 

-------------------------------------------------------------------------

[1]     B. Kerbl, G. Kopanas, T. Leimkühler, G. Drettakis, “3D Gaussian Splatting for Real-Time Radiance Field Rendering”, in SIGGRAPH, July 2023

[2]     S. Saito, G. Schwartz, T. Simon, J. Li, G. Nam, “Relightable Gaussian Codec Avatars”, in CVPR, June 2024

[3]     S. Giebenhain, T. Kirschstein, M. Rünz, L. Agapito, M. Nießner “NPGA: Neural Parametric Gaussian Avatars”, in SIGGRAPH ASIA, Dec. 2024

[4]     J. Li, C. Cao, G. Schwartz, R. Khirodkar, C. Richardt, T Simon, Y. Sheikh, SA. Saito, “URAvatar: Universal Relightable Gaussian Codec Avatars”, in SIGGRAPH Asia, Dec. 2024

[5]     B. Egger, W. Smith, A. Tewari, S. Wuhrer, M. Zollhoeffer, T. Beeler, F. Bernard, T. Bolkart, A. Kortylewski, S. Romdhani, C. Theobalt, V. Blanz, T. Vetter, “3D Morphable Face Models—Past, Present, and Future”, ACM Transactions on Graphics, vol. 39, no. 5, June 2020

[6]     The OpenXR 1.1.42 Specification, https://registry.khronos.org/ OpenXR/specs/1.1/html/xrspec.html#XR_FB_face_tracking, last accessed, Nov. 2024

[7]     M. Sarkis, W. Zia, K. Diepold, “Fast Depth Map Compression and Meshing with Compressed Tritree”, in ACCV, Nov. 2009

[8]     M. G. Kim, S. Jeong, S. Park, J. Han, “Superpixel-guided Sampling for Compact 3D Gaussian Splatting”, in ACM Symposium on Virtual Reality Software and Technology, Oct. 2024

[9]     J. C. Lee, D. Rho, X. Sun, J. H. Ko, E. Park, “Compact 3D Gaussian Representation for Radiance Field”, in CVPR, June 2024

[10] Qualcomm® AI Engine Direct SDK , https://www.qualcomm.com/developer/software/qualcomm-ai-engine-direct-sdk, last accessed Nov. 2024

[11]Qualcomm® AI Model Efficiency Toolkit (AIMET), https://quic.github.io/ aimet-pages/, last accessed, Nov. 2024

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries. AIMET is a product of Qualcomm Innovation Center, Inc.

About the Author
Michel Sarkis
Michel SarkisPrincipal Engineer/Manager
Qualcomm relentlessly innovates to deliver intelligent computing everywhere, helping the world tackle some of its most important challenges. Our leading-edge AI, high performance, low-power computing, and unrivaled connectivity deliver proven solutions that transform major industries. At Qualcomm, we are engineering human progress.

Stay connected

Get the latest Qualcomm and industry information delivered to your inbox.

Subscribe
Manage your subscription

© Qualcomm Technologies, Inc. and/or its affiliated companies.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries. Qualcomm patented technologies are licensed by Qualcomm Incorporated.

Note: Certain services and materials may require you to accept additional terms and conditions before accessing or using those items.

References to "Qualcomm" may mean Qualcomm Incorporated, or subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, research and development functions, and substantially all of our products and services businesses, including our QCT semiconductor business.

Materials that are as of a specific date, including but not limited to press releases, presentations, blog posts and webcasts, may have been superseded by subsequent events or disclosures.

Nothing in these materials is an offer to sell or license any of the services or materials referenced herein.