Scene-based Audio for MPEG-H

Designed to help content creators, content hosts, consumer electronics manufacturers and broadcasters create, capture and render true-to-life immersive 3D audio experiences, Scene-based audio technology for MPEG-H is designed to overcome the key limitations of traditional audio formats.

Webinar: The Next-Gen Technologies Driving Immersion

May 14, 2018


Scene-based audio for highly immersive experiences

Jun 23, 2016


The difference is in the details.

The Scene-based audio format is designed to represent the audio scene as a field of pressure values at all points in a space over time. This is engineered to be an absolute and true representation of the 3D sound-scape. It is also independent of what equipment is used to record and playback the sound.

See for yourself all the ways MPEG-H and Scene-based audio (also known as higher order ambisonics or HOA) are revolutionizing the future of immersive audio.

  • True 3D Sound: Scene-based audio captures and creates 3D sound scenes with a true-to-life re-creation of the 3D audio experience including proximity and height.

  • Compatible Workflows: Scene-based audio for MPEG-H works with existing TV broadcasting and broadcast streaming workflows.

  • Universal Format: Audio production in multiple mix formats (such as stereo, 5.1, 7.1 or 7.1.4 surround sound) isn’t necessary anymore because of Scene-based audio’s comprehensive sound field representation. The MPEG-H system also allows for maximum flexibility by allowing Scene-, Objects- and Channel-based audio formats to be transmitted simultaneously.

  • Flexible Application: Standardized from the same group (MPEG) that is responsible for mp3 and AAC audio codecs, as well as H.264 and HEVC video codecs, MPEG-H Scene-based audio delivers vivid 3D soundscapes to a variety of technologies like smartphones, tablets, STB, speaker bars, smart TVs, AV receivers and more.

  • Flexible Rendering: Scene-based audio is a loudspeaker agnostic format that adapts to the local loudspeaker geometry and acoustic landscape to offer optimal immersive sound playback in any location.

  • Compatible Infrastructure: MPEG-H works for both live capture and for recorded content — making it an existing infrastructure for audio broadcast and streaming.

  • Interactivity: The scene-based audio format allows listeners to interact with and personalize their audio experience. Listeners can combine audio objects like dialogue and multi-lingual commentary with scene-based audio or manipulate the audio point of view.

  • Efficient Representation: Using the MPEG-H compression engines, captured 3D sound scenes compress to just about any bitrate from 96 Kb/s to 1.2 Mb/s.

  • Live Recording: Scene-based audio works well for live recording because it captures a representation of the entire sound scene without requiring a human sound mixer. It’s a perfect format for sports, news, and other live applications.

Sophistication meets simplicity.

Although MPEG-H is one of the most sophisticated 3D audio coding technologies, it’s also one of the easiest to implement. MPEG-H is a comprehensive processing chain that covers acoustic capture, efficient representation and transmission and flexible rendering. It also enables a single format for all rendering scenarios —there’s no need for content creators to encode audio separately for playback on 2.0, 5.1, 7.1, 7.4.1, 11.1, 22.2 surround sound systems, and many more.

The model is progressive and implementation is designed to be easy. MPEG-H scene-based audio will revolutionize the audio experiences for content creators, broadcasters, streaming service providers and consumer electronics manufacturers everywhere.

Want to learn more?

A new era of immersive experiences.

Advancements in sound technology is just one way that Qualcomm Technologies is making experiences more immersive. Fully immersive experiences are achieved by simultaneously improving the broader dimensions of visual quality, sound quality, and intuitive interactions.