Virtual Reality

11 / 13

Reality check: VR content creation and achieving 6-DOF is no easy task.

Many technical challenges must be addressed for generating great Virtual Reality (VR) content and for achieving a 6 degrees of freedom (6-DOF) Head Mounted Display (HMD) experience.

VR video content creation is extremely challenging and there are multiple technology dimensions which must be addressed in order to generate high quality VR content for smartphones. Imagery generated from multiple cameras must be synchronized to capture the 360-degree field of view of a particular scene at the same instant in time. Next, images from the different cameras must be equalized for color and brightness are then stitched together into a single high-resolution frame continuously at a high frame rate to create a perfect 360-degree video. Finally, to effectively stream the video to HMDs, high efficiency compression of the content is key and a very high resolution/high FPS for higher pixel density is needed for an immersive experience.

6-DOF HMD has existed for several years, delivering highly accurate head motion tracking while tethered to a large, high-powered PC. Currently, there is no solution that allows users to experience that level of power and performance on an affordable, compact, mobile VR HMD system. A critical challenge associated with 6-DOF is localization. The user’s movement in the real world must be tracked with high accuracy, so that a consistent view of the virtual world can be rendered on the display real in real-time. This can only be realized through highly accurate fusion of camera sensor and Inertial Measurement Unit (IMU) data.

Key Research Areas:

Pioneering VR content R&D.

We have developed real-time image processing of two camera sensors in a single System on Chip (SoC) solution. Designed to be closely coordinated, the camera sensors capture 360-degree images in smooth synchronization with each other where timing, color, and sensor orientation are critical.

After cameras capture a 360-degree scene, their imagery is processed through our SoC’s powerful Image Signal Processor (ISP) which performs image processing operations including demosaic, denoising, and color and brightness correction. Powerful Central Processing Units (CPUs) and Graphics Processing Units (GPUs) enable real-time analysis and correction of color mismatches between cameras as well as ghosting artifacts that unavoidably appear while stitching contents from multiple cameras. The two input fisheye images are un-warped in real-time and stitched together to form a continuous 360-degree image canvas.

“Ghosting” and color mismatches reveal visible seams in the stitched content.

Our dynamic stitching and color equalization approach greatly reduces ghosting artifacts, as well as matches colors across the two images, avoiding visible seams.

Achieving a high quality, 4K high-resolution and a 30 FPS high frame rate are key for a good user experience as this content is viewed on a HMD which magnifies the scene and presents it very close to the user’s eye. 

Next, the stitched video frames are encoded and the output bitstream is streamed to a server/HMD screen directly or can be stored locally on the smartphone itself.

Camera frames are processed by numerous algorithms before they are outputted onto a user’s HMD device.

6-DOF and the future of VR R&D.

Current mobile VR HMD systems support a 3 degrees of freedom (3-DOF) architecture, where the HMD detects the user’s rotational movement, providing the main benefit of looking at the world from a fixed point. We are developing a 6-DOF system that delivers a more immersive VR experience, where user motion is not confined to rotations around a single viewpoint. This will enable highly accurate 6-DOF motion tracking of head movements on our mobile platform.

6-DOF allows the HMD to detect both rotational and translational movement, allowing the user to move freely in the virtual world and “look around corners”. We are pioneers in Visual-Inertial Odometry (VIO), conducting camera feature processing with monocular camera data and inertial data processing (with the accelerometer/gyroscope data) where the camera frames operate at 30 FPS and are fused with IMU data between 800-1000 Hz. This has resulted in camera and inertial sensor data fusion, continuous localization, and accurate, high-rate 6-DOF “pose” (positioning and orientation) generation and prediction.

Contact Us

If you find the work we’re doing in virtual reality to be exciting, and you have a technical background in VR, sensor fusion, or visual/signal processing, we’d love to hear from you.  Please visit us at www.qualcomm.com/company/careers to submit your resume.