Content is the lifeblood of a vibrant ecosystem, as history has shown on numerous occasions from smartphones to social media. Virtual Reality (VR) will be no different, with content in the form of video, games, and other applications driving adoption. In fact, the potential of VR video is clear based on existing premium video content, such as sports, concerts and nature. However, a thriving content ecosystem also needs long-tail video that resonates with individuals in a personal way, which is perfect for VR since it is such an immersive medium. As VR video content improves, VR consumption will go from snackable and sporadic usage to extended and daily usage. Increased consumer adoption will drive VR ecosystem investment and competition, leading to lighter, sleeker and more immersive VR headsets, which will lead to further investment in content. This is the virtuous cycle we want, but it needs to be kick started. User-Generated Content (UGC) is just the thing to provide that kick start.
Although premium VR video content has been captured primarily with expensive camera rigs, consumer VR cameras are essential for the mass creation of UGC. VR cameras differ from traditional cameras in that they require 360°/180° and/or stereoscopic capture, which introduces new challenges.
Consider 360° capture, which generally uses at least two fish eye lenses to capture the full 360° spherical view. Pixelation is one issue since the same number pixels is spread across a wider field-of-view lens—as a reference, a typical smartphone camera has an approximate field-of-view of 60°. Adding more cameras would help increase the number of pixels per degree, but complexity (and cost) also increases when adding each additional camera. The increase in complexity is primarily due to stitching the individual images together. To the stitch the images together properly, all the cameras need to be synchronized to take the images at the same time. The individual settings of each camera, such as 3A (auto-focus, auto-exposure, auto-white-balance) also needs to be synchronized otherwise the stitching boundaries will be visible and the colors will look inconsistent. Accurate pose estimation of each camera must also be synchronized with each individual image frame. And don’t forget audio. 3D audio also needs to be properly recorded and synchronized with the images and camera pose.
Stereoscopic capture, which requires at least two cameras to replicate the different viewpoints our two eyes will see, is used to create depth perception during video playback and provide more immersive VR experiences. Although the images from the two different cameras may not be stitched together, they still face all the synchronization, settings setup, color equalization, and pose calibration issues that were described for 360° video capture.
And what about sharing these experiences live through social media so others can join you for special moments like a wedding, a birthday party, or a sporting event? This requires doing all the processing above as well as encoding, packaging, and wirelessly transmitting the video in real time. That’s a lot of processing and connectivity to fit within the power and thermal constraints of a sleek mobile device, such as a standalone camera or a smartphone. It’s these types of challenges that Qualcomm Technologies, Inc. (QTI) is well positioned to solve.
QTI is enabling high quality on-device, real-time VR capture and streaming for the masses. Qualcomm Snapdragon SoCs are engineered to provide an efficient heterogeneous computing solution that is optimized from end-to-end for VR capture and streaming. They include an integrated global high-frequency clock, which performs highly accurate, uniform time stamping of the camera and sensor data, which is critical for synchronized stitching. Plus, QTI develops intelligent stitching algorithms that attempt to stitch around objects rather than in a predefined straight line so the image boundaries are seamless. Snapdragon processors’ diverse processing engines, such as the CPU, GPU, DSP, and VPU, provide high throughput and efficient processing for stitching, image processing, and video encoding. Snapdragon processors also support a variety of connectivity solutions, such as LTE, Wi-Fi, and Bluetooth, for real-time streaming. At CES 2017, we demonstrated on-device real-time VR capture and sharing on a drone prototype based on Snapdragon Flight by stitching, encoding, and live streaming 4K 360° video at 30 frames per second to VR headsets.
Consumer-class VR cameras that capture VR images and video with the simplicity of a traditional camera are already becoming available. The form factors come in variety of shapes, ranging from a small candy bar and a sphere to smartphone attachments, and the trend is toward sleeker and more convenient. For example, the Essential 360 camera is a miniature magnetic attachment (powered by Snapdragon 625) to the Essential Phone (powered by Snapdragon 835) to capture 360° images and videos. One day, we could see smartphones or augmented reality glasses integrating multiple cameras for 360° and/or stereoscopic capture.
These same technologies that are enabling 360° stereoscopic capture are also being used for other industries like action cameras, drones, and security products. For example, a 360° security camera does not require mechanical parts since it can view the entire scene without turning. On-device processing also allows for real-time analytics through machine learning to provide alerts and send video to the cloud only when something important is happening, like a real security issue. Overall, we are excited about all the possibilities when affordable 360° stereoscopic cameras are made possible. It will not only kick start the VR video content ecosystem, but it will disrupt or enable other industries.