Why the 360 video technical challenge is not solved yet?
|Last update on .
TL;DR
The perceived resolution output is roughly 16 times lower than the original video resolution, which pushes services to use higher resolutions for 360 content, that gives in return GPU a hard time with the decoding process because of its limited capabilities.
This article aims to dive into some of the technical complexities behind 360 videos, to identify the issues generated by this new trendy format, and to show why 360 pictures don’t have these kinds of problems.
To understand the first limitation induced by GPUs, we need to first understand why video compression is a requirement that cannot be bypassed.
Compression is essential
We can conceive a video as a succession of images, that we usually call frames. But, if you were to save each frame just like you save one image on your disk, then you would end up with an enormously huge video file!
Let’s do some math to check that:
- We pick a frame resolution of
1920x1080
pixels (which is the equivalent of a1080p
video). Let’s assume that such a frame, saved in JPEG weights around2 MB
. - We pick a frame rate of
30fps
.
So for an average movie of 2 hours, we will have 216,000
distinct frames.
Which would results in a 432 GB
movie file!
This naive calculation that we have just done, shows us the efficiency of video compression algorithms, bringing movie files to practical sizes, instead of a crazy-and-impossible-to-share 432 GB
movie file. Though, how compression magic works is another fascinating topic for another day.
Now that you know why video compression exists, and that it is an absolute necessity if you want to share videos across a bandwidth-limited medium like The Internet, let’s move on to the limited capacities of the GPU.
GPUs have limitations
The compression of video frames implies that a decompression step exists too, and this step is usually called the decoding process. Just like when you want to open a zip archive, you will need to decode/decompress the files, also known as unzipping the archive. The main difference with unzipping is that the video decoding process is happening in real-time, frame by frame. And your hardware has physical limits on this real-time decoding process.
For example a mobile device GPU might be limited to decoding frames to a maximum size of 1920x1080
pixels. This means that this mobile device will only be able to decode at best a 1080p
360 video, and we will see in the next section why this is an issue.
1080p is a good quality for a video, right?
Well, it is for regular videos but not for 360 videos. Here we will briefly understand why 360 image warping increases the need for resolutions that are higher than 1080p
.
A typical 360 video frame
Notice the video frame at the left-side and how the elements of the room are distorted, just like when you see a flat map of the world. Distortions happen because we use a projection to map a 3D
volume onto a 2D
surface. There is a great variety of projections out there, in this example we used the well-known equirectangular projection, also sometimes called the spherical projection.
The player you’ll be using to watch your 360 video, will map that 360 frame onto a sphere — aka spherical projection — so that you can visualize a portion of the image, with roughly around a 50° field of view.
I projected such a 50° field of view onto a 360 frame grid so that you can see how many pixels your player will display (the green shape below), compared to the overall image that the GPU has to decode (the entire grid). The shape corresponds to a mobile device screen held in portrait mode.
Spherical projection of the player frame (green shape) on the full-size 360 map
Because the player is only displaying a small portion of the full 360 image as shown in the projection above, you are never watching a 1080p
video. Let’s define a new measurement that will better reflect the final quality that a user will experience while playing a 360 video. We can call this new measurement the perceived resolution, and compare it with the original video resolution.
Here are rough estimates of the perceived resolution for a 50° field of view player, depending on the original video resolution.
Computation of the perceived resolution from the initial video resolution
This table shows us how bad the perceived experience can get, unless you use a very high definition video of around 4K
or higher. And this answers why a 1080p
resolution is not enough for a good 360 video experience.
This also explains why Facebook 360 and Youtube 360 seem to stream low resolution 360 videos, whereas they are actually streaming very high resolution videos. In fact, they wish they could have higher resolutions, but they are reaching the limits of our graphic processors.
Is 360 content useless then?
Far from that, at 360player.io we believe and bet on 360 contents as a big component of the future, but not on the 360 videos! Simply because, 360 content doesn’t have to only be limited to 360 videos only.
As a matter of fact, there is no such limitations in the resolution for 360 photos, because for instance the decoding of an image is a onetime process usually happening on the CPU side, and is not constrained by real-time requirements.
The 360 picture formats have a lot of very interesting advantages, and best of all: it’s ready and mature enough for any type of company to use right away.