Click play button to see the slides and hear Zoe's narration.
AV1 Encoder Optimization from the Decoder's Point of View
Demuxed 2020 was a “don’t miss” event. More than 1,000 video engineers participated, and 28 video engineers, codec developers, and image scientists from Netflix, Disney, Facebook, Apple, LinkedIn, Akamai, Fastly, Cloudflare, VideoLan, and Visionular shared best practices, open-source projects, and the latest industry development trends in video.
Our very own Zoe Liu presented a talk about optimizing an encoder from the perspective of the decoder. Video engineers understand the importance of ensuring playback stability and the role that the encoder plays. AV1 contains more than 100 coding tools as compared with its predecessor VP9. This is the key to AV1’s significantly improved coding efficiency. However, as with every new codec standard, there is added complexity, and AV1 is not exempt.
Though AV1 is a compute-intensive codec, from January 2019 through April 2020, the open-source libaom implementation has seen its speed increase ten times even while its coding efficiency improved by more than ten percent.
Visionular’s own Aurora1 AV1 codec implementation can now achieve real-time performance for live encoding on a single machine up to 4K resolution. This is made possible only by the extreme work and dedication of hundreds of codec engineers at Visionular and around the world who are singularly focused on improving the performance of AV1.
The AV1 standard has been launched for 2 years and 5 months as of the writing of this post. Though the hardware decoding ecosystem is expanding, software decoding support is still the main focus for most streaming video distributors wishing to use AV1.
The goal of considering AV1 optimization from the perspective of the decoder is to make AV1 a dominant video coding standard. To do this, we must enable AV1 to enter the practical deployment stage as soon as possible using software as the main decoding solution. This will open up the standard to be deployed even as the hardware playback ecosystem is developing.
Developers optimize and improve encoders from two aspects:
1- coding efficiency,
2- accelerating coding speed.
There is a third aspect to consider if we want practical software decoding to proliferate, and that is the optimization of the encoder for reduced decoding complexity. This process involves analyzing the decoding complexity to optimize the encoder implementation for as low a complexity score as possible while retaining all the features and tools needed to deliver on the coding gains of the standard.
A best-in-class AV1 encoder must not only balance encoding efficiency and encoding speed but also decode complexity. The best-known AV1 open-source software decoder is dav1d that is funded by the Alliance for Open Media (AOM) and jointly developed by VideoLAN and FFmpeg. Current experiments prove that dav1d has obvious advantages over all other AV1 open-source software decoders in terms of decoding speed and multithreading.
In order to evaluate the decoding performance of AV1 on mobile devices, we paid special attention to the decoding power consumption data of dav1d on mobile. In addition to the limited computing resources of mobile devices, power consumption is an important indicator for considering the quality of the decoder. Excessive power consumption will seriously affect the service life of the battery. At the same time, the phone will heat up, which will further cause the CPU to reduce its frequency and negatively impact the user experience.
We tested dav1d, ffmpeg-h264, and openhevc software decoders on several common mobile devices. The evaluation indicators include CPU usage (%), memory usage (MB), current (mA), power consumption (mW), voltage (mV), and temperature (℃).
The results show that in terms of power consumption, dav1d is between ffmpeg-h264 and openhevc, worse than ffmpeg-h264, but better than openhevc.
In addition, we collected dav1d decoding performance data with our partners and focused on the performance of dav1d on low-end mobile devices. The test video set includes multiple resolutions such as 720×536 and 960×480. Experiments proved that on low-end mobile devices when the encoding rate is high, real-time decoding will be a challenge. How to reduce the complexity of decoding by optimizing the encoder is an issue that has always been considered, but until now codec engineers were unable to focus on the objective. Today, that all changes!
AV1 provides a wealth of coding tools, which makes AV1 especially attractive not only because it is royalty-free, but also because of the advanced coding technology it encompasses. Precisely because of the adoption of these tools has the complexity increased and this has had a major impact on the complexity of the decoder. For example, Warped Motion is the first time in AV1 that Affine Transform is used to model complex motions, which surpasses the traditional concept of two-dimensional motion vectors.
The parameters of the affine transformation are derived from the motion vectors of the surrounding three macroblocks. The complexity of the decoder is far less than the complexity of the encoder. Once standard tools like Warped Motion in AV1 are adopted, the perception of the complexity of the decoder will be increased. Therefore, whether to build encoders or de-code control, this is a new challenge. We’ve optimized the AV1 encoder in order to increase the speed while maintaining the standard advantages of AV1.
Based on this, we have comprehensively optimized the image quality, bit rate, encoding speed, and decoding speed, and on this basis, proposed the concept of DCA (Decoder Complexity Aware). Using the CAE (Content-Adaptive Encoding) + DCA joint optimization strategy, when AV1 is currently only supported by software decoder solutions, AV1 can be implemented in practical scenarios so that end users can enjoy the advantages of AV1 as soon as possible. As a simple example, the division method of the macroblock size in AV1 can range from 4×4 to 128×128. We can avoid segmentation of image blocks that are too large or too small while ensuring sufficient coding efficiency, while greatly reducing the complexity of decoding so that the decoder can perform real-time decoding on low-end devices.
As a newer coding standard, the AV1 ecosystem is constantly evolving and being established. We are happy to see that more and more video manufacturers and hardware manufacturers are announcing their support for the AV1 coding standard as the application of AV1 is becoming more common. Visionular is the world’s leading AV1 solution provider. We are continuously innovating on our real-time AV1 implementation and VOD optimized codec that is designed with the most advanced film-grain handling and premium video coding features.
Download slides here.
If you have a Demuxed 2020 ticket, you can watch the full presentation along with the live Q&A here.
author:

Zoe Liu
President & Co-Founder, Visionular
Continue reading...
At Foothill Ventures, we believe in startup companies that ride the transformative power of major technology shifts such as deep learning in computer vision. Visionular’s founders are world-class technologists in their field of video codec and AI-driven optimization. We feel privileged to support their adventure with our resources and experience.
I invested in Visionular because the team is at the forefront of innovations in video encoding and image processing for real-time low latency video communications and premium video streaming applications.