AV1 Screen Content Coding

Zoe Liu
President & Co-Founder, Visionular
The application of AV1 in WebRTC is driving some of the largest-scale commercial implementations today. WebRTC supports four encoder implementations, namely VP8/VP9 (libvpx), OpenH264 (H.264 codec), and libaom RT AV1 (libaom real-time version). Note that VP9 is the zero-version of AV1, meaning libaom grew out of libvpx. A notable addition to the WebRTC encoder implementations is our Aurora1 AV1 encoder.
When we consider that libaom RT inherits features directly from libvpx VP8 and VP9, which have been deployed for WebRTC use cases, it’s easy to see how AV1 fits the RTC application starting with screen content coding (SCC) applications. AV1 SCC tools yield much more significant bitrate reduction with better quality due to IntraBC and Palette modes. Cisco WebEx was among the first to announce AV1 support replacing the aging H.264 codec.
Screen content is unique from standard video in the RTC use case. For example, almost every pixel will change when switching from one slide to the next. However, practically no changes occur once on that slide until the following slide change.
Screen content also features limited colors and graphics with sharp edges with highly repetitive patterns. For example, alphabetical and alphanumeric characters may appear multiple times on the same frame (slide) with the same font and size). Meaning a frame rate as low as 5 FPS may be adequate to represent the shared screen, in contrast to standard webcam video that requires 30 FPS or gameplay content requiring 60 FPS.
AV1 is the first video codec standard that natively includes Screen Content Coding (SCC) tools, with over 100 tools to address multiple encoding situations. These tools are included in its main body, meaning that every AV1 decoder must support the SCC features to be compliant. Other codecs specify SCC coding tools but only in their extensions, meaning that not all decoders support them.
See the quality impact that AV1 SCC tools can have where x264 requires 800kbps while Aurora1 needs just half the bits (400kbps) to produce demonstrably sharper image quality.


Combining AV1’s core coding tools with our intelligent algorithms delivers results for screen content coding with fifty percent fewer bits and noticeable higher quality. Applying the same algorithms for standard video content to screen content is not effective as the trade-off between quality and encoding efficiency is not the same for screen content.
Our team has worked hard to ensure that the Aurora1 AV1 encoder dynamically applies the best tools to screen content with a combination of the following tools:
- Adaptive early terminations for screen content,
- Categorization and identification of screen content specific motions such as sliding and scrolling,
- Optimized IntraBC motion search,
- Palette coding speedup,
- SCC specific motion estimation optimization,
- Optimized hash matching.
Since AV1 grew out of VP9, let’s compare their abilities to partition a frame.
VP9 offers four partitions (it can split a single frame into four squares and analyze them separately). In comparison, AV1 has ten partitions allowing the encoder to process different image parts more granularly.
A tool that the Aurora1 AV1 encoder leverages is single and compound motion compensation modes. The new single motion mode called “Warped Motion” allows you to detect motion using four parameters. Altogether there are 128 different methods to detect motion when you switch to compound mode enabling Aurora1 to stabilize an image and reduce unnecessary movement, reducing the number of constantly changing pixels.
Take a look at Figure 1 below where Aurora1 is compared to x264 with 1080p30 screen content. Testing was performed on an Intel Core i7 processor with both encoders configured to use a single thread at 100% utilization.
As seen in Figure 1, using four videos, Aurora1 achieves a BD-rate (Overall PSNR) savings of 81.25% with a significant quality improvement (BD-PSNR of 13.95dB) while maintaining a constant FPS of 46 or 41 with SVC turned on. This result is 8x faster than required for screen content video as many platforms encode screen content as low as 5 FPS.
NOTE: In all comparisons, we used the following command options for x264: ffmpeg -threads 1 -r 30 -s 1920×1080 -c:v libx264 -x264-params bframes=0 -tune zerolatency -preset superfast -threads 1
In Figure 2 below, Aurora1 is compared to Open H264. Across a wide test set of videos, Aurora1 achieved a greater than 50% bitrate reduction while operating just 38% slower. This shows that video engineers can confidently switch from the aging and less efficient Open H264 encoder to AV1 so that they can enjoy better quality and a tremendous reduction in bandwidth.
Let’s take a look at how Aurora1 compares to other AV1 implementations.
In Figure 3 below, Aurora1 is compared to libaom RT. Across a wide test set of videos, Aurora1 achieved a greater than 50% bitrate reduction while operating 14% faster than libaom RT.
For WebRTC applications that currently leverage VP8 or VP9, and where AV1 is on the roadmap, engineers now have a solution that operates with greater speed while providing a significant efficiency advantage over H.264 and libvpx implementations.
In Conclusion
Enabling SCC in Aurora1 can reduce screen content bitrates by more than 50% or up to 500kbps, which is impossible with any other video codec standard, or AV1 encoder implementation including libaom RT.
What about speed? A common concern with AV1 and any next-gen codec standard is speed. While this argument has been around for some time, it’s somewhat outdated. Read this post to learn about how AV1 speed has improved. WebRTC testing across various platforms, including cloud (data center), desktop, and mobile were conducted using the following settings and operational conditions:
- Video camera output encoded at 24 FPS with screen content at 12 FPS. Resolutions between 720p and 1080p.
- With standard screen content, Aurora1 preserved the original quality at 1080p and 100kbps. During intense screen content motion, the bitrate rarely exceeded 500kbps.
- CPU usage was reasonable enabling smooth playback even on entry level i5 PCs.
- With Aurora1’s Scalable Video Codec (SVC) the bitrate was 35% of OpenH264 or VP9 at the same visual quality.
AV1 offers an exciting set of tools for optimizing content for real-time delivery using WebRTC. With the Aurora1 encoder, you can achieve lower bitrates at the same quality requiring less processing power, making AV1 a realistic option for any WebRTC application.
Continue reading...
At Foothill Ventures, we believe in startup companies that ride the transformative power of major technology shifts such as deep learning in computer vision. Visionular’s founders are world-class technologists in their field of video codec and AI-driven optimization. We feel privileged to support their adventure with our resources and experience.
I invested in Visionular because the team is at the forefront of innovations in video encoding and image processing for real-time low latency video communications and premium video streaming applications.