Learning-Based Multi-Frame Video Quality Enhancement

IEEE 2019 ICIP presentation learning-based multi-frame video quality enhancement visionular
This paper was presented by Junchao Tong, Xilin Wu, Dandan Ding, Zheng Zhu, and Zoe Liu, “Learning-Based Multi-Frame Video Quality Enhancement,” in the Proceedings of the IEEE International Conference on Image Processing (ICIP), September 22-25, 2019 in Taipei, Taiwan.
The convolution neural network (CNN) has shown great success in video quality enhancement. Existing methods mainly conduct enhancement tasks in the spatial domain, exploring the pixel correlations within one frame. Taking advantage of the similarity across successive frames, this paper demonstrates a learning-based multi-frame approach, with an aim to explore the greatest potential for video quality enhancement while leveraging the temporal correlation.
High level overview of how LMVE works:
First, we apply a learning-based optical flow to compensate for the temporal motion across neighboring frames. LMVE is a novel approach to jointly leverage the spatial-temporal correlations among frames for better enhancement on compressed video. LMVE categorizes different frames within one video to three quality levels, and utilizes those high-quality and moderate-quality frames to enhance the low-quality ones in between.
Afterward, a deep CNN network, which is structured in an early-fusion manner, discovers the joint spatial-temporal correlations within the video. FlowNet is first adopted to obtain the optical flow between adjacent frames in order to generate compensated frames. Afterwards, the compensated frames are fed into an early-fusion CNN network, in conjunction with the original low-quality frames.
To ensure the generality of our CNN model, we further propose a robust training strategy. One high-quality frame and one moderate-quality frame are paired to enhance the remaining low-quality frames in between, which considers a trade-off between frame distances and various frame quality.
Experimental results demonstrate that LMVE obtains a consistent superior result and outperforms prior work by 0.23 dB in PSNR on average.
The code and model of this LMVE approach are published in Github at https://github.com/IVC-Projects/LMVE.


Continue reading…

Click to learn about Visionular’s Intelligent Optimization technology.

At Foothill Ventures, we believe in startup companies that ride the transformative power of major technology shifts such as deep learning in computer vision. Visionular’s founders are world-class technologists in their field of video codec and AI-driven optimization. We feel privileged to support their adventure with our resources and experience.

Dr. Xuhui Shao
Managing Partner, Foothill Ventures

I invested in Visionular because the team is at the forefront of innovations in video encoding and image processing for real-time low latency video communications and premium video streaming applications.

Tony Zhao
Co-Founder & CEO, Agora.io

I need to improve video quality.

Click button to start an evaluation.

contact us

Aurora5 HEVC Encoder SDK

Learn More

Aurora1 AV1 Encoder SDK

Learn More visionular auroracloud transcoder

Cloud Transcoder

Learn More

Are you ready to start an evaluation so you can upgrade your video quality and UX?

let’s talk