Unravelling the Power of Single-Pass Look-Ahead in Modern Codecs for Optimized Transcoding Deployment


Event Time

Originally Aired - Monday, April 15   |   11:30 AM - 11:50 AM PT

Event Location

Pass Required: Core Education Collection Pass

Don't have this pass? Register Now!

Info Alert

Create or Log in to myNAB Show to see Videos and Resources.

Videos

Resources

{{video.title}}

Log in to your myNAB Show to join the zoom meeting!

Resources

Info Alert

This Session Has Not Started Yet

Be sure to come back after the session starts to have access to session resources.

Abstract

The ever-increasing demand for online video content has led to the emergence of technologies aimed at reducing transcoding costs in both on-premise and cloud-based environments. In a typical video workflow, which includes transcoding, metadata parsing, and streaming playback, transcoding consumes a significant share of available resources. For optimal video streaming we need fast encoding algorithms which can produce highest-quality at lowest possible bitrate.

At the moment it is acceptable industrial lore that the highest quality / lowest bitrate tradeoff is only possible through multipass encoding [1,2,3]. That incurs substantial computational cost and is not very suitable for live streaming. Because of the increasing complexity of modern codecs (AV1, HEVC, VVC) and the demand for encoders in live broadcast applications [4], more effort has been put into developing optimal single pass encoding schemes[5, 6]. In fact these schemes involving “lookahead” have evolved significantly, leveraging metadata extracted from frames (e.g., motion information, rate-distortion tradeoff, Luma plane histogram etc) ahead to inform encoding decisions and the coding process [6]. This results in lower computational cost compared to multipass encoding. It is now widely suspected that these single-pass schemes may be competitive but no quantitative work has been conducted yet to explore this.

In this paper we make three contributions

  1. a)  We provide a technical overview of single-pass encoding systems, highlighting the

    key features that provide the most coding efficiency. We consider production-ready

    implementations such as SVT-AV1 for AV1 and x265 for HEVC.

  2. b)  Using a practical dataset (containing HD, 4K 25/60fps contents), we test

    performance under different lookahead (single-pass) and multi-pass settings at different speed-presets.. The key idea here is to evaluate production ready systems with available parameterisation. Therefore we consider AWS Mediaconvert, and offline encoding solutions (hardware and softwares).

  3. c)  We evaluate performance using both visual quality and computational load hence showing the different compromise regions possible in the quality/rate/compute volume.

By examining production ready systems, we are able to make recommendations for choosing and parameterising production ready single-pass and multi-pass workflows. We expect that this work will help to further evolve the development of single-pass encoders.

References

[1]: P. H. Westerink, R. Rajagopalan and C. A. Gonzales, "Two-pass MPEG-2 variable-bit-rate encoding," in IBM Journal of Research and Development, vol. 43, no. 4, pp. 471-488, July 1999, doi: 10.1147/rd.434.0471.
[2]: Y. -C. Lin, H. Denman and A. Kokaram, "Multipass encoding for reducing pulsing artifacts in cloud based video transcoding," 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada, 2015, pp. 907-911, doi: 10.1109/ICIP.2015.7350931.
[3]: I. Zupancic, E. Izquierdo, M. Naccari and M. Mrak, "Two-pass rate control for UHDTV delivery with HEVC," 2016 Picture Coding Symposium (PCS), Nuremberg, Germany, 2016, pp. 1-5, doi: 10.1109/PCS.2016.7906322.
[4]: Reznik Y., J. Cenzano and B. Zhang, "Transitioning Broadcast to Cloud," in SMPTE Motion Imaging Journal, vol. 130, no. 9, pp. 18-32, Oct. 2021, doi: 10.5594/JMI.2021.3106162.
[5]: G. Kim, K. Yi and C. -M. Kyung, "A Content-Aware Video Encoding Scheme Based on Single-Pass Consistent Quality Control," in IEEE Transactions on Broadcasting, vol. 62, no. 4, pp. 800-816, Dec. 2016, doi: 10.1109/TBC.2016.2569999.
[6]: F. Kossentini, H. Guermazi, N. Mahdi, C. Nouira, A. Naghdinezhad, H. Tmar, O. Khlif, P. Worth, F. Ben Amara, "The SVT-AV1 encoder: overview, features and speed-quality tradeoffs," Proc. SPIE 11510, Applications of Digital Image Processing XLIII, 1151021 (21 August 2020);

Biography

Vibhoothi is a PhD student and Research Assistant with the Sigmedia Group at Trinity College Dublin. He has been a member and active member and collaborator to various open- standards bodies and organisations like Alliance for Open-media (AOM), VideoLAN, Xiph. Org to name a few. He is chairing the IEEE Student Branch of Trinity College Dublin since 2022.

François Pitié is an Assistant Professor in Media Signal Processing, Trinity College Dublin. He has made several key contributions in the field of Video Processing and Computer Vision with more than 1500+ citations. He is a reviewer in first tier conferences and journals. He is holder of multiple patents and developed algorithms which are used by companies such as Google, Disney, The Foundry and Weta Digital, and by post-production artists globally.

Julien Zouein is a research assistant and a PhD student under Prof. Anil Kokaram. He was a ML Research Engineer for Shadow, a Cloud Computing company based in Paris. He co-founded Kyber, a company developing an open-source real-time video streamer for remote interactions with machines. Julien is also part of VideoLAN.

Anil Kokaram is Chair of Electronics at Trinity College Dublin, Ireland where he founded sigmedia.tv in 1998. From 2011-2017 he led the Media Algorithms Team at YouTube/Google. In 2011 Google acquired his startup GreenParrotPictures that was producing video enhancement algorithms. In 2007 he was honoured with a Science and Engineering Academy Award from the American Academy of Motion Picture Arts and Sciences for work in post-production technology development.


Presented as part of:

Generative AI Uses and Video Transcoding


Speakers

Vibhoothi Vibhoothi
Research Assistant
Trinity College Dublin