Academia.eduAcademia.edu

Outline

Automatic loudness control in short-form content for broadcasting

2017, The Journal of the Acoustical Society of America

https://doi.org/10.1121/1.4978023

Abstract

During the early years of the International Telecommunication Union (ITU) loudness calculation standard for sound broadcasting [ITU-R (2006), Rec. BS Series, 1770], the need for additional loudness descriptors to evaluate short-form content, such as commercials and live inserts, was identified. This work proposes a loudness control scheme to prevent loudness jumps, which can bother audiences. It employs shortform content audio detection and dynamic range processing methods for the maximum loudness level criteria. Detection is achieved by combining principal component analysis for dimensionality reduction and support vector machines for binary classification. Subsequent processing is based on short-term loudness integrators and Hilbert transformers. The performance was assessed using quality classification metrics and demonstrated through a loudness control example.

References (16)

  1. References and links
  2. Abe, S. (2005). Support Vector Machines for Pattern Classification (Springer, London, UK).
  3. Dietterich, T. G. (1998). "Approximate statistical tests for comparing supervised classification learning algorithms," Neural Comp. 10, 1895-1923.
  4. Duda, R. O., Hart, P. E., and Stork, D. G. (2012). Pattern Classification (John Wiley & Sons, Hoboken, NJ). EBU (2010). Tech 3341, " 'EBU Mode' metering to supplement loudness normalisation" (European Broadcast Union, Geneva).
  5. EBU (2014). R128-s1-2016, "Loudness parameters for short-form content (advertisements, promos, etc.)" (European Broadcast Union, Geneva).
  6. Giannoulis, D., Massberg, M., and Reiss, J. D. (2012). "Digital dynamic range compressor design-A tutorial and analysis," J. Audio Eng. Soc. 60, 399-408.
  7. ITU-R (2006). BS. 1770, "Algorithms to measure audio programme loudness and true-peak audio level" (International Telecommunications Union, Geneva).
  8. Moore, B. C. J., and Glasberg, B. R. (2002). "A model of loudness applicable to time-varying sounds," J. Audio Eng. Soc. 50, 331-342.
  9. Oppenheim, A. V., and Schafer, R. W. (2010). Discrete-time Signal Processing, 3rd. ed. (Pearson Higher Education, New York).
  10. Schuller, B. W. (2013). Intelligent Audio Analysis (Springer, New York).
  11. Skovenborg, E., and Nielsen, S. H. (2004). "Evaluation of different loudness models with music and speech material," in Audio Engineering Society Convention 117, San Francisco, CA.
  12. Soulodre, G. A. (2004). "Evaluation of objective loudness meters," in Audio Engineering Society Convention 116, Berlin, Germany.
  13. Vickers, E. (2010). "The loudness war: Background, speculation, and recommendations," in Audio Engineering Society Convention 129, San Francisco, CA.
  14. Vyas, A., Kannao, R., Bhargava, V., and Guha, P. (2014). "Commercial block detection in broadcast news videos," in Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing, ACM, p. 63.
  15. Z€ olzer, U., ed. (2002). DAFX: Digital Audio Effects, Vol. 1 (John Wiley & Sons, Hoboken, NJ).
  16. Zwicker, E. (1977). "Procedure for calculating loudness of temporally variable sounds," J. Acoust. Soc. Am. 62, 675-682.