|
Perceptual Coding of Digital Audio
TED PAINTER, STUDENT MEMBER, IEEE AND ANDREAS SPANIAS, SENIOR MEMBER, IEEE
Abstract (摘要)
Perceptual Coding of Digital Audio
TED PAINTER, STUDENT MEMBER, IEEE AND ANDREAS SPANIAS, SENIOR MEMBER, IEEE
During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network,
wireless, and multimedia computing systems face a series of
constraints such as reduced channel bandwidth, limited storage capacity,
and low cost. These new applications have created a demand
for high-quality digital audio delivery at low bit rates. In
response to this need, considerable research has been devoted to
the development of algorithms for perceptually transparent coding
of high-fidelity (CD-quality) digital audio. As a result, many algorithms
have been proposed, and several have now become international
and/or commercial product standards. This paper reviews
algorithms for perceptually transparent coding of CD-quality digital
audio, including both research and standardization activities.
This paper is organized as follows. First, psychoacoustic principles
are described, with the MPEG psychoacoustic signal analysis
model 1 discussed in some detail. Next, filter bank design issues
and algorithms are addressed, with a particular emphasis placed
on the modified discrete cosine transform, a perfect reconstruction
cosine-modulated filter bank that has become of central importance
in perceptual audio coding. Then, we review methodologies that
achieve perceptually transparent coding of FM- and CD-quality
audio signals, including algorithms that manipulate transform
components, subband signal decompositions, sinusoidal signal
components, and linear prediction parameters, as well as hybrid
algorithms that make use of more than one signal model. These
discussions concentrate on architectures and applications of those
techniques that utilize psychoacoustic models to exploit efficiently
masking characteristics of the human receiver. Several algorithms
that have become international and/or commercial standards
receive in-depth treatment, including the ISO/IEC MPEG family
(1, 2, 4), the Lucent Technologies PAC/EPAC/MPAC, the
Dolby1 AC-2/AC-3, and the Sony ATRAC/SDDS algorithms. Then,
we describe subjective evaluation methodologies in some detail,
including the ITU-R BS.1116 recommendation on subjective
measurements of small impairments. This paper concludes with a
discussion of future research directions.
摘要大意:
上一个十年,CD的音质大大取代了模拟音频. 在网络,无限和多媒体计算机系统中新兴的数字音频应用面临着一系列的约束,例如信道的带宽,有限的储存空间和低成本。这些新的应用需要在低的比特率时更高质量的数字音频。为了实现这样的需求,大量的研究工作投入到了高质量的数字音频感知编码中。这样的结果就是很多的算法被提出,并且一些已经成为了现在商业生产线上的标准。这篇文章回顾了包括研究领域和工业标准领域具有CD音质的数字音频的感知编码。
这篇文章是按以下顺序进行组织的。首先,我们描述了心理声学的原理,并且讨论了MPEG心理声学的第一模型。其次,我们讨论了滤波器组的设计问题和算法,并且着重强调了修改的离散的余弦变换。再次,我们股了可以达到FM和CD的感知编码质量的方法。我们也讨论了一些现在国际的标准。最后,我们提出了未来研究的方向。
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-5-19 05:38
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社