sophiazhang08的个人博客分享 http://blog.sciencenet.cn/u/sophiazhang08

博文

Perceptual Coding of Digital Audio (数字音频的感知编码)

已有 3453 次阅读 2014-10-9 01:14 |系统分类:论文交流

                                              Perceptual Coding of Digital Audio

TED PAINTER, STUDENT MEMBER, IEEE AND ANDREAS SPANIAS, SENIOR MEMBER, IEEE


Abstract (摘要)


Perceptual Coding of Digital Audio

TED PAINTER, STUDENT MEMBER, IEEE AND ANDREAS SPANIAS, SENIOR MEMBER, IEEE


During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network,

wireless, and multimedia computing systems face a series of

constraints such as reduced channel bandwidth, limited storage capacity,

and low cost. These new applications have created a demand

for high-quality digital audio delivery at low bit rates. In

response to this need, considerable research has been devoted to

the development of algorithms for perceptually transparent coding

of high-fidelity (CD-quality) digital audio. As a result, many algorithms

have been proposed, and several have now become international

and/or commercial product standards. This paper reviews

algorithms for perceptually transparent coding of CD-quality digital

audio, including both research and standardization activities.

This paper is organized as follows. First, psychoacoustic principles

are described, with the MPEG psychoacoustic signal analysis

model 1 discussed in some detail. Next, filter bank design issues

and algorithms are addressed, with a particular emphasis placed

on the modified discrete cosine transform, a perfect reconstruction

cosine-modulated filter bank that has become of central importance

in perceptual audio coding. Then, we review methodologies that

achieve perceptually transparent coding of FM- and CD-quality

audio signals, including algorithms that manipulate transform

components, subband signal decompositions, sinusoidal signal

components, and linear prediction parameters, as well as hybrid

algorithms that make use of more than one signal model. These

discussions concentrate on architectures and applications of those

techniques that utilize psychoacoustic models to exploit efficiently

masking characteristics of the human receiver. Several algorithms

that have become international and/or commercial standards

receive in-depth treatment, including the ISO/IEC MPEG family

(1, 2, 4), the Lucent Technologies PAC/EPAC/MPAC, the

Dolby1 AC-2/AC-3, and the Sony ATRAC/SDDS algorithms. Then,

we describe subjective evaluation methodologies in some detail,

including the ITU-R BS.1116 recommendation on subjective

measurements of small impairments. This paper concludes with a

discussion of future research directions.



摘要大意:

上一个十年,CD的音质大大取代了模拟音频. 在网络,无限和多媒体计算机系统中新兴的数字音频应用面临着一系列的约束,例如信道的带宽,有限的储存空间和低成本。这些新的应用需要在低的比特率时更高质量的数字音频。为了实现这样的需求,大量的研究工作投入到了高质量的数字音频感知编码中。这样的结果就是很多的算法被提出,并且一些已经成为了现在商业生产线上的标准。这篇文章回顾了包括研究领域和工业标准领域具有CD音质的数字音频的感知编码。


这篇文章是按以下顺序进行组织的。首先,我们描述了心理声学的原理,并且讨论了MPEG心理声学的第一模型。其次,我们讨论了滤波器组的设计问题和算法,并且着重强调了修改的离散的余弦变换。再次,我们股了可以达到FM和CD的感知编码质量的方法。我们也讨论了一些现在国际的标准。最后,我们提出了未来研究的方向。



perceptual_coding.pdf



 



https://blog.sciencenet.cn/blog-819588-834118.html

上一篇:CRLB for the localization error in the presence of fading
收藏 IP: 149.169.123.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-19 05:38

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部