||
本文为德国埃尔兰根纽伦堡大学的博士论文,共184页。
音乐信号很复杂。当音乐家们一起演奏时,他们的乐器声音叠加在一起形成一个单一的复杂声音混合体。此外,即使是单个乐器的声音也可能已经包含了谐波、冲击、类噪声和瞬态性质的声音成分。音乐信号处理任务的复杂性,包括如时间尺度调整、延长或压缩音乐信号的持续时间或音乐源分离等。因此将音乐记录分离成与单个乐器相对应信号的任务通常直接来源于音乐的复杂性。
在这篇论文中,我们的目标是探索新的方法来完成音乐信号处理任务。我们的核心思想之一是通过将一个给定的音乐信号分解成两个或多个中间成分,然后分别处理这些成分,从而降低任务的复杂性。处理结果取决于音频分解技术,一个中间分量可以反映音乐信号的某些方面,例如它的和声或撞击声。这种明确的解释通常允许我们应用更专业的方法来处理中间分量。在最后一步中,处理后的分量信号被重新组合以形成全局结果。作为我们贡献的一部分,提出了各种新颖的音频分解技术,用于将音乐信号分解为不同类型的中间分量。例如,我们提出了一种将信号分解为三个分量的方法,这些分量分别包含信号的谐波、冲击和噪声类声音。此外,我们应用前面描述的一般策略来处理数字信号处理和音乐信息检索领域的不同任务。特别是,我们提出了新的程序用于时间尺度修改、歌声分离、振动分析和音频拼接。基于这些方法,我们还提供了各种原型用户界面和工具,用于分析、修改、编辑和合成音乐信号。
Music signals are complex. When musicians play together, their instruments’ sounds superimpose and form a single complex sound mixture. Furthermore, even the sound of a single instrument may already comprise sound components of harmonic, percussive, noise-like, and transient nature, among others. The complexity of music signal processing tasks such as time-scale modification—the task of stretching or compressing the duration of a music signal—or music source separation—the task of separating a music recording into signals that correspond to the individual instruments—is therefore often directly derived from the complexity of music signals themselves. In this thesis, our goal is to explore novel ways of approaching music signal processing tasks. One of our core ideas is to reduce a task’s complexity by decomposing a given music signal into a set of two or more mid-level components and then process these components individually. Depending on the audio decomposition technique, a mid-level component may reflect certain aspects of the music signal, such as its harmonic or percussive sounds. This explicit interpretation often allows us to apply more specialized methods for processing the mid-level components. In a last step, the processed component signals are recombined to form a global result. As part of our contributions, we propose various novel audio decomposition techniques for splitting a music signal into mid-level components. For example, we present a method for decomposing a signal into three components that contain the signal’s harmonic-, percussive-, and noise-like sounds, respectively. Furthermore, we apply the general strategy described previously to approach different tasks in the fields of digital signal processing and music information retrieval. In particular, we propose novel procedures for time-scale modification, singing voice separation, vibrato analysis, and audio mosaicing. Built upon these methods, we additionally present various prototype user interfaces and tools for analyzing, modifying, editing, and synthesizing music signals.
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-5-4 18:08
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社