Very low bit rate parametric audio coding [Elektronische Ressource] / von Heiko Purnhagen

gottfried_wilhelm_leibniz_universitat_hannover - Dipl.-Ing. Heiko Purnhagen , University Of Hannover , <Purnhage@Tnt.Uni-Hannover.De>

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

153 pages

English

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Description

Sujets

Informatik

Informations

Publié par	gottfried_wilhelm_leibniz_universitat_hannover
Publié le	01 janvier 2008
Nombre de lectures	7
Langue	English
Poids de l'ouvrage	1 Mo

Extrait

VeryLowBitRate
ParametricAudioCoding
VonderFakultat¨ fur¨ ElektrotechnikundInformatik
derGottfriedWilhelmLeibnizUniversitat¨ Hannover
zurErlangungdesakademischenGrades
Doktor Ingenieur
genehmigte
Dissertation
von
Dipl. Ing.HeikoPurnhagen
geborenam2.April1969inBremen
2008ii
1.Referent: Prof.Dr. Ing.H.G.Musmann
¨2. Prof.Dr. Ing.U.Z olzer
TagderPromotion: 28.November2008iii
Acknowledgments
This thesis originates from the work I did as member of research staff at the Information
TechnologyLaboratoryoftheUniversityofHannover.
Firstofall,Iwouldliketothankmysupervisor,ProfessorMusmann,fortheopportu
nitytoworkintheinspiringenvironmentofhisinstituteandtheInformationTechnology
Laboratory,forenablingmetoparticipateintheMPEGstandardizationactivities,forthe
freedom he gave me to pursue my own ideas, and for everything I learned during these
years. I would also like to thank Professor Zolzer¨ and Professor Ostermann for being on
mycommittee.
I’m very grateful for the fruitful interactions I had with my colleagues and students.
Inparticular,IwouldliketothankBerndEdler,NikolausMeine,CharalamposFerekidis,
AndreeBuschmann,andGabrielGaus,whoallcontributed,intheirownway,tothesuc
cess of this work. I would also like to thank Frank Feige and Bernhard Feiten (Deutsche
Telekom, Berlin) and Torsten Mlasko (Bosch, Hildesheim) for good cooperation and re
search funding. My work was closely related to the MPEG 4 standardization activities,
andIwouldliketothankSchuylerQuackenbush,MasayukiNishiguchi,andJur¨ genHerre
fortheirsupportinMPEG.
ManythanksgotoLars“Stockis”Liljeryd,MartinDietz,LarsGillner,andallmycol
leagues at Coding Technologies (now Dolby) for their patience, conﬁdence, and support
duringthetimeIneededtoﬁnalizethisthesis.
Furthermore, I would also like to thank all those people who continue to create and
to craft sounds that help me believing that there are still audio signals around that are
worthwhile to deal with. These are people like Joachim Deicke, Paul E. Pop, Bugge
Wesseltoft, Soﬁa Jernberg, Fredrik Ljungkvist, Joel Grip, Paal Nilssen Love, and many
othermusiciansandartists.
Lastbutnotleast,Iwouldliketothankmyparentsandfriendsforalltheirsupport.
Stockholm,December2008
Deteren´ made˚ atforsta˚ enandenkulturpa.˚ Atleveden. Atﬂytteindiden,at
bedeomatblivetalt˚ somgæst,atlæresigsproget. Pa˚ etellerandettidspunkt
kommer sa˚ mask˚ e forstaelsen.˚ Den vil da altid være ordløs. Det øjeblik man
begriber det fremmede, mister man trangen til at forklare det. At forklare et
fænomeneratfjernesigfradet.
PeterHøeg: FrøkenSmillasfornemmelseforsne(1992)ivv
Kurzfassung
IndieserArbeitwirdeinparametrischesAudiocodierungsverfahrenfur¨ sehrniedrigeDa
tenraten vorgestellt. Es basiert auf einem verallgemeinerten Ansatz, der verschiedene
Quellenmodelle in einem hybriden Modell vereinigt und damit die ﬂexible Verwendung
einer breiten Palette von Quellen und Wahrnehmungsmodellen erm oglicht.¨ Das ent
wickelte parametrische Audiocodierungsverfahren erlaubt die efﬁziente Codierung von
beliebigenAudiosignalenmitDatenratenimBereichvonetwa6bis16kbit/s.
DieVerwendungeineshybridenQuellenmodellssetztvoraus,daßdasAudiosignalin
Komponenten zerlegt wird, die jeweils mit einem der verfugbaren¨ Quellenmodelle ange
messen nachgebildet werden konnen.¨ Jede Komponente wird durch einen Satz von Mo
dellparameternihresQuellenmodellsbeschrieben.DieParameterallerKomponentenwer-
denquantisiertundcodiertunddannalsBitstromvomEncoderzumDecoderubermittelt.¨
ImDecoderwerdendieKomponenten Signalewiedergem aß¨ derubertragenen¨ Parameter
synthetisiertunddannzusammengefugt,¨ umdasAusgangssignalzuerhalten.
DashierentwickeltehybrideQuellenmodellkombiniertSinustone,¨ harmonischeTone¨
und Rauschkomponenten und verfugt¨ uber¨ eine Erweiterung zur Beschreibung von
schnellen Signal Transienten. Der Encoder verwendet robuste Algorithmen zur automa
tischen Zerlegung des Eingangssignals in Komponenten und zur Schatzung¨ der Modell
parameter dieser Komponenten. Ein Wahrnehmungsmodell im Encoder steuert die Si
gnalzerlegung und wahlt¨ die fur¨ die Wahrnehmung wichtigsten Komponenten fur¨ die
¨Ubertragungaus.SpezielleCodierungstechnikennutzendiestatistischenAbhangigk¨ eiten
¨undEigenschaftenderquantisiertenParameterfur¨ eineefﬁzienteUbertragungaus.
Der parametrische Ansatz ermoglicht¨ die Erweiterung des Codierungsverfahrens um
zusatzliche¨ Funktionen.DieSignalsyntheseimDecodererlaubtes,Wiedergabegeschwin
digkeit und Tonhohe¨ unabhangig¨ voneinander zu verandern.¨ Datenratenskalierbarkeit
wird erzielt, indem die wichtigsten Komponenten in einem Basis Bitstrom ubertragen¨
¨ ¨ ¨werden,weitereKomponentendagegeninErganzungs Bitstr omen.Robustheitfurfehler-
¨behaftete Ubertragungskanale¨ wird durch ungleichformigen¨ Fehlerschutz und Techniken
zurMinimierungderFehlerfortpﬂanzungundzurFehlerverdeckungerzielt.
DasresultierendeCodierungsverfahrenwurdealsHarmonicandIndividualLinesplus
Noise (HILN) parametrischer Audiocoder im internationalen MPEG 4 Audio Standard
standardisiert. Hortests¨ zeigen, daß HILN bei 6 und 16 kbit/s eine Audioqualitat¨ erzielt,
dievergleichbarmitdervonetabliertentransformationsbasiertenAudiocodernist.
Schlagworte: ParametrischeAudiocodierung,Signalzerlegung,Parameterschatzung,¨
Quellenmodell,Wahrnehmungsmodell,MPEG 4HILNvivii
Abstract
In this thesis, a parametric audio coding system for very low bit rates is presented. It is
based on a generalized framework that combines different source models into a hybrid
model and thereby permits ﬂexible utilization of a broad range of source and perceptual
models. The developed parametric audio coding system allows efﬁcient coding of arbi
traryaudiosignalsatbitratesintherangeofapproximately6to16kbit/s.
The use of a hybrid source model requires that the audio signal is being decomposed
intoasetofcomponents,eachofwhichcanbeadequatelymodeledbyoneoftheavailable
source models. Each component is described by a set of model parameters of its source
model. The parameters of all components are quantized and coded and then conveyed
as bit stream from the encoder to the decoder. In the decoder, the component signals are
resynthesized according to the transmitted parameters. By combining these signals, the
outputsignaloftheparametricaudiocodingsystemisobtained.
The hybrid source model developed here combines sinusoidal trajectories, harmonic
tones, and noise components and includes an extension to support fast signal transients.
Theencoderemploysrobustalgorithmsfortheautomaticdecompositionoftheinputsig
nalintocomponentsandfortheestimationofthemodelparametersofthesecomponents.
A perceptual model in the encoder guides signal decomposition and selects the percep
tually most relevant components for transmission. Advanced coding schemes exploit the
statisticaldependenciesandpropertiesofthequantizedparametersforefﬁcienttransmis
sion.
The parametric approach facilitates extensions of the coding system that provide ad
ditional functionalities. Independent time scaling and pitch shifting is supported by the
signalsynthesisinthedecoder. Bitratescalabilityisachievedbytransmittingthepercep
tually most important components in a base layer bit stream and further components in
one or more enhancement layers. Error robustness for operation over error prone trans
mission channels is achieved by unequal error protection and by techniques to minimize
errorpropagationandtoprovideerrorconcealment.
TheresultingcodingsystemwasstandardizedasHarmonicandIndividualLinesplus
Noise (HILN) parametric audio coder in the international MPEG 4 Audio standard. Lis
tening tests show that HILN achieves an audio quality comparable to that of established
transform basedaudiocodersat6and16kbit/s.
Keywords: parametricaudiocoding,signaldecomposition,parameterestimation,
sourcemodel,perceptualmodel,MPEG 4HILNviiiix
Contents
1 Introduction 1
2 FundamentalsofParametricAudioCoding 9
2.1 ParametricRepresentationsofAudioSignals . . . . . . . . . . . . . . . 9
2.2 GeneralizedFrameworkforParametricAudioCoding . . . . . . . . . . . 12
3 SignalAnalysisbyDecompositionandParameterEstimation 15
3.1 DesignofaHybridSourceModelforVeryLowBitRateAudioCoding . 15
3.1.1 ModelingofSinusoidalTrajectories . . . . . . . . . . . . . . . . 16
3.1.2ofHarmonicTones . . . . . . . . . . . . . . . . . . . 17
3.1.3 ModelingofTransientComponents . . . . . . . . . . . . . . . . 19
3.1.4ofNoise . . . . . . . . . . . . . . . . . . 20
3.2 ParameterEstimationforSingleSignalComponents . . . . . . . . . . . 21
3.2.1ofSinusoidalTrajectoryParameters . . . . . . . . . . 22
3.2.2 BuildingSinusoidalTrajectories . . . . . . . . . . . . . . . . . . 32
3.2.3 EstimationofHarmonicToneParameters . . . . . . . . . . . . . 40
3.2.4ofTransientComponentParameters . . . . . . . . . . 47
3.2.5 EstimationofNoiseP . . . . . . . . . . . . 50
3.3 SignalDecompositionandComponentSelection . . . . . . . . . . . . . 51
3.3.1 SignalDecompositionforHybridModels . . . . . . . . . . . . . 51
3.3.2 Perception BasedDecompositionandComponentSelection . . . 56
3.4 ConstrainedSignalandParameterEstimation .