A semantic concept for the mapping of low-level analysis data to high-level scene descriptions [Elektronische Ressource] = Ein semantisches Konzept für die Abbildung von Low-level-Analyseergebnissen auf High-level-Szenenbeschreibungen / von Holger Neuhaus
143 pages
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

A semantic concept for the mapping of low-level analysis data to high-level scene descriptions [Elektronische Ressource] = Ein semantisches Konzept für die Abbildung von Low-level-Analyseergebnissen auf High-level-Szenenbeschreibungen / von Holger Neuhaus

-

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres
143 pages

Description

A Semantic Concept for The Mapping ofLow-level Analysis Data to High-level SceneDescriptionsEin semantisches Konzept fur die Abbildung von low-levelAnalyseergebnissen auf high-level SzenenbeschreibungenD I S S E R T A T I O Nzur Erlangung des akademischen GradesDoktoringenieur(Dr.-Ing.)vorgelegt derFakult at fur Elektrotechnik und InformationstechnikTechnische Universit at IlmenauvonDipl.-Ing. Holger Neuhausgeboren am 31.08.1972 in Rotenburg (Wumme)vorgelegt am: 18. Juni 2007Gutachter:1. Prof. Dr-Ing. Karlheinz Brandenburg2. Prof. Thomas Sikora3. Dr-Ing. Ulrich-Lorenz BenzlerVerteidigung am: 24. Oktober 2008urn:nbn:de:gbv:ilm1-2008000197AbstractAlong with the growing need for security, an increasing amount of surveil-lance content is created. It is indispensable to index the content in advancein order to enable quick and reliable searches on the output of hundredsor thousands of surveillance sensors installed at a single facility. For thispurpose, the concept of Smart Indexing and Retrieval (SIR) enables cost ef- cient searches by generating high-level meta data. The generation of thismeta data has to be done automatically, based on the low-level features ex-tracted by content analysis algorithms. Creating it manually becomes moreand more di cult to handle in reasonable time at reasonable costs.

Sujets

Informations

Publié par
Publié le 01 janvier 2008
Nombre de lectures 71
Poids de l'ouvrage 1 Mo

Exrait

ASemanticConceptforTheMappingof
Low-levelAnalysisDatatoHigh-levelScene
Descriptions

EinsemantischesKonzeptf¨urdieAbbildungvonlow-level
Analyseergebnissenaufhigh-levelSzenenbeschreibungen

DISSERTATION

zurErlangungdesakademischenGrades
Doktoringenieur(Dr.-Ing.)

derorgelegtvFakult¨atf¨urElektrotechnikundInformationstechnik
TechnischeUniversit¨atIlmenau

onvNeuhausHolgerDipl.-Ing.geborenam31.08.1972inRotenburg(W¨umme)

vorgelegtam:18.Juni2007

ter:hGutac

1.Prof.Dr-Ing.KarlheinzBrandenburg
2.Prof.Dr-Ing.ThomasSikora
Benzlerh-LorenzUlricDr-Ing.3.

Verteidigungam:24.Oktober2008

v:ilm1-2008000197bn:de:gburn:n

Abstract

Alongwiththegrowingneedforsecurity,anincreasingamountofsurveil-
lancecontentiscreated.Itisindispensabletoindexthecontentinadvance
inordertoenablequickandreliablesearchesontheoutputofhundreds
orthousandsofsurveillancesensorsinstalledatasinglefacility.Forthis
purpose,theconceptofSmartIndexingandRetrieval(SIR)enablescostef-
ficientsearchesbygeneratinghigh-levelmetadata.Thegenerationofthis
metadatahastobedoneautomatically,basedonthelow-levelfeaturesex-
tractedbycontentanalysisalgorithms.Creatingitmanuallybecomesmore
andmoredifficulttohandleinreasonabletimeatreasonablecosts.
Whereasformerlyproposedapproacheshavebeenstronglyapplication
dependent,inthisthesis,agenericconceptformappingtheresultsofthelow-
levelcontentanalysisdatatosemanticeventdescriptionsanditsapplication
ispresented.Theconstitutingelementsofthisapproachandtheirunderlying
conceptsaswellasanintroductiontotheirapplicationareshown.Themain
contributionoftheapproacharethegeneralityandtheearlystageatwhich
thestepfromlow-leveltohigh-levelrepresentationistaken.Thisreasoningin
themetadatadomainisperformedonsmalltimeframeswhilethereasoning
oncomplexerscenesisdoneinthesemanticspace.Evenanunsupervised
self-assessmentispossibleusingthesemanticapproach.

ords:Keywscenedescription,behaviorrecognition,eventontology,smartindexing,video
nceeillasurval,retriev

Zusammenfassung

ZusammenmitdemwachsendenBedarfanSicherheitwirdeinezunehmen-
deMengean¨Uberwachungsinhaltengeschaffen.Umeineschnelleundzu-
verl¨assigeSucheindenAufnahmenhunderterodertausenderineinerein-
zelnenEinrichtunginstallierten¨Uberwachungssensorenzuerm¨oglichen,ist
dieIndizierungdiesesInhaltsimVorausunentbehrlich.ZudiesemZweck
erm¨oglichtdasKonzeptdesSmartIndexing&Retrieval(SIR)durchdie
Erzeugungvonhigh-levelMetadatenkosteneffizienteSuchen.Daesimmer
schwierigerwird,dieseDatenmanuellmitannehmbaremZeit-undKosten-
aufwandzugenerieren,mussdieErzeugungdieserMetadatenaufBasisvon
low-levelAnalysedatenautomatischerfolgen.
W¨ahrendbisherigeAns¨atzestarkdom¨anenabh¨angigsind,wirdindieser
ArbeiteingenerischesKonzeptf¨urdieAbbildungderErgebnissevonlow-
levelAnalysedatenaufsemantischeSzenenbeschreibungenpr¨asentiert.Die
konstituierendenElementediesesAnsatzesunddieihnenzugrundeliegen-
denBegriffewerdenvorgestellt,undeineEinf¨uhrunginihreAnwendung
wirdgegeben.DerHauptbeitragdespr¨asentiertenAnsatzessinddessenAll-
gemeing¨ultigkeitunddiefr¨uheStufe,aufderderSchrittvonderlow-level
aufdiehigh-levelRepr¨asentationvorgenommenwird.DiesesSchließeninder
Metadatendom¨anewirdinkleinenZeitfensterndurchgef¨uhrt,w¨ahrenddas
SchließenaufkomplexerenSzenenindersemantischenDom¨aneausgef¨uhrt
wird.DurchdieVerwendungdiesesAnsatzesistsogareineunbeaufsichtigte
SelbstbewertungderAnalyseergebnissem¨oglich.

orter:¨hlagwScSzenenbeschreibung,Verhaltenserkennung,Ereignisontologie,”Kluge“Indexer-
stellung,SucheinVideos,¨Uberwachungssysteme

meinen

Eltern

iii

tswledgmenknoAc

IamthankingmysupervisorProf.Dr.-ing.KarlheinzBrandenburgforha-
vinggivenmetheopportunitytocommenceinthischallengingandmost
interestingarea.Aboveall,Ithankhimforhisvaluablecluesandreferences,
andforhisguidanceandsupportthroughoutthiswork.IamthankingProf.
Dr-Ing.ThomasSikoraforhissupport.

IthankDr.Ulrich-LorenzBenzlerforhisguidanceandmentoring,the
unboundedpatienceandreadinessandtheadministrativesupporthegave
ears.ythesethroughme

Idenprogramm”thanktheandRobertnamelyBoschDr.GmLarsbHforPlacke,emploDr.yingAndreasmeintheEngelsberg,“Doktoran-Dr.
WolfgangNiem,Dr.StefanMueller-Schneiders,andThomasJ¨ageratRo-
bring,ertandBoschtheGmwillibHngnessHildesheimforcounfort-theirandendlessadministrativdiscussions.esupport,Greatthegratitumento-de
tomylektorsUlrichandMarco.IthankUweDaniel,theheadofthedepart-
mentCR/AEM,forhissupportinallformalities.

theIgreatthankcommalltheunityotheranddothectoralmanysoircandidates´eestoatstrengRobertthenBoscthishcommHildesheimunityfor;-)

MyspecialthanksgototheKernTeam.

onsIwethankhadabRobouttheWijnhovtopicenforandhisforcohisoperationconstructivandetheliteraryprohinductivts.ediscussi-

VielenDankanKerstinLietzf¨urdieBereitstellungdes”Drehortes“.
felloIwthankstudenStimonandfriendMoserbutnotalsoonlyforforhishimprofessionalfortifyingmeadviceandandbeingsuppaort.good

iv

ZusprucIchhdankundeVerstAndrea¨andnis&Gert,indenUlrikwe,enigerSoerenerfolgreicundMarcohen“f¨urZeiten.Unterst¨uzung,
”WeiterhindankeichmeinendamaligenKommilitonenMarcusKern,Matt-
hiasKautznerundHelge-HubertusHundackerf¨urdie¨außerstproduktiveZu-
sammenarbeitw¨ahrenddesgesamtenStudiumssowieChristianWeigelf¨ur
ebendies,alsauchihmundRossif¨urdiefreundlicheUnterkunftinIlmenau.

CheerstotheAustralianOutbackforinspiration.

Meingr¨oßterDankjedochgiltmeinenEltern1,diemichstetsmitallihrer
Kraftunterst¨utztundgef¨orderthabenundmirsodenlangenWeg,denich
gegangenbin,ersterm¨oglichten.

1Mygreatestgratitudegoestomyparents

v

Contsten

1TheSmartIndexingandRetrievalProblem
1.1Introduction............................
1.2ProblemStatement........................
1.2.1Anoverallsystem.....................
1.2.2Fieldsofapplication...................
1.3ExistingSolutions.........................
1.3.1Model-basedEventRecognition.............
1.3.2BlankSpots........................
1.4DefinitionofaSemanticConcept................
1.5Organizationofthiswork....................

1123455889

2RecentAdvancesintheFieldofAnalysisandModelingof
ObjectMotionandBehavior10
2.1Introduction............................10
2.2Overviewonappliedtechniques.................11
2.2.1ProbabilisticandStochasticTechniques.........11
2.2.2SymbolicTechniques...................13
2.2.3PetriNet.........................14
2.3Action-orientedmotionanalysisandtracking..........15
2.4Scenarioanalysis.........................17
2.5Eventdetectionandrepresentation...............21
2.6OntologiesforVideoEvents...................34

3EmployedTechniquesandTools37
3.1Introduction............................37
3.2Ontologies.............................37
3.2.1Ontologydesign......................37
3.2.2OntologyLanguages...................39
3.2.3VERL...........................42
3.3TheVCAAlgorithm.......................56
3.4Retrievalqualitymeasures....................56

vi

4

5

3.4.1Recall...........................57
3.4.2Precision..........................57
3.5The“GroundTruth”.......................57
ASemanticConceptfortheMappingoflow-levelAnalysis
Datatohigh-levelSceneDescriptions58
4.1Introduction............................58
4.2Terminology............................59
4.3EventMorphemes.........................63
4.3.1MetaKnowledge.....................64
4.3.2SemanticModule.....................65
4.3.3EventMorphemeTaxonomy...............66
4.3.4SemanticInterpolation..................68
4.3.5Reasoning-outfalsepositives...............69
4.3.6EventMorphemePath..................69
4.4ApplicationofEventMorhpemes................69
4.4.1Separatelistofobjects..................69
4.4.2EventMorphemeTemplate...............70
4.4.3EventMorphemes’sphereofresponsibilityforreasoning71
4.5Whereistheindex?........................71
4.6Anexample............................72
4.7ImplementingEventMorphemes.................74
4.7.1Anoverallsystem.....................74
4.7.2TheEventMorphemeOntology.............75
4.7.3TheEventMorphemedetectormodules.........75
81Results5.1Introduction............................81
5.2RealizationofUseCases.....................81
5.2.1Fight............................82
5.2.2Shopping.........................82
5.2.3CriminalBehavioratanATM..............83
5.2.4DetectBeggar/SalesmanonStreet...........83
5.3Reasoningoutfalsepositives...................84
5.4EventMorphemedetection....................84
5.4.1Resultsforthetestsequences..............86
5.4.2Errorbounds.......................87
5.5Comparison............................88
5.6Mappingotherapproaches’eventstoEventMorphemes....92

vii

6Conclusionandoutlook93
6.1Introduction............................93
6.2Assessingtheresults.......................94
6.3Outlook..............................95

AZusammenfassungindeutscherSprache104
A.1Einleitung.............................104
A.1.1Problemstellung......................105
A.1.2EinGesamtsystem....................106
A.1.3Anwendungsgebiete....................107
A.2ExistierendeL¨osungen......................108
A.2.1Modell-basierteEreigniserkennung............108
A.2.2OffenePunkte.......................111
A.3MittelundWegederL¨osung...................112
A.3.1Einf¨uhrung........................112
A.3.2Ontologie.........................112
A.3.3VERL...........................113
A.3.4DerVCAAlgorithmus..................113
A.3.5Maßef¨urRetrievalqualit¨at................114
A.4EinsemantischesKonzeptf¨urdieAbbildungvonlow-level
Analyseergebnissenaufhigh-levelSzenenbeschreibungen....115
A.4.1Einf¨uhrung........................115
A.4.2Terminologie.......................115
A.4.3EventMorpheme.....................117
A.4.4WoistderIndex?.....................120
A.5Ergebnisse.............................120
A.5.1Einf¨uhrung........................120
A.5.2RealisierungderAnwendungsf¨alle............121
A.5.3AusschließenvonFehlerkennungen(reasoning-out“).122
”A.5.4Ergebnissef¨urdieTestsequenzen............122
A.6AuswertungundAusblick....................124
A.6.1Einf¨uhrung........................124
A.6.2BewertungderErgebnisse................124
A.6.3Ausblick..........................125
BResultsfortrainingsequences126
128ThesenC129ositionsPropD

viii

E

Used

abbreviations

and

bsym

ols

ix

130

ofListFigures

1.1Thedifferentlevelsofsemanticsinscenedescriptions......3

2.1Generalframeworkofvisualsurveillance(takenfrom[Huetal.,
2004])................................11
2.2Thethirteenrelationshipstoexpressanyrelationshipthatcan
holdbetweentwointervals.....................36

4.1Amovingregionrepresentingtwoobjects............60
4.2Anobjectconsistingoftwomovingregions(schematicrepre-
sentation)..............................60
4.3Twoobjectswithonecorrespondingmovingregioneach....61
4.4Suitcasehandoverrepresentation.................61
4.5TheEventMorpheme’s1-to-1relation..............61
4.6FromeventtoSemanticModule.................62
4.7SchematicrepresentationofthecontextbetweenVCA’smov-
ingregionstohigher-levelscenedescription...........63
4.8Threephenotypesoftake-out...................67
4.9NewIndexingApproach:TheEventMorphemesaretheindex.72
4.10Thesuitcasehandoverscene....................73
4.11MappingVCAoutputtoEventMorphemes...........76

5.1Apersonfallscausedbyanother.................82
5.2Theshoppingscenario.......................83
5.3Apersonspyingonanother’sPIN................83
5.4Apersonwalkingfromonepersontoanother..........84
5.5ThepostHocanalysistool.....................85

A.1DieverschiedenenNiveausderSemantikinSzenenbeschrei-
bungen...............................106
A.2DieEventMorphemesindderIndex..............121

x

ablesTofList

4.1

5.15.25.3

A.1A.2

B.1

TheattributesoftheinstantiatedEventMorphemeandthe
correspondingsemanticlabels.................

Resultsofthedetectionofeventsinthetestsequences...
Comparisonofthesystems...................
Mappingotherapproaches’eventstoEventMorphemes...

Ergebnissef¨urdieDetektionderdefiniertenAnwendungsf¨alle
VergleichderSysteme.....................

Resultsofthedetectionofeventsinthetrainingsequences

xi

.

.

...

..

.

70

869192

122123

127

1Chapter

TheRetrievSmartalProblemIndexingand

ductiontroIn1.1

conAlongtentiswiththecreated.groItwingisneedindispforensablesecurittoy,anindextheincreasincongtentamounintadvofancesurvineillanceorder
oftosurvenableeillancequickandsensorsreliableinstalledsearatcahesonsinglethefacilitoutputy.TheofhaimundredsofSmartorthousandsIndexing
[andHuetRetrieval.,al2004],(SIR;ConalsotentcalledBasedSemanVideotic-BasedIndexingandVideoRetrievRetrievalal(CBVIR),(SBVR),
[[BashirHampapurandetal.Khokhar,2004,])2003is]ortoAugeneratetomaticFmetaorensicdataandVideothusRetrievenableal(AFVR),efficient
searchautomaticallyonthe,conbasedtent.ontheThelow-levgenerationelfeatofuresthisextractedmetabdatayconhastentotbeanalysisdone
algorithms.Creatingitmanuallybecomesmoreandmoredifficulttohandle
indataproreasonablecessingtimeisatemployedreasonabletoextractcosts.Uponhigher-levtheeloutputsinformaon-tion.orTheofflineactualmeta
cfeatures.hallengeisAlthoughthissemanseveralticapproacanalysishesofhathevebconeententaktenbasedtoonaddressthethisextractedtopic,
thesearchforanall-purposesolutioncontinues:

Ifitbeindeedpossiblesuchatousefulidentifysetofcorrespgenericondingactionscanfeaturesbeanddefined,matcwhingould
methodswhichare,toalargedegree,application-independent?
([Gavrila,1999])

1

2

tStatemenProblem1.2Intuitively,anindexdependsonthekindofretrievalthatisexpectedtobe
performed.Unfortunately,itcannotbeknowninadvance,whichscenarios
willbesearchedfor.Whenanindexisqueriedforascenariothatwasn’t
knowntoberelevantatindexingtime,problemswillariseinevitably.The
indexhastobesmartoratleastgenerictoenablesuchaquery.Thisway,
togetherwithasmartretrieval,aqueryforanyscenariowouldbepossible.
So,whatis“SmartIndexing”,whatis“SmartRetrieval”?
Asmartwaytoindexistoindexwhatandhowitmakessense.“Smart”
intheinformationtheorycontextreferstoreductionofredundancy.“Smart
Retrieval”exploitssuchaSmartIndexinthemostefficientway,i.e.the
indexisbeinginterpretedcorrectly.Specificdomaincharacteristics,e.g.
scenelayout,aswellasuserrestrictionsaretakenintoaccountatretrieval
time.Asetofotherproblemsariseswhenitcomestovideoquerybysemantic
keywords(orevenmore:freetextqueries).Thegreatestdifficultycanbe
foundinthemappingofthelow-level(pixel)videorepresentationtohigh-
level(human)semantics.I.e.whilelow-levelfeaturesareextractedeasily,
thestartingpointofaretrievalprocessisusuallythehigh-levelqueryby
theuser.Theproblemisillustratedbythemappingofthelow-levelfea-
turesthecomputerusesontheonehandtothequestionposedbyahuman
ontheother.Thisiscommonlydescribedas“bridgingthesemanticgap”.
However,bridgingthesemanticgapisnotonlytranslatinghigh-levelqueries
tolow-levelfeatures.Theessenceofasemanticqueryisunderstandingof
themeaningbehindthequery.This,ofcourse,isalsouserdependent,asit
involvesthedefinitionoftermsdependingonthedomaintheusersearches
in.Thishastobeconsideredwhenprocessingaquery.
Takingintoaccountthepreceding,wearrangethedataaccordingto
ascendingsemanticlevelfrompixellevel(inthevisualdomain)tosemantic
metadata.Thefirstlevelcomprisestheimageprocessingalgorithms.Inthis
thesisthislevelwillbecalledlevel0asthisisthebeginningoftheanalysis
andthereisnothing“below”thatlevel.Thealgorithmsoflevel0arethose
tosegmentmovingregions,extractfeatureslikecolorortexture,andperform
objectclassification.Outputofthesealgorithmsandthusoflevel0ismeta
data,e.g.thepositionoftheobjectinpixelcoordinatesoftheboundingbox,
adefinedlabeloftheobject’sclassetc.
Themetadataoutputoflevel0istheinputforlevel1onwhombasic
reasoningisperformed.Thisincludesdetectionofsimpleevents,analysisof
directionofmovementorthedetectionofpeoplecarryinggoods[Abdelkader
andDavis,2002],e.g.abackpack[Haritaogluetal.,2001]:thereasoning

3

whetherorwhethernotgoodsarebeingcarriedisbasedontheshape,which
isLmetaevel2datafinallyoutputisofthelevelhighest0.semanticlevel.Thislevelincorporatesthe
outputoutputaofsemanleveltic1togetherdescriptionwithlikesema“thneftticofainformasuitcase”.tionabLoutevel2thescenerepresentots
hoopwaeratoruserwwouldouldperformdescribaeaqueryscenetoaandretrievismalostliksystem.elytheSeeformfigurein1.1whicforhanan
overviewonthedifferentlevelsofsemantic.

Figure1.1:Thedifferentlevelsofsemanticsinscenedescriptions.

1.2.1Anoverallsystem
Inordertocreateasystemcapableofdetectingeventsandcombiningthose
recognizedeventstoasemanticscenedescription,weneed
1.alow-levelanalysis(VCA1)algorithm,
2.anontologytolaydownacommondescriptionlanguage,
3.aninferencemechanismtomaptheVCAmetadatatoasemanticlabel,
4.atool/mechanismforcreatinganadequatemapofthescene
5.aretrievalinterface
Theseitemsconstitutethesystemasfollows.TheVCAalgorithm(item1)
analyzestheimagesequenceandoutputstheobjects’shapes,tracks,andfea-
tureslikecolor,texture,etc.Thus,itprovidesthelevel0metadata.The
inferencemechanism(item3)isthecruxofthesystem:anythingthatisn’t
processedsoundlyhere,willeffecttheconsistenceofthefollowingstages
(aswellasinadequacyoftheVCA).Thispartofthesystemwillhaveto
beadaptedtotheprecedingVCAalgorithm(s)employed.Itoutputslevel1
metadata.Theontologyforthecommondescriptionlanguage(2)isrequired
ontheonehandforthereasoninglogicsandontheotherhandfortheoutput
1VideoContentAnalysis

4

totheuserandthemappingoftheuser’squery,respectively.Furthermore,
thereasoninglogicexploitingtheunderlyingontologyenables(togetherwith
item4)thesteptolevel2.Boththeontologyandthereasoningshould
pbeerformedindependenwhentfromsteppingtheVfromCAlevel0algorithmstolevelused1.asThethisretrievadaptionalinisterfacealready(5)
finallyshouldbeflexibleenoughtoenabletheuseravarietyofqueriesand
mapthequerytotheindex.
Thelevel0contentanalysisalgorithmsareanimportantrequirementfor
thefollowingtwolevels.Still,thisareaisnotinfocusofthisthesis.This
thesisintroducesasemanticconceptfordetection,representation,indexing,
ofandaVCAretrievalofalgorithm.eventsThofus,htheumanandgenerationnon-hofumanlevel1actionsandlevelbased2onisthethematteroutput
ofinterest,agenericapproachforbehaviorunderstandingtogetherwitha
suitableconceptforindexingandretrieval.

applicationofFields1.2.2ASmartIndexingsystemshouldbeabletoautomaticallyindexvideocontent
torichinenableefficieninformation.tsearcThehes.SmartTheindexRetrievhasalmtoustbebofeablelittletoexploitredundancythebutSmartstill
Indexbycombiningtheindicesintelligentlyandalsoincorporateadditional
informationsuchasuserrestrictionsandscenelayout.
ASmartIndexing&Retrievalsystemhastobemeasuredbyitsretrieval
performance.So,whenevaluatingsuchasystemspecificeventsarebeing
queriedandtheresultsarerated.
Togetherwithsecuritysystemsexperts2,asetofrelevantusecaseshas
beenidentified.Atfirst,thefollowingsub-eventsaretobedetected:pick-
up,put-down,andfall-down.Theselectedsub-eventsshouldbecapableof
composingthedefinedusecases.Inthenextstep,thefollowingusecasesto
beretrievedfromtheindexweredefined.

1.“Fight”:apersonfallscausedbypersonpassingby/personfallsdown
anddoesn’tstandupagain.
2.“Shopping”:didthecustomerpayornot?
3.“CriminalbehavioratanATM”:spyonanotherperson’sPIN
4.detectbeggar/salesmanonstreet.
2tospecify,associatesoftheBoschSecuritySystemsbusinessunit

5

SolutionsExisting1.3Researchinthefieldofautomatedactionrecognitionandscenedescription
hascompletebeenovgoingerview.on“forThesomefollotimewingnow”sections[Gashovrillaw,1997systems].Seeandchapterresearch2wfororka
relatedrecognitiontotheprocess,presenandtedwaork(toinsomeasenseextend)thatasemanmodelticisincorpdescriptionorated(orinlabtheel)
ofascenarioisbeinggenerated.

1.3.1Model-basedEventRecognition
[Hongengetal.,2004]presentaconcepttorepresentactivitiesandrecognition
methodsemployingthisrepresentation.Anactivityisbeingcomposedof
evactionentrepresenthreads,tseacthehcthreadharacteristicsbeingexecuoftratedbjectoryyasingleandshapactor.e.AAmsingleulti-threadthread
evandentisrepresencomptedosedbyofanevseveneraltactgraph,ionsimilarthreadstoinrelatedtervalbyalgebratempnetoralworksconstrain[Allents
andFerguson,1994].Mostly,multi-threadeventsincorporateseveralactors,
i.e.beingmulti-agentevents.Eventsareorganizedintoseverallayersof
externalabstraction.knoThewledgeisapproacincorphisoratedcloselyintorelatedtheexpto[IvectedanovstructureandBobiofckthe,2000activit]asy
mo[del.CupillardThismoetdelal.,is2004]hierarcpresenhicaltabutsystemonlyinmothedelsaframewgents,orknotofobthejects.european
projectADVISOR[NaylorandAttwood,2003].Itisanapproachforthe
onlinerecognitionofindividual,groupofpeople,orcrowdbehaviorinmetro
survrepreseneillancetationconoftextscenarios.employingInputmforultiplethecamscenarioeras.Arecognhierarcitionhyisareusedforthe

1.scenariomodelsdefinedbyexperts,
2.geometricinformationoftheobservedenvironment,and
3.personstrackedbyavisionmodule,whichissupposedtodothatcor-
.rectly

Theformalismisbasedonthreemainideas:

1.definevariousoperators(softwaremodules)forrecognition,
2.haveallknowledgeneededinthecorrespondingoperator,and
3.descriptionoftheoperatorshallbedeclarativetobuildanextensible
erators.opoflibrary

6

Thebehaviorrepresentationisactororiented,whereanactorisanyscene
objectinvolvedinbehavior.Therecognitionprocessusesonlytheknowledge
representedbyexpertsthroughscenariomodels.Still,scenariosaredefined
“asone”,thusvariationshavetobehandledbydifferentdetectormodules.
[Bourbakisetal.,2003]defineamodelforrepresenting,recognizing,and
interpretinghumanactivity.Themodelisbasedonthehierarchicalsynergy
ofthreeothermodels:theLocal/Global(L-G)graph,theStochasticPetri
Net(SPN)graphandaneuralnetwork(NN)model.Theauthorsdistin-
guishbetweenstructuralknowledge(knowledgeaboutphysicalstate)and
functionalknowledge(knowledgeaboutchangeandevents),andtheymodel
thetemporalrelationshipamongasetofdifferentobjecttemporaleventsin
thescene.TheydevelopaDynamicallyMulti-LinkedHiddenMarkovModel
(DML-HMM)tointerpretgroupactivitiesinvolvingmultipleobjectscap-
turedinanoutdoorscene.Thefieldofapplicationareairportcargoactivities
witheventslike“movingCargo”,beingcompletelydomaindependent.
Anapproachbasedonforcedynamicsispresentedby[Siskind,2001],
buildingasystemforrecognizingtheoccurrenceofeventsdescribedbysimple
spatial-motionverbsinshortimagesequences.Thesemanticsoftheseverbs
isspecifiedwithevent-logicexpressionsthatdescribechangesinthestateof
force-dynamicrelationsbetweentheparticipantsoftheevent.
[Fern,2004]extends[Siskind,2001]bydevelopingasupervisedlearn-
ingalgorithmforautomaticallyacquiringhigh-levelvisualeventdefinitions
fromlow-levelforcedynamicinterpretationsofvideo.Atemporalevent-
descriptionlanguageisintroduced:AMA,“And’sof“Meet’sandAnd’s”
(MAtimelines)”.AnMAtimelineisthesuccessionofonestatebeingtrue
foratimeintervalandasecondstatebeingtrueforasecondtimeinterval
meetingthefirst.AMAistheconjunctionofMAtimelines.Algorithmsand
complexityboundsaregivenfortheAMAsubsumptionandgeneralization
problems,alearningmethodisdevelopedbasedonthesealgorithmsand
appliedtolearningeventdefinitionsfromvideo.
[Ghanemetal.,2004]representandrecognizeeventsusingPetriNets
tobuildaninteractivesystemforqueryingsurveillancevideoaboutevents.
Thequeriesmaynotbeknowninadvanceandhavetobecomposedfrom
primitiveeventsandpreviouslydefinedqueries.PetriNetsareusedasboth
representationandrecognitionmethods.Agraphicaluserinterfaceisusedto
composequerieswhichthenaremappedintoasetofpetrinetsthatrepresent
thecomponentsofthequery.[Ghanemetal.,2004]alsouseaneventontology
definingstates,events,compositeevents/scenarios,andrelations.Objects
areassumedtobeprovidedbyanintermediatevisionlayer.
[XinandTan,2005]proposeasystemthatintegratesallrelatedinforma-
tioninahierarchicalconceptualmodel(namelyanontology)asanapproach

7

fordefinesevenetvenmotsasdelingandsignificantanalysischangeswithandsemanmappingsticrepresenoftaticonceptualons.Thisunitsinsystemthe
model.Threebasiceventcomponentsformtheconcept:entities,words,and
awsetords)offromattributes.thecontenThet,lowwhileerlevtheeluppoferthelevelframewsemanorkticalextractsrepresenfeaturestations(theof
eventsarereceivedusingthesewords.Eventsaretreatedasobviousfeature
cmohanges;vingobchajectsngeisaretheentittriggeries.Theofansceneevenist.dividedDifferentintoattributesdifferentofregionsregionsthatand
arelabeledmanually.Todescribemovingobjects,themotionstatesmove,
halt,stopareused.Trajectoriesarecharacterizedthrough‘gostraight’,‘turn
righregionst’,are‘turnleft’,describe‘retrdace’.throughIntheirteractionsspatialbetweenrelations:moovingccupoby,jectsenter,andsptransfer,ecial
appear.Interactionsoftwomovingobjectsarecloseto,awayfrom,en-
levcounelister,thefollolayw.outTheofthehierarcscenehyofandonassotologiesciatedconsistsrestrainofts:threeusablelevels.featuresThefirstof
theregion“Road”includethatvehiclesandothermovingobjectscanmove
onstatesit.ofThemotionnextlevandelthecontainsconceptsthemodescribvingingobinjects’teronactionstologymencontionedtainingabovthee.
Onafinallevel,thesemanticontologyrepresentswhatoccursinthescene.
Tomeasurethesimilarityofsemanticconcept,amethodusingConceptual
WSDStatusVmeasuresectorandtheWeighdistancetedbetwSemaneentictwoconDistanceceptual(WSD)statusisvpropectors.osed.WhenThe
thesemanticdistanceislargerthanathresholdlearnedfromtrainingdata,
an[evenGulerthappetal.ens,and2003]apresensemantticavideorepreseneventationtofdetectionthisevandentisminingmade.frame-
workforeventdetection,annotation,andcontentbrowsingincludingavideo
analysisdatabase.Theeventdetectionpartregardseventsinahierarchical
three-levelstructureconsistingof

data,kingtracthe1.2.simplebehaviorslike“wait”,“enter”,orpick-up”,and
3.higher-levelactivitiesconstitutedbythesesimplebehaviorsas“meet-
ing”,“packagedrop-off”,or“exchangebetweenpeople”.

Thethirdevleveneltofanalyevsenistisbaseddetectiononaresplitpanderformedmergebyatwinformationo-level.TheHiddensecondMarkoandv
Model,thefirst(hiddenlevel)representingthesecondleveloftheevent
structure.

otsSpBlank1.3.2Torecognizeevents,threeinputsasidethevideostreamarenecessary:

8

1.eventdomainknowledge,i.e.theknowledgeaboutwhatanevent“looks
like”inthespecificdomain,
2.knowledgeaboutthescenelayout,and
3.userrestrictionssuchasalloweddurationofstayinasensitivearea.

Themainchallengeinsemanticvideoretrievalisnottoknowatindexing
time,whateventswillbequeried.Thisway,itwouldn’tbewisetodirectly
indexonlinedetectedcomplexcompositemulti-threadmulti-agentscenarios.
Tobeabletoretrievethemnevertheless,theirconstitutingsub-scenarios
havetobeindexed.
Reasonably,it’sonlythescenario-independentandthusbasiccomponents
ofeventsthatcanbeusedtoenablescenarioanduserindependentindex-
ing.Formerlyproposedconceptsforbehaviorunderstandinganddescription
areapplicationdependentandmoreorlessreal-timealertsystems.Those
approachesemployingtaxonomies/ontologiesorbeingmodularinanother
wayeitherincorporatescenario-relatedinformationand/oruserrestrictions
intheirdetectionalgorithms.Scenariorepresentationsenablingretrievalin-
corporatinganyuserrestrictionsmightsprawlwhenappliedasin[Ghanem
etal.,2004]-anewrepresentationforeachvariation.Scenarioknowledge
shouldn’tbeheldintheeventdetectors([Cupillardetal.,2004])asthismakes
themscenariodependentandlimitsthesearchwithvaryingparameters.
So,what’sactuallymissingisaconceptthatisgenericandcanbeapplied
tocomplexscenedescriptions,andthusenablesSmartIndexing.Uponthis
indexingefficientretrievalcanbeperformed,eveniftheeventstobequeried
werenotknownatindexingtime(SmartRetrieval).Rulesandrestrictions
donotbelongintheindexbutshouldbesetupintheretrievalstep.The
samegoesforcontextualinformation:whichscenariorepresents“stealing”
andwhichdoes“handover”canonlyberesolvedbyanoperatorretrieving
thescenarioduetoatheftbeingreported.

1.4DefinitionofaSemanticConcept

Thisworkshowstheapplicationofthesemanticconceptthathasbeenintro-
labducedels.in[ThelabNeuhausels,are2005then]:tchemapondensedpinginotoflogenericw-levelmoduanalysisles.dataFinallyto,asemansyntaxtic
isprovidedforarrangingthesemodulestoasemanticscenedescriptionwhich

9

cancoversimpletocomplexscenes.Byusingthemodulesasanindexstruc-
ture,scenariositisbshoywnarranginghowaretrtheievalelementssystemcanbeexploitingbuild.theThispwossibilitay,ytheofmopresendelingted
issystemshownmhoakwesathisconconcepttributioncantobetheappliedproblematictoavarietnatureyofofthescenariosSIRoftopic.sceneIt
recognition,descriptionandretrieval.Notonlytheretrievalapplicationas
a“pull-application”ofsuchaneventrecognitionsystemcanbebuildusing
thetem.Mostconceptimpevolvortanedtlyhere,itisbutpalsoossiblethetodo“push”-vtheadvariant,ancedareasoninreal-time-alertgbasedsys-on
notthesmartactor-orienindexted:ataretrievspecificaltime.actorbTheecomespresenoftedinterestconceptnoissoevoner,enast-orien(s)heted,
performsaspecificaction.
Thegoalofthisworkistobuildasystemforindexingandretrievalof
videoalgorithmsurvandeillancethusbconetenindept.SucendenhtaorsystemeasilybeshouldtobbeeseparatedadoptablefromtoanayspVecificCA
algorithm.Toillustratethis,seefigure1.1forthefunctionaldistinctionofthe
detectiongeneration&oftracenrickinghingalgorsemanithm.ticinformation.Thisworkfollowsthemotion

1.5Organizationofthiswork
Theremainderofthisthesisisorganizedasfollows.Chapter2brieflysum-
marizesrelatedworkaddressingtheproblemofbehaviorrecognitionand
sceneutilizedindescriptithisoninthesisarenaturallistedlanguage.andInexplained.chapter3,Chapterthe4principlespresentsandthetoap-ols
usedproacinhtothatsolvcethapter,heSIRsectionchallenge.4.3introSectionducesthe4.2introconstitutingducestheelementsterminologyofthe
concept,theEventMorphemes.Section4.6showsexamplesfortheusage
ofthepresentedconcept.Section4.4showshowEventMorphemesareap-
plied,andsection4.7showstherealizationofEventMorphemes.Chapter
5presentstheresultsoftheconceptwhenprocessingtheusecasesandthe
6comparisonsummarizeswiththewrelatedorkwandorkpoints(sectionsoutho5.4.1wtheandiden5.5,tifiedrespwectiveakpely).ointofChapterthe
implementationofthepresentedconceptcanbemet.

2Chapter

RecentAdvancesintheField
ofAnalysisandModelingof
ObjectMotionandBehavior

ductiontroIn2.1Accordingto[Huetal.,2004],theprocessofanalyzing,understanding,and
describingthecontentofinterestincludesthesestages:“modelingofen-
vironments,detectionofmotion,classificationofmovingobjects,tracking,
understandinganddescriptionofbehaviors,humanidentification,andfusion
ofdatafrommultiplecameras”.Figure2.1showsanoverviewonthegeneral
frameworkreferredtoin[Huetal.,2004].
Thehierarchicalarrangementoflevelsofsemanticsuggeststheorderof
processingthedatatobeanalyzedusingthestagesof[Huetal.,2004]:level0
covers‘detectionofmotion’,‘classificationofmovingobjects’,‘tracking’,‘hu-
manidentification’,and‘fusionofdatafrommultiplecameras’,level1deals
withthe‘understandingofbehaviors’andlevel2incorporatesthepreceding
stagestogetherwiththe‘modelingofenvironments’tothe‘descriptionof
behaviors’andabove.Seefigure1.1foranoverviewonthedifferentlevels
tic.semanofThischapterisorganizedasfollows.Section2.2introducesthetechniques
appliedforeventdetectionsofar.Section2.3givesanintroductorysurvey
onsystemsderivingsimpleeventoractioninformationdirectlyfrommotion
detectionandtrackingalgorithm,wherethedetectionofeventsratherisa
“sideproduct”.Section2.4coverssystemsaddressingthetaskofeventand
actiondetectionwithouthavinganunderlyingmodelorsemanticrepresenta-
tion.Section2.5summarizessystemsemployingmodelsofbehaviorand/or

10

11

Figure2.1:Generalframeworkofvisualsurveillance(takenfrom[Huetal.,
]).2004

scenariosforeventrecognitionandthusbeingcloselyrelatedtothetopicof
thisthesis.Section2.6presentsontologiesandontologylanguagesforvideo
descriptions.tenev

2.2Overviewonappliedtechniques

2.2.1ProbabilisticandStochasticTechniques

Themaincharacteristicofprobabilisticandstochastictechniquesistomodel
explicitlyuncertaintyusingnumbers.ThesectionstartsbydescribingBayes-
ianClassifiertechniques,thenNeuralNetworkstechniques.Bothtechniques
ofarevisualadaptedfeaturestomoatdelagivtheentime.uncertaintWithyinBatheyesianrecognitionclassifiers,ofevtheentscomdepbinationending
isinferredfromthefrequencyoftheobservationsofeventsinfunctionof
visualfeatures.WithNeuralNetworks,thecombinationisstochastically
HiddenadjustedbMarykovimproMovingdelthetechniquesrecognitionapopliedvertoahlearningumansetactivitofysamples.recognitionFinallyare,

12

described.Thesetechniquesareusuallyusedtorecognizesequencesofevents.

ClassifieresianyBaInBayesiananalysis,thestateofknowledgeabouttheparametersxassoci-
atedwithamodelthatdescribesthephysicalobjectbeingstudiedissum-
marizedbytheposterior,whichistheprobabilitydensityfunctionp(x|d)
oftheparametersgiventheobserveddatad.Bayeslawgivestheposterior
as

p(d|xj)p(xj)
p(xj|d)=p(d|x)p(xj)
j

(2.1)

Theprobabilityp(d|x),calledthelikelihood,comesfromacomparison
oftheactualdatatothedatapredictedonthebasisofthemodelofthe
object.Thepredicteddataaregeneratedusingamodelforhowthemea-
surementsarerelatedtotheobject,whichwecallthemeasurementmodel.
Thepriorp(x)expresseswhatisknownabouttheobject,exclusiveofthe
presentmeasurements,andmayrepresentknowledgeacquiredfromprevious
measurements,specificinformationregardingtheobjectitself,orsimplygen-
eralknowledgeabouttheparameters,e.g.thattheyarenon-negative.Bayes
lawsaysthatforagivenobjectmodeltheposteriorcanbeevaluatedby
combiningthelikelihood,whichrequiresthedatavaluespredictedforthat
objectmodel,andwiththenumericalvalueoftheprior.Thiscalculationusu-
allyisstraightforward.Itinvolvescalculatingthepredictedmeasurements
forthegivenobjectmodel,whichwerefertoastheforwardmeasurement
calculation.Dynamicscenesareanuncertainenvironment.ThusBayesianclassifiers
areapplicabletothisproblem.Ifthevariables(i.e.objectattributes)are
conditionallyindependent,anaiveclassifiercanbeusedandtheBayesian
ruleisusedtoinfertheobjectclass.Suchaclassifierneedstolearnthe
parameters,e.g.p(ratio|car)andp(ratio|non−car).Themainadvantage
ofBayesianclassifiersistheircapabilitytomodeltheuncertaintyofthe
recognitionbyusingprobabilities.However,therearetwodrawbacks.First,
theaprioriprobabilityneedstobelearned.Duetotheconstructionofthe
learningsetsthislearningstageistimeconsuming.Secondly,thetimewhen
thevisualfeatureshavetobecomputedneedstobeindicatedexplicitly.
Thus,theyarenotadaptedtomodeltemporalrelations.

13

HiddenMarkovModel(HMM)
AsBayesianclassifiers,HMMsarealsousedtomodeluncertaintyofthe
observedenvironmentandinparticular,theuncertaintyoftemporalrelations
ofevents.TheprincipleofthisapproachistousetheMakovianhypothesis:
theprobabilityofbeinginagivenstateonlydependsontheprobabilityof
beinginthedirectpreviousstate.TheadvantageofHMMscomparedto
BayesianclassifierandNeuralNetworksistheabilitytorecognizesequences
ofevents.However,theyarelimitedinthewaytheyrecognizesequencesof
eventswhereseveralmobileobjectsareinvolved.Theprobabilityofbeingin
astateforamobileobjecthastobecombinedwiththeprobabilityofbeing
inanotherstateforallothermobileobjects.

orkswNetNeuralInessence,neuralnetworksaremathematicalconstructsthatemulatethe
processespeopleusetorecognizepatterns,learntasks,andsolveproblems.
Neuralnetworksareusuallycharacterizedintermsofthenumberandtypes
ofconnectionsbetweenindividualprocessingelements,calledneurons,and
thelearningrulesusedwhendataispresentedtothenetwork.Everyneuron
hasatransferfunction,typicallynon-linear,thatgeneratesasingleoutput
valuefromalloftheinputvaluesthatareappliedtotheneuron.Every
connectionhasaweightthatisappliedtotheinputvalueassociatedwith
theconnection.Aparticularorganizationofneuronsandconnectionsisoften
referredtoasaneuralnetworkarchitecture.Thepowerofneuralnetworks
derivesfromtheirabilitytolearnfromexperience(thatis,fromhistorical
datacollectedinsomeproblemdomain).
Humanbehaviorsevolvenormallyinanuncertainenvironmentthusneu-
ralnetworkstechniqueshavebeenusedtocopewiththisproblem.However,
itisnotefficienttohandlecomplexbehaviorsinvolvingalargenumberof
physicalobjectsandcomplextemporalconstraints(e.g.synchronizedcon-
straint)becauseitleadstoacombinatorialexplosionofpossiblebehaviors
correspondingtoallcombinationsofphysicalobjectsdetectedinthescene.

2.2.2SymbolicTechniques
Thissectionpresentssymbolictechniquesforhumanactivityrecognitions.
Thesetechniquesaimattransformingnumericalobservationsintosymbolic
scenarios.

14

AutomataRecently,automatahavebeenusedtorecognizehumanbehaviorsinvideo
sequences.Severalnumericaltechniquesareusedtorecognizevideoevents
uptotheeventlevel(e.g.numericalcalculationsofbasicpropertiesofphys-
icalobjects,comparisonofstatesattwoconsecutiveinstantstorecognize
events).AttheScenariolevel,theyuseanautomatonapproachforrecogniz-
ingpre-definedScenarios.TorecognizeaScenarioM,theScenariorecogni-
tionprocesscreatesanautomatonrepresentingtheScenarioM.Thestates
ofthisautomatoncorrespondtothestates/events/subscenarioscomposing
M.Thetransitionsofthisautomatoncorrespondtotheconstraintsdefined
betweentwostates.ThisapproachhastheadvantageofreusingtheScenar-
iospartiallyrecognizedatpreviousinstantsinsteadofrecalculatingthemat
eachinstant.Moreover,italsoshowsthecapacityofpredictingwhichSce-
narioswillhappenintheobservedscenes.However,ithasseveraldrawbacks.
Forexample:(1)ifaScenarioMisdefinedwithseveralphysicalobjects,the
Scenariorecognitionprocesshastocreatealltheautomatacorrespondingto
allcombinationsofphysicalobjectsdefinedwithinMfortherecognitionof
M.Moreover,thenumberofstatesofaScenarioincreasesinfunctionofthe
numberofphysicalobjectsinvolvedintheScenario,becausethesephysical
objectscanevolveinmanydifferentSituations.

NetetriP2.2.3AthePetriarcs,Netandconthesisttoksens.basicallyTheoftokfourensarecompputonenints,thetheplacesplaces,mothedelingtransitions,states.
Thetransitionsareenabledwhentheprecedingplacesarefilledsufficiently.
Thetransitionsareusedtomodeleventsthatchangestates.Arcsarethe
connectionsbetweenplacesandtransitions.
TheadvantagesofPetrinetsare:(1)thecapacityofsequencing,paral-
lelismandsynchronization,(2)Petrinetsallowmonitoringandprediction.
However,thistechniquecanleadtherecognitionprocesstoacombinatorial
problemwhencopingwithtemporalScenariosdefinedwithseveralphysi-
calobjectsandwithscenescomposedofalargenumberofmobileobjects.
Moreover,sometemporalconstraints(e.g.“personBarrives1minuteafter
personAleft”)aredifficulttoexpressusingthisformalism.

15

2.3Action-orientedmotionanalysisand
kingtrac“Theaimistodevelopintelligentvisualsurveillancetoreplacethetraditional
passivevideosurveillancethatisprovingineffectiveasthenumberofcameras
exceedsthecapabilityofhumanoperatorstomonitorthem.Inshort,the
goalofvisualsurveillanceisnotonlytoputcamerasintheplaceofhuman
eyes,butalsotoaccomplishtheentiresurveillancetaskasautomaticallyas
possible.”[Huetal.,2004]
[AggarwalandCai,1997]statethatforhumanactivityrecognitionthe
advantageusingthetemplatematchingtechniquewasitsinexpensivecompu-
tationalcostswhilebeingrelativelysensitivetothevarianceofthemovement
duration.Threemajorareasrelatedtointerpretinghumanmotionarede-
fined:1.motionanalysisinvolvinghumanbodyparts,
2.trackingofhumanmotionusingsingleormultiplecameras,and
3.recognizinghumanactivitiesfromimagesequences.
[CutlerandDavis,2000]presentasystemthatanalyzesperiodicmotion
bysegmentingthemotionandtrackingobjectsintheforeground.Objectsare
alignedalongthetemporalaxis.Theobject’sself-similarityasitevolvesin
time,isbeingcomputed.Thesystemalsoclassifiesobjectsusingperiodicity:
eople”,“pand“dogs”“other”classes.distincttheareIn[Haritaogluetal.,2000],thereal-timevisualsurveillancesystemW4
isintroduced.Itusesacombinationofshapeanalysisandtrackingand
constructsmodelsofpeople’sappearances.Bythis,thesystemdetectsand
tracksgroupsofpeopleandwatchestheirbehaviors.TheSystemhandles
occlusionandoutdoorenvironments.Asinglecamerawithagrayscalesensor
isused.W4employsabackgroundmodeltoreducetheinfluenceofchanges
indynamicscenesderivedfromlightingetc.
Residualflowisusedin[Lipton,1999]toanalyzerigidityandperiodicity
ofmovingobjects.Itisassumedthatrigidobjectspresentlittleresidual
flowandnon-rigidmovingobject,e.g.ahumanbeinghasahigheraver-
ageresidualflowandadditionallyshowsaperiodiccomponent.Basedon

16

this,humanmotionisdistinguishedfrommotionofotherobjects,suchas
vehicles.Torecognizehumandynamicsinvideosequences,[Bregler,1997]
buildsmotionmodelsofhumanlimbsandjoints,whicharewidelyusedin
tracking([Huetal.,2004]).Theyareeffectivebecausethemovementsofthe
limbsarestronglyconstrained.Thesemotionmodelsareemployedasapriori
knowledgetointerpretandrecognizehumanbehaviors.Humanbehavioris
decomposedintomultipleabstractions,andrepresentsthehigh-levelabstrac-
tionbyHMM’sbuiltfromphasesofsimplemovements.Thisrepresentation
ismouseddelsofforhbumanothtraclimbskingandandjoinrects,toognition.o.A[highlZhaoyetal.structured,2002]motionusemomotiondel
forballetdancingundertheminimumdescriptionlength(MDL)paradigm
islearned.Thismotionmodelresemblesafinite-statemachine(FSM).
In[RaoandShah,2001],aviewinvariantrepresentationofactioncon-
sistingofdynamicinstantsandintervalsispresentedwhichiscomputed
usedusingtolearnspatio-temphumanoralcurvactionsaturewithoutofatratraining.jectory.FTohiscusofrepresenthesystemtationisarehthenu-
manactionsperformedbyahand.Thetrajectoryofahandisrepresented
byasequenceofdynamicinstantsandintervals.Adynamicinstantisan
instantaneousentity,whichoccursforonlyoneframe,andrepresentsanim-
portantchangeinmotioncharacteristics:speed,direction,acceleration,and
curvature.Aninstantisdetectedbyidentifyingmaximainspatio-temporal
curvature.Anintervalrepresentsthetime-periodbetweentwodynamicin-
stants,duringwhichthemotioncharacteristicsremainconstant.Instants
andintervalshavephysicalmeanings.Therefore,itispossibletoexplainan
actionasasequenceofmeaningfulinstantsandintervals.Dynamicinstants
include

hing”,“toucwisting”,“tosening”.“lo

Inincludealsterv

hing”,“approac“lifting”,“pushing”,“receding”.

17

earscenariosExample“opening/closingoverheadcabinet”,
“pickingup/puttingdownabook/phone”,and
oard”.whiteba“erasingThesubjectoftrackinginmultiplecamerasisaddressedin[Javedetal.,
2000].Thesystempresentedusesspatialrelationshipsbetweenviewfieldsof
camerastoestablishcorrespondingrelationshipsofimages.[Krummetal.,
2000]usecolorhistogramstomatchregions.[Brandetal.,1997]useCoupled
HiddenMarkovmodelsfortherecognitionofaction.
[BobickandDavis,2001]useatemporaltemplatefortherecognitionof
humanmovement–astaticvector-imagewherethevectorvalueateachpoint
isdefinedasafunctionofthemotionpropertiesatthecorrespondingspatial
locationintheimagesequence.Peoplecarryingobjectsaredetectedin[Ab-
delkaderandDavis,2002]bylookingatfactorsaffectinggaitperception,e.g.,
clothing,environments,distance,carriedobjectssuchasbriefcases.[Hari-
taogluetal.,2001]usesilhouettestodeterminewhetherpeoplearecarrying
objectsdeterminesormowhethervingapunencumersonbisered.carryingTheanemploobyjectedandshapesegmenanalysiststheobalgorithmject
betfromweenthetpweorsonpsoeople.that[itCunadocanbeteal.trac,ked,1997]e.g.modelduringgaitanasexcthehangemovofemenobtjectsof
anarticulatedpendulumandusethedynamicHoughtransformtoextract
theusedtolinessmoothrepresenthetinginclinathetionthighdatainofeachtheframe.thighandThetoleastfillthesquaresmissingmethopoindtsis
causedbyself-occlusionofthelegs.Phase-weightedmagnitudespectraare
usedasgaitfeaturesforrecognition.

analysisScenario2.4Thissectioncoversapproachesaddressingthescenarioanalysisproblemwith-
outunderlyingmodelsorsemanticrepresentations.
In[Ersoyetal.,2004],eventsareformulatedusingdomain-independent
eventprimitivesrepresentedbyspatio-temporalrelationshipsbetweenob-
jects.Complexeventsareexpressibleascombinationsofsimpleevents.The
trajectoriesofobjectsserveas“atomicentities”.Thesyntaxoftheevent
descriptionintheapplicationofaparkinglotusesthesyntaxof[Allen,1983]:
dropoff(x,y)≡enterlot(x)∧<
((stop(x)∧oexits(y)∧o(dist(x,y)<d))∧dileavelot(x)

18

[StaufferandGrimson,2000]employreal-timetrackingtolearnactivity
patterns.Itisstatedthatbecauseofthestabilityandcompletenessofthe
representationitwaspossibletodosimpleclassificationbasedonaspectratio
size.orIn[Piateretal.,2002],thedescribedeventsare
arget”“NewTarget”“ConfirmTarget”eTv“Moetc.,i.e.descriptionoftrackeroutput,andnotsemanticallyexpressions.
Fromthoseevents,simplescenedescriptionsaregenerated.
[Davis,2004]addressestheproblemofthetimenecessarytoidentify
humanactions.Theactionsinfocusare
alking”,“wand“running”,“standing”.Areferenceframeworkusingthe”keyfeature”from[JepsonandRichards,
1991]isemployedandacontinualverificationprocessoftheselectedobject
likelihoodoncetheactionhasbeendetectedisproposed.
[AyersandShah,2001]extensivelyusepriorknowledge.Thisknowledge
isemploymentintracking,skindetection,andactionrecognition.Asan
example,terminal”“useasdefinedis“personsittingnearterminal”and“scenechange”isdetectedin
mousebutnotbehindit.
“Inmouse”describestheregionthe(computer)mouseisin.
[MakrisandEllis,2003]automaticallytrainanactivity-basedsemantic
scenemodelforasurveillanceregion.Thesemanticscenemodelsdefines
regionsofactivityinthecameraview.Regionswhereparticulartypesof
motion-relatedactivityarelocatedhavebeentrainedfromtargettrajectories
generatedfromtrackingobjectsthroughtheenvironment.Thoseregionsare

19

zonesofentry/exit,paths,routes,andjunctions.Atextualdescriptionof
theactivitybasedonthissystemis:
“Atthetimet1,pedestrian#440entersthesceneatentrypoint
1,movingalongpathA.Atjunction2he/shechoosespathBand
exitsthesceneatexitpoint3”
[DeeandHogg,2005]presentasystemthatanalyzesbehaviorinasurveil-
lancesetting.The“goal”ofanagentwithinthesceneisevaluatedbyasub-
goalalgorithm.Ifhemovestowardsanentry/exithecan“see”,thebehavior
isclassifiedasbeingexpected.Themoretheagentdoesn’tbehavelikethat,
thehighertheinexplicabilityscoreis.Asanapplication,thesystemisused
tohighlightinterestingactionstoasurveillanceoperator.
[Toshevetal.,2006]presentanalgorithmthatprocessesasetofprimitive
eventssuchassimplespatialrelationsbetweenobjectsobtainedfromatrack-
ingsystemandoutputsfrequenteventpatterns.Thisworkfocussesonthe
problemofdetectingfrequentcomplexactivitieswithoutamodel.Anevent
isaspatio-temporalpropertyofanobjectinatimeintervalorachangeof
suchaproperty.Eventsareformallydefinedinaneventdescriptionlanguage
([Br´emondetal.,2004])whichenablesthedefinitionofcomplexeventsin
termsofsimpleronesandthiswaybuildhierarchicalstructuresofevents.
Suchsimpleeventsare
“avehicleontheroad”,
“avehicleontheparkingroad”,
“vehicleonaparkingplace”,and
“personcomingoutofthevehicle”,
scenarioplexcomthebuildingeuvre”.mano“parkingThus,primitiveeventsare
“objectinazone”or
“objectnearanotherobject”.
Forthedetectionofcomplexevents,thedataminingApriorialgorithm
isadaptedwhichusesthesocalledAprioriproperty:thesubpatternsof
frequentpatternsarealsofrequent.Asthispropertydoesnotholdincaseof

20

similaritybecausesubpatternsofpatternscanbelesssimilarthanthepat-
ternsthemselves,aWeak-Aprioripropertyisformulated.Itdecreasesthe
frequencythresholdforshorterpatternsinordertopreventlosingsubpat-
ternsoffrequentpatternsandthustoguaranteetheirdetectioninthemerge
step.Thecontextknowledgeisseparatedfromthealgorithmtomakethe
approachapplicableindifferentdomains.Fieldofapplicationistheparking
omain.dmonitoringlot[PorikliandHaga,2004]introduceasetoftime-wiseandobject-wisesta-
tisticalfeaturesastrajectories,histograms,andHMM’sofspeed,orientation,
location,size,andaspectratiotobuildaneventdetectionframework.Not
predefinedmodelsaremappedtoevents,butunusualeventsaredetectedby
analyzingtheconformityscores.Thus,usualandunusualarenotpredefined:
usualislaiddownas

“thehighrecurrenceofeventsthataresimilar”.

scenariosulated)(Simare

“anobjectmovinginoppositedirectiontotherest”,
“awaitingobjectwhereotherobjectsmoves”,and
“afastmovingobject”.

sifying[hNascimenumantoetactivitiesal.,2005from]videodescribeansequencesalgorithmofaforshoppingsegmencentingterasandclas-

shop”,thegtering/exitin“en“passing,orbrowsinginfrontofashopwindow”.

Theseactivitiesarerecognizedbyusingaprioriknowledgeaboutthelayout
scene.theof[FuentesandVelastin,2005]presentaneventdetectionalgorithmbased
loonw-levmotionelrtraepresenjectories.tationsPofosition,predefinedspeedanevdenpts.eopleAdensitsemanyticareuseddescriptotionscreateis
associatedtotheseevents.Thesystemthenraisesalarmstothesurveillance
operator.Eventsdetectedare

21

luggage”,“unattended“falls”,hiding”,eople“palism”,and“vts”,“figh“intrusioninforbiddenareas”,and
ks”,“attacthelatterbeingpre-stagestofight,indicatedbyapersongettingtooclose
toanotherpersonbyenteringher/hissocialorpersonalzone.
[Louetal.,2002]proposeaframeworkforsemanticinterpretationof
vehicleandpedestrian’sbehaviors.Theobjecttrajectoriesareanalyzedusing
dynamicclusteringandclassificationonwhichthehighlevelsemanticis
based.Spatialinformationisusedtocombinetrajectoriesintoclustersand
thendynamicinformationisemployedtoarrangethetrajectoriesineach
clusterintoclasses.Thoseare
“MoveForward”,
t”,Righurn“TandLeft”,urn“T“Stop”.

Thenaturallanguagedescriptionfollowstherule
“(TheObj)(Action)in(Theplacename)[at(high/low/middle)
eed]”.spTheplaces’namesarehandlabeled.Theapplicationisvisualtrafficsurveil-
lance.

2.5Eventdetectionandrepresentation
Inthissectionthosesystemsemployinganeventmodeltosomeextentare
ted.presen[PinhanezandBobick,1998]definearepresentationforthetemporal
structureinherentinhumanactionsandamethodforusingthatrepresen-
tationtodetectoccurrencesofactions.Thehierarchydevelopedstartsat

22

theactions”“sensorandfinallyinformation”“actionws”.hichTheistempfolloworaledbystructure,“events”thePNFconstitutingPropagation“sub-
1To(past-emplonoyw-PNF,fut),inistervalemployalgebraedtodetectconstrainandtnetremowvorkse(IA-netinconsistenwtorks,[situations.Allen,
1984])areused.ThoseIA-networksaremappedintoPNF-networks.APNF
vnetariabworklesisisathebinsetaryofsymconstrainbolstm=satisfactionpast,netnow,workfut.whereThetheactionsdomainpresenoftedall
are

wl”,obk-up“pic“wrappingchicken”,and
ts”.ingredien“mixing“pick-up-bowl”,i.e,isdefinedas
“reach-for-bowl”AND“grasp-bowl”.
Thesensorinformationise.g.codedas
“DET:hands-close-sta-bowl”or”DET:bowl-on-table”.
framewToorkrecognizeformrepresenulti-agentingtandactionvisually[HongengrecognizingandNevcomplexatia,m2001]ulti-agencreatetac-a
(ttion.ypically)“Complex”apartiallyisanoractionderedthattempconoraltainsrelationmantoyonecomponenanothertsoandtccurringhatarein
subjecttocertainlogicalconstraints.“Multi-agent”resultsinparallelevent
streamsdomaindevthatelopinedteractisintemrecognizingporal(tAmericanypicallyfocausal)otballwplaays.ys.TheThetaskapproacandhtheis
derdrivenbyconsistencytheideatypof[icallyGrimsonimpliesandcorrectness.Lozano-P´erezThe,1985represen],thattationamassivleloelemenwor-ts
are1.todividual”,definea“lotempcaloralgoals”,structur“eveents”)descriptionwithrofelationsthecoglobaldedbasehaviotempr(“in-oral
ts,constrain2.todefineforeachbasicelementavisualnetworkthatdetectstheoc-
currenceofgoalsorevents,and
3.toconstructamulti-agentbeliefnetworkreflectingthetemporalstruc-
action.theofture1“future”for

23

TemporalrelationsareexpressedwithAllen’sintervalalgebra[Allen,1983].
[ParkandAggarwal,2004]useeventhierarchyasamethodtorepresent
two-personinteractionsatasemanticlevelwithanaturallanguagedescrip-
tion.Interactionsconsistoftwosingle-personactions,whichconsistoftorso
andarm/legmovement.Fortherepresentation,thosetripletsareused:

<agent-motion-target>
Thesystemperformsitsreasoningbasedonbody-partgestures[Parkand
quenceAggarwofal,2003instan]astaneousanpelemenosestaryateacevhentofframe.motionThebineingteractioncomposedhierarcofhayse-is
asdefinedinteraction-action-gesture(dynamic)-pose(static).
hTheumantraninsformationteractions.rulesareExamplesofdeterminedhumanbyindomain-spteractionsecificarekno“hwledgeugging”abandout
hing”.“punc(facial[Koandjimaethands).al.,By2002]assoareciatingdescribingvisualfeatactivitiesuresofbyheadtrackinandghandskinregionsmotion
withnaturallanguageconcepts,syntacticcomponentssuchasverbs,objects,
theetc.headareimpliesdeterminednotandonlythetranslatedpositionintowherenaturalapersonlanguage.isbutThepalsoositionapos-of
turewhether(s)heisstandingorsitting.Thedirectionoftheheadimplies
what(s)heislookingat,andthepositionsofthehandsimplygesturesand
interminedteractionsusingwithknoobwledgejects.abOtheroutobcompjectsonenandtsequipmennecessarytinforaasenscene.tenceAareconceptde-
mohierarcve”,hyspofbecializedodybyactions“moisveslodefinedw/fast”from“bande”“staasytherhigh/looot,w”,“movrespe”andectively“not.
ing[semanGritaiticetal.,corresp2004]ondencepropboseetwaneenapprhoacumanhbtoodies.matchingTheyhmakumaneimplicitactionsuseus-
ofthelawsgoverningbodyproportionstoderivegeometricconstraintsfor
sevmatceralhing.pointsInsteadontheofactorusingforaactiosinglenpointrecognitionforisrepresenbeingtation,explored.theusageInsteadof
oftwocameraviews,geometricconstraintswithrespecttotwoactorsper-
forminganactionareused.Eachpointrepresentsthespatialcoordinatefor
ananatomicallandmarkonthehumanbody.Eightpointsarerequiredin
ineachtheframe;action.atleastActionsonearemustrecognizedcorrespbyondtomeasuringthebodythepartsimilaritdirectlyyofinpvolvostureed
ateachcorrespondingtimeinstant.(Staged)scenariosare

24

alking”,“w“bendingdowntograspanobject”,
ject”,obthe“lifting“walkingaway”,and
ait”.g‘Egyptian’“theIn[Sternetal.,2003],asystemusingFuzzyExpertSystemmodelsto
describeasceneispresented.Thetermsofdescriptionarethenumberof
peopleandpeoplegroupsinthescene,theiractionsas
“walkingtoward/awayfromthecamera”,
still”,“standingerson”,panotherfrom“departing“walkingwithanotherperson”,or
”.groupa“joiningTheobjectclassificationisperformedviaaStaticExpertModel,employing
ashsucrulesfuzzy“ifArea=very−smallThennot−a−person”.
ForactionidentificationDynamicExpertSystemmodelsareused,e.g.
“ifX-movement=slightly−rightandY-movement=almost−
no−changeThenVelocity=standingandDirection=none”.
[Xiangetal.,2002]presentanapproachformodelingtemporaleventson
thelocalintensitytemporalhistoryofpixels.Pixel-leveleventslike
stopping”,“carorwse”,broeople“pal”vremoject“obaredetectedbyPixelChangeHistorywithabackgroundmodel.Onthenext
hierarchiclevel,blob-leveleventsaredetectedusingclusteringviaExpectation-
Maximizationalgorithm.Asusecase,ashoppingscenarioispresented:
“takeacanandexitwithoutpaying”.

Thescenarioisrepresentedbytheeventclasses

en”,tak“canving”,leaandtering“enk“shoper”,eepandwsing”,“broying”.“pa

25

Theeventclassesarelearnedautomaticallyandlabeledmanually.
betwIneen[IvmanovultipleandagenBobictsk,is2000]addressed.therecognTheitionofrecognitionactivitiesproblemandinisteractionshereby
dividedintotwolevels:

1.theindependentprobabilisticeventdetectorstoproposecandidatede-
tectionsoflow-levelfeaturesand
2.takingtheoutputoflevel1asinputforastochasticcontext-freegram-
hanism.mecparsingmar

Theadvantageofthesecondlevelemployedisdescribedastoprovidelonger
rangetemporalconstraints,disambiguateuncertainlow-leveldetections,and
allowtheinclusionofaprioriknowledgeaboutthestructureofeventsina
givendomain.Level1isfortherecognitionofprimitives(statistical),while
level2recognizesstructure(syntactical)ofthescene.Thesystemisapplied
scenario:lotparkingato

“drivingintotheparkinglotandleavingtheparkinglotonfoot”,
“peoplebeingdroppedoff/pickedup”etc.

Thescenariosareconstructedfromtheprimitivesdetectedbythetracking
generator:tenev

ter”,“car-enter”,erson-en“p“car-found”,erson-found”,“pandject-lost”,“obed”.ject-stopp“ob

26

Thus,thesesymbolsconsistoftrackingsystemoutputandcontainlittleor
els.labticsemannoPTheeopleVision)IBMisSmartdesignedSurveillancewiththeSysteminten[tofHampapurmakingetal.curren,2004tly]develop(formallyed
surveillancesystems“smart”.Thesystemassumeslargelystaticcameras.
Upontheinputsofthesecameras,real-timevideobasedalertsaregener-
ated:

1.motiondetection:movementofanyobjectwithinaspecifiedzone,
2.directionalmotiondetection:specificdirectionofmovement,
3.abandonedobjectalarm:objectswhichareabandoned,
4.objectremoval:movementsofauser-specifiedobjectthatisnotex-
pectedtomove,and
5.cameramove/blind:whenthecamerahasbeentamperedwith.

Thesealertsaresolelybasedonmovement.Inadditiontoreal-timealerts,
aviewablevideoindexforAutomaticForensicVideoRetrieval(AFVR)is
generated,containingindicesforthenumberofobjects,classification(single
person,groupofpeople,vehicles),objectproperties(color,texture,shape,
size),movementproperties(position,velocity,trajectory),occlusionparam-
eters(whenobjectsareoccluded),backgroundchangesduetochangesin
lightingandstoppingofmovingobjects,andeventinformation:anyevents
thatmaybeflaggedbytheengine.
[Guleretal.,2003]presentavideoeventdetectionandminingframe-
workforeventdetection,annotation,andcontentbrowsingincludingavideo
analysisdatabase.Theeventdetectionpartregardseventsinahierarchical
ofconsistingstructureelthree-lev

data,kingtracthe1.2.simplebehaviorslike
ait”,“worter”,“enandk-up”pic3.higher-levelactivitiesconstitutedbythesesimplebehaviorsas
“meeting”,“packagedrop-off”,or
“exchangebetweenpeople”.

27

Theeventanalysisisbasedonsplitandmergeinformation.Thesecondand
thirdlevelofeventdetectionareperformedbyatwo-levelHiddenMarkov
Model,thefirst(hiddenlevel)representingthesecondleveloftheevent
structure.[Forestietal.,2004]presentasystemforeventclassificationinparking
lots.Theydistinguishsimpleandcomplexevents.Simpleeventsare“moving
e.g.jects”,ob

“vehiclemovinginanallowedarea”or
“pedestrianswalkingwithtypicaltrajectories”.

Complexeventsarea“setoftemporallyconsecutivesimpleevents”.The
objecttrajectoriesareapproximatedbyBezi`erCurves.Objectsareclassified
as“vehicles”or“pedestrians”.Eventrecognitionisexecutedbytakinginto
accountthataneventischaracterizedbyasetofclassifiedobjectsovera
sequenceofconsecutiveframes.Threetypesofalarmsaregenerated:

ts,enevnormaltsenevsuspicious“pedestrianswalkingwithtrajectoriesnotalwaysrectilinear”,
“pedestriansmovingaroundavehicle”,and
tsenevdangerous“pedestrians/vehiclesmoving/stoppinginnotallowedareas”,
“pedestriansmovingwithatypicaltrajectories”.

theAnobofflinejectEvclassentandDatabasetheconparameterstainsofmodtheelsofBezi`erthosefittintypg.es.TheseThearefeaturesusedareas
inputforthetrainingofanAHNT(adaptivehighorderneuraltree).An
ActiveliminatedeEvenviatanDatabaseagecounister.storingAndetectedautomaticsimpleproevcedureents.checOldksifevensometsareof
theseeventsarespatiallyortemporallyrelated.Ifso,acompositeeventis
generated.Compositeeventsare

tering”,enehicle“vving”,moehicle“v“p“personersonmoexitingving”,vorehicle”,

28

area”.parkingexitingerson“pCompositeeventsaredefinedbytheoperator.
[Hongengetal.,2004]presentaconcepttorepresentactivityandrecogni-
tionmethodsemployingthisrepresentation.Anactivityisbeingcomposed
ofactionthreads,eachthreadbeingexecutedbyasingleactor.Asingle
threadeventrepresentscharacteristicsoftrajectoryandshape,e.g.“ap-
proachingareferenceperson”or“headingtoward”.Amulti-threadevent
iscomposedofseveralactionthreadsrelatedbytemporalconstraintsand
isrepresentedbyaneventgraph,similartointervalalgebranetworks[Allen
andFerguson,1994].Mostly,multi-threadeventsincorporateseveralactors,
i.e.beingmulti-agentevents.Eventsarebeingorganizedintoseverallayers
ofabstraction(from[Medionietal.,23]).Theapproachiscloselyrelated
to[IvanovandBobick,2000]asexternalknowledgeisincorporatedintothe
expectedstructureoftheactivitymodel.Forthescenariorecognition,three
ed:yemploaresteps1.detectandtrackmovingobjects,
2.computeobjectpropertiesusing“UserProvidedContext”(spatialand
andtext),contask3.inparallelmatchscenariostoa“LibraryofEventModels”.

Scenariosaredefinedfromasetofpropertiesorsub-scenariosbuildinga
hierarchicalstructure.Theeventrepresentationatthescenariolevelmaps
tohowahumanwoulddescribeevents.Single-threadeventsarerecognized
bynaiveBayesclassifier,complexonesbyBayesianNetworks,aswellare
multi-threadevents.Presentedaretheevents

and“stand”h”“crouc

asshaped-basedevents,

h”,“approacandat”,“stopwn”dow“sloandjectorytraonbased

alongving“mopath”the

29

withregardtothegeometricalzone.
[Cupillardetal.,2004]presentasystemperformedintheframework
ofapproactheeuhropforeantheproonlinejectADrecognitionVISORof[Naylorindividuandal,Atgrouptwooofd,p2003eople,].orItcroiswdan
beharepresenviorintationmetroofsurvscenarioseillanceahieconrarchtextyisemploused,yingderivingmultiplefromancameras.“EntitFy”orthethe
“Scenario”with“State”and“Event”ontheonebranchand“Scene-Object”
onasitsthespother,ecializations.with“Static”Inputfor(“Equipmenthescenariot”,“Zone”)recognitionandare“Mobile”(“Person”)

1.scenariomodelsdefinedbyexperts,
2.geometricinformationoftheobservedenvironment,and
3.personstrackedbyavisionmodule,whichissupposedtodothatcor-
.rectly

Thesystememploysa“graphofsolutions”foreachpersonwithnodeslike

to”,“closefrom”,“far“movescloseto”,
“movesawayfrom”,
andat”,ys“staalism”.and“v

Theformalismisbasedonthreemainideas:

1.definevariousoperators(softwaremodules)forrecognition,
2.haveallknowledgeneededinthecorrespondingoperator,and
3.descriptionoftheoperatorshallbedeclarativetobuildanextensible
erators.opoflibrary

Thebehaviorrepresentationisactororiented,whereanactorisanysceneob-
ofjectpinveople,olvoredincrobwd.ehaForvior:eachstatictracobkedjects,actor,zonestheobfinehaviorterest,recognitionindividuals,mogroupdule
performsthreelevelsofreasoning:“states”,“events”and“scenarios”.

30

Therecognitionprocessemploysfourconcepts:
basicproperties(trajectories,speed,direction),
states(asituationcharacterizinganactoratacertaintimeas“an
individualiswalking”or“thetrajectoryisstraight”),
events(changeofstate,e.g.“agroupentersazoneofinterest”),and
scenariosasthecombinationofstates,events,orsub-scenarios.
Aframeworkofoperatorsisused:torecognizeascenario,theoperatorsare
arrangedhierarchically:thebottomiscomposedofstateswhilethetop(after
possibleintermediatelevels)correspondstothescenariotoberecognized.
Theoutputoftheoperatorsareboolean:theeventiseitherdetectedornot
detected.Thescenariospresentedare
“fraud”(jumpingovertheturnstiles),
ting”,“fighking”,c“bloandalism”,and“vwding”.roercv“oBehaviorsarespecificscenariosdefinedbytheuser.In[Vuetal.,2003],these
scenariosare“pre-compiled”forbetterrecognition.Inputforthesystemis
contextualaprioriknowledge(scenariomodels,geometric&semanticinfor-
mationaboutthescene)andthevideostream.Therecognitionprocessuses
onlytheknowledgerepresentedbyexpertsthroughscenariomodels.Foreach
modelofascenarioinstance,thesetofactorswithactorvariables,thesetof
sub-scenarioinstances(elementaryscenarios,composedscenarios),andthe
setofconstraintsaredefined.Thesystemisappliedto
andk”attac“Bank“Vandalismagainstaticketmachine”.
[Bourbakisetal.,2003]defineamodelforrepresenting,recognizingand
interpretinghumanactivity.Themodelisbasedonthehierarchicalsynergy
ofthreeothermodels:theLocal/Global(L-G)graph,theStochasticPetri
Net(SPN)graph,andaneuralnetwork(NN)model.Theauthorsmakea
distinctionbetweenstructuralknowledge(knowledgeaboutphysicalstate)
andfunctionalknowledge(knowledgeaboutchangeandevents)andmodel
thetemporalrelationshipamongasetofdifferentobjecttemporalevents

31

inthescene.TheydevelopaDynamicallyMulti-LinkedHiddenMarkov
Mocaptureddelina(DML-HMM)noutdotoorinscene.terpretThegroupfieldofactivitiesinapplicationvolvingaremaiultiplerportobcargojects
activitieswitheventslike

k”,rucvingT“mo,vingCargo”“mo“mo“movingTvingCargoLifruckCargo”.t”,or

Anapproachbasedonforcedynamicsispresentedby[Siskind,2001],
buildingasystemforrecognizingtheoccurrenceofeventsdescribedbysim-
plespatial-motionverbsinshortimagesequences.Thesemanticsofthese
verbsarespecifiedwithevent-logicexpressionsthatdescribechangesinthe
statesofforce-dynamicrelationsbetweentheparticipantsoftheevent.The
exampleof“Ahandispickingupablock”isdescribedasfollows.

Apickupeventischaracterizedasachangefromastatewhere
thesourcepatientoatisstatesuppwhereortedthebypatienatissubstansupptiallyortedbyconstrainbeingtatwithtachedthe
t.agentheto

[Fern,2004]extends[Siskind,2001]bydevelopingasupervisedlearn-
ingalgorithmforautomaticallyacquiringhigh-levelvisualeventdefinitions
fromdescriptionlow-levellanguageforceisdynamicintroinduced:terpretationsAMA,“Aofnd’svideo.of“MAeet’stempoandralAevennd’s”t-
(MAtimelines)”.AnMAtimelineisthesuccessionofonestatebeingtrue
foratimeintervalandasecondstatebeingtrueforasecondtimeinterval
meetingthefirst.AMAistheconjunctionofMAtimelines.Algorithmsand
problems,complexityaboulearndsningaremethogivendforisdevtheelopAMAedbasedsubsumptionontheseandalgorithmsgeneralizationand
appliedtolearningeventdefinitionsfromvideo.
[Ghanemetal.,2004]representandrecognizeeventsusingPetriNets
tobuildaninteractivesystemforqueryingsurveillancevideoaboutevents.
Thequeriesmaynotbeknowninadvanceandhavetobecomposedfrom
primitiveeventsandpreviouslydefinedqueries.PetriNetsareusedasboth
representationandrecognitionmethods.

“Fcanorbesimpleeasilylearnedactivitiesfromwhosetrainingstructuredata,isstoknocwnhasticinadvinferenceancecanand

32

beused.Ontheotherhand,forhigherleveleventsthatinclude
temporalcombinationsofotherevents,deterministicinference
preferable.”seemsAgraphicaluserinterfaceisusedforformulatingqueries.Thesequeries
arethenmappedintoasetofpetrinetsthatrepresentcomponentsofthe
query.[Ghanemetal.,2004]alsousesaneventontologydefiningstates,
events,compositeevents/scenarios,andrelations.Objectsareassumedto
beprovidedbyanintermediatevisionlayer.Aneventisrepresentedbya
transitioninthePetriNet;aprimitiveeventbyaconditionedtransition
(withtheconditionthattheprimitiveeventhasbeendetected),acomposite
eventbyahierarchicaltransition(apredefinedPetriNetbeingusedasone
block)whosestructureisderivedfromtheeventstructure.Thesystemis
usedonaparkinglot,witheventslike
andcars”ting“counhange”.exc“car[Hakeemetal.,2004]proposearepresentationofeventsinvideos,based
ontheCASErepresentationofnaturallanguages.Theypointouttheim-
portanceofcausalandtemporalrelationshipsbetweensub-events.Inorder
tocapturemulti-agentandmulti-threadedevents,ahierarchicalCASErep-
resentationofevents,CASEE,isdeveloped.Bymappingscenestoanevent-
tree,eventsarerecognizedviasub-treematching.Thepresentedscenariois
crossing.railroad[HakeemandShah,2004]proposeaframeworkforclassificationofmeet-
ingvideos.Thisframeworkisutilizedtoanalyzehumanmotiondatato
performautomaticmeetingclassification.Arule-basedsystemandastate
machineareemployedtoanalyzethevideos,utilizingthreelevelsofcontext
hierarchy;movementsandtheirattributes,events(=actions),andbehavior.
Bythese,activitiesareidentifiedandthemeetingtypeisclassified,based
onthemeetingontology.Therule-basedsystemistheprimaryframework
manager,whichrecognizesbehaviorsbasedontheeventsdetectedbythe
statemachine.Intheontology,therelationshipsbetweenmovementsform
eventsthathavearelationshipwitheachothertoformbehaviors.Different
behaviorsformgenresandthemeetingontology.
Asanapproachforeventmodelingandanalysiswithsemanticrepre-
sentations[XinandTan,2005]proposeasystemthatintegratesallrelated
informationinahierarchicalconceptualmodel(namelyanontology)and
defineseventsassignificantchangesandmappingsofconceptualunitsinthe
model.Threebasiceventcomponentsformtheconcept:

33

tities,enandords,wattributes.ofsetaThelowerleveloftheframeworkextractsfeatures(thewords)fromthecon-
tent,whiletheupperlevelsemanticalrepresentationsofeventsarereceived
usingthesewords.Eventsaretreatedasobviousfeaturechanges;changeis
thetriggerofanevent.Differentattributesofregionsandmovingobjectsare
entities.Thesceneisdividedintodifferentregionsthatarelabeledmanually:
grassplot,road,,yasidewtersection,inalk.crosswTodescribemovingobjects,themotionstates
”,ev“mo“halt”,“stop”areused.Trajectoriesarecharacterizedthrough
t”,straigh“got”,righ“turnleft”,“turn“retrace”.Interactionsbetweenmovingobjectsandspecialregionsaredescribedthrough
ons:relatispatialtheiry”,ccup“oter”,“en“transfer”,ear”.“app

34

Interactionsoftwomovingobjectsare
to”,“closefrom”,yaw“ater”,“encounw”.“folloThehierarchyofontologiesconsistsofthreelevels.Thefirstlevelisthe
layoutofthesceneandassociatedrestraints:usablefeaturesoftheregion
next“Road”levelincludecontainsthatvtheehiclesmovingandobotherjects’moonvingtologyobjectsconcantainingmovtheeonit.statesTheof
motionandtheconceptsdescribinginteractionsmentionedabove.Inafinal
thelevel,thesimilaritsemanyofticonsemantologyticconcept,represenatswhatmethoodccursthatinusesthescene.ConceptualTomeasureStatus
VectormeasuresandtheWeighdistancetedbetSemanweentictwoDistanceconceptual(WSD)isstatuspropvectors.osed.TheWhenWSDthe
evsemanenttichappensdistanceandisalargersemanticthanarepresenthresholdtationoflearnedthisevfromentistrainingcreated.data,an

2.6OntologiesforVideoEvents
[evNevents,atiaetVERL,al.,2004(Video]EvdefineentaReformalpresenlanguagetationforLangudage)escribingandaanonlanguagetologyofto
MarkupannotateLanguage).instancesoftheDifferenevtenttsypesofdescribcompedinositionVERL:ofevenVEMLtsare(Videodefined:Event
“ifourhypothesisofdecomposingcomplexeventsintosimpler
weveencantsisovvalidercomeandtheifthecomplexitnumbyerofofrepresenprimitivtingeevtheentswidisevlimited,ariety
ofeventsseenintherealworld.”
theObobjectsject’shavepropproperties,erties,aattributes,ttributes,andandrelationsrelations.ataStatesgivenaretime.definedEvenbtsy
cansubtoypccures,ataexpressions,timeinstanandtoporineratorsterval.toThedescriboneevtologyents.definesAsantypes,examplederivfored
syntaxandinferencerules,“carry”ispresented/definedasinequation2.2:

PROCESS(carry(x,y,a,b,t),

AND(hold(x,y,t),move(x,a,b,t)))(2.2)

Primitiveeventsofamobileobjectcanbe

eed-up”,“spwn”,w-do“slo“start”,“stop”,ort”,“turn-righ“turn-left”.

35

VERLisaprogramminglanguage.VEMLencodesinstancesofVERLin
XML.Seesection3.2.3forthesyntaxofVERL.
[Br´emondetal.,2004]introduceanotherontologytorepresentvideoevent
knowledgeforautomaticvideointerpretation.Twomainconceptsareem-
died:ob

1.physicalobjectsand
2.videoevents.

Avideoeventcanbe

state,eprimitivastate,ositecompaaprimitiveevent,or
acompositeevent.

Primitivestatesareatomstobuildotherconcepts.Aprimitiveeventis
achangeofstate.Aphysicalobjectcanbeastaticobject,likeadesk
oramobileobject,likeapersonoracar.Theyhaveaclass,attributes
and“liveliness”,theabilitytoeitherbemovedand/ormovebyinitiating
theirownmovement,thenbeingmobileobjects.Contextualobjectsare
walls,doors,chairs,suitcases,etc.Therelationshipsbetweenthepresented
conceptsareeithervisionbased,spatial,orspatio-temporal.Asyntaxto
describetheseconceptsisalsoproposed.
[Allen,1983]introducesaninterval-basedtemporallogicandareasoning
algorithmbasedonconstraintpropagation.Tomaintaintemporalrelations,
thirteenrelationshipstoexpressanyrelationshipwhichcanholdbetween
twointervalsaredefined:

1.OREBEF

MEET2.

VERLAPO3.

DURING4.

ST5.TAR

FINISH6.

andtheirinverses,andadditionally

EQUAL13.

Seefigure2.2foravisualrepresentationofthoserelations.

36

holdFigurebetw2.2:eentThewointhirteentervals.relationshipstoexpressanyrelationshipthatcan

[Allen,1983]alsoaddressestheproblemof‘persistence’:astateistrue
untildiscoveredotherwise.Thisismodeledbyapplyingconstraintstointer-
als.v

3Chapter

EmployedTechniquesandTools

ductiontroIn3.1

Thischapterpresentstheunderlyingtheoreticalfundamentals.Itgivesan
introductiontotheprincipleofontologiesasatoolforknowledgerepresen-
tationandreasoning.Thesyntaxoftheontologylanguageusedtoexpress
therulesofthesemanticconceptpresentedinthisworkissummarizedsep-
arately.Theemployedsoftwareisshownaswellastheutilizedretrieval
qualitymeasuresareintroduced.

tologiesOn3.2

OntologiesweredevelopedinArtificialIntelligencetoenablesharingand
reuseofknowledge.Anontologyprovidesasharedandcommonunderstand-
ingofadomainthatcanbecommunicatedbetweenpeopleandheterogeneous
andwidelyspreadapplicationsystems.Itisalsoanexplicitconceptualization
(i.e.metainformation)thatdescribesthesemanticsofthedata.

designtologyOn3.2.1buildingReusabilittecyishnoloonegiesoftheenablingmosttheimplargeortantscaledfeatures.reuseofonResearctologies.hfoTocussesachievone
therequirementssetbyreusabilityanontologymustconsistofsmallmodules
withahighinternalcoherenceandalimitedamountofdependenciesbetween
themodules.Gruberhasexpressedthedesignprinciplesoftheontology
indesigns.1995.ATherepreliminaryisaneedsetofforobdesignjectivecriteriacriteriaforontotologiesguideandwhoseevpurpaluateosetheis

37

38

knowledgesharingandinteroperabilityamongprogramsbasedonashared
conceptualizationare[Gruber,1995]:
1.Clarity:Anontologyshouldeffectivelycommunicatetheintended
meaningofdefinedterms.Definitionsshouldbeobjective.Whilethe
motivationfordefiningaconceptmightarisefromsocialsituationsor
computationalrequirements,thedefinitionshouldbeindependentof
socialorcomputationalcontext.Formalismisameanstothisend.
Whenpossibleadefinitionshouldbestatedinlogicalaxioms.Where
possible,acompletedefinition(apredicatedefinedbynecessaryand
sufficientconditions)ispreferredoverapartialdefinition(definedby
onlynecessaryorsufficientconditions).Alldefinitionsshouldbedoc-
umentedwithnaturallanguage.
2.Coherence:Anontologyshouldbecoherent:thatis,itshouldsanc-
tioninferencesthatareconsistentwiththedefinitions.Attheleast,
thedefiningaxiomsshouldbelogicallyconsistent.Coherenceshould
alsoapplytotheconceptsthataredefinedinformally,suchasthose
describedinnaturallanguagedocumentationandexamples.Ifasen-
tencethatcanbeinferredfromtheaxiomscontradictsadefinitionor
examplegiveninformally,thentheontologyisincoherent.
3.Extendibility:Anontologyshouldbedesignedtoanticipatetheuses
ofthesharedvocabulary.Itshouldofferaconceptualfoundationfor
arangeofanticipatedtasks,andtherepresentationshouldbecrafted
sothatonecanextendandspecializetheontologymonotonically.In
otherwords,oneshouldbeabletodefinenewtermsforspecialuses
basedontheexistingvocabulary,inawaythatdoesnotrequirethe
definitions.existingtheofrevision4.Minimalencodingbias:Theconceptualizationshouldbespecified
attheknowledgelevelwithoutdependingonaparticularsymbol-level
encoding.Anencodingbiasresultswhenrepresentationchoicesare
madepurelyfortheconvenienceofnotationorimplementation.En-
codingbiasshouldbeminimized,becauseknowledge-sharingagents
maybeimplementedindifferentrepresentationsystemsandstylesof
tation.represen5.Minimalontologicalcommitment:ontologyshouldrequirethe
minimalontologicalcommitmentsufficienttosupporttheintended
knowledgesharingactivities.Ontologyshouldmakeasfewclaimsas
possibleabouttheworldbeingmodeled,allowingthepartiescommit-
tedtotheontologyfreedomtospecializeandinstantiatetheontology

39

asneeded.Sinceontologicalcommitmentisbasedonconsistentuse
ofvocabulary,ontologicalcommitmentcanbeminimizedbyspecify-
ingtheweakesttheory(allowingmostmodels)anddefiningonlythose
termsthatareessentialtothecommunicationofknowledgeconsistent
.theorythatwith

LanguagestologyOn3.2.2tureRecenandtly,theresemanhasticsbofeentheagroexcwinghangednumbdataerofinthenotationsWorldtoWidedescribeWeb.thestruc-XML
(ExtensibleMarkupLanguage)isthebasisforseveralofthesenewstandard
candidates.Itisasimple,flexibletextformatderivedfromSGML(ISO
8879).large-scaleXMLiselectronicaW3C1publishingrecommendationandisalsopladesignedyingtoanmeetincreasingthecrolehallengesintheof
excadoptedhangeasofaawidemeansvofarietinytercofdatahangingontheWinformationebandbetwelsewhere.eenItcomputerhasbpro-een
ofgrams.metadataInparticuaboutlarstitisoredobwidelyjectsseenandasthebprogramsest(e.g.solutionthefortheOpenintercSoftwhangeare
Description)andfortheinterchangeofcommercialinformation(e.g.Open
FinancialExchange)[Hunter,2001].
TheResourceDescriptionFramework(RDF)makesupastandardfor
describingthesemanticsofinformationviametadatadescriptions[Lassila
andsemanSwicticsk,of1999the].data.XMLTheschemasgivtransformationeastandardlanguageforXSLdescribingprovidesastructurestandardand
isforafoundescribingdationformappingsprobcessingetweenmetadata.differenIttproterminolvidesinogiesterop[Clarkerabilit,1999yb].etwRDFeen
emphasizesapplicationsthatfacilitiesexctohangeenablemacautomatedhine-readableprocessinginformationofWebontheresources.Web.ItRDFis
alsodesignedforrepresentingdata.RDFprovidesabasicobject-attribute-
valuedatamodelformeta-data.Otherthantheseintendedsemantics–
describcommitmenedonlyts.Ininformallyparticular,innothereservstandarded–termsRDFaremakdefined.esnoJustdata-molikeXML,deling
thethatareRDFtodatabeusmoeddel[proLassilavidesandnoSwicmeck,199hanisms9].fordeclaringpropertynames
calmoThedelingRDFScprimitivhemaes.proItvidesdefinesarepresenclasses,tationsubclasses,formalismsubpropandbasicerties,ontodomainlogi-
andrangerestrictionsofproperties,etc.inaweb-basedcontext.Further-
more,itprovidesinformationabouttheinterpretationofthestatementsgiven
intheRDFdatamodel,butitdoesnotconstrainthesyntacticalappearance
1WorldWideWebConsortium

40

vofoancabularyRDFfordescriptRDFion.dataTheandRDFletsScthemhemaspeciletsfydevtheelopkindsersofdefineobjectsatoparticularwhich
nismtheseproattributesvidesacanbasicbteypeapplied.systemInforotherRDFwmoords,dels.theThisRDFtypScehemasystemmecusesha-
somepredefinedterms,suchasClass,subPropertyOfandsubClassOf.RDF
ScdefinedhemaasexpressionsinstancesofaretwalsooorvalidmoreforclassesRDFusingexpressions.thetypeRDFpropobertjectsy[Dacanbviese
].2002,al.et

OntologyInferenceLayer
TheOntologyInferenceLayerOILisaproposalforaweb-basedrepresenta-
tionandinferencelayerforontologies.Itcombinesthewidelyusedmodeling
primitivesfromframe-basedlanguageswiththeformalsemanticsandrea-
soningservicesprovidedbydescriptionlogic.ItiscompatiblewiththeRDF
Schemaandincludesaprecisesemanticsfordescribingtermmeaningsand
thusalsofordescribingimpliedinformation.
OILpresentsalayeredapproachtoastandardontologylanguage.Each
additionallayeraddsfunctionalityandcomplexitytothepreviouslayer.This
isdonetoenablethatagents(humansormachines)whocanonlyprocess
alowerlayercanstillpartiallyunderstandontologiesthatareexpressedin
higherlayers[Ont,2000].

DARPAAgentMarkupLanguage
ThebasedDARPsemanAticAgentlanguageMark-UpthatlinksLanguagethe(DinformationAML)isonadesignedpageastomanacXML-hine-
readablesemantics.ThegoaloftheDAMLprogramistocreatetechnologies
thatenablesoftwareagentstodynamicallyidentifyandunderstandinforma-
tionmanner.sources,DAMLandtoresearcprohvideplaninincluteropedesrabilitsixytasksbet[wDeenAM,agen2000ts]:inasemantic

1.TocreateanAgentMark-UpLanguage(DAML)builtuponXMLthat
allowsuserstoprovidemachine-readablesemanticannotationsforspe-
cificcommunitiesofinterest.
2.TocreatetoolsthatembedDAMLmarkupontowebpagesandother
informationsourcesinamannerthatistransparentandbeneficialto
users.the3.Tousethesetoolstobuildup,instantiate,operate,andtestdifferent
setsofagent-basedprogramsthatmarkupanduseDAML.

41

4.Tomeasure,viaempiricalexperimentation,theproductivityimprove-
mentsprovidedbythesetools.
5.Toapplythesetoolstothirdpartyagentdevelopment,military-specific
problems,andsupportfortheintelligencecommunitysoastoevolve
DAMLtechnologiestowardslarge-scaleuse.
6.ToinserttheDAMLtothecommercialandmilitarymarketsviapart-
nershipswithindustrialanddefense-related(C2andintelligence)or-
ganizations.

AML+OILDDAML+OILisasemanticmark-uplanguagebuiltonearlierW3Cstandards
suchasRDFandRDFSchema.Itextendstheselanguageswithricher
modelingprimitives.DAML+OILprovidesmodelingprimitivescommonly
foundTheinlanguageframe-bhasedaswelllanguagesdefined[semanGeneserethtics.AandDFikesAML+OIL,1987].knowledgebase
isacollectionofRDFtriplets.DAML+OILprescribesaspecificmeaningfor
triplesthatusetheDAML+OILvocabulary.Themodel-theoreticsemantics
spmeaningecifiyis.exactlyDAMwhicL+hOILtriplesonlyareproassignedvidesaaspsemanecificticinmeaninterpretationg,andforwhatthosethis
partsofanRDFgraphthatinstantiatetheschema.AnyadditionalRDF
statements,resultinginadditionalRDFtriplets,areperfectlyallowed,but
DAML+OILissilentonthesemanticconsequences(orlackthereof)ofsuch
additionaltriples[Smithetal.,2002].

LanguagetologyOnebWTheWebOntologyLanguageOWLisintendedtoprovidealanguagethat
canbeusedtodescribetheclassesandtherelationsbetweenthemthatare
inherentinWebdocumentsandapplications.TheWebOntologyLanguage
OWLisasemanticmarkuplanguageforpublishingandsharingontologies
ontheWorldWideWeb.OWLisdevelopedasavocabularyextensionof
andisderivedfromtheDAML+OILWebOntologyLanguage.InOWL,an
ontologyisasetofdefinitionsofclassesandproperties,andconstraintson
thewaythoseclassesandpropertiescanbeemployed.AnOWLontology
mayincludethefollowingelements:

•taxonomicrelationsbetweenclasses,
•datatypeproperties,descriptionsofattributesofelementsofclasses,

42

•objectproperties,descriptionsofrelationsbetweenelementsofclasses
degree,lesseratoand,andclasses,ofinstances•erties.propofinstances•ofaDatatclass.ypOeWLpropisaertiessetofandthree,objectpropincreasinglyertiesarecomplexcollectivlanguages:elytheOpropWLertiesLite,
OWLDLandOWLFull.
guageOWLthatLitewillshasabtisfyeenusersdefinedprimawithrilytheneedingintenationofclassificationcreatingahierarcsimplehyandlan-
simpleconstraintfeatures.Forexample,whileitsupportscardinalitycon-
strainshouldts,beitonlysimplerptoermitsprovidecardinalittoolysuppvaluesortforof0OorWL1.LiteForthathesenforitsreasons,moreit
es.relativcomplexOWLDLincludesthecompleteOWLvocabulary,interpretedundera
numberofsimpleconstraints.Primaryamongtheseistypeseparation.
Classidentifierscannotsimultaneouslybepropertiesorindividuals.Sim-
ilarly,propertiescannotbeindividuals.OWLDLissonamedduetoits
correspondencewithdescriptionlogic.
cabularyOWLFmoreullbroadincludeslythethancOWLompleteDL,OWLwithvothecabularyfreedom.ItproinvidedterpretsbythisRDF.vo-In
OWLFull,aclasscanbetreatedsimultaneouslyasacollectionofindividuals
(theAnotherclasssignificanextension)tanddifferenceasanfromindividualOWLinDLitsisownthatraightDatat(theypclassePropintenertytion).can
bemarkedasanInverseFunctionalProperty.Thesearedifferencesthatwill
beofinteresttotheadvanceduser[Smithetal.,2002].

VERL3.2.3Inthesummerandfallof2003theAdvancedResearchandDevelopment
Activity(ARDA)oftheU.S.Governmentsponsoredthe“ChallengeProject
onVideoEventTaxonomy”.Theresultwasaformallanguagefordescribing
anontologyofevents,calledVERL(VideoEventRepresentationLanguage).
AcompanionlanguagecalledVEML(VideoEventMarkupLanguage)to
annotateinstancesoftheeventsdescribedinVERL,wasalsodeveloped.This
sectionshowsthesyntaxandapplicationofVERLandVEML,aspresented
in[BollesandNevatia,2004].

43

ductiontroInVERLisintendedtobealanguageforrepresentingeventsforthepurposeof
designinganontologyofthedomainofanapplicationandforannotatingdata
withthecategoriesinthatontology.Thefirstversionofthelanguage,VERL
1.0,wasdefinedaspartofanARDA2003challengeprojectanddescribedina
January2004report.Thisdocumentdescribesmodificationsofthelanguage
inresponsetofeedbackfromusersofthelanguageindefiningontologies.It
correctssomeerrors,providesadditionalconstructs,andclarifiessemantics.
2.0.VERLislanguageThis

TheSyntaxofVERLThissectionssketcheshowtouseVERLinitstwo
primarytasks-annotatingdataanddefiningcompositeevents.Theformeris
donefirstbecauseitiseasiertounderstand.

proThevidesalanguagecarefulisvdeveryelopmensimple,tofbutthethislanguage,simplicityisgivingitsdeceptivsemane.tics.Section3.2.3

ontologyAnnotatingandaDatadesignationAnofannotationalocationisainpairthecovideonsistingofdata.athinginaVERL

<thing,loc>

object.ThethingNothingisdescribsaideshereaabstateoutorhoevwenthet,orloancationsentitareytosucbheasspeacified.physical

TypesTherearethreebasictypesinthelanguage.Everythingisathing.
Therearetwotypesofthings.Thetypeentencompassesentitiesandgen-
erallymaybethoughtofasphysicalobjects,althoughinsomeapplications
itstatecanisbeausedpropertmoreyofbroadlysomething.Thetypholdingeevovencerapompasseseriodofstatestime.andevNormallyents.,Aa
personwouldbeoftypeent,andhisrunningwouldbeoftypeev.Thetype
isyhhierarc

thing/\event

44

Inspecificapplicationsitispossibletoexpandthishierarchytomorespe-
cifictypes.Forexample,onemightintroducepersonandvehicleassubtypes
oftypeent.
VERLExpressionsConstantsmaybeofanyoneofthethreetypes.Vari-
ablesmayrangeoveranyoneofthethreetypes.AVERLexpression(vexpr)
ws:folloasdefinedisAconstantorvariableisavexpr.
vexpr→constant|variable

Forexample,“John,”“X1,”“Fire-1,”and“E1”mayallbevexprs.The
typeofthevexpristhetypeoftheconstantorvariable.Thus,“John”isan
entityconstant,“E1”willbeaneventvariableif,forexample,itrefersto
John’srunning,andsoon.
Afunctionsymbolappliedtotheappropriatenumberofvexprsasarguments
pr.exvais

vexpr→fct([vexpr{,vexpr}∗])

Squarebrackets[...]indicatesomethingisoptional;herethefunction
mayhavenoarguments.Curlybrackets{...}groupelementstogether.The
Kleenestar*meanszeroormoreinstancesof.Theargumentsmustbeof
therighttype.
Apredicatesymbolappliedtotheappropriatenumberofargumentsisa
expr.v

vexpr→pred([vexpr{,vexpr}∗])

Theargumentsmustbeoftherighttype.Theresultisalwaysoftypeev.
Forexample,ifchangeisapredicatesymbolrelatingtwothingsoftypeev
andE1andE2areeventvariables,thenchange(E1,E2)isavexproftypeev.

45

Alogicaloperatorappliedtotheappropriatenumberofvexprsoftypeevis
expr.va

vexpr→AND(vexpr{,vexpr}∗)|
OR(vexpr{,vexpr}∗)|
IMPLY(vexpr,vexpr)|NOT(vexpr)|
EQUIV(vexpr,vexpr)

ANDandORtakeoneormorearguments.IMPLYandEQUIVtake
twoarguments.NOTtakesoneargument.Theresultisalwaysoftypeev.
example,orF

AND(change(E1,E2),change(E3,E4))

isavexpr,andtheresultistheeventconsistingoftheaggregateof
andchangthee(E1result,E2)istheandstatechangofe(E3absence,E4)).oftheNOevT(entchangchange(Ee(1E,1E,E2))2).isavexpr
Aconstantorvariablecanbeusedasalabelonavexpr.
vexpr→{constant|variable}:vexpr

Theresultingvexprreferstothesamethingasitsconstituentvexprand
ofcourseisofthesametype.Thelabelcanthenbeusedelsewheretorefer
thing.thattoVariablesoccurringinannotationsareassumedtobeexistentiallyquan-
tifiedwiththeoutermostscope.Variablesinannotationsarethusequivalent
toconstants,andnoformaldistinctionbetweenthetwoneedbemade.
ThiscompletesthebasicsyntaxofVERL,aswouldberequiredforan-
data.notatingDefiningCompositeEventsThebasicoperatorfordefiningcomposite
eventsisPROCESS.Ittakesapredicationandavexprasitstwoargu-
ments.Thepredicationisapredicateappliedtotheappropriatenumberof
arguments,wheretheargumentshaveanoptionaltypespecification.

46

defn→PROCESS(pred([argspec{,argspec}∗)[,vexpr])
argspec→[type]variable

ThesecondargumentofPROCESSisoptional,andifitismissing,it
isassumedtheprocessisprimitive,i.e.,inthisapplicationitisdirectly
implementedinsoftware.Forexample,ifthereisthepredicatelocated-at
relatingathingtoanentity,andapredicatechangerelatingtwothingsof
typeev,thenthepredicatemoveisdefinedasfollows:

PROCESS(move(thingx,enty,entz),
change(located−at(x,y),located−at(x,z)))

Thatis,forathingxtomovefromentityytoentityz,thereisachange
inx’slocationfromytoz.
readPRas“IfOCESSthereisisequivacalenhangettofromximplication.beingloThecatedaboatveytoxstatemenbeingtlocouldcatedbe
atz,thenthereisamovingofxfromytoz.”Variablesoccurringinthe
antecedent(thesecondargumentofPROCESS)aretakentobeuniversally
thequantifiedconsequenovertthe(thescopfirsteofargumtheentwholeofPRdefinition.OCESS)Vareariablesexistenotiallyccurringquanonlytified.in

us,Th

PROCESS(p(x,z),q(x,y))

willbeinterpretedas
(Ax,z)[p(x,z)→(Ey)q(x,y)]

47

ThreeotheroperatorscanbeusedoptionallyinplaceofPROCESS-
beusedPRIMITIVEwhen,thereisSINGLE-THREADnosecond,andargumenMULttotheTITHREADPR.OCESSPRIMITIVEdefinition.can
pensequenSINGLE-THREADtially,withoutmeansovthaterlap.allMULconstituenTITHREADteventscaninbetheuseddefinitionwhentherehap-
isnosuchconstraint.
ThenForamovexample,eeventisassumealosingle-threadcated-atevandent.changeareprimitivepredicates.

PRIMITIVE(located−at(thingx,enty))
PRIMITIVE(change(eve1,eve2))
SINGLE−THREAD(move(thingx,enty,entz),
change(located−at(x,y),located−at(x,z))
DefiningSubtypesTheusermaywanttodefinesubtypesofthetypes
entargumenandts,ev.theThisnamecanofbethedonesubtypwithetheandtheSUBTYPEnameofoptheerator.supertItype.takestwo

SUBTYPE(type,type)

mobileForhasexample,subtyptoesstatevehiclethatandenthumanhas,subtoneypwesouldmobilewriteandimmobileand

SUBTYPE(mobile,ent)
SUBTYPE(immobile,ent)
SUBTYPE(vehicle,mobile)
SUBTYPE(human,mobile)

48

ifyouSiblingwerettoypspesinecifythetmobileypeandhierarcchyontainershouldasbesubtmypesutuallyofent,yexclusivoue.couldThnotus,
harathervemobilethanascontyptainers.es.InItthisisoftencase,btheettertodefinitionstreatwsuchouldbeconceptsasproperties

PRIMITIVE(container(entx))
PRIMITIVE(mobile(entx))

InferenceRulesInadditiontoannotationsofspecificeventsanddefi-
nitionsofcompositeproperties,relations,andevents,onemayalsowantto
specifyinferencerulesthatallowtodrawconclusionsfromwhatisrecognized
inthedata.ForthistheoperatorRULEtakestwovexprsoftypeevasits
ts.argumen

RULE(vexpr,vexpr)

Forexample,supposedefiningcarry(x,y,a,b,t)(xcarriesyfromatob
duringtimeintervalt),asxholdsyduringtimeintervaltandxmovesfrom
atobduringtimeintervalt.

PROCESS(carry(x,y,a,b,t),AND(hold(x,y,t),move(x,a,b,t)))

Thentosaythatwhenxcarriesyfromatobduringt,yalsomoves
fromatobduringt:

RULE(carry(x,y,a,b,t),move(y,a,b,t))

49

Variablesintheantecedentoftheimplicationwillbeinterpretedasuni-
versallyquantified;variablesthatoccurintheconsequentbutnotinthe
antecedentwillbeinterpretedasexistentiallyquantified.Thus,

willbeinterpretedas

RULE(p(x,z),q(x,y))

(Ax,z)[p(x,z)→(Ey)q(x,y)]

CommentsLinesbeginningwith//arecomments.
ControlStructuresJustasinprogramminglanguages,controlstructures
areessentialforbuildingupcomplexcompositeeventsandprocesses;VERL
providesSequence,Conditional,Repeat-Until,andWhile-Doconstructs.
Sequencetakesanarbitrarynumberofeventsasitsarguments.Itsays
thattheseeventsoccurinsequence,notoverlapping.Theorderinthese-
quenceisthesameastheorderofthearguments.Theresultingvexprde-
scribesthecompositeeventconsistingofalltheargumenteventsoccurring
sequence.in

Sequence(e1,e2,...)

Conditionaltakestwoorthreeeventsasitsarguments.Itsaysthat
ifthefirstargumentholdsorobtains,thenthesecondargumenthappens;
otherwise,thethirdargumenthappens,ifthereisathirdargument.

Conditional(e1,e2)
Conditional(e1,e2,e3)

50

Theresultingvexprdescribesapieceofbehaviorthathasbeenrecognized
tooperateinthisfashion.
Repeat-Untiltakestwothingsoftypeevasitsarguments.Theresulting
vexprdescribesthecompositeeventinwhichthefirstargumentisrepeated
untilthesecondargumentholdsorobtains.

Repeat−Until(e1,e2)

Normally,e1isthesortofeventthatchangestheworldinawaythat
affectswhetherornote2holdsorobtains.Bothe1ande2aretypesof
eventsorstates,ratherthanspecificinstances.
While-Dotakestwothingsoftypeevasitsarguments.Theresulting
vexprdescribesthecompositeeventinwhichthesecondargumentisrepeated
aslongasthefirstargumentholdsorobtains.

While−Do(e1,e2)
While-Docanbedefinedintermsofotheroperators:

PROCESS(While−Do(eve1,eve2),
Conditional(e1,
Repeat−Until(e2,NOT(e1))))

ThiscompletesthesummaryofthesyntaxofVERL.Thefollowingpara-
graphspresentseveralclassesofpredicatesthatareusefulinbuildingupand
relatingcomplexstructuredevents.
EqualityEqualityisausefulrelationfordefiningcomplexeventswhich
specify,forexample,casesinwhichthesameparticipantplaysmorethan
onerole.EqualityisrepresentedwiththepredicateEqualthattakestwo
entitiesasitsargumentsandreturnsavalueof“true”iftheyareidentical.
Moreprecisely,itreturnsaneventualitythatexistswhenthetwoentitiesare
tical.iden

Equal(thing,thing)

51

Equalitystatementscanbegivenlabels,inwhichcasethelabelrefersto
thestateofthetwothingsbeingequal.

TemporalRelationsRepresentingthetemporalrelationsamongcompo-
nenteventsiscrucialinrecognizingcompositeevents.Forperhapsmost
applications,describingtherelationsamongthetemporalintervalsoccupied
bythecomponentevents,accordingtoAllen’sintervalalgebra,issufficient.
Thisisbecauseagentsarerespondingprimarilytotheactionsofotheragents
orthebehaviorofmovingobjects.Eveninacasewhereoneofthethreadsis
preciselytimed,suchasaconveyorbeltinafactory,theworkerisrespond-
ingprimarilytotheappearanceofthepartratherthantothepassageofa
time.oftamouncertainTimescomeintwovarieties–instantsandintervals.Thus,

SUBTYPE(temporalEntity,thing)
SUBTYPE(instant,temporalEntity)
SUBTYPE(interval,temporalEntity)

Oftwodistinctinstants,oneisbeforetheother.Thepredicateafteristhe
inverseofbefore.

before(t1,t2),after(t1,t2),Equal(t1,t2)

AninstanttandanintervalTcanbeinseveralpossiblerelations:

begins(t,T),inside(t,T),ends(t,T)

Itmaybethatnoneoftheseistrue.
Therearesixpossiblebasicrelationsthatcanobtainbetweentwointervals:

before(T1,T2),meets(T1,T2),overlaps(T1,T2),begins(T1,T2),
contains(T1,T2),ends(T1,T2)

52

TheseformthebasisofAllen’sintervalalgebra.Theycanbedefinedin
Theretermsofarebtwegins,opinside,ossibleandrelaendstionsbetrelationsweenbevetenwtseenandinstantimtses:andinSometervevals.ents
happeninstantaneously.Inthiscase,

atTime(e,t)

wheretisaninstant.Someeventshappenacrossintervals,withaduration.
casethisIn

during(e,T)

whereTisaninterval.Thepredication

timeSpan(T,e)

saysthatTistheentiretemporalentityorsequenceoftemporalentities
duringwhicheoccurs.
TheOWL-Timeontology[HobbsandPan,2004]providesarichelab-
orationoftheseconceptsandincludestreatmentsofmeasuresofduration,
clockandcalendarterms,andtemporalaggregates.Insomecases,thisricher
required.istimeoftheoryManyactionsarerhythmic,andrecognizingthisisanimportantpartof
recognizingthehigher-levelevent.Tappingone’sfoottomusicislikethis,as
ismarchinginunison.Arhythmiceventcanbecharacterizedasaniterative
eventinwhichtheiterationsoccupytimeintervalsofequalduration.Rhythm
isoftenusedtocoordinatemultithreaditerativeaction.

53

isaThreechangeOthefromroneUsefulthingofPredicatestypeevtoTheanotherpredicatethingofchangetypesaysev.thatthere

change(e1,e2)

Forexample,asseenabove,achangeoflocationcanberepresentedinthis
.yaw

change(located−at(x,y,t1),located−at(x,z,t2))

Thepredicatescauseandenablecanalsobeusedtolinkpredicatestogether.

cause(e1,e2)
enable(e1,e2)

Thesecondargumentofcauseandbothargumentsofenableareoftype
ev.Thefirstargumentofcauseisoftypething.Eventscancauseother
evevenent.ts.ForButthereexample,areitisalsomostsomeconenvtitenieniestthattoviewcanafunctiondogasastheacausecauseofofitsan
ownOfmovcourse,ements.causeandenablearenotdirectlyobservable,buttheyare
ofteninvolvedinwhatistobeinferredfromobservables.
Examples:TosaythatBislocatedatC:

located−at(B,C)

TosaythatBmoves(changeslocation)fromCtoD:

change(located−at(B,C),located−at(B,D))

TosaythatAcausesBtomove(changelocations)fromCtoD:

cause(A,change(located−at(B,C),located−at(B,D)))

54

riseandAssumeCtofall,definingwherethethepredicatesrisingstartsrisebandeforefall.andToovsayerlapsthatwithAthecausesBfalling:to

cause(A,AND(e1:rise(B),e2:fall(C),overlaps(e1,e2)))

ThisdescribesthesituationwhereAcausesnotonlyB’srisingandC’s
fallingbutalsotheoverlappingofthetwoevents.Iftheoverlappingofthe
twoeventsisaccidentalandnotspecificallycausedbyA,

AND(cause(A,AND(e1:rise(B),e2:fall(C))),overlaps(e1,e2))

VERLofticsSemanTheTypes,Terms,andExpressionsRatherthandealingimmediatelywith
VERLexpressions(vexprs),westartwithamoreconventionaltreatmentof
terms.andexpressionsThetypesystemisasbefore.Everythingisathing.Therearetwokinds
ofthings–entities(ent)andevents,states,andconditions(ev).Asaxioms:

((AxAx)[)[entthing(x()x)↔∼↔[event(x()]x)vev(x)]]

evsareThatmis,xutuallyisaexclusivthingife.andUsersonlycanifitdefineisansubtentyporesanofev,entandandenev,tsbutand

55

theyshouldbemutuallyexclusive.Oftenitissafertodefinesomethingasa
propertyratherthanasatype.Atermcanbeaconstant,avariable,ora
functionsymbolappliedtotheappropriatenumberoftermsasarguments.

term→constant|variable|fct([term{,term}∗])

Theconstantscanbeofanyoneofthethreetypes,andvariablescan
rangeoveranyoneofthethreetypes.Thefunctiondeterminesthetypeof
thetermthatresultsfromfunctionapplication.Termsdenotethingsinthe
worldbeingdescribedinVERL.Anexpressioncanbeapredicateappliedto
theappropriatenumberoftermsasarguments.

expr→pred([term{,term}∗])

bytheSuchterms.anAexpressionpredicateistrueappliedifthetopredsomeicatesetisoftrueargumenfortsthewillthingsbeaden“pred-oted
ication.”appropriateInnumadditionber,ofanexpressionsexpressionascanopbeerands.alogicaloperatorappliedtothe

expr→AND([expr{,expr}∗])|
OR([expr{,expr}∗])|NOT(expr)|
IMPLY(expr,expr)|
EQUIV(expr,expr)

ANDcanbeappliedtoanarbitrarynumberofexpressionsandistrueif
alltheexpressionsaretrue.ORcanbeappliedtoanynumberofexpressions
andistrueifanyoneoftheexpressionsistrue.NOTisappliedtoone
expressionandistrueiftheexpressionisfalse.IMPLYisappliedtotwo
expressionsandistrueifthefirstexpressionisfalseorthesecondexpression
istrue.EQUIVisappliedtotwoexpressionsandistrueiftheyareboth
false.othbortrueWhenannotatingvideodataandsayingthatanentityoreventisat
aparticularlocationinthedata–at−video−loc(e,l)–thusassertinga

56

predicationaboutthatentityoreventandalocationinthedata.Asetof
annotationsforavideosourceisaconjunctionofsuchpredications.[Bolles
andNevatia,2004]

AlgorithmCAVThe3.3

Itisnotinfocusofthisthesistodevelopanimageprocessingalgorithm
ofit’sown.TheVCAinputisratherprovidedbythesystempresentedby
[M¨uller-Schneidersetal.,2005].Thedetectionanddescriptionofmovingob-
jectsisbasedonanobject-oriented,statisticalmulti-featureanalysisofvideo
sequences.Thisanalysisisself-adaptingtoanobservedscene([Meyeretal.,
1996]).Thesystemisabletoidentifysplitandmergebehaviors,whereasin-
gleobjectsplitsintotwoormoreobjectsandtwoormoreobjectsmergeinto
oneobject.Additionally,objectsthatareorbecomeidlearedetected:when
themotionvectoroftheobjectbecomestoosmall,theobjectispredicated
tobeidle.Thesystemcanalso“discriminatebetweenremovedversusaban-
donedobjects.Itdoesthisbyanalyzingthechangeintheamountofedge
energyassociatedwiththeboundariesoftheforegroundregion[...].Barring
extremelyclutteredenvironments,iftherearesignificantlymoreedgesthen
anobjecthasbeenadded.Conversely,lessassociatededgeenergysuggests
thatanobjecthasbeenremoved”([Connelletal.,2004]).
Thus,thesystemprovidesthefollowingoutputs:1)objectregions,2)
theobjects’tracks,3)an“Idle”flagfornon-movingobjects,and4)distinc-
tionbetweenabandoned/removedobjects.Thoseare-throughanMPEG-7
document-theinputsforthesystempresentedinthisthesis.

3.4Retrievalqualitymeasures

Precisionandrecallarethebasicmeasuresusedinevaluatingsearchstrate-
assume:measuresThesegies.

1.Thereisasetofrecordsinthedatabasewhichisrelevanttothesearch
topic

2.Recordsareassumedtobeeitherrelevantorirrelevant(thesemeasures
donotallowfordegreesofrelevancy).

3.Theactualretrievalsetmaynotperfectlymatchthesetofrelevant
records.

57

Recall3.4.1RECALListheratioofthenumberofrelevantrecordsretrievedtothetotal
numberofrelevantrecordsinthedatabase.Itisusuallyexpressedasa
tage.ercenpRecall=(NumberRetrievedandRelevant)/(NumberPossibleRele-
t).anv

Precision3.4.2PRECISIONistheratioofthenumberofrelevantrecordsretrievedtothe
totalnumberofirrelevantandrelevantrecordsretrieved.Itisusuallyex-
pressedasapercentage.
Precision=(NumberRetrievedandRelevant)/(NumberTotalRe-
ed).triev

T“GroundThe3.5ruth”

Thereisnodeterministicmethodologyforunderstandingwhatisrelevantto
adouser’scumentssearchah.vetoThbus,eforselectedtheevbyaluationhand.Iofnatheretrievcasealofansystem,eventtherelevdetectionant
system,theGroundTruthconsistsof

•thetypeofevent,
•thetimewindowtheeventoccurredin,and
•theIDtheVCAsystemassignedtheobject(s)performingtheevents
to.

4Chapter

AMappingSemanticofloConceptw-levelforAnalysisthe
DataDescriptionstohigh-levelScene

MorphemeFromWikipedia,thefreeencyclopedia.

InLinguistics,amorphemeisthesmallestmeaningfulunitina
givenlanguage.Thisisthedefinitionestablishedin1933bythe
AmericanlinguistLeonardBloomfield1.

ductiontroIn4.1

Thischapterpresentsaconceptforeventmodelinganddetection.Thecon-
cepthasbeenintroducedby[Neuhaus,2005].Thedevelopmentandapplica-
tionofthisapproachaswellasaproofofconceptarenowpresentedinthis
andthefollowingchapters.
Theremainderofthischapterisorganizedasfollows.Insection4.2
termsused(andintroduced)arelistedandexplained.Insection4.3and
it’ssubsectionstheconstitutingelementsoftheconcept,EventMorphemes,
alongwiththeunderlyingconceptsandontologyarepresented.Insection4.6
b1oundEnglishmorpheme,Example:“-believThee-”waordfree“unbelievmorpheme,able”anhasdthree“-able”.“un-”morphemesisalso“un-”,aprefix,(negatory)“-able”a
isasuffix.Bothareaffixes.

58

59

anexampleofanEventMorpheme-baseddescriptionofasceneisgiven.
Section4.7showstheimplementationoftheEventMorphemesthatisused
toprooftheconceptinthesubsequentchapters.

erminologyT4.2Atfirsttherearesometermsthatneedcleardefinitionbeforeusage.Thisis
mainlyduetotheblendingofcontentanalysismatterswithhigh-levelcon-
cepts.Mostofthecontentanalysis’termshavetheircorrespondingcounter-
partonthesemantics’sidebutitisnota1-to-1match.

Low-levelandhigh-levelLow-leveldescribesfeaturesthatcanbeex-
segmentractedtationofautomaticallymovingbyobmeansjectsoffromsignalimageprosequence’cessing.pTheixelswouldextractionbeloandw-
levwhenel.(s)heHigh-leveldescribesdescribaesscene.semanIntictermsmeaningsofautomatedandconceptscontenatuseranalysisemploandys
therindexing,analysisof“high-levtheel”low-levdescribelesthfeatures.eproThcessusof(pdraost-hwingoc)conanalysisclusionsoflofromw-levfur-el
withsemanticoutputfeatureswillherebecalled“high-level”.

FeaturesandattributesThetermsfeaturesandattributes,asusedhere,
aretodescribethecharacteristicsofthelow-leveldataandthehigh-level
semanticconcepts,respectively.Featureiscommonlyusedinthefieldof
contributetentdescribanalysis,esantheywaelemenys.tsForofthethesemansemanticticconceptrepresentation.presentedThishere,canbat-e
thenumberofobjects,directionofmovement,oridletimeofanobject.
Whichtheseareandhowtheyareusedwillbeshowninsection4.4.2.

(seeMovingsectionregion1.2),theandcoobntenjecttTheanalysismovingsystem.reItgionisisanthearbitraryoutputofshaptheedlevelclus-0
terofpixelsthat-intheory-representsamoving(orrecentlymoving;i.e.
waidle)ysobthereject.(yet).RealitThy,us,howanever,objectshoinwsthethat“real-wcurrentorld”conhatensttobeanalysisisdistinguishenotal-d
fromamovingregion.Thisisnotonlyforreasonsofsemantics,butalso
bsimpleecauseoneexamplemoforvingthisregioncasemawyouldrepresenbeatpmoreersonthancarryingoneamovingsuitcase,object.asde-A

.4.1figureinpicted

orldreal-w(a)example

tationrepresenhematicsc(b)

Figure4.1:Amovingregionrepresentingtwoobjects.

60

Ontheotherhand,anobjectmayberepresented(orbepartof)more
thanonemovingregion.Whenapersone.g.passesbehindatruckand
re-appearsontheotherside,somecontentanalysissystemswillassignanew
idalthoughitisthesameobject.Theschematicrepresentationofthiscase
isdepictedinfigure4.2.

Figure4.2:Anobjectconsistingoftwomovingregions(schematicrepresen-
tation).

first,theAnotherobjectconceiv‘suitcase’ableisscenariorepresenwouldtedbbeythethesamehandovmoerofvingaregionsuitcase:astheAt
flofirstor,pbersoneingrepresencarryingittedb(seeyit’sfigureown4.1m).ovingNext,regionthe,assuitcasedepictedisinstandingfigureon4.3the.

Whenthesuitcaseiscarriedawaybyasecondperson,thecorresponding
movingregionwillcontainthesuitcaseandit’sporter.Thisisanalogto
figure4.1.Figure4.4showstheresultingschematicrepresentationofthe
wholesuitcasehandoverscenario.Temporal(orspatial)relationsarenot
represented,onlytheassociationsoftheobjectstomovingregionsrepresent-
depicted.istheming

exampleorldreal-w(a)

sc(b)tationrepresenhematic

Figure4.3:Twoobjectswithonecorrespondingmovingregioneach.

Figure4.4:Suitcasehandoverrepresentation.

61

pEvoinenttintime.GenerallyIn,thiseventthesis,represen“eventst”whatrepresenistsactuallyonlyanhappelemeneningtaryatapartspofecifica
sequenceofactionsandiscomparableto“primitiveevents”[Ghanemetal.,
re.literatuin]2004

EventMorphemeEventMorphemesappeartohavenocorrespondence
inandthliterature.uslinktheBasicallyob,jectstheytotherepresencorresptanobondingject’sev“pents.ointofTheirview”ofrelationanisevenal-t
waysa1-to-1relation,asdepictedinfigure4.5.

(a)Oneobject-oneEvent(b)Twoobjects-twoEvent
Morpheme-oneeventMorphemes-oneevent

Figure4.5:TheEventMorpheme’s1-to-1relation.

ThedetailedconceptandapplicationofEventMorphemesfollowsinsec-
.4.3tion

62

corpMetaoratesKnowhatwledgeiscoThemmonlytermknoMetawnasK“anowleprioridgeasknousedwledge”inthtoisgetherthesiswithin-
generaleventmorphologyandcontext-specificuserrestrictions.Theapriori
eralknoevwledgeentcouldmorphologybeamapdescriboftheessconcenetext(seeindepsectionendent4.3.1kno)etc.wledgewhileaboutthehogen-w
eventspiecetogether(seesection4.3.1)andtheuserrestrictions(seesection
4.3.1)describee.g.locationdependentalloweddurationsofstay.

SemanticModuleTheSemanticModuleistheoutputoftheontology-
basedinferencemechanism.Itiscomparabletorelatedworks’“composite
event”[Ghanemetal.,2004]or“process”[Nevatiaetal.,2004].Inmostcases,
itexploitstheMetaKnowledgetoreasonabouttheeventsbyincorporat-
ingspecificinferencerulesasdescribedinsection4.3.2.Insimplecasesa
SemanticModulesolelyusessuccession/combinationsofevents.Actually,
theSemanticModuleiswhatthedatabasemayactuallybequeriedfor.A
SemanticModulemayconstituteitselfbyoneevent,asdepictedinfigure4.6
a),oritmayincorporatetwoormoreevents(figure4.6b)),dependingon
thecomplexityoftheprocess.

(a)ASemanofticoneMoevendulet.consisting(b)ASemanoftwticoevMoendulets.consisting

Figure4.6:FromeventtoSemanticModule

The“amount”ofMetaKnowledgeusedintheinferenceprocesscannotbe
ismeasuredrepresenintednbumybaerbofarpieces.acrosstheTherefore,SemanticinMofiguredules.4.7theAMetaSemanticKnoMowledgedule
mayrepresentlevel1orlevel2.

SceneASceneasusedinthisworkconsistsofSemanticModulesrepre-
sentingeventsintheircontextspecificinstance.Onemaythinkofitasa
“chainofevents”(nottobeconfusedwiththeEventMorphemePathas
introducedinsection4.3.6)orofwhatiscommonlyknownas“scenario”.
Figure4.7depictstheconnectionbetweenmovingregions,objects,Event

63

Morphemes,events,MetaKnowledgeandSemanticModuleandthescene.
Adescriptionofasceneislevel2.

Figure4.7:SchematicrepresentationofthecontextbetweenVCA’smoving
regionstohigher-levelscenedescription.

4.3EventMorphemes
TheideabehindEventMorphemesisthedecompositionofaSceneinto
semanticallymeaningfulparts(small,butnotatomic),eachwitha“main
object’s”perspective.Suchapartwouldnotconsistofmorethanthemain
object,itsattributes,andtheobject(s)themainobjectsinteractswith(if
any).Theattributesaretherepresentationsofthefeaturesextractedby
theemployedcontentanalysisalgorithm.AnEventMorphemerepresents
atimewindowinwhichtheseattributesareanalyzedandinterpreted(see
section4.4.2).Thisinferenceprocesscreatesasemanticlabeldescribingthe
w.windotimetedrepresen

Thesemanticconcept“EventMorphemes”thusrepresentsa
semanticintermediatelayer.Itmaps(onanearlystage)thelow-level
metadatatoasemanticlabel.

moreAnEvencompanions.tMorphemeAnexamplemayhaforveaacompanionpredecessor,isa“holdsuccessor,whilewandalk”onewhicorh

64

represents“carry”(seesection4.3.2).Therolesofthesuccessorandprede-
cessorareexplainedinsection4.4.

wledgeKnoMeta4.3.1TheMetaKnowledgerepresentsauxiliaryknowledgenecessaryforamore
eventdistinctivisecompeventosedofrecognition.(eventItmorphology)includesnotandonlythelaknoyoutwledgeoftheaboutscenewhat(mapan
ofthesceneetc.),butalsouserrestrictionsforeventswhichdependonthe
user’sinterpretationofe.g.“loitering”,i.e.afterwhattimedoes“stay”turn
intorestrictions“loiter”.areThepoinroletedofoutevinentthemorfollowphologying,thesections.mapofthesceneanduser

morphologytenEvEventmorphologydescribesthecommonknowledgeaboutwhataneventis
composedofinagivendomain.Actually,thesearetheunderlyingrulesand
definitionsfortheinferencemechanismdescribedinsection4.3.2.Sucha
rulewoulde.g.be
“walk”AND“hold”EQUALScarry.
Anotherexamplewouldbe
“stay”INFRONTOF“emergencyexit”EQUALS“obstructing”.
“Loitering”isanexampleforwhichmoreinputisneeded.Thedefinition
ofthiseventcouldbe
“stay”FOR“specifictime”IN“specificarea”EQUALS“loiter-
ing”.Howlong“aspecifictime”isandwhere“aspecificarea”is,isnotpartof
theeventmorphology.Theseinformationscomeintoplayasuserrestriction
(section4.3.1),whentheoperatorrefinesthequery.

scenetheofmapAAmapofthesceneisindispensableforlocation-dependentevents.Ife.g.a
carisbeingparkedinitsdesignatedparkingspace,itisanallowedevent.A
carbeingparkedinthemiddleofafirelaneandthusobstructingitwouldbe
unwanted.Anotherexampleofincorporatingthemapofthesceneintothe

65

eventreasoningwouldbeknowingthatwhereobject“X”ismoving,there
isariver.Fromthis,thesystemcanconcludethatthemovingobjectisa
swimming(orfloating)object.
andSimilarparameters,toamaptheofavtheeragesceneobisjectknosizewledgeandabactoutuallytheevcamera’serythingperspcommonlyective
describedas“aprioriknowledge”.Inthisthesis,itistheuser-independent
knowledgeconcerningthelocationandsetupofthescene.Thescenemap
theandstepthefromsemanticlevel1todescriptionlevel2of.thelocationsthereinaretherequirementfor

restrictionsUserUserrestrictionsincludeallthoseparametersforthereasoningprocessthat
dependonthespecificscene.“Loitering”,asintroducedinsection4.3.1,is
anexample.Thetimeapersonisallowedtostayatacertainplacedepends
ontheplaceandonthesituation.The“specifictime”maybeshorterin
frontofanATMthanitwouldbeonthelawnoutsidethefacility’sfence.
Theuserrestrictionswillhavetobedefinedatquerytime(inaretrieval
application)orhavetobeparametrizedwhensettinguparealtimealert
system.

duleMoticSeman4.3.2TheSemanticModuleisthereasoningmechanism,wherehigherlevelscene
descriptionsaregenerated.Pictureascenethat–whendescribedjustusing
events–wouldreadlikethis:
Apersonismoving.
Thepersonisholdingasuitcase.
Inthevideoeventrepresentationlanguage,VERL(introducedin[Neva-
tiaetal.,2004]),complexeventsarecomposedfromsimpleevents.Inthis
thesiscomplexeventscorrespondtoSemanticModule.Whenusingtherule
4.1equationin

PROCESS(carry(x,y,a,b,t),
AND(hold(x,y,t),move(x,a,b,t)))(4.1)
theabovedescriptionofasceneyieldsas
Apersoniscarryingasuitcase,

66

whichisamuchmorecompactandintuitivedescription.So,theSemantic
MobasisduleofthecondensesSemanevticenMotstodulesstatemenaretsinferencetheuserruleswasouldinactuallyequationlook4.1.for.The
moreThericextensivherethethesetofinferenceEventrulesMorpheare,methemoreClassesand(seethesectionfiner4.3.3scenes)andcanbthee
indexed.Theseinferencerules’inputs,ofcourse,havetomatchthesemantic
labeloftheEventMorpheme.Theoutputshouldfitcommonontologiesand
wthayusitbewouldapplicableexploitfortheotherconceptconceivofableincorporasemantionticofdrivendifferentonapplications.tologies.This

4.3.3EventMorphemeTaxonomy
ThesemanticlabelsofametadatapatternrepresentedbyanEventMor-
phemeareorganizedinhierarchicalclasses.“State”and“transition”are
thetwotypesofclasses.StatesarerepresentedbyEventMorphemeswith
semanticlabelssuchas“move”or“hold”,whiletransitionswouldbe“put-
down”or“pick-up”.“Transition”issub-dividableinto“start”and“stop”
forthosetransitionsthatbeginorendstates,respectively.Theseproperties
concerntheprincipleofSemanticInterpolation(seesection4.3.4).Anex-
emplaryimplementationofsuchanapproachasinventedin[Neuhaus,2005],
inVERLnotation,wouldbe

SUBTYPE(state,ev)
SUBTYPE(transition,ev)

SUBTYPE(start,transition)
SUBTYPE(stop,transition)

SUBTYPE(pick−up,start)
SUBTYPE(put−down,stop)
SUBTYPE(move,state)
SUBTYPE(hold,state)

TobenefitinfulloftheconceptoftheEventMorphemeTaxonomy,the
taxonomyhastobeadaptedtotheVCAalgorithmemployedfortheeventde-
tection.ThefinertheEventMorpheme’spredicateshallbe,thelessconfident

67

theclassificationwillbe.TheadvantageofarrangingtheEventMorphemes’
ispsemanossibleticlabtoelsfallbachierarckonhicallyamoreisenablinggeneralalabel“maybwhene”thestatemenunderlyingt.Thiswinferenceay,it
mechanismproducesalowerconfidenceforthemorespecificone.
AnothermotivationforataxonomialarrangementofEventMorphemes
arefirstexperimentalresults.The“pick-up”-detectorusesthe“removed”
binedclassificationwithosplitftheandVCAmergealgorithminformation:(seethesectionmo3.3ving)asregionsinput.theThis‘remoisvcom-ed’
objectdetectionof“split-offthe”frremoomvalinaretracthosekingremowithinving.aspIfecifiedallthesetimeinrulestervarealmet,fromthethe
EventMorpheme“pick-up−1”isgeneratedfortheremovedobject.
Whenlookingatscenariosresultinginthedetectionof“pick-up,”isto
beseenthatnotonlyactualeventsofe.g.apersonpickingupasuitcase
(fig,getting4.8upa)appafterear,havingbutsatalsoforaacarwhilepulling(figuresoutof4.8abparandking4.8c,spaceresporaectivpersonely).

(a)pick-up

(b)get−up(c)pull−out
Figure4.8:Threephenotypesoftake-out.

Thus,similarpatternsintheVCAoutputhavedifferentsemantics.It’s
orjustanothersomething(thebeingsuitcase,remobyvedthepfromerson).theThisscene,resultseitherinbythisitselfinitial(car,papproacerson)h
fortheEventMorphemeTaxonomy:

SUBTYPE(move,state)
SUBTYPE(hold,state)
SUBTYPE(bring−in,transition)
SUBTYPE(take−out,transition)
SUBTYPE(put−down,bring−in)
SUBTYPE(pick−up,take−out)

68

down”“Picbyk-up”“bring-in”.hasbeenNotereplacedthatthebyclassesit’ssup“start”erclassand“tak“stop”e-out”,asandsub-clas“put-ses
of“parking”.“transition”Thearetransition’sdeletedasprop“picertyk-up”ofthestartsimplication“hold”,ofbuttheb“pull-out”eginningstopsor
endofastatethereforecannotbemodeledbymeansoftaxonomy;ithasto
bedoneviathedefinitionofrulesasintroducedinsection4.3.4andthisway
thedevelopmentofanontology.

4.3.4SemanticInterpolation
TheprincipleofSemanticInterpolationdescribestheprocessofasserting
thatobjectsarepresentinthescenethatarenotexplicitlydetectedbythe
contentanalysisatalltimes.Anexamplewouldbeabrought-insuitcase:the
contentanalysisdetectstheput-downofthesuitcase,yetitdoesn’tdetect
btheeinsersuitcasetedbywhilethethepinferenceersonmowhodule.bringsTheitrulesinisforthiscarryingareit.asThfollous,ws.ithasto

SUBTYPE(appear(ente,atTime(t)),transition)
SUBTYPE(hold−1(entepersonp),state)
PRIMITIVE(starts(transitione,states))
PRIMITIVE(stops(transitione,states))
RULE(begins(pick−up−1,hold−1))
RULE(ends(put−down−1,hold−1))
IMPLY(AND(hold−1(ente1,personp1),appear(p1,atTime(t1))),
appear(e1,atTime(t1)))

Thesuitcaseappearstogetherwiththepersoncarryingit.Practically,
thismeansthatthemovingregionofthepersonbeforetheput-downis
btheeingapplication“copied”aofndSemanprecededticIntoterptheolationsuitcase’sisthehistory.detectionAnotherofapulexamplel−outfor:
thecorrespondingpredecessorisparking.However,whentheslot−inhas
beendetectedalso,thiswouldbeacaseofpersistence:“astateistrueuntil
discoveredotherwise”[Allen,1983].

69

4.3.5Reasoning-outfalsepositives
Theprincipleofthe’early’stepfromthelow-levelrepresentationtoase-
manticlabelentailsanotheradvantage.Itbecomespossibletoknowwhich
resultsarefalsepositivesforthedeterminatedclassofEventMorpheme.If
e.g.apresumedbrought-in(bring-in−1)objectexistedbeforetheinferred
put-down,itisquiteobvious,thatthisassertionisafalsepositive.Itcanbe
eithercausedbysegmentationfailureorasimilarpatternintheVCAoutput
astherespectiveclassofEventMorpheme.Anotherbasisforreasoning-out
falsepositivesistheobjectclass:someactionsarerestrainedtocertain
classes.

4.3.6EventMorphemePath
AllconceivableeventsandscenescanbemodeledbyEventMorphemes.To
achievethis,notonlyarichvocabularyofEventMorphemesandasound
represenreasoningtationconceptofthe(thesuccessionSemanticofMothedule,EvenseetsectionMorphemes4.3.2is)areofnecessaryneed.,toAo.
ThisrepresentationistheEventMorphemePath.
TheEventMorphemePathisnottobeconfusedwith“chainofevents”as
thethiswouldsuccessionbeofdescribEventedmoreMorphemes.appropriatelyAswellaassEv“SemanentticMorphMoemesdulePathemselvth”.Ites,is
manthereticcanMobedules.parallelWhenEventusinganMorphemeindexingPaths,systemrepresenbasedtingonmEventulti-threadedMorphemesSe-
andcombiningEventMorphemestoEventMorphemePaths,itbecomespos-
sibletoquerysuchanindexandsearchforevents(=ˆSemanticModules)that
haven’tbeenthoughtofatindexingtime.

4.4ApplicationofEventMorhpemes
ThissectionshowshowEventMorphemescanbeappliedtomodelascene
andthuscreateadescriptionofit.Specialnotionisgiventoimplementa-
tionalaspects,suchastheseparatecentralstorageofalistofallobjectsor
howlow-levelfeaturesaremappedtoasemanticlabel.

4.4.1Separatelistofobjects
AnEventMorphemeholdsasoneofitsattributesthemainobjectandthe
object(s)themainobjectinteractswith(ifany).Toavoidredundancy,all
objectrelateddataisstoredinoneseparateObjectList.ThisObjectList
may(asinsection4.7)matchwiththeVCAresults.Linkstotheobjectsare

70

thenstoredintheEventMorphemes.Theobjectdataconsistsofthelow-
levelfeaturessuchasthecorrespondingmovingregion,colorandtextural
features.Whenthesefeaturesarechoseninawaythatobjectrecognition
ispossibleuponthem(atleasttosomeextend),thelistwillenablemulti-
camerasearches.This,ofcourse,demandsadditionalmatchingoftheobject
featureswhenanalyzingtrackdataandbythisthemergingofcorresponding
objecttracks.TheobjectsclassesareavailableintheObjectList,too.They
are,ontheonehand,anotherimportantpieceforalevel2descriptionbut
alsonecessaryfortheprocessofreasoningoutfalsepositives,asdescribedin
.4.3.5section

4.4.2EventMorphemeTemplate
TheEventMorphemeTemplateis,practicallyspeaking,amatrixwherethe
entriesaretheoutputfeaturesoflevel0.Withtheseinputsasattributes,
theEventMorphemeiscreated.Duringthatprocess,aninferencealgorithm
producesthesemanticlabel.Table4.1showstheprincipleoftheEventMor-
.emplateTpheme

ttributesAjectobsplitmergejectoberedvcodistancetationorieninhangecdeformationframeetimofdurationareaesensitiv
ysta0≈evmo0>ervtip-o•••put-downelslabticSeman
meet•k-uppic••Table4.1:TheattributesoftheinstantiatedEventMorphemeandthecor-
respondingsemanticlabels

correspEventondingMorptohemestableare4.1takinstaneplace.tiatedThewhendvisualurationcofhangestheEvinaentcomMobinationrpheme

71

isfirstdaeterminedtemporalbythesegmennexttationcishangecarried“undoing”out.Forthetheinstanresultingtiatingtimeone.windoSo,atw,
theattributesoftheEventMorphemeasshownintable4.1areanalyzed.
tempBasedoralonthese,segmenatationfirstispclassificationerformed.and-if“Classification”necessary-astandsforrefinementattemptingofthe
todeterminewhatclassofaction(seesection4.3.3)istakingplaceand
whichobjectsareinvolved.Thisway,anEventMorphemeforeachobject
participatinginaneventiscreated.
Inthesecondstep,thesuccessionoftheEventMorphemes(theEvent
MorphemePath)isanalyzed.Thisimplieslookingatthepredecessor(s)
andinstantiatesuccessor(s)additionalofeacEvhEvententMorphemes.MorphemeIf,toe.g.refineanit’sEventMorpclassificationhemeor“picevenk-
up”isdetectedinthesuccession,anewEventMorpheme“hold”willbe
created(SemanticInterpolation).

4.4.3EventMorphemes’sphereofresponsibilityfor
reasoningSevwhicheralpoinreasoningtshaveshouldtobtakeeborneplace.inOnemindhaswhentotakceinonsideringtoconsideratatwhicionhwhatstage
featuresandattributesaffectthedecision,e.g.,aboutthespecializationof
thetheVclassCAofalgorithmtheEvreenstultscanMorphemes.behandRulesledintothereasonEventoutMorphemefalsepositivdetectionesin
stage.Thoseassertionsderivedfrom“reliable”VCAoutputsuchasobject
beclassthataregroinferredwingindomaintheknoSemantwledgeicMocandule.beTheincorpadvoanratedtagebyofthesimplylatterwadaptingould
theunderlyingontologyandnothavingtotouchtheprogramcodeofthe
EventMorphemedetectors(andre-compilethem).This,again,underlines
thegenericnatureoftheEventMorphemesasanindexingstructure.

index?theisWhere4.5Theentireconceptpresentedsofarisaconceptforindexingandretrieval.
So,asalaststep,theseparationbetweentheindexingandretrievalpart
hastobedone.Inotherwords:whereistheindex?AstheRetrievalwill
needalltheinformationheldintheEventMorphemes,theindexshouldmap
exactlythisinformation.Thus,thedividinglinegoes“through”theEvent
Morphemes(seefigure4.9).Whenbuildinganindexlikethis,thegeneric
natureoftheEventMorphemeapproachcanbeexploitedintotalbythe
SmartRetrievalwithoutlosingaccesstolow-levelfeatures.

72

Figure4.9:NewIndexingApproach:TheEventMorphemesaretheindex.

exampleAn4.6Figure4.10showsthekeyframesofascenethatcouldbedescribedas

Asuitcaseispassedfromonepersontoanother.

Apersonenterswithasuitcase.Heputsdownthesuitcaseandwalks
overtothechair.Afterthepersonsataforwhile,asecondpersonenters.
Hepicksupthesuitcaseandleaves.

ThethreecorrespondingEventMorphemePathswouldreadasfollows.

Forthefirstperson:“walk”WHILE“hold”-“put-down”-“walk”-
“sit”-wn”“sit-doForthesecondperson:“walk”-“pick-up”-“walk”WHILE“hold”
Finally,thesuitcase:“beingheld”-“beingput-down”-“stand”-
“beingpicked-up”-“beingheld”

Anotherinterpretationofthescenecouldcertainlybe

Aleft-behindsuitcaseisstolen

Figure4.10:Thesuitcasehandoverscene.

73

emerges.IncasesEitherlikeofthis,thethetwoadvinantageterpretationsofbreakingofthethesceneevencantsbinetomosmaldeled,lpartsand
itdoviserorsimpleathefttomaqueryyajustturndatabase.outwhenWhetherthethisfirstpsequenceersonrepisorshotsawingaburglaryhan-.

conceivOtherableandtomomoredelscenescomplexscenesspreadingcaonvberemmoultipledeledaswcameras.ell.ItThis,couldofevencourse,be
Ifthisdemandswasathesoundcase,listaofobscenariojectlikandereliableobjectfeaturesforrecognition.

Personshoppingatasupermarket

wouldbemodeledas(onlyperson’spointofview):

car”“exiting“walk”(toshoppingbaskets)
et)bask(shoppingk-up”“pic“hold”WHILEalk”“w)shelfto(in“grab”ds)o(go“hold”“putinto”(goodsintobasket)

“walk”WHILE“hold”(intocashierarea)
“walk”WHILE“hold”(intoparkinglot)
“putinto”(goodsincar)
“putdown”(shoppingbasket)
car”ter“en

74

4.7ImplementingEventMorphemes
4.7.1Anoverallsystem
Inrecognizedordertoevcreateentsatosystemcompileacapablesemanofticdetectingsceneevendescription,tsandwecomneedbiningthose
1.alow-levelanalysis(VCA2)algorithm,
2.anontologytolaydownacommondescriptionlanguage,
3.aninferencemechanismtomaptheVCA-metadatatoasemantic
el,lab4.atool/mechanismforcreatingthemapofthescene
5.aretrievalinterface
Theseitemsconstitutethesystemasfollows.TheVCAalgorithm(item1)
analyzestheimagesequenceandoutputstheobjects’shapes,tracks,andfea-
tureslikecolor,texture,etc.Thus,itprovidesthelevel0metadata.The
utilizedVCAalgorithmisdescribedinsection3.3.Theinferencemechanism
here,(item3will)iseffectthethecruxofconsistencetheofsystem:theanfolloytwinghingthatstagesisn’t(aswproellascessedinadequacysoundly
ofingtheVCAVCA).algThisorithm(s)partofemplotheyed.systemItoutputswillhavleveleto1bemetaadapteddata.toThetheonpreced-tology
forreasoningthecommonlogicsanddescroniptiontheotherlanguagehand(2)foristherequiredoutputonthetotheoneuserhandandforthethe
mappingoftheuser’squery,respectively.Furthermore,thereasoninglogic
levelexploiting2.Boththetheonunderlyingtologyonandtologythereenableasoning(togethershouldbewithindepitemenden4)tthefromsteptheto
VfromCAlevel0algorithmstolevelused1.asThethisretrievadapaltioninisterfacealready(5)pfinallyerformedshouldwhenbesteppingflexible
enoughtoenabletheuseravarietyofqueriesandmapthequerytothe
index.2VideoContentAnalysis

75

4.7.2TheEventMorphemeOntology
EventMorphemesholdassessablepatternsofthepixel-levelortransformed
playattributesanimpofortanthetsemanroleinticthe(evreaenst)oningconceptsprocesstheyastheyrepresent.enableThesethespattributesecializa-
tionofsemanticconcepts.Therefore,theunderlyingontologyhastoprovide
rectlyconcepts.Theandfollomecwinghanismsontologytomoindel,VERLdefine,notationandinshoterpretwsabasicthosevvocaluesabularycor-
Morphemes.tenEvof

SUBTYPE(person,ent)
SUBTYPE(object,ent)
SUBTYPE(state,ev)
SUBTYPE(move,state)
SUBTYPE(hold,state)
SUBTYPE(event,ev)
SUBTYPE(bring−in,event)
SUBTYPE(take−out,event)
SUBTYPE(put−down,bring−in)
SUBTYPE(pick−up,take−out)
PRIMITIVE(starts(evente,states))
PRIMITIVE(stops(evente,states))

4.7.3TheEventMorphemedetectormodules
Asstatedinsection4.7.1,theEventMorphemedetectionmoduleshavetobe
adaptedtotheemployedVCAalgorithm.So,topresentaproofofconcept,
themappingoftheVCAalgorithmdemandedinsection3.3isdescribedhere.
Notethatthefollowingalgorithmsonlyfitthisparticularsoftware.Detector
ofdetectionfordulesmo

tak1.e-outg-inbrin2.fall3.4.non-straight(movement)

76

done.wereAdetailedimplemented.desciptionFigurefollo4.11wsshinowsthescsubsequenhematicallythosections.wthemappingis

Figure4.11:MappingVCAoutputtoEventMorphemes.

ThemoduleshavebeenimplementedusingMATLABTM.TheEventMor-
phemesweredetectedinapost-hocanalysisstep,aftertheVCAprocessed
sequences.the

take-outForthedetectionoftheEventMorphemetake-outtwoinformationsareof
need:theremoved-classificationandthesplitlistsofallobjects.Whenanob-
jectisclassifiedbytheVCAalgorithmas“removed”andappearsinthesplit
listofanotherobjectaspecifictimebeforethedetectionofthe“removal”,
itis−1predicatedtobetakenout.Thus,theEventMorphemestake-outand
take-outaregeneratedfortheobjectthetaken-outobjectappearsinits
splitlistandthetaken-outobject,respectively.Algorithm4.1showsinprin-
hm.algorittheciple

bring-inThedetectionoftheEventMorphemebring-inisquitesimilartothedetec-
tionoftake-out.Insteadofthe“removed”-classification,the“idle”-clas-
sificationisused.WhenanobjectisclassifiedbytheVCAalgorithmas
“idle”andappearsinthesplitlistofanotherobjectaspecifictimebefore,
itisbring-in−predicat1areedtogenerabetedbroughforthet.obThjectus,thetheEvbroughentt-inobMorphemesjectappbring-inearsinandits
splitlistandthebrought-inobjectitself,respectively.Algorithm4.2shows

removedObjects←getObjectsWithEvent(’Removed’);
foroic←curr1enctoeOfRnumberemoval(i)ofr←emovedobjgetEventTime(ectsdoremovedObjects(i),
;d’)emove’RobjefindObjectsContainingRctInSplitLists(emoveremovedObjedObjects(i)ctInSplitList(i));←
endfori←1tonumberOfRemovedObjectsdo
fornumberj←O1fOtobjectsContainingRemovedObjectInSplitListdo
occurrenceOfSplit←getSplitTime(j);
oifc(currocenccurreOfRenceOfRemoval(i)emoval(i)<=o>c=currocenccurrenceOfSpliteOfSplit+))&&then(
Objectihasbeentakenoutbyobjectj
endendendAlgorithm4.1:Detectionmodulefortake-out

algorithm.theprinciplein

77

idleObjects←getObjectsWithEvent(’Idle’);
foroic←curr1enctoeOfIdnumberle(i)of←removedgetEventTime(objectsiddoleObjects(i),’Idle’);
idobjeleObjects(i)ctsContainingId);leObjectInSplitList(i)←findObjectInSplitLists(
endforifor←j1←to1tonumbernumberOfIOdlfOeObjbjectsectsCdoontainingIdleObjectInSplitList
dooccurrenceOfSplit←getSplitTime(j);
ifocObcurrjectencihaseOfIdbeenle(i)>brough=octcurrinbyencobjecteOfSplit)jthen
endendendAlgorithm4.2:Detectionmoduleforbring-in

andAsthenthestay“idle”-clastillforsasifispcationecificalsotime,theappliesresultstoobofjectsalgorithmthatre4.2cenhatlyvemtoovbede

78

pdoneost-probycessedeliminatingsemanthosetically:objectsfalseppresumedositivestoarebetobbroueght-inreasoned-out.thatexistedThisinis
mothevemtracenkt.listsThofus,theValgorithmCA4.2outputhasortobtheeobextendedjecttraasshojectorywngivinesalgorithmevidence4.3of.

idleObjects←getObjectsWithEvent(’Idle’);
fori←1tonumberofremovedobjectsdo
occurrenceOfIdle(i)←getEventTime(idleObjects(i),’Idle’);
objectsContainingIdleObjectInSplitList(i)←findObjectInSplitLists(
;)cts(i)leObjeidendfori←1tonumberOfIdleObjectsdo
forj←1tonumberOfObjectsContainingIdleObjectInSplitList
dooccurrenceOfSplit←getSplitTime(j);
ifoccurrenceOfIdle(i)>=occurrenceOfSplit)then
if(getFirstOccurrence(idleObjects(i))==
occurrenceOfIdle(i))&&(getTrackLength(idleObjects(i)
then)1<Objectihasbeenbroughtinbyobjectj
endendendendAlgorithm4.3:Extendeddetectionmoduleforbring-in

Theresultingimprovementisshowninsection5.3.Thesamerestraints
areputontheresultsofthedetectionoftake-out.Thus,algorithm4.1is
.accordinglyadapted

lfalThedetectionoffallanalyzesthechangeoftheaspectratiooftheobject’s
boundingbox.Ifthischangeislargerthanaspecifiedthresholdandthe
objectisaperson,theobject(person)ispredicatedtohavefallen(seealgo-
rithm).4.4

foreachsetTimeWindow(timeWindowΔt);do
bboxAspectRatio←calculateAspectRatio(BoundingBox)
endif(bboxAspectRatio>=threshold)&&(objectClass==’Person’)
thenfellersonPendAlgorithm4.4:Detectionmoduleforfall

79

aightnon-strTheEventMorphemenon-straightisdetectedbyanalyzingtheobject’stra-
jectory.First,thetrajectoryisnormedtoaspecificnumberofverticeswithin
theanalysis(time)window.Then,apseudo-differentiationofthetrajectory
iscalculated.Exploitingthefactthattheseconddifferentiationofastraight
linevanishes,athresholdisappliedtotheoutcome.Ifitisreached,the
objectispredicatedtomoveinanon-straightway.

1setTimeWindow(Δt);
32foreacforeachhTrtimeWindowajectoryΔtofiΔdotdo
i4normedTrajectory←normTrajectory(TrajectoryOfΔti,
numberOfVertices);
5pseudoStraightness←diff(normedTrajectory,2)/
;eserticerOfVnumb6ifpseudoStraightness>thresholdthen
7Objectisnotmovingstraight
8end9end10endAlgorithm4.5:Detectionmodulefornon−straight
normTrajectory()inline4istakenfrom[ISO/MPEG,2002](seeeq.

4.2),withoutcorrespondencedetermination.

80

1.SetVi=c(i)for0≤i≤NC−1andNP=NC
2.IfNP=N,terminate.
3.Findi(=imin)whichminimizesTi.
4.SetV(imin−1)modNP=V(imin−1)modNP+Δ(i)andV(imin+1)modNP+Δ(i).
5.SetVimin=Vimin+1,Vimin+1=Vimin+2,...,VNC−2=VNC−1
andNP=NP−1.GotoStep2.
(4.2)

(4.2)

diff(X,n)inline5ofalgorithm4.5isaMATLABTMfunctionthatcal-
culatesdifferencesbetweenadjacentelementsofX,applyingrecursivelyn
times,resultinginthenthdifference.

5Chapter

Results

5.1ductiontroIn

ThiscMorpheme-basedhapterpresenevtsenttheresultsdetectionofansystem.exeItmplaryisshownimplemenhowthetationofdetectorsanEvenin-t
introtheducedinrecognitionsectionof4.7.3complexperforminscenarios.differenIttisalsoscenariosanddemonstratedtheirhowapplicationdetec-
tionresultsresultsthatarewillbimproevpresenedbytedherereasoning-outasaprooffalseofpconositivcepteshaautovebeenmatically.obtainedThe
inousseveralscenariossteps.wasAtprofirst,cessed.atrainingNext,thesetEvenconsistingtofMorphemes304dsequencesetectionwithhasbveenari-
devFinallyelop,edtheonthedetectionoutputhasofbetheenpVCAerformedalgorithmontheprotestcessingsetconthesetainingthesequences.use
casesequences.Theremainderofthischapterisorganizedasfollows.In
sensectionted.5.2Sectionthe5.3realizatshowsionhofowthetheuseteccaseshniqueidenoftifiedreasonining-outsectionimpro1.2.2visespre-the
theprecisiontestsetofarethepresenresults.tedinThesectionresultsof5.4.1.eventFinally,detectionsectionin5.5thecomparessequencestheof
EventMorphemeapproachwithcompetitiveapproaches.

CasesUseofRealization5.2

Asmentionedinsection1.2.2,togetherwithsecuritysystemsexperts,aset
ofrelevantusecaseshasbeenidentified.Atfirst,thefollowingsub-eventsare
tobedetected:pick-up,put-down,andfallingdown.Theselectedsub-events
shouldbecapableofcomposingthedefinedusecases.So,thefollowinguse
casestoberetrievedfromtheindex,weredefined.

81

82

1.“Fight”:apersonfallscausedbyapersonpassingby/personfallsdown
anddoesn’tstandupagain.
2.“Shopping”:didthecostumerpayornot?
3.“CriminalbehavioratanATM”:spyonanotherperson’sPIN.
4.detectbeggar/salesmanonstreet.

Thesescenarioswerestagedindifferentlocations:1)anewsstand,2)an
outdoorpathway,and3)indoors.Thefollowingsectionsexplainhowtheuse
caseswhereputintoaction.

tFigh5.2.1Thefightscenariowasrecordedindoors.Twopersonsapproachedeachother
oforthepassedscene.byoneanother.Oneofthepersonfell.Seepicture5.1forashot

Figure5.1:Apersonfallscausedbyanother.

Shopping5.2.2Themainissueintheshoppingscenariowas“Didthecostumerpayornot?”.
Totestthisscenario,asetofsequenceswereshotataconvenientstore.A
personstandingatashelfwithnewspapersandmagazinespicksupamag-
azine,andtheneitherputsitback,paysforitandleaves,orleaveswithout
payingforit.Figure5.2showsaframeofthescene.

scenario.shoppingThe5.2:Figure

83

5.2.3CriminalBehavioratanATM
TheATMscenariowasrecordedindoors.Theissueof“spyingonanother
person’sPIN”wasstagedbyapersonpretendingtooperateanATM(ac-
tuallyawhiteboard).Anotherpersonapproachescloselyand(pretendsto)
spiesonwhatpersononeisdoing.Seefigure5.3forthesetupofthescene.

Figure5.3:Apersonspyingonanother’sPIN.

StreetonBeggar/SalesmanDetect5.2.4Thescenario“beggar/salesman”wasrecordedoutdoors.Thesetupofthe
scenewasalineofpeoplewalkingdownafootpath.Anotherpersoncomes
towardthosepeopleandapproacheseachoneofthem.Thisresultsina
non-straighttrajectorywheretheothers’trajectoriesarestraight.Figure5.4
showsthescenery.

Figure5.4:Apersonwalkingfromonepersontoanother.

84

5.3Reasoningoutfalsepositives
egyThisasinsectiontroshoducedwsinthesectionresults4.7.3for.theWhetherapplicationtheobofjectthedetectedreason-outastakstrat-en-
object’sout/broughtrat-injectoryexistedgavebproeforeofofthemovdetectionement,ofthetheresult“remoofvtheal”/“idle”detectororifoutputthe
wasclassifiedas‘falsepositive’.
positivTheeratfolloewingdecreasedimprovand,ementhtsus,weretheachievprecisionedontheincreased.trainingThisset.wasTheforfalsethe
increasedetectionofof17,7%take-outintheadecreaseprecision.ofForfalsethepositivdetectionesbyof20,2%bring-in,resultingtheresinultsan
wwaereyevimproenvmoreedthesignificanprecisiont.bTheyfa29,0%.lsepositivSeeetablrateeB.1droppinedappbyendix35,8%Bandforthisthe
resultsonthetrainingsequences.Thereasoning-outstrategysolelyeffects
thethelatterprecisionareasnotthetoberecallisreducedbaffectedypbyost-hothecfalseanalysis.negatives.Unfortunately,

5.4EventMorphemedetection
Forthedetectionofthescenariosdescribedabove,ademonstratorsoftware
called“postHocMPEG7Analysis”wasdeveloped.Seefigure5.5forascreen-
shot.

ThesoftwarewasdevelopedusingMATLABTM.Thedesiredclassofthe
EventMorphemecanbeselectedviatheuserinterface.Theimplemented
areclasses

Figure5.5:ThepostHocanalysistool.

fall(“Tipover”)

(“pictake-outk-up”)

wn”)(“put-dobring-in

ev(“moaightnon-str

t”)non-straigh

Additionally,thefollowingSemanticModulesareavailable.

1.Insocialarea1

tThef2.

yaP3.

ingLoiter4.

eedingSp5.

1termadoptedfrom[FuentesandVelastin,2005]

85

Grab6.7.areateringEnOut&In8.In&Out9.

86

Items1to3arethoserelevanttothetestscenarios.“Insocialarea”
TheappliessecondtoptheersonATMenteringscene:thisthesoareacialtriggersareaisthedraevwnent.inth“Theft”evideo(itemdispla2)yis.
2thethecashiersuccessionzone”ofpick-uppreceding;andthedisappecashierarzonewithoutisadrawnput-downintheoranvideo“endisplateringy.
“Pdisappay”ear(item,3)withoutisthethedesuccessiontectionofofpick-upput-down,.“enItemstering4theto9cashierarenotzone”relevandant
forthedetectionoftheusecases.

5.4.1Resultsforthetestsequences
Thissectionpresentstheresultsforthedetectionoftheidentifiedusecases.
Theseresultswerevalidatedagainstahand-annotatedgroundtruth.The
rateoftrueandfalsepositivesandoffalsenegativesandtheresultingvalues
forprecisionandrecallareshownintable5.1.

ScenarioGroundtruefalsefalseRecallPrecision
Truthpositivespositivesnegatives
pick-29271293,196,4
upput-970277,8100
wndoShoppingpay870187,5100
theft12111191,791,7
Beggar32200100100
Fight11100190,9100
ATM1090190100

Table5.1:Resultsofthedetectionofeventsinthetestsequences

2disappearistheendoftheobjecttrack.
3trackslabeledmanually

87

ThedetectionoftheEventMorphemenon-straightfortheusecase“de-
tectBeggar/Salesmanonstreet”failedduetotheearlyoverlapoftheper-
sonsstagingthescenario.So,thedetectorwastestedwithmanuallylabeled
tracks.Withthisinput,thebeggarwasdetectedcorrectly.Theonefalse
positivefor“theft”wascausedbyamissed“put-down”.

oundsbError5.4.2ThissectionapproximatestheConfidenceRadiiforthetotalrecallandpre-
cisionvaluescalculatedfromtheresultsabove.TheConfidenceRadiusde-
terminesagiveninterval,inwhich-withacertainconfidence-theresults
actuallymaylie.Astotalrecall,thepresentedsystemachieves90%,and
fortotalprecision97%.First,theConfidenceRadiusforrecalliscalculated
[Papula,1997].
Usingaconfidencelevelof95%(γ=0,95),fromtheconstraint

P(−c≤U≤c)=γ=0,95,
Ubeingtheactualvalueforrecall,ccalculatesas

P(−c≤U≤c)=Φ(c)−Φ(−c)=Φ(c)−[1−Φ(c)]=
2∙Φ(c)−1=0,95
Φ(c)=0,975−→c=u0,975=1,960.

(5.1)

→−(5.2)Thus,theconstantisthequantileu0,975=1,960ofthenormaldistribu-
tion.Withn=81andk=73theunknownvalueofpestimates

pˆ=nk=8173=0,901.

(5.3)Theconstraintforanextensivesample(Δ>9)isnottotallymet,yet
enough:close

Δ=n∙pˆ∙(1−pˆ)=81∙0,901∙(1−0,901)=7,21.

Now,theConfidenceRadiusΘcalculatesas

√Θ=c∙Δ=1,96∙7,21=0,06≈6%.
81n

88

(5.4)

(5.5)Proceedingaccordinglyfortheprecisionvalues(n=75,k=73),the
confidenceradiusyieldsto≈3%.
Comparison5.5Inthissection,theresultspresentedinsection5.4.1arecomparedtoresults
publishedinliterature.Wherenotavailable,thecompetitors’resultswere
transposedtoprecision/recallvaluesusingtheresultsgiven.Inthefollowing,
theresultsareshownaspublished.

[Brandetal.,1997]testedtheirsystemusing52gesturesof3types:

whip”“single“cobra”knee”“brush

The[Cupillardclassificationetal.ac,2004curacy]usedforthethepropscenariososedsystemwas94,2%.

“fraud”alism”and“vting”“fighking”c“blowding”roercv“o

89

Theyachievedthefollowingdetectionrates.6of6forfraud,4of4
forvandalism,20of24forfighting,13of13forblocking,and2of2for
ov[ercroGulerwding.etal.F,alse2003]alarmtestedswerewith2for“tenfraud,significan4fortfigheventing,tssucandh1asforpackbloagecking.ex-
changetailgatingovermthroughultipleaccessviews,conpeopletrolledcparkiheckngpboinutts”notTheleavingcorrecttheircars,detectionpeoplerate
forthe“RealTrainingSet”,reallifevideodata,was92,31%.
even[tsHakaseemandShah,2004]classifiedstagedmeetingscenariosconsistingof

raised”“handered”wlo“handject”obupk“picject”obwndo“putextended”“handretraced”“handving”aw“handpresenderator“mot”ting”oinp“handds”es/noshak“head

Thoseteneventswereusedtodetectthreeclassesofmeetings:

t”“argumenpassing”ject“oboting”“v

[RaoandShah,2001]testedtheproposedsystemontheevents

up”k“picwn”do“put

90

andonthedetectionofsimilaractions.Thetruepositiverateforthe
down”detectionofresulted“pickin7up”truewaspo9sitwithives,61falsefalsenegativnegativeseandand11falsefalseppositivositive.e.“Put
[Xiangetal.,2002]trainedtheirsystemwiththeevents

en”tak“can“enter&leave”
er”eep“shopkwsing”“broying”“pa

Theresultsonthetestsetfor“cantaken”consistedof10truepositives
withnofalsepositiveornegative(100%of10,nofalsedetection).Enter
&(61,1%leaveof18,resulted3falsein11truedetections).positiveShopks,3eepfalseeracphievositivedes4andtrue7pfalseositives,1nagativfalsees
ptheositivedetecctionand8offalsebronegativwsingesare5(33,3%trueofp12,ositiv1es,false9falsedetection).positivesTheandresults3falsefor
negatives(62,5%of8,9falsedetections).Payingresultedin6truepositives,
6falsecomparisonpositivisessummaandnorizedfalseintablenegativ5.2e.(100%of6,6falsedetections).The

SystemruthTGroundscenariostdifferenesositivptrueestivisopfalseesnegativfalseRecallPrecision
94,2352Brand[,al.et]1997[Cupillard455427393,385,7
,al.et]2004

92,3

[etGuleral.,1092,3
]2003[Hakeem12231156794,395,0
,Shahand]2004[Neuhaus,4817732890,197,3
2006][Raoand232162769,688,9
,Shah]2001[Xiang54536191866,765,5
,al.et]2002

hnologyecTHiddenCoupleddelsMovoMarkthree-stageformalismerators-(opwledge-knodescription)HMM

sys-rule-basedstatedantemhinemac)4hapterc(seeHMM

cluster-forEMctionseleing,ummMiniusingDescription(MDL)Length

Table5.2:Comparisonofthesystems

4Thiswork,tobeprecise!

91

92

5.6Mappingotherapproaches’events
toEventMorphemes
TheconceptofEventMorphemesisproposedtobeagenericsemanticframe-
workforthemappingoflow-levelanalysisdatatohigh-levelscenedescrip-
5.3tions.exemplaThiswrilyay,shoitwsmustthatbepthisisossiblethetocasemofordelanysomeconceivscenariosableeusedvenint.Tlitera-able
ture.[Hongengetal.,2004]:
“heading“approactohingwaard”referenceperson”heappradingoachistowarandisEvensynontymousMorphemfore
happroac[“moBourbakisvingTrucetk”al.,2003]:anEventMorphemewithknowl-
edgeoftheobjects’class
vingCargo”“mo“movingCargoLift”sameasabove
[“moSiskindvingT,ruc2001]:kCargo”
“Ahandispickingupablock”tknowowledgeEventoftheobMorphemesjects’classwith
[Ghanemetal.,2004]:
“carexchange”ingSemandrivticenaMowaydule:byathedrivcarerisbNOTe-
beingtheonethatbroughtitin
[Guleretal.,2003]:
“wait”EventMorphemestay
“enter”SemanticModule:enterimplies
oftheanknoentrywledge(MetaKnowledge)
“pick-up”EventMorpheme
“exc“packhangeagebdrop-oetwffeen”people”seetheexampleinsection4.6

[Guleretal.,2003]:
ait”“wter”“en

k-up”“pic“packagedrop-off”
“exchangebetweenpeople”

Table5.3:Mappingotherapproaches’eventstoEventMorphemes

Chapter6

okoutloandConclusion

Wenneiner,dermitM¨uhekaum
BaumeinenaufistGeklettertSchonmeint,dassereinVogelw¨ar,
Soirrtsichder.1
h)Busc(Wilhelm

troIn6.1duction

Inlothisw-levelthesiscontethentanalyapplicationsisdataofthetosemangenericticevconceptentfordescriptionsmappingisthepresenresultsted.of
Theconstitutingelementsofthisapproachandtheirunderlyingconcepts
asevenwtsellareasannotintroregardedductionastowhole,theirbutapplicationdecomposedareinshotown.smallAsemanscenarioticallyor
meaningfulMorphemes,anpartsy(smallscenariobutcannotbeatomic).build.Byarrangingtheseparts,theEvent
stageTheatwhmainichconthesteptributionfromoflothew-levelapproactoharehigh-levtheelgeneralitrepresenytationandisthetakearlyen.
TheThisreasoningreasoninginonthecomplexermetadatascenesdomainisdoneispintheerformedsemanonticsmallspace.timeEvenframes.an
unsupresultswiervisedllbeassessedself-assessmeninthetisfollopwingossiblesection,usingthesectionseman6.2.ticSectionapproac6.3h.poinThets
outfutureresearchdirections.
be:1Ifwrongsomeoneishe.climbslaboriouslyintothebranchesofatreeandthinkshimselfabirdto

93

theAssessing6.2results

94

TheproposedconceptwasvalidatedbyimplementinganEventMorpheme-
basedeventdetectionsystem.Theoveralldetectionresultofthesystem
comesto97,3%and90,1%forprecisionandrecall,respectively.Thisshows
thattheconceptofEventMorphemesasanindexforlargesurveillancecon-
tentdatabasesisexeedinglysuitableforthemodellingofrelevantsurveillance
scenarios.Theresultsshowthatthepresentedconcept’sperformanceisatleast
equaltothestateoftheart.Thedifferencetotheotherapproachesin
comparisonisthatthesystempresentedhereisapost-hocmetadataanalysis
system.ItistobeexpectedthattheincorporationoftheEventMorpheme
detectorswithintheVCAalgorithmwillincreasetheperformanceofthe
system.TheinformationavailabledirectlyintheVCAmoduleallowsto
incorporatemoreinputtothedetectormodulesandevenconfidencemeasures
fortherecognition.ThiswouldalsoexploittheconceptofarrangingEvent
Morphemeclassesinataxonomywithincreasingsemanticpredicate(see
section4.3.3).ThereareVCAsystemsthatperformlong-termanalysissuch
asbackgroundmodels,2,5Dmapsorentry/exitdetectionandthusan(semi-)
automatedmapofthescenecreation.Suchsystemswouldevenreducethe
incorporationoftheuserintheeventrecognitionprocessandthiswayenable
thedevelopmentofamoreintegratedsystem.
Thesystemclearlyreliesonthecontentanalysisalgorithm.Thisnotonly
meansthatimprovementoftheeventdetectionincludesimprovingtheVCA.
Italsoentailsthattheassessmentoftheretrievalperformanceisapossible
approachtoevaluatetheperformanceofVCAalgorithms.
Thedrawbackofotherapproachesisthelackofgenerality.Thisisre-
solvedbysplittinguptheinformationnecessarytorecognizeaneventinto
threeparts.Generaldomainknowledgeabouthowanevent“lookslike”,
scenariospecificinformationsuchasthelayoutofthesceneetc.,andsearch
specificuserrestrictionsareregardedseparately.Theyareincorporatednot
untilanactualqueryisposted.
AnindexforthesearchinlargedatabasesissetupbestusingEvent
Morphemesassemanticintermediatelayerratherthansearchingdirectlyon
thelow-levelmetadata.Fastersearchesarepossibleonthepre-processed
index,withnodecreaseingenerality.
TheconceptofEventMorphemesiscapableofmodelingtheotherap-
proaches’events.Eventheoutputofthosesystemsbasedontrackingoflimbs
suchas[Brandetal.,1997]and[HakeemandShah,2004]canbemappedto
Morphemes.tenEv

okOutlo6.3

95

Animportantquestionishowaccuratesemanticscenedescriptionsbasedon
currentlyavailablecontentanalysisalgorithmscanbe.Anotherthingtobe
evaluatedistheamountofeventclassesthatcanberecognizedinrealtime.
Thiswouldprovideaflexiblyconfigurablesystemtoalertthesurveillance
operatorwhenactionsofinteresttakeplace,akindofsemanticfilter,soto
eak.spAconceivableextensionwouldbetheincorporationofaudioasasecond
modality.Multimodalanalysisleadstobetterperformanceinmultimedia
analysis[SnoekandWorring,2003].Forthefieldofsurveillance,thismight
beadifferentstory,assurveillancescenesareneithercomposednordothey
haveamusicalsoundtrack.So,themodality“audio”mightbeaddingnew
eventsratherthaninfluencetheconfidenceofthevisualclues.Butthis,of
course,hastobeinvestigatedfurther.
Onthesemanticside,thedevelopmentofacompleteontologyisamatter
offutureresearch.Thepotentialoutputofcontentanalysisalgorithmshas
toevolveindiversitytomaketheontologycompleteandnotlimititto
eventsbasedontrajectoryanalysisandchangedetectioninthebackground
(e.g.removal).Ofcourse,theconceptoftheEventMorphemesalreadyis
applicable,butstill,yet,ithasmorepotential.

yBibliograph

TheDARPAAgentMarkupLanguageHomepage,2000.URLhttp://www.
.daml.orgDescriptionofoil,2000.URLhttp://oil.semanticweb.org/.
ChirazBenAbdelkaderandLarryDavis.Detectionofpeoplecarryingob-
jects:amotion-basedrecognitionapproach.In5thInternationalconfer-
enceonAutomaticfaceandgesturerecognition,May2002.
J.K.VisionAggarwandalImageandQ.Cai.UnderstandingHuman,motion73(3):428–analysis:440,MarcAhreview.1997.Computer
J.F.Allen.Maintainingknowledgeabouttemporalintervals.Communica-
tionsoftheACM,26(11):832–843,1983.
J.F.AllenandG.Ferguson.Actionsandeventsintemporalintervals.Jour-
nalofLogicComputing,4(5):531–579,1994.
JamesIntelF.ligenceAllen.,23:123Tow–ards154,a1984.generaltheoryofactionandtime.Artificial
D.AyersandM.Shah.Monitoringhumanbehaviorfromvideotakenin
anofficeenvironment.ImageandVisionComputing,19(12):833–846,
2001.erOctobFaisalAnoI.verview.BashirTandechnicalAshfaqrepA.ort,Khokhar.Dept.ofVideoCS/ECE,contentUIC,mo2003.delingtechniques:
A.F.BobickandJ.W.Davis.Therecognitionofhumanmovementusing
temporaltemplates.IEEETransactiononPatternAnalysisandMachine
Intelligence,23(3):257–267,March2001.
BobArdaBolleschallengeandRamprojectNevfinalatia.repAhort,ierarc2004.hicalReportvideoingevPenterioond:tologyJune1,ino2004wl.
-September30,2004.

96

97

NikolaosG.Bourbakis,JamesR.Gattiker,andGeorgeBebis.Asynergistic
modelforrepresentingandinterpretinghumanactivitiesandeventsfrom
video.InternationalJournalonArtificialIntelligenceTools,12(1):101–
2003.116,M.Brand,N.Oliver,andA.Pentland.Coupledhiddenmarkovmodelsfor
complexactionrecognition.InComputerVisionandPatternRecognition,
pages994–999,June1997.
C.Bregler.Learningandrecognizinghumandynamicsinvideosequences.
InIEEEComputerSocietyConferenceonComputerVisionandPattern
Recognition,pages568–574,June1997.
Fran¸coisBr´emond,NicolasMaillot,MoniqueThonnat,andVan-ThinhVu.
Ontologiesforvideoevents.Rr-5189,INRIASophia-AntipolisResearch
2004.yMaUnit,JamesClark.XSLTransformations(XSLT),November1999.URLhttp:
.//www.w3.org/TR/xsltJ.Connell,A.W.Senior,A.Hampapur,Y.-L.Tian,L.Brown,and
S.Pankanti.DetectionandtrackingintheIBMPeopleVisionSystem.
InIEEEInternationalConferenceonMultimediaandExpo(ICME’04),
pages1403–1406Vol.2,June2004.
D.Cunado,M.S.Nixon,andJ.N.Carter.Usinggaitasabiometric,via
phase-weightedmagnitudespectra.In1stInt.Conf.onAudio-andVideo-
BasedBiometricPersonAuthentication,pages95–102,1997.
F.Cupillard,A.Avanzi,F.Br´emond,andM.Thonnat.Videounderstanding
formetrosurveillance.InIEEEInternationalConferenceonNetworking,
SensingandControl,Volume:1,pages186–191,March2004.
R.CutlerandL.S.Davis.Robustreal-timeperiodicmotiondetection,analy-
sis,andapplication.IEEETransactionsonPatternAnalysisandMachine
Intelligence,22(8):781–796,August2000.
JohnDavies,DieterFensel,andFrankvanHarmelen,editors.Towardsthe
SemanticWeb.Wiley,2002.
J.W.Davis.Sequentialreliable-inferenceforrapiddetectionofhumanac-
tions.InComputerVisionandPatternRecognitionWorkshop,pages111
2004.June118,–

98

H.M.DeeandD.C.Hogg.Onthefeasibilityofusingacognitivemodel
tofiltersurveillancedata.InIEEEInternationalConferenceonAdvanced
VideoandSignal-BasedSurveillance,Como,Italy,2005.
I.Ersoy,F.Bunyak,andS.R.Subramanya.Aframeworkfortrajectory
basedvisualeventretrieval.InInternationalConferenceonInformation
Technology:CodingandComputing(ITCC2004),Volume:2,pages23–
2004.27,AlanFern.LearningModelsandFormulasofaTemporalEventLogic.PhD
thesis,PurdueUniversity,August2004.
G.L.Foresti,C.Micheloni,andL.Snidaro.Eventclassificationforautomatic
visual-basedsurveillanceofparkinglots.In17thInternationalConference
onPatternRecognition,Volume:3,pages314–317,August2004.
LuisM.FuentesandSergioA.Velastin.Tracking-basedeventdetectionfor
CCTVsystems.PatternAnalysisandApplications,7(4),2005.
D.M.Gavrila.Thevisualanalysisofhumanmovement:Asurvey.Computer
VisionandImageUnderstanding,73(1):82–98,1999.
D.M.Gavrilla.Thevisualanalysisofhumanmovement:Asurvey.Computer
VisionandImageUnderstanding:CVIU,73(1):82–98,1997.
M.R.GeneserethandR.E.Fikes.LogicalFoundationsofArtificialIntelli-
gence.MorganKaufmannPublishers,1987.
NagiaGhanem,DanielDeMenthon,DavidDoermann,andLarryDavis.Rep-
resentationandrecognitionofeventsinsurveillancevideousingpetrinets.
InIEEEWorkshoponEventDetectionandRecognition,page112,June
2004.W.EricL.GrimsonandTom´asLozano-P´erez.Recognitionandlocalizationof
overlappingpartsfromsparsedata.A.i.memo841,MassachusettsInstitute
ofTechnologyArtificialIntelligenceLaboratory,June1985.
AlexeiGritai,YaserSheikh,andMubarakShah.Ontheuseofanthropometry
intheinvariantanalysisofhumanactions.InICPR(2),pages923–926,
2004.T:R:Gruber.Towardprinciplesforthedesignofontologiesusedforknowl-
edgesharing.InN.GuarinoandR.Poli,editors,FormalOntologyinCon-
ceptualAnalysisandKnowledgeRepresentation.KluwerAcademicPub-
1995.lishers,

99

SadiyeGuler,WinnieH.Liang,andIanA.Pushee.Avideoeventdetection
andminingframework.InConferenceonComputerVisionandPattern
RecognitionWorkshop,volume4,pages42–49,2003.
AsaadHakeemandMubarakShah.Ontologyandtaxonomycollaborated
frameworkformeetingclassification.In17thconferenceoftheInterna-
tionalConferenceonPatternRecognition(ICPR),2004.
AsnndHakeem,YaserSheik,andMubarakShah.CASEE:Ahierarchical
eventrepresentationfortheanalysisofvideos.InTheNineteenthNational
ConferenceonArtificialIntelligence(AAAI),pages263–268,2004.
A.Hampapur,L.Brown,J.Connell,M.Lu,H.Merkl,S.Pankanti,A.W.
Senior,C.Shu,andY-LTian.TheIBMSmartSurveillanceSystem.In
IEEECVPR,WashingtonD.C.,June2004.
IsmailHaritaoglu,DavidHarwood,andLarryS.Davis.w4:Real-time
surveillanceofpeopleandtheiractivities.IEEETransactionsonPattern
AnalysisandMachineIntelligence,22(8):809–830,August2000.
IsmailHaritaoglu,RossCutler,DavidHarwood,andLarryDavis.Backpack:
Detectionofpeoplecarryingobjectsusingsilhouettes.ComputerVision
andImageUnderstanding,81(3):385–397,March2001.
JerryR.HobbsandFengPan.Anontologyoftimeforthesemanticweb.
ACMTransactionsonAsianLanguageProcessing(TALIP):Specialissue
onTemporalInformationProcessing,3(1):66–85,March2004.
SomboonHongengandRamakantNevatia.Multi-agenteventrecognition.In
InternationalConferenceonComputerVision(ICCV’01),volume2,July
2001.SomboonHongeng,RamNevatia,andFran¸coisBr´emond.Video-basedevent
recognition:activityrepresentationandprobabilisticrecognitionmethods.
ComputerVisionandImageUnderstanding,96(2):129–162,November
2004.WeimingHu,TieniuTan,LiangWang,andSteveMaybank.Asurveyon
visualsurveillanceofobjectmotionandbehaviors.InIEEETransactions
onSystems,ManandCybernetics,PartC,pages334–352,Aug.2004.
JaneHunter.Addingmultimediatothesemanticweb-buildinganmpeg-7
ontology.InternationalSemanticWebWorkingSymposium(SWWS),Jul.
2001.

100

ISO/MPEG.N4579,TextofISO/IECDraftTechnicalReport15398-8In-
formationTechnology-MultimediaContentDescriptionInterface-Part
8ExtractionandUseofMPEG-7Descriptions,March2002.
YuriA.IvanovandAaronF.Bobick.Recognitionofvisualactivitiesand
interactionsbystochasticparsing.IEEETransactionsonPatternAnalysis
andMachineIntelligence,22(8):852–872,August2000.
O.Javed,S.Khan,Z.Rasheed,andMShah.Camerahandoff:Trackingin
multipleuncalibratedstationarycameras.InWorkshoponHumanMotion,
pages113–118,December2000.
A.JepsonandW.Richards.Whatmakesagoodfeature?SpatialVisionin
HumansandRobots,pages89–125,1991.
A.Kojima,T.Tamura,andK.Fukunaga.Textualdescriptionofhuman
activitiesbytrackingheadandhandmotions.In16thInternationalCon-
ferenceofPatternRecognition,Vol.2,pages1073–1077,August2002.
J.Krumm,S.Harris,B.Meyers,B.Brumitt,M.Hale,andS.Shafer.Multi-
cameramulti-persontrackingforeasyliving.InThirdIEEEInternational
WorkshoponVisualSurveillance,pages3–10,July2000.
O.LassilaandR.Swick.Resourcedescriptionframework(rdf),January
1999.URLhttp://www.w3.org/TR/PR-rdf-syntax/.
AlanJ.Lipton.Localapplicationofopticflowtoanalyserigidversusnon-
rigidmotion.InICCV99WorkshoponFrame-RateApplications,Septem-
erb999.1JiangungLou,QifengLiu,TieniuTan,andWeimingHu.Semanticinter-
pretationofobjectactivitiesinasurveillancesystem.InIEEE16thInt.
Conf.PatternRecognition,volume3,pages777–780,2002.
DimitriosMakrisandTimEllis.Automaticlearningofanactivity-based
semanticscenemodel.InIEEEConferenceonAdvancedVideoandSignal
BasedSurveillance(AVSS’03),pages183–1488,July2003.
G´erardMedioni,IsaacCohen,Br´emond,SomboonHongeng,andRamakant
Nevatia.Eventdetectionandanalysisfromvideostreams.In8,editor,
IEEETransactionsonPatternAnalysisandMachineIntelligence,August,
889.–87323.2001

101

M.Meyer,M.H¨otter,andT.Ohmacht.Anewsystemforvideo-baseddetec-
tionofmovingobjectsanditsintegrationintodigitalnetworks.InSecurity
Technology,30thAnnualInternationalCarnahanConference,Oct.1996.
StefanM¨uller-Schneiders,ThomasJ¨ager,HartmutS.Loos,andWolfgang
Niem.Performanceevaluationofarealtimevideosurveillancesystem.
In2ndJointIEEEInternationalWorkshoponVisualSurveillanceand
PerformanceEvaluationofTrackingandSurveillance,October2005.
JacintoC.Nascimento,M´arioA.T.Figueiredo,andJorgeS.Marques.Mo-
tionsegmentationforactivitysurveillance.In1stISRWorkshoponSys-
tems,DecisionandControlRoboticMonitoringandSurveillance,Lisbon,
2005.JuneM.NaylorandC.I.Attwood.ADVISORAnnotatedDigitalVideoforIn-
telligentSurveillanceandOptimisedRetrieval.Ist1999-11287:Advisor,
2003.yMaVISOR,ADHolgerNeuhaus.Asemanticconceptforthemappingoflow-levelanalysis
datatohigh-levelscenedescription.InFourthInternationalWorkshopon
Content-BasedMultimediaIndexing(CBMI2005),ISBN952-15-1364-0,
21-23June2005.SuviSoftOyLtd.CDRom.
Nevatia,Hobbs,andBolles.Anontologyforvideoeventrepresentation.In
ConferenceonComputerVisionandPatternRecognitionWorkshop,pages
2004.June128,–119LotharPapula.Mathematikf¨urIngenieureundNaturwissenschaftler,vol-
1997.eg,View3.umeSanghoParkandJ.K.Aggarwal.Recognitionoftwo-personinteractions
usingahierarchicalbayesiannetwork.InFirstACMSIGMMinternational
workshoponVideosurveillance,pages65–76,2003.
SanghoParkandJakeK.Aggarwal.Semantic-levelunderstandingofhuman
actionsandinteractionsusingeventhierarchy.InConferenceonComputer
VisionandPatternRecognitionWorkshop,pages12–20,June2004.
J.H.Piater,S.Richetto,andJ.L.Crowley.Event-basedactivityanalysisin
livevideousingagenericobjecttracker.In3rdIEEEInt.Workshopon
PETS,pages1–8,June2002.
C.PinhanezandA.Bobick.Humanactiondetectionusingpnfpropagation
oftemporalconstraints.InIEEEConferenceonComputerVisionand
PatternRecognition,pages898–904,June1998.

102

FatihPorikliandTetsujiHaga.Eventdetectionbyeigenvectordecomposition
usingobjectandframefeatures.InConferenceonComputerVisionand
PatternRecognition(CVPRW),volume7,page114,June2004.
CenRaoandMubarakShah.View-invariantrepresentationandlearningof
humanaction.InIEEEWorkshoponDetectionandRecognitionofEvents
inVideo,July2001.
J.M.Siskind.Groundingthelexicalsemanticsofverbsinvisualperception
usingforcedynamicsandeventlogic.JournalofArtificialIntelligence
Research(JAIR),15:31–90,August2001.
MichaelK.Smith,DeborahMcGuinness,RaphaelVolz,andChrisWelty.
Webontologylanguage(OWL)guide,November2002.URLhttp://www.
w3.org/TR/2002/WD-owl-guide-20021104/.
C.G.M.SnoekandM.Worring.Multimodalvideoindexing:Areviewof
thestate-of-the-art.MultimediaToolsandApplications,2003.URLhttp:
.noek03multimodal.html//citeseer.nj.nec.com/article/sChrisStaufferandW.EricL.Grimson.Learningpatternsofactivityusing
real-timetracking.IEEETransactionsonPatternAnalysisandMachine
Intelligence,22(8):747–757,August2000.
H.Stern,U.Kartoun,andA.Shmilovici.Anexpertsystemforsurveillance
pictureunderstanding.InNATOASI,DataFusionforSituationMoni-
toring,IncidentDetection,AlertandResponseManagement,2003.
AlexanderToshev,Fran¸coisBr´emond,andMoniqueThonnat.AnApriori-
basedmethodforfrequentcompositeeventdiscoveryinvideos.InICVS,
2006.10,pageVan-ThinhVu,Fran¸coisBr´emond,andMoniqueThonnat.Automaticvideo
interpretation:Arecognitionalgorithmfortemporalscenariosbasedon
pre-compiledscenariomodels.In3rdInternationalConferenceonVision
System(ICVS’03),pages523–533,April2003.
T.Xiang,S.Gong,andD.Parkinson.Autonomousvisualeventsdetection
andclassificationwithoutexplicitobject-centeredsegmentationandtrack-
ing.InBritishMachineVisionConference(BMVC),Vol.1,pages233–
2002.erbSeptem242,LunXinandTieniuTan.Ontology-basedhierarchicalconceptualmodel
forsemanticrepresentationofeventsindynamicscenes.In2ndJoint

IEEEInternationalWorkshoponVisualSurveillanceand
EvaluationofTrackingandSurveillance,2005.,15-16Oct.
submissiondeadline:June30th,2005.

103

2005.PerformancPapere

TaoZhao,TianshuWang,andHeung-YeungShum.Learningahighlystruc-
turedmotionmodelfor3dhumantracking.InThe5thAsianConference
onComputerVision,January2002.

AAnhang

SpracheZusammenfassungindeutscher

A.1Einleitung

mendeZusammenMengemit¨UbdemerwwacachhsenungsinhaltsdenBed¨gescurfnishaffnacen.hDieSicherheitIndizierungwirddeseineInhaltszuneh-
imVAufnahmenorausistvonunentbHundertenehrlich,oumderTeinescausendenhnellederundinzuveinererl¨assigeeinzelnenSucheinEinricdenh-
tunginstallierten¨Uberwachungssensoren(Kamerasetc.)zuerm¨oglichen.Das
ZieldesSmartIndexing&Retrieval(SIR;auchSemantic-BasedVideoRetrie-
val(SBVR),[Huetal.,2004],ContentBasedVideoIndexingandRetrieval
(CBVIR),[BashirandKhokhar,2003]oderAutomaticForensicVideoRe-
undtrievalsomit(AFVR),dieErm[¨oglicHampapurhungeteffizal.ien,ter2004Suc])he.istdieDieErzeugungErzeugungvondieserMeMetadatentada-
tenmussautomatischaufBasisderContent-AnalyseAlgorithmenerfolgen,
dadiemanuelleErstellunginvern¨unftigerZeitundzuvern¨unftigenKosten
immerschwierigerwird.DieeigentlicheHerausforderunghierbeiistdiese-
manadressierttischewurde,AnalysegehdestdieInhalts.SucheObnacwhohleinerdiesesAllzwThemaeckl¨inosungwmehrereneiter:Ans¨atzen

WenninderTatsolcheinn¨utzlicherSatzvongenerischenAktio-
nendefiniertwerdenkann,w¨urdeesm¨oglichsein,entsprechen-
deEigenschaftenundKorrespondenzmethodenzuidentifizieren,
welchezueinemhohenMaßeanwendungsunabh¨angsind?(Ga-
])1999[vrila

104

105

ProblemstellungA.1.1Intuitivh¨angteinIndexvonderArtderRetrieval-Anfrageab,dieerwartet
wird,durchgef¨uhrtzuwerden.LeideristnichtimVorausbekannt,wonachge-
suchtwerdenwird.WenneinIndexnacheinemSzenariodurchsuchtwird,von
demzumZeitpunktderIndexerstellungnichtbekanntwar,dassesrelevant
seinw¨urde,entstehenunvermeidlichProbleme.DerIndexmuss”smart“oder
zumindestgenerischsein,umsolcheineAnfragezuerm¨oglichen.Somitw¨are,
zusammenmit”SmartRetrival“,eineSuchenachjedemSzenariom¨oglich.
Also,wasist”SmartIndexing“,wasist”SmartRetrieval“?
EinklugerWeise,einenIndexzuerstellen,istzuindizieren,wasundwie
esSinnmacht.”Klug“imSinnederInformationstheoriew¨urdedieReduzie-
rungderRedundanzbedeuten.SmartRetrieval“nutztsolcheinenSmart
IndexaufdieeffizientesteWeise”aus,d.h.derIndexwirdrichtiginterpretiert.
SpezifischeDom¨anen-Eigenschaften,z.B.Szenenanordnung,ebensowieBe-
nutzerbeschr¨ankungenwerdenzurRetrieval-ZeitinBetrachtgezogen.
EineReiheandererProblemeentsteht,wenneineAnfrage¨uberseman-
tischeSchl¨usselw¨orter(odersogarfreieText-Anfragen)kommt.Diegr¨oßte
SchwierigkeitliegtinderAbbildungderlow-levelBildrepresentation(Pixel)
zuhigh-level(menschlicher)Semantik.Dasbedeutet,dassw¨ahrendlow-level
Merkmaleleichtextrahiertwerdenk¨onnen,derAusgangspunkteinesRe-
trievalprozessesnormalerweisediesemantischeFragedesBenutzersist.Die
Abbildungvonlow-levelMerkmalen,diederComputerverwendeteinerseits
aufdieAnfragedurcheinenMenschenandererseitsillustriertdasProblem,
dasallgemeinals¨Uberbr¨uckendersemantischenL¨ucke1“beschriebenwird.
Allerdingsbedeut”etdiesemantischeL¨uckezu¨uberbr¨uckennichtallein,high-
levelAnfragenauflow-levelMerkmaleabzubilden.DieEssenzeinersemanti-
schenFrageistdasVerstehenderBedeutunghinterderFrage.Dasistselbst-
verst¨andlichauchbenutzerabh¨angig,daesdieDefinitionvonAusdr¨ucken
einschließt,dievonderDom¨ane,inderderNutzersucht,abh¨angen.Das
mussbetrachtetwerden,wenneineFragebearbeitetwird.
DasVorangehendeinBetrachtziehend,werdendieInformationenanstei-
gendgem¨aßihressemantischenNiveausvomPixel-Level(indervisuellen
Dom¨ane)zusemantischenMetadatenangeordnet.DieersteEbeneschließt
dieBildverarbeitungsalgorithmenein.IndieserThesewirddiesesNiveau
Niveau0genannt,daesderAnfangderAnalyseistundesnichts”unter“
diesemNiveaugibt.DieAlgorithmendesNiveau0sindjene,diebeweg-
teObjektesegmentieren,MerkmalewieFarbeoderTexturextrahierenund
Objektklassifikationdurchf¨uhren.ErgebnisdieserAlgorithmenundsomitdes
1bridgingthesemanticgap

106

KoNiveauordinaten,0sindeinMetadaten,definierterz.BLabdieelPderositionKlassedesdesObGegenstandesjektsetc.inBildpunkt-
DieErgebnissevonNiveau0sindderInputf¨urNiveau1,aufdemgrundle-
gendesEreignissen,SchließenAnalysedurcderhgef¨Ricuhrthtungwird.derDiesBewschließtegungodasderErkdieennenErkvennonungeinfacsolchenher
Personenein,dieWaren[AbdelkaderandDavis,2002],z.BeinRucksack[Ha-
nichritaoglut,beteruhtal.,auf2001der]Obtragen:jektkondastour,ScdiehließendieobWarenMetadatenausgabgetragenewvonerdenNiveoderau
ist.0Niveau2istschließlichdash¨ochstesemantischeNiveau.DiesesNiveau
vtionereinigt¨uberdiedieSzeneErgebnissezueinervonNivesemanautisc1henzusammenBeschreibungmitsemanwietischerDiebstahlInforma-eines
Koffers“.BildA.1zeigteinen¨Uberblick¨uberdieverschiedenen”semantischen
eaus.Niv

AbbildungA.1:DieverschiedenenNiveausderSemantikinSzenenbeschrei-
bungen.

Niveau2repr¨asentiert,wieeinBenutzereineSzenebeschreibenw¨urde
unddieForm,dieamwahrscheinlichstenist,indereinNutzereineAnfrage
aneinRetrievalsystemdurchf¨uhrenw¨urde.

tsystemGesamEinA.1.2UmeinSystemzuschaffen,dasf¨ahigist,Ereignissezuentdeckenunddie-
seEreignissezueinersemantischenSzenenbeschreibungzuverbinden,wird
otigt¨enb1.einenlow-levelAnalyse(VCA2)Algorithmus,
2.eineOntologief¨urdieDefinitioneinerallgemeinenBeschreibungsspra-
he,c3.einInterferenzmechanismus,derdieVCA-Metadatenaufeinsemanti-
abbildet,elLabhessc2VideoContentAnalysis;Videoinhaltsanalyse

107

4.einWerkzeug/Mechanismus,umeineentsprechendeKartederSzene
wiesoerzeugenzuterface.alinRetrievein5.DieseTeilebildendasSystemwiefolgt.DerVCAAlgorithmus(1)ana-
undlysiertdieMerkmaleBildfolgewieFundarbe,gibtTdieexturFetcormenaus.vonSoObwerdenjekten,Nivederenau0TraMetadatenjektorien
erzeugt.DerInterferenzmechanismus(3)istderspringendePunktdesSys-
dertems:etfolgendenwas,dasStufenhiernic(ebhtensorichwietigbUnzulearb¨eitetanglichkwird,eitenbdeseeinflusstVCA).dieDieserKonsistenzTeil
desSystemsmussandeneingesetztenVCAangepasstwerden.Ausgabesind
Niveau1Metadaten.DieOntologief¨urdieallgemeineBeschreibungssprache
(2)wirdeinerseitsf¨urdaslogischeSchließenundandererseitsf¨urdieAusgabe
andenBenutzersowiedieAbbildungderFragedesBenutzersaufsystemin-
terneTerminiben¨otigt.Ausserdemerm¨oglichtdieInterferenzlogik,diedie
zugrundeliegendeOntologieverwendet,(zusammenmit4)denSchrittauf
wNiveendetenau2.VSoCAwohldieAlgorithmenOntologieunabhals¨aucangighdassein,ScdahließendiesesolltenAnpasvsunongdenbvereitser-
am¨UbergangvonNiveau0zuNiveau1durchgef¨uhrtwurde.DasRetrie-
valinVielfaltterfacevon(F5)ragensollsczubefhließlic¨ahigenhflexibundeldiegenFugragesein,aufumdendenIndexBenautzerbzubilden.zueiner
DieVCAAlgorithmenaufNiveau0sindeinewichtigeVoraussetzung
f¨urdiefolgendenStufen.DiesesGebietistjedochnichtimFokusdieserAr-
beit.DieseArbeitf¨uhrteinsemantischesKonzeptf¨urdieDetektion,Re-
pr¨asentation,IndexierungundRetrievalvonEreignissenaufBasisderAus-
gabeneinesVCAAlgorithmusein.SoistdieGenerierungvonNiveau1und
Niveau2vonInteresse,eingenerischerAnsatzf¨urdasErkennenvonVerhal-
ten,zusammenmiteinempassendenKonzeptf¨urIndexing&Retrieval.

endungsgebietewAnA.1.3EinSmartIndexingSystemsollzurautomatischenIndizierungvonVideoin-
haltimstandesein,umeffizienteSuchenzuerm¨oglichen.DerIndexmussvon
geringerRedundanz,gleichwohlreichanInformationsein.SmartRetrieval
mussimstandesein,denSmartIndexdurchKombinierenderIndicesintelli-
gentauszunutzenundzus¨atzlicheInformationwieBenutzerbeschr¨ankungen
undSzenenlayoutmiteinzubeziehen.
EinSmartIndexing&RetrievalsystemmussanseinerRetrievalleistung
gemessenwerden.WennsolcheinSystemalsobeurteiltwird,werdenspezi-
fischeEreignissegesucht,unddieErgebnissewerdenbewertet.

108

ZusammenmitSicherheitssystemexperten3isteineReiherelevanterUse-
Casesidentifiziertworden.ZuerstsollendiefolgendenSubereignissedetek-
tiertwerden:Aufnehmen,AbstellenundHinfallen.Dieausgew¨ahltenSuber-
Feignisseolgendensollenwurdeermn¨dieoglichen,diewiederzufindendendefiniertenF¨alleUse-Caseswurdendefiniert.zusammenzusetzen.Im
1.”Kampf“:einePersonf¨allt,verursachtdurcheinevorbeigehendePer-
son/Personf¨allthinundstehtnichtwiederauf.
2.”Einkauf“:zahltderKundeodernicht?
3.”KriminellesVerhaltenaneinemGeldautomaten“:Ausspionierender
erson.PandereneinerPIN4.detektiereBettler/Verk¨auferaufderStraße.

osungen¨LExistierendeA.2ForschungimFeldderautomatisiertenAktionenerkennungundSzenenbe-
stsc¨andigenhreibung¨Ubl¨aufterblic”bkseiereitsaufseiteiKapitnigerel2vZeit“[erwiesen.GavrillaDie,1997folgenden].F¨urAbsceinenvhnitteoll-
zeigenSystemeundForschungsarbeiten,diemitderpr¨asentiertenArbeitin
demSinneverwandtsind,alsdasseinModelldemErkennungsprozesszu
scGrundehreibunggelegtderwirdSzeneunderzeug(zutwird.einemgewissenMaße)einesemantischeBe-

A.2.1Modell-basierteEreigniserkennung
[nHongengungsmethoetdenal.,repr2004¨]asenpr¨tiert,asendietierendieseeinRepr¨Konzept,asendtationasAktioneneinsetzen.undEineErkAkti-en-
4oneinzelnenwirdausAkteurAktionssausgeftr¨¨angenuhrt.Einzusammengesetzt,Ein-Strang-Ereignisjeder“Streprrang¨asenwirdtiertvondieeinemEi-
”genscmehrerenhaftenderAktionsstrTra¨jektorieangenunundterForm.zeitlicEinhen”Einschr¨Multi-Strang-Ereignisankungen“zusammengwirdause-
setztNetzwunderkenre[pr¨AllenasentiertanddFurcerghusoneinen,1994].EreignisgraphenEreignisse,w¨ahnlicerdenhinIntervmehrerenalAlgebraAb-
straktionsschichtenangeordnet.DieserAnsatzistengverwandtmit[Ivanov
andBobick,2000],daexternesWissenindieerwarteteStrukturdesAkti-
onssmodellseinbezogenwird.DiesesModellisthierarchischabermodelliert
nurAgenten,nichtGegenst¨ande.
43umactiongenauerthreadszusein,MitarbeiterdesBoschSicherheitssystem-Produktbereichs

109

[Cupillardetal.,2004]pr¨asentiereneinSystemimRahmendeseurop¨ai-
schenProjektsADVISOR[NaylorandAttwood,2003].EsisteinAnsatz
f¨urdieOnlineErkennungeinesIndividuums,einerGruppevonLeutenoder
demVerhaltenvonMenschenmengenimU-Bahn-¨Uberwachungskontext,wo-
beimehrereKameraseinsetztwerden.F¨urdieRepr¨asentationderSzenewird
eineHierarchieverwendet.Ausgangf¨urdieSzenarioerkennungsind
1.durchExpertendefinierteSzenariomodelle,
2.geometrischeInformationderbeobachtetenUmgebung,und
3.durcheinVisionModul(VCA)verfolgtePersonen,vondemangenom-
menwird,dassesdasrichtigtut.
DerFormalismusberuhtaufdreiHauptideen:
1.definiereverschiedenartigeOperatoren(Software-Module)f¨urdieEr-
ng,uennk2.habedasganzeerforderlicheWissenimentsprechendenOperator,und
3.BeschreibungdesOperatorssolldeklarativsein,umeineerweiterbare
BibliothekvonOperatorenzuerstellen.
Szenarienwerden”imGanzen“definiert,alsom¨ussenVariationendurch
verschiedeneDetektormodulebehandeltwerden.
[Bourbakisetal.,2003]definiereneinModellf¨urdieDarstellung,Erken-
nungundInterpretationmenschlicherAktionen.DasModellberuhtaufder
hierarchischenSynergievondreianderenModellen:derLocal/Global(L-G)
Graph,demStochastischePetri-Netz(SPN)GraphundeinemNeuronales
Netzwerk(NN)Modell.DieAutorenmacheneineUnterscheidungzwischen
strukturellerKenntnis(Kenntnis¨uberphysischenZustand)undfunktionaler
Kenntnis(Kenntnis¨uber¨AnderungenundEreignisse).SieentwickelnDyna-
mischeMulti-linkedHiddenMarkovModelle(DML-HMM),umGruppenak-
tionenzuinterpretieren.DasFeldderAnwendungsindLadungst¨atigkeiten
amFlughafenmitEreignissenwie”movingTruckCargo“,womitdiesesSys-
temvollkommendom¨anenabh¨angigist.
EinaufKr¨aftedynamikgegr¨undeterAnsatzwirddurch[Siskind,2001]
pr¨asentiert.SieerstelleneinSystemzurErkennungvonEreignissen,die
durcheinfacheVerbenderr¨aumlichenBewegunginkurzenBildfolgenbe-
schriebensind.DieSemantikdieserVerbenwirdmitEvent-Logikausdr¨ucken
angegeben,die¨Anderungeninkr¨aftedynamischenBeziehungenzwischenden
TeilnehmerndesEreignissesbeschreiben.

110

[Fern,2004]erweitert[Siskind,2001]umdieEntwicklungeines¨uberwach-
tenLernalgorithmus,umhigh-levelvisuelleEreignissevonlow-levelkr¨aftedy-
namischerRepr¨asentationdesVideosautomatischzuerhalten.Einezeitliche
Ereignisbeschreibungssprachewirdeingef¨uhrt:AMA,And’svonMeet’s
undAnd’s“(MA-Timelines)“.EineMA-Timelineist”dieAbfolge”einesZu-
standes,dereinigeZeith¨altundeinzweiterZustand,dereinigeZeith¨alt
unddenerstentrifft.AMAistdieVerbindungvonMA-timelines.EineLern-
methodewirdbasierendaufdiesenAlgorithmenentwickeltundaufLernen
vonEreignisdefinitionenausVideosangewandt.
[Ghanemetal.,2004]repr¨asentierenunderkennenEreignisseunterVer-
wendungvonPetri-Netzen.HierauserstellensieeininteraktivesSystem,
mitdemin¨UberwachungsvideosnachEreignissengesuchtwird.DieFragen
sindnichtimVorausbekanntundausprimitivenEreignissenzusammenge-
setzt.PetriNetzewerdensowohlf¨urRepr¨asentationalsauchErkennungs-
methodikverwandt.EinegraphischeBenutzerschnittstellewirdverwendet,
umFragenzuformulieren.DieseFragenwerdendannaufeineReihevon
Petri-Netzenabgebildet.[Ghanemetal.,2004]nutztaucheineEreignison-
tologie,dieZust¨ande,Ereignisse,zusammengesetzteEreignisse/Szenenund
Beziehungendefiniert.DasSystemwirdaufeinemParkplatzmitEreignissen
wieAutosz¨ahlen“undAutoaustausch“eingesetzt.
”XinandTan[2005]sc”hlagenalseinenAnsatzf¨urdieModellierungvon
EreignissenundAnalysemitsemantischenRepr¨asentationeneinSystemvor,
dasdiezusammenh¨angendeInformationineinhierarchischesBegriffsmo-
dell(namentlicheineOntologie)integriertundEreignissealsbedeutende
¨AnderungenundAbbildungenvonBegriffseinheitenimModelldefiniert.Drei
grundlegendeEreignisbestandteileformendasKonzept:Entit¨aten,W¨orter
undeineReihevonAttributen.Ereignissesindrepr¨asentiertalsoffensichtli-
cheEigenschafts¨anderungen.VerschiedeneAttributevonRegionenundbe-
wegtenObjektensindEntit¨aten.DieSzenewirdinverschiedeneGebietege-
teilt,diemanuelletikettiertwerden.Wechselwirkungenzwischenbewegten
ObjektenundspeziellenGebietenwerdendurchihrer¨aumlichenBeziehungen
beschrieben.DieHierarchiederOntologiebestehtausdreiNiveaus.Dieerste
EbeneistdieAnordnungderSzeneundzugeh¨origeEinschr¨ankungen.Dasfol-
gendeNiveauenth¨altdieOntologiederbewegtenObjekte,diedieZust¨ande
derBewegungunddieKonzepte,dieobigerw¨ahnteWechselwirkungenbe-
schreibt.ImdrittenNiveaurepr¨asentiertdiesemantischeOntologiedas,was
inderSzenevorkommt.
[Guleretal.,2003]pr¨asentiereneinFrameworkf¨urEreigniserkennungin
VideosundderenSpeicherungf¨urDetektion,AnnotationundBrowsing.Der
DetektorteilbetrachtetEreignisseineinerhierarchischenStrukturbestehend
aus

111

kingdaten,racTden1.2.einfachemVerhaltenwie”warten“,”gehen“oder”aufnehmen“und
3.T¨atigkeitendesh¨oherenNiveauszusammengesetztausdieseneinfachen
Verhalten,wie”Meeting“,”Paketabsetzen“oder”Austauschzwischen
.“ersonenP

PunkteOffeneA.2.2UmEreignisseerkennenzuk¨onnen,sindnebendemVideodreiInputserfor-
h:derlic1.spDomezifisc¨anenhenwissen,Dom¨d.aneh.dieaussiehKennt“,tnisdar¨uber,wieeinEreignisinder
”2.Kenntnis¨uberdieAnordnungderSzenenund
3.Benutzerbeschr¨ankungenwiez.B.erlaubteDauerdesAufenthaltsin
Gebiet.ensensitiveinemDieHauptherausforderungdessemantischenVideoretrievalsistdieTatsa-
che,EreignissedasszumgefragtZewitpunkterden.derSomitIndew¨arexerstellungesnichtnichtklug,bekanndirekttseinonlinekann,wdetektier-elche
teMulti-Strang-Multi-AgentenSzenarienzuindizieren.Umdennochnach
ihnensucAngemessenhenzuistk¨onnen,hierbei,mn¨urussenszenarisinnvolleounabh¨angigeSub-Ereignisseundsomitindiziertwgrundlegendeerden.
BestandteilevonEreignissenzuverwenden.BishervorgeschlageneAns¨atze
f¨urzeitsysteme.EreigniserkDieennAnsung¨atze,sinddiemehrToderaxonomie/Owenigerntologieanwenduneinsetzen,gssptrezifiscagenheEcInfor-ht-
mation¨uberSzenenbzw.Benutzerbeschr¨ankungendirektinihrenDetekti-
onsalgorithmen.Wasalsowirklichfehlt,isteineRepr¨asentation,diegenerischistundauf
komplexeSzenenbeschreibungenangewandtwerdenkannundsomitSmart
trievIndexingaldurcermhgef¨oglic¨uhrtht.wAuferden,indieserselbstWweiseenndieerstelltengesuchtenIndiceskEreignisseanneffizienzumtIndi-Re-
nichtinzierungszeitpunktdenIndex,nicsiehtbsollenekannimtwaren.RetrievalscRegelnhrittundaufBescgestellthr¨wankungenerden.gehDassel-¨oren
beundgiltwelcf¨urhesKontexAustausch“tinformation:kannnwurelcvomhesAnwSzenarioender”aufgelDiebstahl¨ostw“erden,repr¨asendemtiert,ein
gestohlener”Koffergemeldetwird.

112

A.3MittelundWegederL¨osung
uhrung¨EinfA.3.1DieserAbschnittpr¨asentiertdietheoretischenGrundlagenderArbeit.Es
werdenOntologienalseinWerkzeugf¨urWissensrepr¨asentationundInferenz
vorgestellt.DieSyntaxderOntologie-Sprache,diedieRegelndessemanti-
schenKonzeptsindieserArbeitformuliert,wirdgesondertzusammengefasst.
DieeingesetzteSoftwarewirdgezeigt,ebensowiedieverwendetenRetrieval-
atsmaße.¨qualit

tologieOnA.3.2OntologienwurdeninderK¨unstlichenIntelligenzentwickelt,umdasTei-
lenunddieWiederverwendungvonWissenzuerm¨oglichen.EineOntolo-
giestellteingeteiltesundallgemeinesVersteheneinerDom¨anebereit,die
zwiscnizierthenwerdenMensckhenann.undDiesistheterogenauchveineerteiltenexpliziteAnwendungssystemenKonzeptualisierungk(d.ommh.u-
Metainformation)welchedieSemantikderDatenbeschreibt.

DesigntologieOnReusability(Wiederverwendbarkeit)isteinederwichtigstenEigenschaften.
DieForschungistdaraufausgerichtet,Technologienzuentwickeln,dieeine
WiederverwendungvonOntologienerm¨oglichen.UmdieseAnforderungenzu
erreichen,musseineOntologieauskleinenModulenbestehen,dieeinehohe
innereKoh¨asionundeinebeschr¨ankteMengeanAbh¨angigkeitenzwischen
denModulenhaben.GruberhatdieDesignprinzipienderOntologie1995
ausgedr¨uckt.EsgibteineNotwendigkeitobjektiverKriterien,umDesigns
durchzuf¨uhrenundauszuwerten.EineinleitenderSatzvonDesignkriterien
f¨urOntologien,derenZweckdasTeilenvonWissenundInteroperabilit¨at
unteraufeinergeteiltenKonzeptualisierunggegr¨undetenProgrammenist,
sind[Gruber,1995]:
1.Klarheit:DieineinerOntologiebeschriebenenDefinitionensollten
objektivundunabh¨angigvomsozialenodertechnischenKontextsein.
EineDefinitionsollte(wennm¨oglich)inlogischenAxiomengemacht
undineinernat¨urlichenSprachedokumentiertwerden.
2.Koh¨arenz:EineOntologiesolltezusammenh¨angendsein.Dasbedeu-
tet,dassdieInferenzeninkeinemWiderspruchzudengemachtenDe-
finitionenstehen.Dief¨urdieDefinitioneneingesetztenAxiomesollten
logischkonsistentsein.

113

3.Erweiterbarkeit:Essolltem¨oglichsein,¨uberdasexistierendeVoka-
bularneueTermezudefinieren,ohnedassdieexistierendenDefinitio-
nenwiederholtgepr¨uftundkorrigiertwerdenm¨ussen.
4.MinimaleVerwendungvonImplementationsdetails:DieKon-
zipierungeinerOntologiesolltenichtaufeinerteilweisenKodierung
abhngigvondersp¨aterenImplementierungerfolgen.
5.MinimaleontologischeFestlegungen:EineOntologiesollteeine
minimale,aberausreichendeMengeanBehauptungen¨uberdieDom¨ane
haben,diesiemodelliert.

VERLA.3.3ImSommerundHerbst2003sponsertedieAdvancedResearchandDeve-
lopmentActivity(ARDA)derUS-amerikanischenRegierungdasCallenge
ProjektVideoereignistaxonomie“.DasErgebniswareineformaleSpra”chezur
BeschreibungvonOntologievonEreignissen,genanntVERL(VideoEvent
RepresentationLanguage).Einedazugeh¨origeSprache,genanntVEML(Vi-
deoEventMarkupLanguage),f¨urdieAnnotationderInstanzenderinVERL
beschriebenenEreignisse,wurdeebenfallsentwickelt.SyntaxundAnwen-
dungvonVERLundVEML,wiepr¨asentiert,in[BollesandNevatia,2004]
findetsichinAbschnitt3.2.3.

A.3.4DerVCAAlgorithmus
enEstwicistkniceln.htDerimFVokusCA-InputdieserArbwirdeit,durcheinendasBildvvon[erarbM¨uller-Sceitungsalgorithmhneidersetusal.zu,
2005]pr¨asentierteSystembereitgestellt.DieDetektionundBeschreibung
vonbewMultifeatureanalyseegtenObvjektenonVbieruhendeosequenzen.aufeinerDieseobjektorienAnalyseisttierten,selbstadaptivstatistischenzu
einerspaltungenbeobac“hundtetenVerscSzene(Meyhmelzungeneret“alzu.[iden1996]).tifizieren,DaswSystemobeisicisthf¨einahig,”einzel-Ab-
”nermehrGegenstandGegenst¨andeinzwzueioeinemdermehrGegenstandGegenstv¨erscandehmelzen.aufspaltet,Zus¨bzw.atzlichzweiwoerdender
Gegenst¨andedetektiert,dieruhendsindoderwurden.DasSystemkannauch
zwischenentferntenundliegengelassenenGegenst¨andenunterscheiden.
SostelltdasSystemdiefolgendenOutputsbereit
Regionen,jektOb1.2.dieSpuren(Tracks)derObjekte,

114

3.ein”ruhend“Flagund
4.Unterscheidungzwischenabgestellten/entferntenObjekten.
Diesesind-durcheinMPEG-7Dokument-derInputf¨urdasindieser
Arbeitpr¨asentierteSystem.

A.3.5Maßef¨urRetrievalqualit¨at
PrecisionundRecallsinddiegrundlegendenimAuswertenvonSuchstrate-
gienbenutztenMaße.Diesenehmenan:
1.EsgibteineReihevonEintr¨ageneinerDatenbank,dief¨ureineSuche
sindtnarelev2.Eintr¨agewerdenangenommen,entwederrelevantoderirrelevantzu
sein.3.Dertats¨achlicheRetrievalsatzkannnichtdemSatzvonrelevantenEin-
tr¨agenvollkommengleichen.

RecallRECALListdasVerh¨altnisderAnzahlvonerhaltenenrelevantenEintr¨agen
zurGesamtzahlvonrelevantenEintr¨ageninderDatenbank.Eswirdnorma-
lerweisealsProzentsatzausgedr¨uckt.
Recall=(AnzahlErhaltenerundrelevanter)/(Gesamtzahlm¨oglicher
ter).anRelev

PrecisionPRECISIONistdasVerh¨altnisderAnzahlvonerhaltenenrelevantenEin-
tr¨agenzurGesamtanzahlvonirrelevantenundrelevantenerhaltenenEin-
tr¨agen.EswirdnormalerweisealsProzentsatzausgedr¨uckt.
Precision=(AnzahlErhaltenerundrelevanter)/(GesamtzahlErhalte-
ner).

Die”GroundTruth“
EsgibtkeinedeterministischeMethodik,umzuverstehen,wasf¨ureineSuche
desBenutzersrelevantist.F¨urdieBewertungeinesRetrievalsystemsm¨ussen

115

alsodierelevantenDokumenteperHandausgew¨ahlt5werden.ImFalleines
EreignisdetektionssystemsbestehtdieGroundTruthaus:
•demTypdesEreignisses,
•demZeitfenster,indemdasEreignisauftratund
•derdemObjekt,dasdasEreignisausf¨uhrt,durchdasVCASystem
ID.zugewiesenen

A.4EinsemantischesKonzeptf¨urdieAbbil-
dungvonlow-levelAnalyseergebnissen
aufhigh-levelSzenenbeschreibungen
MorphemausWikipedia,derfreienEnzyklop¨adie.

EinMorphemistdiekleinstebedeutungstragendeEinheiteiner
Spracsystem.heEsaufl¨derasstsichInhalts-auchundalsFormebkleinsteenesemanindertischlangue,indemterpretierbareSprach-
KonstituenteeinesWortesbezeichnen.

uhrung¨EinfA.4.1DieseArbeitpr¨asentierteinKonzeptf¨urdieRepr¨asentationundDetektion
vonEreignissen.DasKonzeptistdurch[Neuhaus,2005]eingef¨uhrtworden.
DieEntwicklungundAnwendungdiesesAnsatzesebensowieein”proofof
concept“werdenpr¨asentiert.

erminologieTA.4.2ZuerstgibteseinigeAusdr¨ucke,dievorGebrauchklardefiniertwerden
mdes¨ussen.GegenstDie¨uckmeistenaufsemanAusdrtisc¨uckherevonSeite,aberInhaltsanalysenesistkeinehaben1-zu-1ihrEnentsprectsprechhen-ung.

5der”GoldStandard“,die”allesumfassendeWahrheit“

116

lotomatiscw-levhelundmittelshigh-levSignalvelerarbLeitungow-level(z.B.bescBildvhreibterarbdieeituEigenscng)haften,extrahiertdiewer-au-
denk¨onnen.DieExtraktionundSegmentierungvonObjekteninBildfolgen
isteinloBenw-levutzerel.einsetzHigh-levelt,wbennescerhreibteinesemanSzenetiscbheeschreibt.BedeutungenInBezugundaufBegriffeautoma-die
tisierteInhaltsanalyseundIndizierenbeschreibthigh-level“denProzess,
Schl¨usseausweitererAnalysevonlow-levelEigensc”haftenzuziehen.

EigenschaftenundAttributeDieAusdr¨uckeEigenschaftenundAttri-
denbutehigh-levsollen,elwiesemahiernvtiscerwhenendet,BegriffendiebEigensceschreibhaftenen.vonEigensclow-levhaftelwirdDatenimFbzw.eld
derInhaltsanalyseohnehinverwendet.F¨urdassemantischeKonzept,das
hierpr¨asentiertwird,beschreibt”Attribut“dieElementedersemantischen
Repr¨asentation.DaskanndieAnzahlvonObjekten,dieRichtungderBewe-
gung,oderdie”ruhend“ZeiteinesObjektessein.

BewegteRegionundObjektDiebewegteRegionistderOutputvom
Niveau0,desVCA-Systems.EsisteinebeliebiggeformteMengevonBild-
punkten,die-inTheorie-einObjektrepr¨asentieren.DieRealit¨atzeigt
jedoch,dassgegenw¨artigeInhaltsanalysediesnochnichtleistenkann.So
musseinObjektinder”realenWelt“voneinerbewegtenRegionunterschie-
denwerden.DiesnichtnurausGr¨undenderSemantik,sondernauch,weil
einebewegteRegionmehralseinbewegtesObjektrepr¨asentierenkann.Ein
einfachesBeispielf¨urdiesenFallw¨areeinePerson,dieeinenKoffertr¨agt.
AndererseitskanneinObjektdurchmehralseinebewegteRegionre-
pr¨asentiertseinsein.WenneinePersonz.B.hintereinenLastwagengehtund
aufderanderenSeitewiedererscheint,werdeneinigeInhaltsanalyse-Systeme
einenneuenIDzuteilen,obwohlesdieselbePersonist.
EinanderesdenkbaresSzenariow¨aredieAblieferungeinesKoffers:zu-
erstwirdderGegenstand‘Koffer’durchdieselbebewegteRegionwiedie
Personvertreten,dieihntr¨agt.DannstehtderKofferaufdemFußboden,
repr¨asentiertdurchseineeigeneRegion.WennderKofferschließlichvonei-
nerzweitenPersonweggetragenwird,wirdderenbewegteRegionauchden
tieren.asen¨reprKofferEreignisImAllgemeinenbeschreibtEreignisdas,waszueinemspezifi-
schenZeitpunktgeschieht.IndieserArbeitrepr¨asentiert”Ereignis“nureinen
elementarenTeileinerReihenfolgevonAktionenundistmit”primitivenEr-

eignissen“[Ghanemetal.,2004]derLiteraturvergleichbar.

117

EventMorphem6EventMorphemescheinenkeine¨Ubereinstimmungin
derLiteraturzuhaben.ImGrunderepr¨asentierensiedenStandpunkt“eines
ObjektsineinemEreignisundverbindensodieObjekte”mitdenentsprechen-
denEreignissen.IhreBeziehungistimmereine1-zu-1Beziehung:einObjekt
-einEventMorphem-einEreignis.

7beitMetaverwKnoendetwledgewird,vDerereinigtAusdrucdas,kwasMetaallgemeinKnowledgeals,soawiepriorierinWissendieser“bAr-e-
”spzeicezifischnethenwirdBenutzerbzusammeneschrmit¨ankungenallgemein.erDasErapriorieignismorpholoWissenkgie¨onnundtekz.B.onteineext-
KartederSzenesein,w¨ahrenddieallgemeineEreignismorphologiekontex-
Bentunabhutzerb¨angigeseschr¨Wissenankungendarb¨ubescerhreibist,enwiez.B.dieEreignisseortsbh¨angigezusammenerlaubtegesetztDauersind.
thaltes.Aufeneines

SemantischesModulDasSemantischeModulistderOutputdesontolo-
giebasiertenInterferenzmechanismus.Esistvergleichbarzumzusammenge-
setztenEreignis“[Ghanemetal.,2004]oder”Prozess“[Nevatia”etal.,2004].
IndenmeistenF¨allennutztesdasMetaKnowledge,um¨uberdieEreignisse
durchVerbindenspezifischerInterferenzregelnzuschließen.Tats¨achlichist
dasSemantischeModuldas,wonachdieDatenbankabgefragtwerdenwird.

MoSzenedulen,EinedieSzeneEreign,isswieeininihrerdieserkArboneittextspverwezifiscandt,henbestehInstanztausreprSem¨asenantisctieren.hen
Mankanndiesalseine”KettevonEreignissen“verstehen.EineBeschreibung
einerSzeneistNiveau2.

MorphemetenEvA.4.3DieIdeehinterEventMorphemenistdieZerlegungeinerSzeneinseman-
tischaussagekr¨aftigeTeile(klein,abernichtatomar),jedesausderPerspek-
67”MetawissenEreignismorphem““

118

tiveeines”Hauptobjekts“.EinEventMorphembestehtausdemHaupt-
objekt,dessenAttributen,unddem/denObjekt(en),mitdem/denendas
Hauptobjektinteragiert(ggf.).DieAttributesinddieRepr¨asentationender
durchdeneingesetztenContentanalyse-AlgorithmusextrahiertenEigenschaf-
ten.EinEventMorphemrepr¨asentierteinZeitfenster,indemdieseAttribute
analysiertundinterpretiertwerden.DieserInterferenzprozesserzeugteinen
semantischenLabel,dereinsolchesZeitfensterrepr¨asentiert.
DassemantischeKonzeptEventMorpheme“repr¨asentiertal-
soeinesemantischeZwischensc”hicht.Esbildet(aufeinerfr¨uhenStufe)
dielow-levelMetadatenaufeinsemantischesEtikettab.

EineKartederSzeneEineKartederSzeneistf¨urdieErkennungorts-
abh¨angigerEreignisseunentbehrlich.Wennz.B.einAutoineinemausgewie-
senenParkplatzabgestelltwird,istdieseinerlaubtesEreignis.EinAuto,
dasinderMitteeinerFeuer-Gasseabgestelltwirdundsiesomitversperrt,
w¨urdeunerw¨unschtsein.EinanderesBeispiel,dieKartederSzenezuver-
wenden,w¨arezuwissen,dassdort,wosichObjekt”X“bewegt,einenFluss
ist.DarauskanndasSystemschließen,dassdasbewegteObjektschwimmt.
¨AhnlicheinerKartederSzeneistKenntnis¨uberdiePerspektivederKa-
meraundderenParameter,diedurchschnittlicheObjektgr¨oßeundallesdas,
wasallgemeinals”aprioriWissen“beschriebenwird.IndieserArbeitistes
diebenutzerunabh¨angigeKenntnisbez¨uglichdesOrtesundderAufteilung
derSzene.DieKartederSzeneunddiesemantischeBeschreibungderOrte
sinddarindieVoraussetzungf¨urdenSchrittvonNiveau1aufNiveau2.
EventMorphemTaxonomie
DiesemantischenLabeleinesdurchEventMorphemerepr¨asentiertenDa-
tenmusterswerdeninhierarchischenKlassenangeordnet.Zustand“und
”¨Ubergang“sinddiezweiKlassentypen.Zust¨andewerdend”urchEventMor-
w¨phemeahrendmit¨Ubergseman¨angetischenabgestelltLabeln“owieder”Bewegungaufgenommen“oder“”wHalt¨aren.“repr¨Ub¨asenergangtiert,“
istunterteilbarin””Start“und”Stop“f”¨urjene¨Uberg¨ange,die”Zust¨andebe-
ginnenbeziehungsweisebeenden.DiesesEigenschaftenbetreffendasPrinzip
derSemantischenInterpolation(sieheAbschnittA.4.3).Einebeispielhafte
ImplementierungeinessolchenAnsatzes,wieeingef¨uhrtin[Neuhaus,2005],
inVERLNotierungw¨are:

SUBTYPE(Zustand,ev)
SUBTYPE(Uebergang,ev)

SUBTYPE(Start,Uebergang)
SUBTYPE(Stop,Uebergang)

SUBTYPE(Aufnehmen,Start)
SUBTYPE(Abstellen,Stop)

SUBTYPE(Bewegung,Zustand)
SUBTYPE(Halt,Zustand)

119

mussUmdievTollvaxonomieomKonzeptdemdereingesetztenEventVCAMorphemAlgorithmTaxonomieuszuangepasstwprofitieren,erden.
JefeinerKlassifizierungdasEvsein.entDerVMorphemorteil,Prdie¨adikseatmanseintischensoll,LabdestoelderunsicEventhererwirdMorphemedie
diesehierarcWhisceisehistzuesmordnen,¨oglicisth,dasaufdasErm¨oglicallgemeheninereeiner”Labelvielleiczuhtfallen,Aussagewenn“.Aufder
f¨urzugrundedenSpezifiscliegendehInerenaufwterferenzmeceist.hanismuseinegeringereWahrscheinlichkeit
phemenEinesindanderedieMotiverstenationexpf¨urerimeneinetellentaxonomiscErgebnisse.heWAnordnennungSzenarvonienEvbentetracMor-h-
nteturwtatserden,¨achlicinhdenenEreignisse”aufnehmenmit“Aufhebdetektiertenvorkwird,ommen,zeigtsich,sonderndassaudabcheisolcnichhet
vonAutos,dieauseinerParkbuchtausparkenoderPersonen,dieaufstehen,
nacOutputhdemunsieterscl¨angerehiedlicZheeitSemgesessenantik.habEsen.wirdSoethabwasen¨ausahnlicderheSzeneMusterenimtfernVtCA“.
EsergibtsichdieserinitialeAnsatzf¨urdieEventMorphemT”axonomie:

SUBTYPE(Bewegung,Zustand)
SUBTYPE(Halt,Zustand)
SUBTYPE(einbringen,Uebergang)
SUBTYPE(entfernen,Uebergang)
SUBTYPE(abstellen,einbringen)
SUBTYPE(aufnehmen,entfernen)

120

SemantischeInterpolation
wDasesenheitPrinzipeinesderObSemanjektestisczuhenerkIneterpnnen,olationdasdurcbeschdiehreibtdenInhaltsanalyseProzess,nicdiehtAn-zu
jederZeitexplizitentdecktwird.EinBeispielisteinhereingetragenerKof-
fer:derVCAAlgorithmusdetektiertzwardasAbstellen,jedochnichtden
getragenenKoffer.SomusserdurchdasInterferenzmoduleingef¨ugtwerden.

LogischesAusschließenvonFehlerkennungen(ReasoningOut)
DassemanPrinziptischendesLab‘frel¨uhen’hatSceinenhritteswveiterenonderVloorteil.w-levEselwirdRepr¨masenoglictationh,zuaufwisseneinen,
¨welcheErgebnisseFehlerkennungenf¨ureinebestimmteKlassedesEventMor-
bevphemsordassind.AbsteWennllenz.B.eindetektiertangenomwurde,menistoffensichereingebrahtlicchh,tesdassObjektdieseexistierte,Behaup-
ist.hfalsctung

A.4.4WoistderIndex?
DaskompletteKonzept,dashierpr¨asentiertwird,isteinKonzeptf¨urIndi-
zierenundRetrieval.Also,alseinletzterSchrittmussdieTrennungzwischen
demIndexunddemRetrievalteilgetanwerden.MitanderenWorten:woist
derIndex?Daf¨urdasRetrieval,dieInformation,sowiesieindenEvent
Morphemenenthaltenist,ben¨otigtwird,solltegenaudasimIndexstehen.
SogehtdieLiniealso”durch“dieEventMorpheme(sieheBildA.2).Wenn
einIndexaufdemgenerischenPrinzipderEventMorphemeerstelltwird,so
wirdSmartRetrievalerm¨oglicht,ohnedenZugriffauflow-levelEigenschaften
erlieren.vzu

ErgebnisseA.5

uhrung¨EinfA.5.1DieErgebnissederexemplarischenImplementierungeinesEventMorphem-
basiertenEreignisdetektionssystemszeigen,dassinverschiedenenSzenarien
EreignisseunterschiedlicherKomplexit¨aterkanntwerden.Eswirdauchde-
monstriert,wiedieErgebnisseverbessertwerden,indemfalscheTrefferdurch

AbbildungA.2:DieEventMorphemesindderIndex

121

logischesSchließeneliminiertwerden.DieErgebnisse,diehieralseinproof-
of-concept“pr¨asentiertwerden,sindinmehrerenSchrittenerlangtw”orden.
ZuerstwurdeeinLehrsatzbestehendaus304Sequenzenmitverschiedenar-
destigeOSzutputsenariendesbVearbCAeitet.AlgorithmDannistusdieentEvwickenteltMoworden,rphemderDetektiondieseSeaufquenzenBasis
durchrechnete.SchließlichistdieDetektionaufdemTestsatzdurchgef¨uhrt
worden,derdieAnwendungsf¨alleenth¨alt.

A.5.2RealisierungderAnwendungsf¨alle
WieEingangserw¨ahntwurdenzusammenmitSicherheitssystemexpertenei-
neReiherelevanterAnwendungsf¨alleidentifiziert.Essollendiefolgenden
gewSub¨ahltenereignisseSubentdecereignissektwerden:sollendazuAufnehmen,geeignetAbstellensein,dieunsdefiniertenHinfallen.DieAnwaus-en-
dungsf¨allezusammenzusetzen.Diesewaren:

1.”son/KampfP“erson:einef¨alltPhinersonundf¨allt,stehvtnicerursachthtwiederdurchauf.einevorbeigehendePer-
2.”Einkauf“:zahltderKundeodernicht?
3.”KriminellesVerhaltenaneinemGeldautomaten“:Ausspionierender
erson.PandereneinerPIN4.detektiereBettler/Verk¨auferaufderStraße.

122

A.5.3AusschließenvonFehlerkennungen(”reasoning-
)“outDieserAbschnittzeigtdieErgebnissederAnwendungReasoning-outStra-
tegie.DiefolgendenVerbesserungenwurdenaufdemLehrsatzerreicht.Die
Fehlerkennungsratenahmab,undsomitnahmdieGenauigkeitzu.Daswar
f¨urdieAufdeckungvonentferneneineAbnahmevonFehldetektionenum
20,2%,resultierendineinerZunahmevon17,7%f¨urPrecision.F¨urdieAuf-
deckungvoneinbringenwarendieErgebnissenochsignifikanter.DieFehler-
erkennungsratefielum35,8%undsomitverbessertesichderWertf¨urPre-
diecisionumPrecision;29,0der%.DasRecallAuswirdschließendurchvdieonFnichtehlerkErkennanntenungenbbeeinflusst.eeinflusstLeiderallein
k¨onnendieLetzterennichtmittelspost-hocAnalysereduziertwerden.

A.5.4Ergebnissef¨urdieTestsequenzen
DiessinddieErgebnissef¨urdieDetektionderidentifiziertenAnwendungsf¨alle.
DieseErgebnissewurdengegeneinevonHandkommentierteGroundTruth
verifiziert.DieErkennungsratenunddieresultierendenWertef¨urPrecision
undRecallsindinTabelleA.1dargestellt.

SzenarioGroundrichtigfalschnichtRecallPrecision
TruthErkannteErkannteErkannte
aufnehmen29271293,196,4
abstellen970277,8100
ufenaEinkDiebstahl12111191,791,7
bezahlen870187,5100
Bettler82200100100
Kampf11100190,9100
Bankautomat1090190100

TabelleA.1:Ergebnissef¨urdieDetektionderdefiniertenAnwendungsf¨alle

VergleichmitanderenAns¨atzen
TabelleA.2zeigtdaspr¨asentierteSystemimVergleichmitanderenAns¨atzen
Literatur:deraus8ObjektevonHandgekennzeichnet

SystemruthTGroundSzenarienteannErktighricteannErkhfalscteannErkthnicRecallPrecisionVerfahren
[etBrandal.,52394,2MarkCoupledovMoHiddendels
]1997[etCupillardal.,455427393,385,7formalismthree-stage
erators-(op]2004wledge-kno[Guler1092,3HMMdescription)

[etGuleral.,1092,3HMM
]2003[Hakeem12231156794,395,0Regelbasiertes
andShah,SystemundZu-
standsautomat]2004Neuhaus9817732890,197,3(sieheKapitel4)
[2006][Raoand232162769,688,9HMM
,Shah]2001[Xiang54536191866,765,5EMzurCluste-
et2002]al.,durcrung,hMinimSelektionum
Description(MDL)Length

TabelleA.2:VergleichderSysteme

9DieseArbeit,umgenauzusein!

123

124

A.6AuswertungundAusblick
uhrung¨EinfA.6.1IndieserArbeitwirddieAnwendungdesgenerischenKonzeptsf¨urdieAb-
bildungvonlow-levelAnalysedatenaufsemantischeSzenenbeschreibungen,
tewiedieseseingef¨uhrtAnsatzesin[undNeuhausihre,2005],zugrundepr¨asentiert.liegendenDiekBegriffe,onstituierendenebensowieElemeneine-
seEinfw¨erdenuhrungnicinhtihrealsAnwGanzesendungbetracwherdentet,sonderngezeigt.inEinkleineSzenariosemanodertischbEreignis-edeu-
dertungsvEventolleTeileMorpheme(klein,kabannernicjedeshtatomar)Szenariounzerlegt.terVerwDurcenhdungOrdnendieserdieserTeileTeile,re-
pr¨asentiertwerden.
Stufe,DeraufHauptbderdereitragSchrittdesvonAnsatzeslow-levisteldieaufAllgemeinghigh-level¨ultigkRepr¨eitundasetationendiefrv¨uheor-
genommenwird.DasSchließeninderMetadatendom¨anewirdinkleinen
mantiscZeitfensternhenDomdurc¨anehgefv¨uhrt.ollzogen.DasSchließenkomplexererSzenenwirdinderse-

ErgebnissederertungBewA.6.2DasgesamteDetektionsergebnisdesSystemskommtauf97,3%f¨urPrecision
und90,1%f¨urRecall.Daszeigt,dassdasKonzeptderEventMorphemeals
einIndexf¨urgroße¨Uberwachungsinhalt-Datenbankenpassendistf¨urdas
Modellierenderrelevanten¨Uberwachungsszenarien.
DieErgebnissezeigen,dassdieLeistungdespr¨asentiertenKonzeptsmin-
destensdemStandderTechnikgleichist.DerUnterschiedzudenanderen
Ans¨atzenimVergleichbestehtdarin,dassdasSystemhiereinpost-hocMe-
ist.tadatenanalysesystemDasSystemistklarabh¨angigvomverwendetenVCAAlgorithmus.Das
bedeutetnichtnur,dassVerbesserungderEreignisaufdeckungeinVerbessern
derVCAeinschließt.EshatauchzurFolge,dassdieEinsch¨atzungderRetrie-
valleistungeinm¨oglicherAnsatzist,umdieLeistungvonVCAAlgorithmen
erten.webzuDerNachteilandererAnn¨aherungenistderMangelanderAllgemein-
g¨ultigkeit.DaswirddurchdasAufteilenderInformationgel¨ost:allgemeines
Dom¨anenwissendar¨uber,wieeinEreignisaussieht“,spezifischeInformatio-
nenwiedieAnordnungderSzeneundsp”ezifischeBenutzerbeschr¨ankungen
werdengesondertbetrachtet.Siewerdenvereinigt,wenneinetats¨achliche
Anfragegestelltwird.DasKonzeptderEventMorphemeistdazuf¨ahig,die
EreignissederanderenAns¨atzezumodellieren.

kAusblicA.6.3

125

sisEinezurwicZeithtigevFrageorhandenerist,wieVCAgenauAlgorithmensemantischeseink¨Szene-Besconnen.EinehreibungenandereaufSacBa-he,
diezuevaluierenist,istdieMengevonEreignisklassen,dieinEchtzeiter-
kanntwerdenk¨onnen.Dasw¨urdeeinflexibelkonfigurierbaresSystembereit-
stellen,stattfindendas-deneineSurvArtsemaneillanceoptisceratorherFilteralarmiert,sozusagen.wennEreignissevonInteresse
EinedenkbareErweiterungw¨aredieMiteinbeziehungvonAudioalsei-
nezweiteMultimediaanalyseModalit¨at.[SnoekMultimoanddaleWorringAnalyse,2003f¨].uhrtF¨urzudabsFesserereldder¨UbLeistungerwbaceihderung
k¨onntedasanderssein,da¨Uberwachungsszenariennicht”nachDrehbuch“
zusammengesetztwerden.Dennochk¨onnteAudioneueEreignissehinzuf¨ugen.
Aberdasmussnochuntersuchtwerden.
AufdersemantischenSeiteistdieEntwicklungeinervollst¨andigenOnto-
logieeineAngelegenheitderzuk¨unftigenForschung.DerpotentielleOutput
vonInhaltsanalysealgorithmenmusssichbreitentwickeln,umdieOntologie
vemeollstb¨andigereitszujetztmacanwhen.endbaSelbstvr,esersthat¨andlicjedohchistnocdashmehrKonzeptPotendertial.EventMorph-

endixApp

Results

B

for

training

sequences

Thesearetheresultsobtainedduringthedevelopmentofthedetectormod-

ulesforbring-inandtake-out.“fuzzynow”namesthe

havepassedbetween

the

split

(see

section

detectionthe

).4.3.3

of

“idle/remoed”v

126

erbumn

and

the

thatframesof

ccurrenceo

of

∞47es2560negativ7015848false851

274∞16425esositivpfalse854
11215133

2832394949

109744483

1827293031

199165469026681446230

2027293031

9372513933351927525

2932394949

9217722341-1912

5060708485

891854112338742628267

127

∞4413,8448,352317,4245,10179,3448,5721517,2442,8612256,4143,1434131,5445,05
252814,5831,821920,4337,2555,2615,630511,3615,6301990,4837,2502840,58
31,812estivisoptrue11,8117,6518,5220,412,866,455,416,4576,9220,4128,3017,65
151510202010015
811,821,1800012,133,23013,573,230000013,701,18
1000000000000000000000

wnofuzzyALTOTPrecisionRecalle-outtakPrecisionRecallbring-inPrecisionRecallreasoned-outbring-innewPrecisionRecallreasoned-oute-outtaknewPrecisionRecallreasoned-outALTOTnewPrecisionRecall

TableB.1:Resultsofthedetectionofeventsinthetrainingsequences

Anhang

Thesen

1.

2.

3.

4.

C

Ereignisselassensichdetektieren,indemnichtnachdemganzenEr-
sinneignisvollegesucTeilehtwiaufgelrd,¨ostsondernwirdunddasnaczuhsucderhendeAbfolgeEreignisdieserinTeilesemangesuctischht
wird.

FschesehlerkScennhließenungsratenaufseinmanContisctenhertEbAnalyseenegesenktSystemenwkerden.¨onnendurchlogi-

EsexistierteinVokabularf¨urdieSynthesebeliebigerEreignisse

EinsolchesVokabularl¨asstsichaufdieAusgabeeinesjedenContent
enden.wanSystemsAnalyse

128

endixApp

D

ositionsProp

1.

2.

3.

4.

Itispossibletodetectscenariosbyresolvingthesought-afterscenario
insemanticallymeaningfulpartsandsearchforthesuccessionofthese
parts.

Itispossibletoreducefalsepositiveratesincontentanalysisalgorithms
byreasoningonthesemanticlevel.

Avocabularyforthesynthesisofarbitraryscenariosdoesexist.

Suchavocabularyisadaptabletotheoutputofanycontentanalysis
system.

129

EendixApp

symUsedbolsabbreviationsand

ermT

SIRVERLGMMHMMDBNCAVwwwW3CMPEGMPEG-7XMLHTMLRDFOILAARPDAMLDWLO

Meaning

VideoSmartEvenIndexingtRepre&senRetrievtationalLanguage
delMoreMixtuGaussianHiddenMarkovModel
VideoDynamicConBatenytesianAnalysisNetwork
WWorldorldWideWideWWebebConsortium
MovingMultimediaPictureContenExptertsDescriptionGroupInterface
HypExtensibleertextMarkupMarkupLanguageLanguage
ResourceDescriptionFramework
OntologyInterferenceLayer
DDefenseARPAAdvAgentancedMarkupResearchLanguageProjectsAgency
LanguagetologyOnebW

130

arungErkl¨

Ichversichere,dassichdievorliegendeArbeitohneunzul¨assigeHilfeDritter
undohneBenutzungandereralsderangegebenenHilfsmittelangefertigtha-
be.KonzepteDieaussindaunnderenterAngabQuellenederdirektoQuelledergekindirektennzeic¨ubhnet.ernommenenDatenund
WeiterePersonenwarenanderinhaltlich-materiellenErstellungdervor-
liegendenArbeitnichtbeteiligt.Insbesonderehabeichhierf¨urnichtdieent-
geltlicheHilfevonVermittlungs-bzw.Beratungsdiensten(Promotionsberater
odermittelbarandereroderPersonen)mittelbaringeldwAnsprucertehLeistugenommen.ngenf¨urArbNiemandeitenhaterhalten,vonmirdieun-im
ZusammenhangmitdemInhaltedervorgelegtenDissertationstehen.
DieArbeitwurdebisherwederimIn-nochimAuslandingleicheroder
¨ahnlicherFormeinerPr¨ufungsbeh¨ordevorgelegt.Ichbindaraufhingewiesen
worden,dassdieUnrichtigkeitdervorstehendenErkl¨arungalsT¨auschungs-
versuchangesehenwirdunddenerfolglosenAbbruchdesPromotionsverfah-
Fzurenshat.olge

5.November2008

131

NeuhausHolger