La lecture à portée de main
Description
Sujets
Informations
Publié par | karlsruher_institut_fur_technologie |
Publié le | 01 janvier 2011 |
Nombre de lectures | 59 |
Langue | English |
Poids de l'ouvrage | 4 Mo |
Extrait
agT
forithmsAlgorficientEf
AnalysisImageLarge-Scale
zurErlangungdesakademischenGradeseines
wissenschaftenIngenieurderDoktors
desderKarlsruherFakultätInstitutsfürInforfürTmatikechnologie
genehmigte
Dissertation
von
assenbergWJan
Koblenzaus
dermündlichenPrüfung:24.Oktober2011
ErsterGutachter:Prof.Dr.PeterSanders
ZweiterGutachter:Prof.Dr.-Ing.JürgenBeyerer
Abstract
andTheavpastailabilitydecadeofhasimagingseenmajorsensorimprosystems.vementsCommerinthecialcapabilitiessatellites
rAirboroutinelyneprlineovidescannerpanchrcamerasomaticyieldimageswithmulti-spectralsub-meterdatarwithesolution.a
grdataoundbringssamplewithitdistancetheof5challengecm.ofThertimelyesultingoanalysis.verabundanceFullyauto-of
mightmatedprinvolvocessingeastillcomputerappears-assistedinfeasible,searchbutforaninterinterestingmediateobjects.step
butThisrwemainsouldraeducechallengetheinamounttermsofofdataprforocessinganspeedanalystandtowexamine,orking
.ymemorhardwThisarwearorkbeginschitecturbyesthatdiscussingmightthebebrtrade-ofoughtfstoamongbeartheuponvariousthe
prentailoblem.longerFPGAdevandelopmentGPU-basedcycles,hencesolutionsthearechoicelessofuniversalcommodityand
deemedmulti-coretooCPUcostlyar.Wechitecturwilles.demonstrateDistributedtheprfeasibilityocessingonofaprclusterocessingis
2aerialhoursonimagesaofsingle100wkm×orkstation100kmwithareastwoatpr1mrocessorsesolutionandawithintotal
oftwelvecores.Becauseexistingapproachescannotcopewith
–suchfromdataamountsoaccessfdata,andeachsignalstageprofocessingthetoimageobjectprocessingextractionpipelineand
forfeaturemaximumcomputationperfor–mance.willhaWveetointrbeoducedesignednewfreffiomcientthegroundalgorithmsup
thatLetprousvidebeginusefulwithresultstheatmostfastertime-cspeedsriticalthantaskpre–theviouslyextractionpossible.
ofThisobjectstepiscandidatesnecessaryfrombecauseanimage,individualalsoknopixelswnasdonotprsegmentation.ovide
ableenoughmodelinforforthemationobjectsfortheinvscrolveseeninggroupingtask.Asimilarsimplepixelsbutrtogethereason-.
netwHigh-qualityorkflowandclusteringanisotropicalgorithmsdiffusionbasedareonfarmeantooshift,time-consuming.maximum
ix
prWeopertyintrofoduceavaoidingnewbothgraph-basedunder-andovalgorithmersegmentation.withtheItsimportantdistin-
guishingfeatureistheindependentparallelprocessingofimage
tileswithoutsplittingobjectsattheboundaries.Ourefficient
implementationtakesadvantageofSIMDinstructionsandout-
perforsimilarmsqualitymean.shiftbRecognizingyafactortheof50outstandingwhileprperforoducingmanceresultsofitsof
wemicrdeoarvelopchitecturitintoe-aawargenerale32-bitvirtual-memorintegerysortercounting,yieldingsortsubrthefastestoutine,
knownalgorithmforshared-memorymachines.
fultoBecausesuppresssegmentationsensornoise.groupsThetogetherBilateralsimilarFilterispixels,anitisadaptivhelp-e
smoothingkernelthatpreservesedgesbyexcludingpixelsthatare
distantinthespatialorradiometricsense.Severalfastapproxi-
highermation-dimensionalalgorithmsarespace.knoWwn,ee.g.accelerateconvthisolutionintechniqueadobyawnsampledfactor
ofimation14viaofthe3Dparallelization,Gaussvkernel.ectorizationTheandsoftwaareis73SIMD-friendlytimesasapprfastox-as
anexactcomputationonanFPGAandoutperformsaGPU-based
approximationbyafactorof1.8.
Physicallimitationsofsatellitesensorsconstituteanadditional
hurdle.Thenarrowmultispectralbandsrequirelargerdetectors
andFusingusuallybothhavdatasetsealoiswerterrmedesolutionpan-sharthanthepeningpanchrandimpromaticovesband.the
techniquessegmentationareduevulnerabletothetocoloradditionaldistortioncolorinforbecausemation.ofPremismatchesvious
efbetwfect,eenwethecomputebandsthespectraloptimalrsetesponseofbandwfunctions.eightsToforreacheduceinputthis
image.Ournewalgorithmoutperformsexistingapproachesbya
factorof100,improvesupontheircolorfidelityandalsoreduces
noiseinthepanchromaticband.
Becausethesemodulesachievethroughputsontheorderof
seTheveralubiquitoushundredMB/s,GDALthelibrarnextyisfarbottleneckslowertobethanaddrtheessedtheoriseticalI/O.
x
diskthroughput.Wedesignanimagerepresentationthatavoids
efficientunnecessaryasynchrcopying,onousandI/O.Thedescriberesultinglittle-knosoftwwnareistechniquesuptofor12
timesasfastasGDAL.Furtherimprovementsarepossibleby
compressingthedataifdecompressionthroughputisonparwith
theasymmetrictransferSIMDspeedscodecofathatdiskachiearravy.esaWecomprdevelopessionanoratiovelof0.5losslessfor
on16-bitasinglepixelscorande.rThiseachesisaboutdecompr100essiontimesthrasfastoughputsasof2lossless700MB/sJPEG-
2000Letandusnoonlywr20–60%eturntolargertheonextractedmultispectralobjects.satelliteAdditionaldatasets.steps
fordetectingandsimplifyingtheircontourswouldprovideuse-
fulinformation,e.g.forclassifyingthemasman-made.Toallow
softwannotatingarelarrasterizerge.imagesHigh-qualitywiththerantialiasingesultingpolyisachiegons,vwedebydevisederiv-a
ingoutperforthemsoptimalthepolynomialGupta-Sprlooullw-passalgorithmfilter.byOurafactorofimplementation24and
exceedsTheprtheeviouslyfillrateofadescribedmid-rangeprGPU.ocessingchainiseffective,but
electro-opticalsensorscannotpenetratecloudcover.Becausemuch
oftheearthssurfaceisshroudedincloudsatanygiventime,we
haveaddedaworkflowfor(nearly)weather-independentsynthetic
fromapertureuniforradarmly.Small,brightrhighly-regionsbeflectivyesubtractingobjectscaneachbedifferpixelsentiatedback-
grtheound,asymptoticestimatedfrcomplexityomtheofdarkestthisapprringoachsurrtoitsoundinglowit.erWeboundreduceby
AmeansofsophisticatedanewpipeliningalgorithminspirschemeedbensuryesRangethewMinimumorkingsetfitsQueries.in
cache,andthevectorizedandparallelizedsoftwareoutperforms
anFPGAimplementationbyafactorof100.
andTheseGPUresultssolutionschallengeenabletheconvsignificantentionalspeedupswisdomoverthatgeneral-FPGA
thepurloposewerCPUs.boundofBecausetheirallofcomplexitytheabo,vetheiralgorithmsusefulnesshaviserdecidedeached
xi
byconstantfactors.Itisthethesisofthisworkthatoptimized
softwarerunningongeneral-purposeCPUscancomparefavorably
inthisregard.Thekeyenablingfactorsarevectorization,paral-
lelization,andconsiderationofbasicmicroarchitecturalrealities
suchasthememoryhierarchy.Wehaveshownthesetechniquesto
beapplicabletowardsavarietyofimageprocessingtasks.How-
ever,itisnotsufficienttotunesoftwareinthefinalphasesofits
development.Instead,eachpartofthealgorithmengineeringcycle
–design,analysis,implementationandexperimentation–should
accountforthecomputerarchitecture.Forexample,noamount
ofsubsequenttuningwouldredeemanapproachtosegmentation
thatreliesonaglobalrankingofpixels,whichisfundamentally
lessamenabletoparallelizationthanagraph-basedmethod.The
algorithmsintroducedinthisworkspeedupsevenseparatetasks
byfactorsof10to100,thusdispellingthenotionthatsuchefforts
arenotworthwhile.Wearesurprisedtohaveimprovedupon
long-studiedtopicssuchaslosslessimagecompressionandline
rasterization.However,thetechniquesdescribedhereinmayallow
domains.otherinsuccessessimilar
xii
Acknowledgements
Iguidancesincerely–thankidentifyingmypradvisor,omisingProf.avPeterenuestoSanders,explorfore,proteachingviding
algorithmengineering,andsharingtheloreofcleveroptimizations.
Thankyou,Prof.Dr.-Ing.JürgenBeyerer,forreviewingthisthesis.
Lookingbackearlier,Ithankmyparentsfortheirloveandsup-
rport,esultingandforinteralloestinwingmecomputingaccesswtoasakindledTRS-80earlymicronatocomputerRandolph.The
Sconcerchool,ningaespeciallymodelbryDrocket.RobertsimulatorKir.chnerThanksstophysicsmysoccerassignmentcoach,
H.KillebrewBailey,forinstillingthespiritpracticehard,play
hard;noregrets!
Igratefullyacknowledgetheproductiveworkingenvironment
attheFGAN-FOMresearchinstitute,nowapartofFraunhofer
tianIOSB.WuttkeThanksfortointermyofestingficematesdiscussionsDominikoverPerlunchpeetandandSfruitfulebas-
administrativcollaboration;eRomymatters;PfeifmyfersuperandvisorAnjaDr.WBlancaniolfgangforhelpingMiddelmannwith
anddepartmentheadDr.KarstenSchulzforprovidingguidance
andthelatitudetoworkoninterestingproblems.
Thisthesisbuildsuponmachine-orientedgroundworklaidfor
thepleasur0eA.D.towstrategyorkwithgamethisprteamojectofstartingenthusiastic,in2002.Itself-motivhasbeenateda
volunteers,especiallyPhilipTaylor.
usefulIamtooltogratefulrtoead/writetheauthorsnearlyofanyGDALimageforfiledevforelopingmat.aThankstruly
toCharlesBloom,Prof.TanjaSchultzandDominikPerpeetfor
valuablefeedbackconcerningpartsofthiswork.
andImostapprofeciateall,themybelopatiencevedandSufen.Herunderstandingloveandofsupportfriends,meanfamilyso,
me.tomuch
xiii
Thisworkisdedicatedtothescientists/engineers/craftsmen
whobridgethegapbetweentheoryandpracticeofcomputing,
devisingsolutionsforpreviouslyinsurmountableproblemsand
teasingoutmaximumperformanceduetoadetailedunderstand-
ingoftheunderlyinghardware.Keeptheflameburning!
xiv
Contents
Contents
AppetizersI
Introduction11.1Fundamentals......................
1.2TheNeedforSpeed...................
1.3ImageProcessingChain................
ArchitectureComputer22.1BriefArchitectureDescriptions............
2.2DatasheetComparison.................
2.3OurChoice........................
2.4ConsequencesfortheAlgorithms...........
MemoryHierarchy...................
SIMD...........................
Parallelization......................
2.5Discussion........................
CourseMainII
Input/Output33.1ImageRepresentation.................
3.2EfficientI/O........