Robustness evaluation of operating systems [Elektronische Ressource] / vorgelegt von Andréas Johansson
191 pages
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Robustness evaluation of operating systems [Elektronische Ressource] / vorgelegt von Andréas Johansson

-

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
191 pages

Description

Robustness Evaluation of Operating SystemsVom Fachbereich Informatik der Technischen Universit¨ at DarmstadtgenehmigteDissertationzur Erlangung des akademischen Grades eines Doktor-Ingenieur (Dr.-Ing.)vorgelegt vonAndr´eas Johanssonaus Falkenberg, SchwedenReferenten:Prof. Neeraj Suri, Ph.D.Prof. Christof Fetzer, Ph.D.Datum der Einreichung: 19.11.2007Datum der mundlic¨ hen Prufung:¨ 14.01.2008Darmstadt 2008D17iiSummaryThe premise behind this thesis is the observation that Operating Systems (OS),being the foundation behind operations of computing systems, are complex entitiesand also subject to failures. Consequently, when they do fail, the impact is theloss of system service and the applications running thereon. While a multitudeof sources for OS failures exist, device drivers are often identified as a prominentcause behind failures.In order to characterize the impact of driver failures, at both the OS andapplication levels, this thesis develops a framework for error propagation-basedrobustness profiling for an OS. The framework is first developed conceptually andthen experimentally validated on a real OS, namely Windows CE .Net. The choiceof Windows CE is driven by its representativeness for a multitude of OS’s, as wellas the ability to customize the OS components for particular needs.

Sujets

Informations

Publié par
Publié le 01 janvier 2008
Nombre de lectures 40

Exrait

RobustnessEvaluationofOperatingSystems

VomFachbereichInformatikderTechnischenUniversit¨atDarmstadt
genehmigte

Dissertation

zurErlangungdesakademischenGradeseinesDoktor-Ingenieur(Dr.-Ing.)
novorgelegtv

Johanssoneas´Andr

ausFalkenberg,Schweden

Prof.NeeraReferenjSuri,ten:Ph.D.
ChristofProf.Ph.D.etzer,F

DatumDatumdermder¨undlicEinreichenhPrung:¨ufung:19.11.200714.01.2008

2008DarmstadtD17

ii

Summary

ThepremisebehindthisthesisistheobservationthatOperatingSystems(OS),
beingthefoundationbehindoperationsofcomputingsystems,arecomplexentities
andalsosubjecttofailures.Consequently,whentheydofail,theimpactisthe
lossofsystemserviceandtheapplicationsrunningthereon.Whileamultitude
ofsourcesforOSfailuresexist,devicedriversareoftenidentifiedasaprominent
failures.ehindbcauseInordertocharacterizetheimpactofdriverfailures,atboththeOSand
applicationlevels,thisthesisdevelopsaframeworkforerrorpropagation-based
robustnessprofilingforanOS.Theframeworkisfirstdevelopedconceptuallyand
thenexperimentallyvalidatedonarealOS,namelyWindowsCE.Net.Thechoice
ofWindowsCEisdrivenbyitsrepresentativenessforamultitudeofOS’s,aswell
astheabilitytocustomizetheOScomponentsforparticularneeds.
Forexperimentalvalidation,faultinjectionisaprominenttechniquethatcan
beusedtosimulatefaults(orerrors)inthesystembyinsertingsyntheticones
andstudytheireffect.Threekeyquestionswithsuchatechniquearewhere,what
andwhentoinjectfaults.Thisthesisshowshowinjectingerrorsattheinterface
betweendriversandtheOScanbeveryeffectiveinevaluatingtheeffectsdriver
faultscanhave.
ToquantifytheOS’srobustness,thisthesisdefinesaseriesoferrorpropaga-
tionmeasures,specificallytailoredfordevicedrivers.Thesemeasuresallowfor
quantifyingandcomparingbothindividualservicesanddevicedriversontheir
abilities.diffusingandysusceptibilitThisthesiscomparesthreecontemporaryerrormodelsontheirsuitabilityfor
robustnessevaluation.Theclassicalbit-flipmodelisfoundtoidentifyahigher
numberofseverefailuresinthesystem.Italsoidentifiesfailuresformoreservices
thanbothothermodels,datatypeandfuzzing.However,itsmaindrawbackis
thatitrequiressubstantiallymoreinjectionsthantheothertwo.Fuzzing,even
thoughnotgivingrisetoasmanyfailuresisabletofindnewadditionalservices
failures.eresevwithAcarefulstudyoftheinjectionsperformedwiththebit-flipmodelshowsthat
onlyafewbitsaregenerallyusefulforidentifyingnewserviceswithrobustness
weaknesses.Consequently,anewcompositemodelisproposed,combiningthe
mosteffectivebitsofthebit-flipmodelwiththefuzzingmodel’sabilitytoidentify
newservices,givingrisetonewmodelwithoutlossofimportantinformationand
atthesametimeincurringamoderatenumberofinjections.
Toanswerthequestionofwhentoinjectanerrorthisthesisproposesanovel
modelofadriver’susageprofile,focusingonhigh-leveloperationsbeingcarried
out.Itguidestheinjectionoferrorstoinstanceswhenthedriveriscarryingout
specificoperations.Resultsfromextensivefaultinjectionexperimentsshowthat
moreservicevulnerabilitiescanbediscovered.Furthermore,aprioriprofilingof
thedriverscanshowhoweffectivetheproposedapproachwillbe.

iii

iv

KurzfassungDerHintergrunddieserDissertationberuhtaufderBeobachtung,dassdas
Betriebssystem,welchesdieGrundlagef¨urdenBetriebvonRechnersystemen
darstellt,einesehrkomplexeStrukturaufweist,wash¨aufigzuFehlernimBe-
triebssystemf¨uhrenkann.WenndiesebetriebssysteminternenFehlerAusf¨allevon
DienstenzurFolgehaben,sindauchdieimRahmendesBetriebssystemslaufenden
Applikationengef¨ahrdet.AuchwennesimallgemeinenvieleFehlerquellengibt,
werdenoftfehlerhafteTreiberalsdieh¨aufigsteUrsacheangegeben.
UmdieAuswirkungenvonTreiberdefektenaufderBetriebssystem-undApp-
likationsebenezucharakterisieren,wirdindieserDissertationeinaufderAusbre-
itungvonFehlernbasierendesFrameworkf¨urRobustheitsauswertungentwickelt.
DasFrameworkwirdsowohlkonzeptionellentwickeltalsauchaufeinemechtenBe-
triebssystemexperimentellvalidiert.Dasgew¨ahlteBetriebssystem,WindowsCE
.Net,istrepr¨asentativf¨urvieleandereBetriebssysteme.Esistmodularaufgebaut,
wasdieAnpassungderBetriebssystemkomponentenanverschiedeneBed¨urfnisse
erheblichvereinfacht.
FehlerinjektionisteinebedeutendeTechnikf¨urdieexperimentelleValidierung,
wobeiFehlersimuliertwerdenindemmansieindasSysteminjiziertundihreFol-
genbeobachtet.DreiwichtigeAspekte,diehierbeiber¨ucksichtigtwerdenm¨ussen,
sind:WelcheFehlersollenwoundwanninjiziertwerden.IndieserDissertation
wirdgezeigt,dassFehlerinjektionindieSchnittstellezwischendemBetriebssystem
unddenTreiberneineeffektiveVorgehensweisedarstellt,dieFolgenvonTreiber-
atzen.h¨abzuscfehlernUmdieRobustheiteinesBetriebssystemszuquantifizieren,werdeneineReihe
vonFehlerausbreitungsmetrikendefiniert,diespeziellaufTreiberfehlerzugeschnit-
tensind.AnhanddieserMetrikenk¨onnenDiensteundTreiberhinsichtlich
EmpfindlichkeitundAusbreitungsverm¨ogenverglichenwerden.
DieseDissertationvergleichtdreizeitgem¨aeFehlermodelleinBezugaufihre
TauglichkeitzurRobustheitsbewertung.DasklassischeBit-Flip-Modellermittelt
amh¨aufigstenschwereAusf¨alleimSystem.MehralsdiebeidenanderenModelle,
DataTypeundFuzzing,ermitteltdiesesModellauchdiemeistenDienste,diezu
Ausf¨allenf¨uhrenk¨onnten.Dergr¨oteNachteildiesesModellsistallerdings,dass
essehrvieleInjektionenerfordert.FuzzingermitteltwenigerDienste,daf¨uraber
neuefehlerhafte,vonBit-FlipnichterkannteDienste.
Einesorgf¨altigeUntersuchungderErgebnissedesBit-Flip-Modellszeigt,dass
schoneineTeilmengederBitsausreichendist,umneueDienste,diezuRobus-
theitsausf¨allenf¨uhren,zuermitteln.Daraufhinwirdeinneues,zusammengesetztes
Modellvorgeschlagen,dasdiegutenEigenschaftendesBit-Flip-Modellsunddas
Verm¨ogendesFuzzing-ModellsneueDienstezuidentifizierenmiteinanderkom-
biniert.DasneueModellverliertkeinewichtigeInformation,underfordertinsge-
samtdeutlichwenigerInjektionen.
UmdieFragezubeantwortenwannessinnvollistFehlerzuinjizieren,wirdein
neues,andasBenutzerprofildesTreibersangelehntesTimingmodellvorgeschla-

v

gen.

Das

neue

dellMo

basiert

auf

der

Ausfuhrung¨

nov

Befehlen

in

einer

oherenh¨

Schicht.BestimmteFehlerinjektionenwerdenzumZeitpunktderAusf¨uhrungbes-

timmterBefehleget¨atigt.DieErgebnissederFehlerinjektionenzeigen,dassein

Vielfachesanst¨orungsanf¨alligen

utzerprofilBendas

neuen

Methode.

des

ersreibT

Dienstengefundenwerdenkann.Auerdemgibt

im

orausV

vi

hlussAufsc

erub¨

die

atEffektivit¨

der

tswledgemenknoAc

andThespacious,pathtoonlyatoPh.D.beiscomealong,narrowewindingrandone.narroAtwtheer.beginningSometimesititisgowidees
steeplyupwards,sometimesdownwards.Sometimesyouthinkyouseean
opcrossings,eningandwherelightoneafterhasthetochnextooseturn,whichonlypathtotofindapursue.deadSomeend.pathsThereloareok
morepromisingthanothersbutyouquicklylearnthattheeasypathisnot
alwaystheshortest.

Iamnowattheendofthepath,onlytorealizethatitisthebeginningofa
new.Iwouldnothavegottenherewithouttheassistanceandencouragement
ofProf.severalNeerapjeople.Suri.FirstThanksofallforIwyourouldlikeguidancetothankandmsuppyort.guideIandwouldmentor,also
likeVilgot,to¨thankOrjan,allArshad,presentRobandert,formerAdina,membDinersu,oftheMarco,DEEDSDan,Ripgroup:on,Martin,Brahim,
Faisal,MajidandMatthias.ManythanksalsotoBirgit,Ute,Sabineand
bBoeymayn.oppAonengreatt.manythanksalsotoProf.ChrstofFetzerforacceptingto

theIproamjectsalsoECgratefulIPforDECOS,fundingECNoEforReSISTconductingandmbyyresearcresearchhgrancomingtsfromfrom
overMicrosoftaninternshipResearch.atAMicrosoftpartoftheResearch,researchCamvbridge.alidationAwspasecialaccomplishedthanksto
BrendanMurphyatMSRforhostingmyinternshipandforalldiscussions
ers.papwritinghelpandFinallytomybeautifulandsupportingwife,Mia.Thankyouforevery-
thing!

vii

viii

tstenCon

ductiontronI11.1Dep1.1.1endabilitDepy:endabilitTheyABasicttributesConcepts................................
1.1.21.1.3DepDependabilitendabilityyMeansThreats.....................................
1.1.4AlternateTerminology:SoftwareEngineering.....
1.1.5BohrbugsandHeisenbugs................
......................aluationEvRobustness1.21.3ThesisResearchQuestions&Contributions..........
..........................StructureThesis1.42BackgroundandContext
2.1AShortOperatingSystemHistory...............
2.1.12.1.2OSDeviceDesignDrivers................................................
2.22.1.3SourcesofWhatFisailurestheofOpProblem?eratingSystems..............................
2.2.22.2.1SoftHardwwareareRelatedRelated..........................................
.......................RelatedUser2.2.3.......................ersDrivDevice2.2.4..........................InjectionaultF2.32.4OperatingSystemsDependabilityEvaluation..........
2.5OtherTechniquesforVerificationandValidation........
2.5.12.5.2FTormalestingMethods.................................................
.............................Summary2.633.1SystemSystemandMoErrordelModel...........................
............................delMoError3.2

ix

1345677894115616171818191020212227292030313334363

3.2.1ErrorType........................
3.2.33.2.2ErrorErrorTLoriggercation.............................................
3.2.4OtherContemporarySoftwareErrorModels......
3.3ExperimentalEnvironment....................
3.3.1WindowsCE.Net.....................
3.3.2DeviceDriversinWindowsCE.............
3.3.43.3.3SoftHardwwareare...................................................
3.3.5SelectedDriversforCaseStudy.............
.............................Summary3.44FaultInjectionFramework
............................ductiontroIn4.14.2Evaluation,Campaign&Run..................
4.3HardwareSetup..........................
4.4SoftwareSetup..........................
4.5InjectionSetup..........................
4.5.1ExperimentManager...................
......................ComputerHost4.5.24.5.44.5.3TInestterceptorsApplications.............................................
............................Pre-Profiling4.64.7SummaryofResearchContributions..............
5ErrorPropagationinOperatingSystems
............................ductiontroIn5.15.2FailureModeAnalysis......................
........................PropagationError5.35.3.1FailureClassDistribution................
...............MeasuresPropagationError5.3.2......................MeasuresofUse5.3.3.............................Discussion5.45.5RelatedWork...........................
5.6SummaryofResearchContributions..............
6ErrorModelEvaluation
............................ductiontroIn6.16.2ConsideredErrorModels.....................
6.2.1DataTypeErrorModel.................
6.2.2Bit-FlipErrorModel...................
x

6324344454546484848494512535354545758585465676690707173737778708288568687878

6.2.3FuzzingErrorModel...................
........................PropagationError6.36.3.1FailureClassDistribution................
6.3.2EstimatingServiceErrorPermeability.........
6.3.3ServiceErrorExposure..................
..................DiffusionErrorService6.3.46.3.5DriverErrorDiffusion..................
6.4ComparingErrorModels.....................
6.4.1NumberofFailures....................
6.4.26.4.3InjectionExecutionTimeEfficiency..........................................
6.4.4Coverage:IdentifyingServices..............
6.4.5ImplementationComplexity...............
6.5CompositeErrorModel......................
6.5.1DistinguishingControlvsData.............
6.5.2TheNumberofInjectionsforFuzzing..........
6.5.3CompositeModel&Effectiveness............
6.76.6RelatedDiscussionWork........................................................
6.8SummaryofResearchContributions..............
delsMoTimingError77.27.1InTimingtroductionModels......................................................
7.2.1Event-Trigger.......................
.......................riggerTime-T7.2.27.3DriverUsageProfile.......................
.........................StringCall7.3.17.3.2CallBlocks........................
....................PhaseserationalOp7.3.37.4ExperimentalSetup........................
7.4.1TargetedDrivers.....................
7.4.27.4.3ErrorInjectionModel..................................................
7.4.4CallStringsandCallBlocks...............
7.5ResultofEvaluation.......................
7.5.1SerialPortDriver.....................
7.67.5.2DiscussionEthernetdriv.............................er......................
7.6.1DifferenceinDriverTypes................
7.6.2ComparingwithFirstOccurrence............
xi

8888889839598989101201301501501701801011111211711811121221321321421421152621621821821031031131431531931931041141

7.6.3IdentifyingCallBlocks..................
.........................orkloadW7.6.4Error7.6.5......................Duration7.77.6.6RelatedWTimingorkErrors..................................................
7.8SummaryofResearchContributions..............

8ConclusionandFutureResearch
8.1Con8.1.1tributionsCategory1:...........................Conceptual.................
8.1.2Category2:ExperimentalValidation..........
8.1.3InjectionFramework...................
8.2ApplicationsofRobustnessEvaluation.............
...................ProfilingRobustness8.2.18.2.28.2.3RobustnessRobustnessEvEnhancingaluationinWTrappestingers........................
8.3OutlookontheFuture......................
8.3.1FaultInjectionTechnology................
....................PropagationError8.3.2.......................delsMoError8.3.38.48.3.4PracticalErrorLessonsTimingLearned...........................................

yBibliograph

xii

241241341441441441

147114488
841051051051115512
351451451651115567

159

ListFiguresof

1.21.1TheThedepattributesendabilitofydepandendabilitsecurityytreeandsecurit...............y..........34
1.41.3TheThefaultWhat,→Whereerror→andfailureWhenprocessdimensions...............offaultinjection..105
2.1Microkerneldesign........................17
3.23.1TheThedrivsystemermomodeldel..................................................3356
3.3Errormanifestationexample...................43
3.4OverviewoftheWindowsCE.Netarchitecture........46
4.1Thehardwaresetup........................53
4.2Overviewoftheexperimentalsetup...............55
4.44.3InjectionBuildinganproOScessimage................................................6516
4.5Thedatatypetrackingmechanism...............63
5.1Errorpropagationmeasures...................74
6.1Injectionefficiency........................103
6.36.2CumClassulativ3efailuresnumbeforroftheserviceBFmowithdelClass...............3failures......110098
6.46.5FailureDiffusionclassstabilitydistributionfortheFZcomparedmodelwith...............CO..........111110
6.6Numberofinjectionsforthecompositemodel.........112
7.1Genericcallstringexample....................126
7.2Exampleofdrivercallingservices................127
7.47.3WDriverorkloadopoperationalerationalphasesphases.......................................112287
7.67.5CallCallprofileprofileoffor9cerfio1C111serial..........................................112299

xiii

7.77.87.97.107.117.127.137.147.15

Timingexperimentsetup.....................
Serialdrivercallstring......................
SerialEthernetdriverdrivercallcallblocksstring..........................................
SerialEthernetdriverdrivererrorcalltimingblocksfailure....................classdistribution.......
SerialEthernetdriverdrivercallerrorprofiletiming......................failureclassdistribution.....
Ethernetdrivercallprofile....................

xiv

031231331113354
631041114411

ablesTofList

3.23.1OvDataterviewypeoferrorusedcasesdatafortyptyespeint...................................3490
3.43.3DataStreamtypineterfaceerrorforcasesserialfordrivstringser..................................4470
3.5Summaryofsymbolsintroduced.................49
4.1Driverservicesused........................67
5.25.1FTheailureCRASHclassessev..erit.yscale............................................7721
5.3Summaryoftheerrorpropagationmeasuresintroduced....83
6.1Targeteddrivers..........................87
6.2ResultsoffaultinjectionforBFmodel..............89
6.46.3ServiceServiceErrorErrorPPermeabilitermeabilityyresultsresults--SerialEthernetdriverdriv.er.............9921
6.56.6ServiceServiceErrorErrorPExpermeabilitosureyresultsresults-S-erialpCompactortdrivFlasher.......driver...9934
6.86.7ServiceServiceErrorErrorExpExposureosureresultsresults--CEthernetompactFlashportdrivcarderdriv.....er..9955
6.96.10ServiceServiceErrorErrorDiffusionDiffusionresultsresults--BFBF-9c1C111erfioserial.................9976
6.11ServiceErrorDiffusionresults-BFatadisk..........97
6.12DriverErrorDiffusionresults..................98
6.13Resultsoffaultinjection.....................100
6.14Experimentexecutiontimes...................102
6.15DriverDiffusionforClass3failures..............104
6.16ServiceErrorDiffusionresults-DT-cerfioserial.......104
6.186.17ServiceServicesErroridentifiedDiffusionbyClassresults3-failuresDT-91C111........................110065
6.206.19DiffusionComparingresultsimportforandtheexpthreeortdrivservicesers...........................111113

xv

6.21

7.17.27.37.47.57.67.7

Requiredmanualreboots.....................

SerialStreamdrivinerterfacecallbloforcks.serialport.....................driver..............
NDISfunctionssupportedbyEthernetdriver.........
EthernetComparisondrivofercallfirst-oblockccurrencesand....................callblockinjection
..................resultsinjectiontimingErrorClass3servicesforcerfioserial.................

xvi

....

611

113322
331113345
831931

1Chapter

ductiontroIn

Whatisrobustness,andwhyisitimportant?

Astheusageofcomputersproliferates,aconsequenceisourincreasing
relianceoftheiroperationsindiverseapplicationenvironments.Theuse
ofcomputers,andespeciallycomputersoftware,promisesmanyadvantages
comparedtoelectronicorpurelymechanicalsolutions,includingrapiddevel-
opment,flexibility,effectivecomponentreuse(bothsoftwareandhardware),
etc.effectsagingnoHowever,softwarebringsaboutnewproblems.Fulfillingnotonlyfunc-
tionalrequirements,butalsorequirementsondeterminism,real-timeand
dependabilitypropertiesbecomeincreasinglydifficult.Softwareengineer-
ingtriestohandletheseproblemsbystructuringthedevelopmentprocess.
However,thatengineeringsoftwareisdifficulthaslongbeenknown.Leveson
notesthatassoftwareisdividedintocomponents(awellestablishedtech-
niquetohandlecomplexity)anewcomplexityisintroducedinthemany
explicitandimplicitinterfacesthatarise[Leveson,1995].Furthermore,the
lackofphysicalconstraintsmakessoftwareinherentlymoreflexible(whichis
positive)butalsogivesrisetonew,unexpectedandunintendedinteractions
(whichmaybehardtofind,quantifyandmaster).Incontrasttophysi-
calsystems,smallperturbationsinsoftwaremaygiverisetoseriousfailures
withoutmuchdelay.Theseproblemsrequirenewmethodsforbuildingand
verifyingsystemsbasedonsoftware.
Akeydesignmodelusedtohandlesomeofthecomplexitiesistouse
standardplatformcomponentstobuildapplicationsupon,theOSbeingthe
mostsignificantsuchsoftwareplatform.TheOSformsthebasicinterface
towhichapplicationsandservicescanbebuilt.Consequentlyarelianceon
continuedprovisioningofcorrectserviceisputontheOS,andwhenthisis

1

2

CHAPTER1.ODUCTIONINTRnotthecasethesystemmightnotfunctionproperly.
ThisthesisaddressestheproblemofevaluatingtherobustnessofanOS,
i.e.,towhichdegreeanOStoleratesperturbationsinitsenvironment.Such
evaluationscanserveseveralpurposes,suchasgaininginformationonhow
thesystemcanfailwhenoperational,guidingverification/validationefforts
towardsserviceswhicharemorelikelytospreadorbetheoferrors,andto
guidetheadditionofrobustnessenhancingcomponentswheretheyaremost
e.effectivThischapterfirstpresentsthebasicterminologyusedinthethesisand
thenintroducestheareaofrobustnessevaluation.Theresearchproblems
addressedarepresentedanddiscussedtogetherwiththecontributionspro-
vided.

1.1.DEPENDABILITY:THEBASICCONCEPTS

3

1.1Dependability:TheBasicConcepts
Dependabilityistheabilityofasystemtoavoidservicefailuresthatare
morefrequentandmoreseverethanisacceptable[Aviˇzienisetal.,2004].
Thisdefinitionimpliesthatthesystemiswellspecified,togetherwiththe
servicesitprovides,suchthatfailuresofthesystemcanbeclearlydefined
anddetected.Italsorequiresacceptableservicefailurefrequenciestobe
establishedandthattheseveritiesoffailuresareknownandcanbeestimated.
[Aviˇzienisetal.,2004]isacollectiveeffortbythedependablecomputing
communitytoagreeonasetofstandardterms.Furtherdefinitionscanfor
instancebefoundthetheIEEEStandardGlossaryofSoftwareEngineering
[IEE,1990]orinDependability:BasicConceptsandTerminology,which
presentsthebasicterminologyinfivedifferentlanguages[Laprie,1992].This
sectionprovidesabriefintroductiontothetermsmostcommonlyusedinthe
fieldandrelevanttotheworkinthisthesis.
Dependabilitycanbeseenasanumbrella,incorporatingseveralat-
tributes,includingnotonlyattributesdirectlyrelatedtofunctionality,but
alsoattributesrelatedtosecurity.Inthisthesis,noemphasizeisputon
securityrelatedattributes.Theyareincludedanddiscussedshortlyinthis
chapterforcompleteness.Figure1.1showsanoverviewoftheattributesof
dependability,thethreatstodependabilityandthemeanstoachievedepend-
.yabilit

yependabilitDandyecuritS

esttributA

satreTh

eansM

AvReliabilitailabilityy
CSafetyonfidentiality
yegritIntyaintainabilitMaultsForsErresailurFFErrault Tor Prevolerancentione
FFault Fault Remoorvecastingal

Figure1.1:Thedependabilityandsecuritytree,from[Aviˇzienisetal.,2004].

4

ODUCTIONINTR1.CHAPTER

1.1.1DependabilityAttributes
bDepothtoendabilitproyvisionisaofcompfunctionalitositey,term,securityencompassingandmainsevtainabiliteralaspy:ectsrelating

•Availability-Theabilityofthesystemtobereadyforusewhen
required

•latedReliabilitservicesy-forTheaspabilitecifiedyofptheeriodsystemoftimetocontinuouslyprovidestipu-

•Safety-Theabsenceofcatastrophicconsequencesonusersandthe
tvironmenen

•Integrity-Theabsenceofimpropersystemalterations

•Confidentiality-Theabsenceofdisclosureofconfidentialinformation
titiesenunauthorizedto

•Maintainability-Theabilityofthesystemtoundergorepairsand
dificationsmo

Dependabilityandsecurityareobviouslyrelated.Availability,forin-
andstance,fromisaasecuritconcernybpotherspfromectiveadep(denialendabilitofyservice).perspectivFiguree(lac1.2koshofwsservice)how
dependabilityandsecurityattributesarerelated.

yailabilitvAyReliabilitDependabilitySafetySecurity
yonfidentialitCyegritIntyintainabilitaM

Figure1.2:Theattributesofdependabilityandsecurity,from[Aviˇzienis
2004].al.,et

1.1.DEPENDABILITY:THEBASICCONCEPTS

5

ThreatsyendabilitDep1.1.2Tofacilitateadiscussionregardingthecauses,effects,detectionandrecovery
fromfaultsinthesystemadistinctionismadebetweenfaults,errorsand
failures.Faultsarethecausesoffailuresinthesystembybeingactivated
(becomingerrors)andthenpropagatingtotheoutputsofthesystemand
therecausingafailure.Figure1.3illustrateshowafaultpropagatingtoa
failureofonecomponent(ComponentA)canbetheinput(fault)ofanother
component(ComponentB)andsoon.
Component AComponent BComponent C
FailureFaultErrorFailureFault

Figure1.3:Thefault→error→failureprocess.

aultsFFaultsarethesourcesforerrorsandfailuresofasystemorcomponent,
includingfaultsappearingduringdevelopment,physicalfaultsinhardware
andfaultinbyteractionitselfisnotfaultsosufficienccurringttoinincauseateractionsfailurewithoftheexternalsystem,compitmonenustts.alsoA
beactivated.Certaintriggeringconditionsarerequiredforthefaulttobe
activated.Forinstance,thepartofthehardwarecontainingthefaultmustbe
theused,faultorisforactivsoftwatedareitthebcoecomesdeconanerrtainingor.Athesthefaultmtriggeringustbemecexecuted.hanismforWhenthe
befaultpresenactivtationwithoutprocesstimmediatelyypicallybiseingtime-depactivatedendenandtarefaultstheninthecalledsystemdormantcan
faults.Agoodexamplethereofisasoftwarefault(bug)inamodulethat
isonlytriggeredforcertaininputvalues,whichmayappearatsomelater
pointintimeasthemoduleisused.

ErrorsErrorssystemctohangefail.theForinthisternaltostatehappofenthethesystemerrorminustawaycausethatamaseriesycofausechainthe
reactions,wheretheerrorispropagatedthroughthesystembyinternalcom-
putations.Errorsaretransformedintoothererrorsinasimilarmanneras
faultsserviceareoutputactiv(orated.lackEvofentuallyoutput)aviolatingpropagatedthesperrormaecificationycauseforantheincorrectsystem,

6

ODUCTIONINTR1.CHAPTER

i.e.,afailure.Thus,alsoerrorscanstaydormantinthesystem,waitingfor
thetriggeringconditionsforpropagationtotakeplace,beforeitpropagates.

ailuresFFailuresareobservedontheoutputsofthesystemandaredetectableas
failuredeviationsmayfromitselfancauseassumedafaultspinecification.anotherAscomponenillustratedt.ThereinareFigurem1.3ultiplea
facetstofailuresinasystem.Somefailuresmaybeofhighercriticality
prothanvisions,others.someSimilarlysystems,acanfailurepromvideustanotlimitedmeanlevaeloftotalservice,absencei.e.,ofthereserviceis
degradation.servicea

MeansyendabilitDep1.1.3Therearefourwaysinwhichdependabilitycanbeachievedandanalyzed:
faultprevention,faulttolerance,faultremovalandfaultforecasting.This
thesisismainlyconcernedwithfaultforecasting,andtosomedegreewith
faulttoleranceandremoval.

tionenPrevaultFThemainintentwithfaultpreventionistoavoidintroducingfaultsinthe
systemduringitsdevelopmentbyuseofmaturesoftwareengineeringprac-
ticesandtools.Faultsarisinginthefieldareavoidedbytheuseofhigh
are.hardwyqualit

oleranceTaultFFaulttoleranceisafundamentalswitchinmentalmodelcomparedtofault
prevention.Infaultpreventiononeavoidstointroducefaultsinthesystem.
Infaulttoleranceontheotherhand,faultsareassumedtobepresentinthe
system,duetoimperfectdesignmethodologies,agingofhardware,interac-
tionfaultswithcomponentsoutsidethecontrolofthedevelopmentteametc.
Faulttoleranceisbasedonthepremiseoferrordetectionandrecovery.Er-
rorsaredetectedandrecoveredfrom,orerrorsaremaskedusingredundancy.
Dependabilityisthenachievedbytoleratingthefaultsratherthanavoiding
tointroducethem,whichmaybeveryhard,ortoocostly.

1.1.DEPENDABILITY:THEBASICCONCEPTS

7

FaultRemoval
Faultremovalaimstoremovefaultseitherduringthedevelopmentstageof
thesystemorduringtheoperationalstage.Developmentstagemethodsare
brokendownintoverificationandvalidation,whereverificationrelatesto
verifyingthatanimplementedsystemactuallyimplementsthespecification
given,andvalidationtocheckingthespecificationitself.Atruntimediagnosis
andcompensationtechniquescanbeusedtoremovefaultsfromthesystem.
Thecontributionsinthisthesisrelatesmainlytoverification,morespecif-
icallytodynamicverification,suchastesting.

orecastingFaultFbFehaaultviorinforecastingtheaimspresencetooffaults.qualitativItelyaimsandtoquanestablishtitativelyfailureevmoaluatedesofsystemthe
systemandtoevaluateothersystemattributesregardingthedependabilityof
thebuttosystem.establishTheopintenterationalisnotcthesameharacteristicsasinoffaulttheremosystem.val(e.g.,verification)
Thethrustofthisthesisisonrobustnessevaluation,whichisapartof
forecasting.fault

1.1.4AlternateTerminology:SoftwareEngineering
Intheareaofsoftwareengineeringaslightlydifferentterminologyforde-
pendabilityfacetsexists.Insoftwareengineeringanerrorrepresentsthe
mistakemadebytheprogrammerthatmadehim/herintroduceaflawin
thecode,thefault(alsoknownasbug).Theconsequenceofthedormant
faultisthatitmaygetactivatedandthenpropagatetothesoftwareoutputs,
causingafailureofthecomponent.
Inthisthesiswewillconsistentlyusetheterminologyfromthedepend-
abilitycommunityaspresentedin1.1.Itallowsforadiscussiononthe
representativenessofinjectederrorsandisalignedwiththelargebodyof
workpresentedinChapter2.

bugsHeisenandBohrbugs1.1.5Asstatedourmainfocusisonfaultsoriginatingfromsoftware.Furthermore,
wefocusonthesubsetofsoftwarefaultsthataretransientinnatureandre-
quirecomplextriggeringforactivation.Thesefaults,knownasHeisenbugs
[Gray,1985]areofkeyinterestbecausetheyarelesslikelytobefoundusing
traditionaltestingtechniques.Theyrepresentfaultsthatrarelyappearin

8

ODUCTIONINTR1.CHAPTER

normalcircumstancesandcontextsandarethereforehardertofind.Theop-
posite,Bohrbugs,havesimpleranddeterministicactivationconditionsand
areeasilyrepeatable.SomeauthorsusethetermMandelbugforbugswhich
giventheexactsameconditions,sometimesappear,sometimesnot[Grot-
tkeandTrivedi,2007].UsingthisterminologyHeisenbugsareasubsetof
Mandelbugs.

aluationEvRobustness1.2

Adegreerelatedtotermwhichtoadepsystemendabilitorycompisronentobustnesscan.functionRobustnesscorrectlyisdefinedintheaspres-“the
enceofinvalidinputsorstressfulenvironmentalconditions”[IEE,1990].
theRobustnesssystem.isAtthethereforesamerelatedtimeittoisandmoreanrestrictedinfluencethanonthedepdependabilitendability,ysinceof
itonlyrelatestoexternalperturbationsandnotinternalones.
ThefocusofthisthesisisontherobustnessofOS’s,asitgivesuseful
andmeaningfulinformationaboutthesystemwithoutrequiringaspecific
operationalscenariotobeinplace,aswouldbethecaseforforinstancereli-
abilityoravailability.Robustnessisconcernedwiththecaseswereexternal
components(includinghumanusers)donotbehaveasexpectedorasstip-
ulated/assumedbythedesignerofacomponent.Assuch,robustnesseval-
uationcomplementstraditionalverificationandvalidationtechniques(also
nerformalabilitiesones).1inThethegoalsystem.ofSuchrobustnessevvulnerabilitiesaluationmaisytoormidenaytifynotpbeotentialtriggeredvul-
infaultsasp(e.g.,ecificsoftopwareerationalbugs)inscenario.theTheysystem,maandyortheymaymaynotleadtoconstituteseveredesigncon-
sequencesforavailability,reliability,safetyorsecurity.
AsCommercial-Off-The-Shelf(COTS)componentsaremoreandmore
kbeyaspecomingectofthestandardverificationbuildingproblockcess.sinmoRobustnessdernofdesignsindividualtheircompcompositiononentsisisa
ofgreatimportancesincecomponentsbuilttobere-usedinmultiplecontexts
cannotbebuiltwithanysuchexplicitcontextinmind.Whencombinedwith
newcomponents,havingdifferentfailurecharacteristics,componentsmaybe
facedwithunexpectedandabnormalinputs.Therefore,componentsshould
bebuilttorespondrobustlytosuchinputsandevaluatingtheirrobustness
mayrevealinformationonhowwellsuitedtheyareforaparticularcompo-
sition.in1aThesystemtermthatvulnerabilitmightyleaddoestonotreferrobustnessonlytofailures.securityThesevulnermightabilitiesalso,butincludetowsecuriteaknessesy-
eaknesses.wrelated

1.3.THESISRESEARCHQUESTIONS&CONTRIBUTIONS9

tionsManasytdescribypesedofinusersthismaythesis,beinincludingterestedindevpeloperformingersforrobustnessdebuggingevoralua-pro-
compfilingonenpurpts;oses;testersinfortegratorsidenfortificationfindingpofossibleinvulnerabilities;teractionorsystemproblemsbetdesignersween
andmanagersforsuitabilitytests,resourceguidanceoridentificationofweak
ofcompsoftwonenarets.devWeelopmenwillt,emphasizebutusewhentheaspmoreectsgeneralapplytotermaevaluatorparticularforaspectthe
personorentityconductingtheevaluation.

1.3ThesisResearchQuestions&Contribu-
tionsTheuseofCOTScomponents,suchasOS’s,isbecomingmoreandmore
common,alsoforproductswithstringentrequirementsondependability.
Forsuchcomponent-baseddesignstofulfiltheserequirements,oneneedsto
establishtheamountoftrustthatcanputonthesecomponentstowork
inaspecificenvironment,includinghowwelltheyhandlefaultsappearing
inothercomponentsofthesystem.Toanswersuchaquestion,thefailure
characteristicsoftheOSneedtobeestablished.Thisincludeshowthe
bOSythecanOSfailduemoretofaultsvulnerable?intheAreencertainvironment.otherArecompcertainonentsservicesmoreprolikelyvidedto
causeafailureoftheOSandconsequentlyafailureofthesystem?
alsoUsingconatainingmodelwhereapplicationstheandOSisdevicethemaindriversinplatformterfacingcompwithonenttheinahardwsystemare,
thesefundamentalquestionsregardingtheOShasguidedtheworkpresented
inexpthiserimenthesis.talTerrorogivepropagationinsightsinandtohoeffectwsuchframeworkquestionshasbcaneenbedefined,answeredwherean
synDriversthetichaserrorsbeenareideninjectedtifiedasintheoneinoftheterfacebmainetwconeenthetributorsOSofandOSitsdrivfailuresers.
[MurphyandLevidow,2000;Simpson,2003;Ganapathietal.,2006].Along
thesamelineChouetal.foundthatdrivercodecontainsuptoseventimesas
manybugsasotherpartsoftheLinuxkernel[Chouetal.,2001].Asdataon
howasystemhandleserrorstypicallyisnotavailableasthesystemisbuilt,
techniquesareneededtospeedupthisprocess.Onesuchtechniqueisfault
injection,wheresyntheticalfaults(orerrors)areinjectedandthebehavior
ofthesystemisobserved.2Thismethodologyraisesadditionalquestions
andregardingwhichhoerrorwtomoinjectdeltoerrorsuse.,Thesewherethreetoinjectquestionsthem,arewhenfundamentotalinjecttoanthemy
2Traditionally,thetechniqueiscalledfaultinjection,evenwhenerrorsareinjected.

10

ODUCTIONINTR1.CHAPTER

faultinjectionapproachandareorthogonal,asillustratedinFigure1.4.

epTy

cationoL

Timing

Figure1.4:Threefundamentaldimensionsinfaultinjection.

Eachinjectederrormodifiessomepartofthesystem.Theerrortype
referstohowthesystemismodified,likeflippingabit,orassigningawrong
value.registers,Wherememorytheorerrorinisparametersinjectedistoitslofunctioncationcalls.dimension,TheliktimingeinofCPUthe
injectionspecifieswhentheerrorisinjected,relativetosomesystemevent,
likebootuptime.Thetimingcanprincipallybetimeoreventtriggered.
Onemayarguethatfaultinjection,beinganexperimentaltechniqueis
ofinherenfaultstlyandlimitednotsincefindingitdoanyesnotfailuresproisvidenoproofcompleteness.ofcorrectnessInjectinga(asninumbtheer
system’sabilitytohandleallfaults/errors).However,wearguethateven
sincewithitlackgivofescompleteness,informationabevoutaluationhowtheusingsystemfaultbehainjectionvesiisnstillpractise.veryEvusefulen
softthoughwaremaformalybetecdesirablehniquesitandisnotadditionalwaysofpsevossibleeralforlayerslargeofsystems,fault-tolerancesuchas
anOS,duetoperformance,complianceorcostlimitations.
OnthesepremisesweareinterestedinfindingouthowOS’sbehavein
thepresencefolloofwingerrors,researchmorespquestionsecificallyareposederrorsforinthisdevicethesis:drivers.Consequently

QuestionshResearcThesisTheresearchquestionsposedforthisthesisaregroupedintotwobroadcat-
egories,firsttheconceptualdefinitionofrobustnessprofilingandtheassoci-
atedmeasures,andthenthequantifiableexperimentalaspectsofvalidating
measures.osedpropthe

Conceptual1:Category

1.3.THESISRESEARCHQUESTIONS&CONTRIBUTIONS11

ResearchQuestion1[RQ1]:Howdoerrorsindevicedriverspropagate
inanOS?Whatisagoodmodelforidentificationofsuchpropagationpaths?
Chapter3setsupthemodelusedtoevaluateandprofiletheOS.The
modelmustallowforcleardefinitionofpropagationpaths,andforuseful,
easilyinterpretableresultstobeextracted.

ResearchQuestion2[RQ2]:Whatarequantifiablemeasuresofro-
OS’s?ofofilingprbustnessChapter5presentsaframeworkthatallowsforidentificationoferror
propagationpaths,thathelpusquantifywhichservicesaremorelikelyto
spreaderrorsinthesystem.Italsoallowsustoidentifyforanapplication,
whichOSservicesusedaremorelikelytobevulnerabletopropagatingerrors.
Additionally,devicedriverscanberankedbasedontheirpotentialdiffusion
oferrors,allowingadesignertomakeinformedchoicesonwhethertoinclude
adriverinthesystem,toenhanceitsrobustnessortofindanalternate
driver.Chapter6presentsexperimentalresultsforacasestudyconducted
.Net.CEwsWindofor

Category2:ExperimentalValidation
representingResearchfaultsQuestionindrivers3[RbestQ3]:injecteWherd?etoWhatinjearect?theWhereadvantagesareerrandors
disadvantagesofdifferentlocations?
drivWers.ehaveThisclevhoseneltoofinjectinjectionerrorsproinvidestheinflexibilitterfaceybeandtwpeentheortabilitOSyandamongits
otheradvantages.Chapter3presentsoursystemmodelandshowswhere
forerrorserroraretobeinjected.injected.Chapter4detailsourinjectionframework,whichallows

ResearchQuestion4[RQ4]:Whattoinject?Whicherrormodel
shouldbeusedforrobustnessevaluation?Whatarethetrade-offsthatcanbe
made?Thechoiceoferrormodelisnotstraightforwardandtherearetrade-offs
tobemadeontheamountofdetailsprovided,thetime/effortrequiredand
theimplementationcomplexity.ItisshowninChapter6howsuchtrade-offs
canbemadeandthreedistincterrormodelsareevaluated,bit-flips,data-
typeandfuzzing.Anovelcompositeerrormodelisprovided,combiningthe
highervulnerabilityexposureofthebit-fliperrormodelwiththelowcosts
ofthefuzzingerrormodel.

ResearchQuestion5[RQ5]:Whentoinject?Whichtimingmodel
shouldbeusedforinjection?

12

ODUCTIONINTR1.CHAPTER

Whendoingin-situfaultinjectionexperiments,whichareneededforro-
bustnessevaluation,thetimeofinjectionbecomesanissue.Thestateof
thesystemevolvesasitexecutesandconsequentlyalsoitssusceptibilityto
faults.Anovelapproachtoselectingthetimeofinjectionispresentedin
Chapter7,basedontheusageprofileofthedriver.

tributionsConThesisTheresearchpresentedhereconstitutesseveralimportantcontributionsto
theresearchcommunity.Eachcontributionliststhecorrespondingresearch
questionsithelpsanswerinbrackets.
agationConinantributionOS,fo1:AcusingframewonaorkkeyissourcepresentedofforOScfailures,haracterizingerrorserrorindeviceprop-
drivers.[RQ1,RQ2,RQ4]
whichConareusedtributionto2:profileAtheseriesofrobustnesserrorofthepropagationOS.[RQ2,RmeasuresQ4]aredefined,
Contribution3:AlargescalecasestudyforWindowsCE.Nethasbeen
[RcarriedQ3,Rout,Q4,RwhereQ5]faultinjectionisusedtovalidatetheproposedmeasures.
ciencyConofsevtributioneralerror4:AmodelsdetailedforinusevinestigationOSonrobustnesstheeveffectivaluations.enessandModelseffi-
arecomparedonseveralparameters,includingnumberofprovokedfailures,
servicecoverage,requiredexecutiontimeandimplementationcomplexity.
Q3][RContribution5:Weshowhowanewcompositeerrormodelcanbe
usedwhenprofilingdrivers,combiningdesirablepropertiesofothermodels
givingexcellentcoveragecharacteristicsforamoderatenumberofinjections.
Q3][RshownConthatfortributiona6:certainTheclassimpactofdrivoftheers,thetimeofimpactinjectionishigh.isstudiedThisandindicatesitis
thatcontrollingthetimeofinjectionisimportant.[RQ5]
presenContedtributiontogether7:withAanovlargeelcaseapproacstudyhtosuppselectingortingthetherightresults.timetoThemoinjectdelis
usesthenewconceptofcallblocktodefinethetimeofinjection.[RQ5]
Contribution8:AflexiblefaultinjectionframeworkforWindowsCE
.Nethasbeenimplementedandusedtocarryingoutthefaultinjectionex-
pmoerimendels,tsdriversrequired.andTheservices.framew[RQ3,orkRalloQ4,wsRQ5]foreasyextensiontonewerror

1.3.THESISRESEARCHQUESTIONS&CONTRIBUTIONS13

ThesisthefromResultingPublicationsTheworkreportedinthethesisissupportedbyanumberofinternational
publications:•Andr´easJohansson,NeerajSuriandBrendanMurphy,OntheIm-
pactofInjectionTriggersforOSRobustnessEvaluation,Proceedingsof
theInternationalConferenceonSoftwareReliabilityEngineering(IS-
2007.SRE),•Andr´easJohanssonandNeerajSuri,RobustnessEvaluationofOp-
eratingSystems,Chapter12ofInformationAssurance:Dependability
andSecurityinNetworkedSystems,Editors:YiQian,JamesJoshi,
DavidTipperandPrashantKrishnamurthy,Tobepublishedin2007
Kaufmann.Morganyb•Andr´easJohansson,NeerajSuriandBrendanMurphy,OntheSe-
lectionofErrorModel(s)ForOSRobustnessEvaluation,Proceedings
oftheInternationalConferenceonDependableSystemsandNetworks
2007.(DSN),•Andr´easJohanssonandNeerajSuri,ErrorPropagationProfiling
ofOperatingSystems,ProceedingsoftheInternationalConferenceon
DependableSystemsandNetworks(DSN),2005.
•Andr´easJohansson,AdinaSˆarbu,ArshadJhumkaandNeerajSuri,
OnEnhancingtheRobustnessofCommercialOperatingSystems,Pro-
ceedingsoftheInternationalServiceAvailabilitySymposium(ISAS),
SpringerLectureNotesonComputerScience3335,2004.
Additionally,theauthorhasbeeninvolvedinthefollowingpublications
thatarenotdirectlycoveredbythethesis:
•Andr´easJohanssonandBrendanMurphy,FailureAnalysisofWin-
dowsDeviceDrivers,WorkshoponReliabilityAnalysisofSystemFail-
ureData,CambridgeUK,2007.
•ConstantinSˆarbu,Andr´easJohansson,FalkFraikinandNeerajSuri,
ImprovingRobustnessTestingofCOTSOSExtensions,Proceedings
oftheInternationalServiceAvailabilitySymposium(ISAS),Springer
LectureNotesonComputerScience4328,2006.
•NeerajSuriandAndr´easJohansson,SurvivabilityofOperatingSys-
tems:ProfilingVulnerabilities,FuDiCoII:BertinoroWorkshoponFu-
tureDirectionsinDistributedComputing,2004.

14

StructureThesis1.4

1.CHAPTERODUCTIONINTR

Thestructureofthefollowingchaptersfollowsthestructureoftheresearch
previously:ostulatedpquestions

Chapter1introducestheresearchproblemsstudiedandthecontribu-
tions.Also,itintroducestheterminologyusedthroughoutthethesis.
Chapter2givesabackgroundandcontexttotheproblemsapproached
inthisthesisbysurveyingrelatedwork.
Chapter3presentsanddiscussesthesystemanderrormodelused.
Theexperimentalenvironmentispresented,bothintermsofhardwareand
re.awsoftChapter4presentsourexperimentalmethodologyandpresentsdetails
regardingthefaultinjectiontechniqueused.
Chapter5introducesourerrorpropagationframeworkandintroduces
thekeymeasuresusedforoferrorpropagationandeffectanalysis.Theiruse
discussed.isterpretationinandChapter6investigatestheimpactofthechoiceoferrormodelbypre-
sentingacomprehensiveexperimentalevaluationofthreeerrormodels.The
evaluationbuildsonthemeasuresintroducedinChapter5.
Chapter7showstheimpactofthetimeofinjectionandpresentsanovel
approachtochoosingrelevantinjectiontimes.
Chapter8finallyputsthecontributionsofthethesisbackintocontext
bydiscussingthegeneralconclusionstobedrawn.Additionallyadiscussion
onhowtheresultscanbeappliedforseveralotherresearchfieldsisprovided
andfutureresearchdirectionsareoutlined.

2Chapter

BactextConandkground

WhatisanOS,andhowhasitsrobustnessbeenevaluated?What
isthestateoftheartandstateofthepracticeinOSrobustness
evaluation?

OvertheyearstheOShasevolvedinitscomplexityandroles.What
startedasaprogramtohelpcomputeroperatorsreadjobsfromtapesfor
largemainframecomputers,istodaypresentinamultitudeofcomputing
productsandresponsibleforservingmultipleconcurrentusersandhandling
awiderangeofdevices.Thesophisticationoftheservicesprovidedhas
increasedtremendouslyovertheyears,ashastherelianceonthecorrectand
timelyprovisionofservicetoapplicationsandusers.Thishasgivenriseto
awholeareaofdependabilityevaluationsandenhancements.
Thischapteraimstorelatetheworkpresentedinthisthesistothelarge
bodyofworkperformedbyotherresearchers.Thusitformsthebackground
andthecontextfortheresearchquestionsposedandputsthecontributions
presentedintoperspective.

15

16

CHAPTER2.BACKGROUNDANDCONTEXT

2.1AShortOperatingSystemHistory

Thefirstcomputerswereprogrammedperhandandtheprogramsweregiven
toanadministratoraspunchcards,whichthenplacedtheminthecard
readerforthecomputers.Ascomputersevolvedandtheusesandrequire-
mentsforcomputationsincreaseditbecameevidentthatsomeformofcon-
trolsoftwarewasneeded,bothtoabstractawaytheintricaciesofthehard-
wareandtoallowforconcurrentaccessformultipleusers.TheOSwasborn
tohandlemultiplejobsthatneededtimeontheCPU.Atfirstthesejobswere
batchedandtheroleoftheOSwastoreadthecodeforonejobintomemory
(fromtapesorpunchcards)andwhenitwasfinishedwritetheoutputon
printers,tapesetc.Onemajorissuewithbatchingofjobswasthatwhilethe
ingforcomputer,somewhichexternalwasadeviceithorrendouslycouldexpnotensivmakeeanpieceyofprogressequipmenandt,wawsaswsimpleait-
idle.ThiswassolvedwhenmultiprogrammingwasintroducedinOS’s.The
memoryavailabletothecomputerwaspartitionedacrossmultiplejobs,such
jobthatcouldwhenuseonethejobprowascessorwaitingtopforerformsomeI/Ocomputations.operatingFtourthercomplete,improvemenanotherts
followed,suchastimesharingwheremultipleusersattachedtoterminals
couldsharethecomputer,bydividingthetimeusedontheprocessoracross
theusers.Ascomputersbecamesmaller,fasterandmoreuserfriendly,the
numberofcomputerusersalsoincreased.SeveraldifferentOS’sevolved,
themostprominentonesbeingfirstUNIX(whichcomesinmanyflavors,
includingOS’slikeGNU/LinuxandMacOSX/Darwin),laterfollowedby
MicrosoftDOSandWindows.ManyspecialpurposeOS’sweredeveloped,
forinstanceforReal-Timesystems,orforlarge-scaleservers.Goodtext
booksongeneralOSrelatedthemesincludetheclassicalbooksbyTanen-
baum[2001]andSilberschatzetal.[2004].

DesignOS2.1.1OneofthekeygoalsforanOSisprotection.Itshouldpreventusersand
processestogainaccesstodata(read,modify,executeetc),devicesandother
processesinanuncontrolledmanner.Thisincludesbothunintentionaland
intentional(evenmalicious)accesses.Acommontechniquetoenforcethisis
todefine(inhardware)differentprivilegelevels,whereprocessesexecuting
withhigherprivilegecanaccesslowerprivilegeprocesses,butnottheother
wayaround.FormostOS’stwosuchlevels(ormodes)aredefined,user
levelandkernellevel.Onlyatthekernellevelisitpossibletousesome
processorinstructions.ByexecutingtheOSatthehigherprivilegelevel
(kernelmode)itcancontroluserprocesses’accesstothesystem.Naturally

2.1.ASHORTOPERATINGSYSTEMHISTORY

17

failuresofkernelmodecomponentsarepotentiallymoreseverethanuser
modecomponents,sincefewprotectionmechanismsexisttopreventthem
fromcorruptingimportantsystemdataandcomponents.

ClientClientProcessTerminalFileMemory
processprocessserverserver. . .serverserverUser mode

Mernelokcri

Kernel mode

Figure2.1:Exampleofmicrokerneldesign.Figurefrom[Tanenbaum,2001].

lithicTherekernelhaveandbtheeentwomicrokmainernel-baseddesigndesign.principlesInforamicrokgeneralOS’s,ernel-basedthedesignmono-
theOSkerneliskeptsmallandprovidesonlylowlevelservices,suchaspro-
cessandmemorymanagement,inter-processcommunication(IPS)etc.The
microkernelistheonlyentityoftheOSrunninginprivilegedmode.Other
servicesthatonewantstheOStoprovide,suchasfilesystems,devicedrivers
etcexecuteinusermode(andareoftenreferredtoasservers)asillustrated
inFigure2.1.ApplicationsrequestOSservicesusingIPCtotheparticular
servIneraprovidingmonolithicthedesignservice,onasshothewnotherbythehand,arroallwinOStheservicesfigure.executein
privilegedmode,andapplicationsmakesystemcallstousetheservices1
providedbythesystem.ThemodeloftheOSlayeredvertically,witheach
layerusingservicesoflowerlayers,whereasthemicrokerneldesignismore
ofahorizontaldesign.Thisdesignisreflectedinoursystemmodel,whichis
3.1.Figureinwnsho

ersDrivDevice2.1.2Devicedriversare,asthenamesuggests,responsibleforinteractionwith
devices.Therearealsodriversforvirtualdevices(protocolsetc)andother
softwaremakinguseofthedriverarchitecturetoextendthefunctionality
oftheOS.Adriver’sroleistoencapsulateandhandlethedevicespecific
interactionneededinorderfortheOSandapplicationstousethedevice.
Asmanydeviceshavespecialfunctionalities,orusespecificprotocols,the
driversprovideamiddlelayerbetweentheOSandthedevices.
1call.HowThroughoutever,forwethewillsystemusetheusedterminthisservicethesiswhictheyhisaremoresynongeneralyms.thanthetermsystem

18

CHAPTER2.BACKGROUNDANDCONTEXT

InordertofacilitateOS-driverinteractions,andtomakeiteasiertode-
velopdrivers,theinterfacebetweenadriverandtheOSistypicallystandard-
ized.Thismeansthatadriverneedstoimplementcertainfunctionalityfor
theOStobeabletointeractwithit.InexchangetheOSprovidesfunctionali-
tiesthatmakeiteasiertodevelopandmaintaindrivers.ThiswaytheOScan
handlewholeclassesofdriversthesameway,makingitsignificantlyeasier
todevelopnewdevices(anddrivers)forexistingOS.Usingdevicedrivers
alsopotentiallysimplifiestheportingoftheOStomultiplehardwarearchi-
tectures,asthedriverscanbeusedhandlepartsofthehardware-specific
featuresofdifferentarchitectures.

Problem?theisWhat2.1.3ThereareseveralreasonswhyitisdifficulttodesignandtestanOS.First
vofarietall,yofmostwOS’sorkloadsareandgeneral-purptheycanose,bei.e.,highlytheyareparameterizedbuilttotohandlebeausedwidein
differentenvironments.Furthermore,theOSkernelrunsinhigh-privilege
mode,werefailureseasilytakedownthewholesystem.OS’softenhavelong
run-times,especiallyintheserverandembeddedareas,makingthemsensi-
aretivetoservice-orienresourceted,exhaustionmeaningandthatleakagecorrectnessproblems.ofManexecutionyOSmafybeunctionalitieshardto
defineandlimit,makingtestingandothermeansofverificationandvalida-
tionhard.LastlythesheersizeofmodernOS’sposesaproblemforthorough
verificationandvalidation.Togiveahintonsize,SteveJobs(CEOofApple
Inc.)wasquotedtosaythatMacOSXcontained86millionlinesofcode
2006].[Jobs,

2.2SourcesofFailuresofOperatingSystems

ThissectionwillsurveysomeofthesourcesforOSfailures.Togetinfor-
mationoncommonsourcesforfailuresthemoststraightforwardtechnique
istocollectdatafromdeployedsysteminthefield.Mostcompaniescollect
failuredatafortheirsystemstosomeextent,butthereareseveralaspects
warrantingconsideration,suchasprivacy,userparticipation,unbiaseddata
2004].y,[MurphetcsetsOneofthemostinfluentialpaperswithinitsfieldisGray’s1985classical
paper“WhyDoComputersStopandWhatCanBeDoneAboutIt?”[Gray,
1985].StudyingoutagereportsforalargenumberofTandemsystemsfour
mainclassesofsourcesforoutageswereidentified:administration,software,
hardwareandenvironment.Administrationandsoftwarewerewerefoundto

2.2.SOURCESOFFAILURESOFOPERATINGSYSTEMS19

bethelargestcontributors(42%and25%respectively).Anotherinteresting
findingwasthatamajorityofthesoftwarefaultsinvestigatedforaspecific
subsystemwereHeisenbugs,notBohrbugs,whichsupportstheideaofusing
softwarefaulttolerancethroughredundancy,e.g.,processpairsetc.Alater
reportin1990reportsonatrendthatsoftwareisincreasinglybeingthe
sourceoffailures(upto60%in1989).
Therestofthesectioncoversdifferentsourcesoffaults.Therearemany
possibleclassificationsoffaults,andinthisthesiswewillconsiderthree
classes:hardware,softwareanduser-relatedfaults.Thissectionwillreview
eachoftheseinturnandrelatethemtoOSfailures.Asdevicedriversisa
majorsourceofOSfailuresandofinteresttothisthesis,thelastsubsection
isdedicatedtofaultsindevicedrivers.

RelatedareHardw2.2.1Hardwarerelatedfaultsarefaultsstemmingfromphysicaldefectsorphe-
nomenainthehardwareplatformuponwhichtheOSisrunning.Hardware
faultsmayhavedifferentcauses,suchaspowerglitches,wear-out/aging,ra-
diation/EMIetc.Muchworkhasbeenspentoncharacterizingandprotecting
againsthardwarefaults.Commonexamplesincludeerrorcorrectingcodeson
memory,redundantbuslines,redundantdisks(RAIDetc)andmanyother
techniques.FromanOSperspectivethesefourclassesoffaultshavebroadly
bedefinedin[Kaoetal.,1993]:

•Memoryfaults,corruptingmemorylocations,eithercodeordata,
•asCPUregisterfaults,corruptionscomputation,(PC,conPSRtrolfloSPwetc),andregisterfaults.Theyappear
•Busfaults,affectingbuslines,and
•I/Ofaults,externaldevicescausingproblems.

Cosmicrayspenetratingtheatmospheremaycausetransistorvoltagelev-
elstotransientlychangewhentheyhitchips.Duetotheirnature(transient)
sucherrorsareoftenreferredtoassofterrors(comparedtohard,perma-
nen(SER).tIterrors).isexpTheectedratethatatwhicfuturehctheyhipsoccurwillhisaveanreferredtoincreaseasintheSERsoftdueerrortoratethe
scalingofsizeandsupplyvoltage[Shivakumaretal.,2002;Constantinescu,
2005].2003,source.BytheHowevprincipleer,manofyerrorhardwconaretainmenerrorst,mayerrorsarepropagatebesttothehandledsoftwclosearetolevtheel

20

CHAPTER2.BACKGROUNDANDCONTEXT

undetected.Thisisespeciallytrueforsystem-levelsoftware,includingdevice
drivThisers.relationSuchwaserrorsstudiedmayinbe[IyerdifficultandVtoelardi,discriminate1985].Datafromwasoftswarecollectedfaults.for
aninstallationoftheMVSOSatStanford.Thepurposeofthestudywas
torelatedevaluatetohardwtheareOS’serrors.abilitytTheoinvdetectestigationandshodiagnosewedsoftthatwaretheOSerrorswasthatrarelyare
abletocorrectlydiagnosetheerrorashardwarerelated,andlesssothanfor
errors.arewsoftpure

2.2.2SoftwareRelated
SullivanandChillaregestudiederrorreportsforsoftwareerrorsintheMVS
OS,betweentheyears1985-1989[SullivanandChillarege,1991].Twomain
groupsofsoftwaredefects(termederrorshenceforth)wereanalyzed,regu-
larandoverlayerrors.Regularerrorsrepresentsa“typicalsoftwareerror
encounteredinthefield”.Overlaysareerrorswherememoryareashave
beenoverwritten,suchasbufferoverruns.Themostcommontypesofmis-
takeswerefoundtoberelatedtomemoryallocation,copyingoverrunsand
pointermanagement.Othererrortypesidentifiedincluderegisterreuse,
typemismatch,uninitializedpointer,undefinedstate,synchronization,dead-
lock,sequenceerror,statementlogic,dataerror,compilationerrorsandoth-
ers/unknown.Thestudywaslaterextendedtoincludedatabasemanagement
systemsandfurtherrefinedandrelatedtothedevelopmentprocessthrough
theconceptofdefecttype[SullivanandChillarege,1992].Thisgaverise
totheclassicalOrthogonalDefectClassification(ODC)process[Chillarege
etal.,1992;Chillarege,1996].ODCcontainssevendefecttypes:function,
assignment,interface,checking,timing/serialization,build/package/merge
anddocumentation.Eachdefectcanbeclassifiedtobelongtooneofthese
types.Thesetypeshavelaterbeenusedtobuildlibrariesforinjectionof
artificialfaultsinsystems,forinstance[ChristmanssonandChillarege,1996;
Christmanssonetal.,1998;Dur˜aesandMadeira,2006].

RelatedUser2.2.3andMurphfoundyanddriversLevidotowbienvaestigatedsignificantthesourcesourcesofforsystemoutagescrashesforWindo[MurphwsyandNT,
Levidow,2000].However,systemoutagesaremostlyfoundtobeplanned
(installationofhardware/OS/applicationsetc).Xuetal.[1999]investigated
thecausesforWindowsNTsystemrebootsandfoundplannedmaintenance
tionandenconfigurationvironmentnettowbeorkrespofservonsibleers.forSoftw31%areofandthedohardwwnaretimetoforalsoaprocauseduc-a

2.2.SOURCESOFFAILURESOFOPERATINGSYSTEMS

21

significantpartofsystemdowntime(22%and10%).
itisEvnotenthethoughfocusofthisuser-relatedthesis.faultsWeisfoacuskeyontosoftwareminimizingrelatedoutagesfaultsofandsystemtheir
consequences.

ersDrivDevice2.2.4

dor.DeviceHowdriveverser,arefortsomeypicallygeneraldevelopdrivedersbya(busdifferendrivters,partforythaninstance)theOSgenericven-
drivmallyerspmayerformedbedevbyeloptheedbdeviceythemanvendor.ufacturer.TestingDueoftodrivtheersnuismberthereforeofdevicesnor-
prosameducedlevelofandthequalityastime-to-markotherpartsetofpressure,theOS.deviceDevelopdriversersmaareyoftennothanotveoftime,the
orwithbetheskilledOSandenough,thetodevices.handleDevicethedrivsometimeserstinypicallytricateexecuteininteractionkernelmorequiredde,
tem.meaningRecenthattaeffortscriticalhavefaultmadeinauser-modevicededriverdrivmaersyptakeossibledowinnthemanywholemodernsys-
OS[Corbetetal.,2005].
Devicedriverstodayformthelargestpart(intermsoflinesofcode)
withintheOS.Chouetal.[2001]reportedthat70%oftheLinuxkernelcode
isdevicedrivers.ThatdevicedriversisamajorsourceforOSfailuresis
thereforenosurprise.Severalfieldstudieshavefounddriverstobeamain
sourceofsystemfailures,e.g.,[MurphyandLevidow,2000;Ganapathiand
Patterson,2005;Ganapathietal.,2006].
dowsOSGanapathi[Ganapathietal.inandvPestigatedatterson,the2005;causesofkGanapathierneletcrashesal.,for2006].theCrashWin-
reportsfrastructurewereforcollectedNetworkfromvolunComputing)teersusingplatformthe[BOI].BOINCDevice(BerkleydrivOpersenwereIn-
onfoundcrashes,tobenotthemaoutages,jorcausewhichformkayernelnotbecrashes,duetobutthecrashes.studyisbasedsolely
repInortsconandtrastlogstoChoutheetpreviousal.hasstudies,studiedwhicLinhuxwkereernelsbasedonspanningcollectingsevenyerrorears
usingstaticanalysis[Chouetal.,2001].Theanalysiswasstaticcompiler
errorbased,ratesusingoftheuptosourcethreecodetoofsevtheenktimesernel.thatTheyoffoundotherdevicepartsofdrivtheerstokhaernel.ve
Fothersurthermore(clusteringtheyoffoundbugs)thatandsomethatnewfunctionserfileshavareemoredistinctlypronemoretoerrorsbugsthanthan
ones.older

22

CHAPTER2.BACKGROUNDANDCONTEXT

InjectionaultF2.3

Faultinjectionisatechniquewherefaults(orerrors)areintentionallyin-
sertedinasystemtoobservehowthesystemreacts.Thetechniquestarted
inthehardwarearea,withthespecificpurposeoftestingfault-tolerance
mechanisms[Arlatetal.,1993].Ithasalsobeenproposedtousefaultinjec-
tionaspartofcertificationforhigh-assurancesystems[Voas,1999].
Faultinjectioncanbeperformedatdifferentlevelsinthesystem(like
hardware,software,protocols)andatdifferentstagesindevelopment(on
designmodels,prototypesordeployedsystems).Thefocusinthisthesisis
onexecutablesystems,i.e.,atleastaprototypeofthesystemneedstoexist
fortheevaluationtobeperformed.Furthermore,wefocusontechniques
implementedinsoftware,socalledSWIFItechniques(SoftWareImplemented
FaultInjection).Othertechniquesrequireuseofspecial-purposehardwareor
useabstractmodelsofthesystemtoinjectfaults.SWIFIhastheadvantage
ofbeingmoreflexibleandcheaper.Surveysoffaultinjectiontechniques,
coveringallthreeclasses,includeClarkandPradhan[1995],Hsuehetal.
[1997]andCarreiraetal.[1999].Thisrestofthissectioncoversawiderange
ols.toSWIFIofFIAT(FaultInjection-basedAutomatedTesting)isafaultinjectiontool
fordependabledistributedapplications[Segalletal.,1988;Bartonetal.,
1990].Systemdependabilityproperties,especiallyerrorcoveragesandla-
tencieswereevaluated.Injectionswereperformedbybit-levelcorruptionof
atask’sdataand/orcodememoryareas.Threetypesofcorruptionswere
used,zero-byte,set-byteand2-bitcompensate.Theoutcomeofexperiments
wereclassifiedonafive-gradescale,frommachinecrashtoinvalidoutput
.orerrnoandInanearlyworkonfaultinjection,ChillaregeandBowenintroducedthe
conceptoffailureaccelerationachievedthroughfaultinjection.Failureac-
celerationoccurswhenthefault→error→failureprocessisaccelerated,by
decreasingthefaultanderrorlatencies,andincreasingtheprobabilitythata
faultcausesafailure[ChillaregeandBowen,1989].Thismakesexperiments
fastertoperformandallowsforestimationsofthetransitionprobabilities
(fault→erroranderror→failure),whichistypicallynotpossiblefrom
fielddata(whichfocusmostlyonfailures).Theyreportedonafaultinjec-
tionstudyperformedontheMVSOS,wherearandom(virtual)pagein
memorywassetto0xFF,generatinganinvalidaddress/opcode,therebyin-
creasingtheprobabilitythatafaultcausesafailure.Itwasfoundthatonly
asmallfractionoftheinjectedfaultsledtoacompletefailureoftheprimary
serviceofthesystem(16%),whereasmost(70%)ledtonolossofservice
atall.Carefulstudyofthelattercategoryledtothedefinitionofpotential

2.3.FAULTINJECTION

23

hazard,anerrorwhichhascauseddamagedinthesystembutdoesnotlead
tofailureunderthecurrentoperatingstate.Potentialhazardsmayleadto
failureatalaterstage,triggeredbychangesinworkload,andmayexplain
previouslyobservedrelationsbetweenworkloadandfailures.
FERRARI(FaultandERRorAutomaticReal-timeInjector)injectser-
rorsinapplicationandOSprocessessimulatinglowlevelhardwarefaults.
Injectionisperformedeitherbyfirstcorruptingthememoryoftheprocess
beforeitisstarted,orbyinjectingfaultsduringexecution,triggeredeither
spatially(i.e.,afteracertaincodelocationisreached)orbyatimeoutde-
finedbytheuser[Kanawatietal.,1995].Injectionsareperformedpurelyin
software,usingsoftwaretraps.Faultsareinjectedintheaddress,dataor
controllineforthetargetedinstruction,resultinginforinstancedifferentin-
structionsbeingexecutedoroperandsbeingmodified.Theactualinjection
isperformedusingbit-levelmodifications.Bothtransientandpermanent
injected.ebcanfaultsFINE(FaultInjectionandmoNitoringEnvironment)wasusedtostudy
thepropagationoferrorsinOS’s.FINEcaninjectbothhardware(CPU,
memory,bus)andsoftwarerelatedfaults(initialization,assignment,condi-
tion).FINEwasoneofthefirstfaultinjectorstobeimplementedinkernel
space,makingOSevaluationpossible.Previoustools,suchasFERRARI
executedinuser-modeandthushadnoaccesstokernelmemoryareas[Kao
etal.,1993].Anewsystemcallwasimplemented(ftrace)usedtospecify
injectionandinsertionofprobesfromuser-space.[Kaoetal.,1993]reports
onexperimentsforSunOS4.1.2usingrandomlyplacedbit-flipsincodemem-
oryandrandomlyselectedglobalvariables.Softwarefaultsweremanually
injectedinthekerneltextsegment.Only8%oftheinjectedfaultsledtoerror
propagationtoanothersubsystem,withmostofthemcausedbycorrupted
functioncallparameters.FINEwaslaterextendedforusewithdistributed
systemsasDEFINE[KaoandIyer,1994].
FTAPE(FaultToleranceAndPerformanceEvaluator)wasusedtocom-
parefault-tolerantcomputersystems[TsaiandIyer,1995].Thetoolcom-
binesafaultinjectorwithaworkloadgeneratorandmonitor,toallowin-
jectionoffaultsunderhighstressconditions,whenfaultsaremorelikelyto
propagate[Tsaietal.,1996,1999].Faultscanbeinjectedthroughoutthe
system(CPU,memory,disks).Thetoolcaninjectsingle,aswellasmultiple
faults.k-atstucandbit-flipsFTAPEwaslaterextendedtoNFTAPE,whichisanextensibletoolusing
genericcomponentstoperformfaultinjectioninadistributedfashion.So
calledlightweightfaultinjectorcomponentsaredefinedtoperformtheactual
injection,monitorandtriggercomponentscansimilarlybeprovidedbythe
user[Stottetal.,2000].NFTAPEhasbeenusedtostudyerrorsensitivity

24

CHAPTER2.BACKGROUNDANDCONTEXT

ofLinux[Guetal.,2004].
MAFALDA(MicrokernelAssessmentbyFaultinjectionAnaLysisand
DesignAid)usesfaultinjectiontoassesstherobustnessofmicrokernel-based
OS’s[Arlatetal.,2002].Injectionsaremadeintobothcodeanddataareasof
theOS,aswellasintotheparametersofkernelcallsusingthebit-fliperror
model.MAFALDAcanbeusedtostudysystemfailuremodesanderror
propagationacrossthecomponentsofthesystem.Theauthorsalsoshowed
how,usingaformaldescriptionoffunctionbehavior,errordetectionwrappers
canbedefinedforkernelfunctions.Anextensionofthetool,MAFALDA-RT,
wasdesignedtoalsohandlereal-timesystems[Rodriguezetal.,2002].
DOCTOR(integrateDsOftwarefaultinjeCTiOnenviRonment)isatool
forSWIFIfordistributedreal-timesystems[Hanetal.,1995].Itcaninject
communicationfaults,suchaslostorduplicatedmessages.Hardwarefaults
aresimulatedinCPUregisters,busormemoryassingleormultiplebit
faults.Thelocation(inmemory)canbedefinedbytheuserorrandomly
selected.Fordrivingtheexperimentsvarioussyntheticworkloadscanbe
automaticallygeneratedoruser-definedprogramsareused.DOCTORwas
forinstanceusedtoevaluateadistributeddiagnosisalgorithmimplemented
onHARTS,adistributedshared-memory-basedreal-timearchitecture[Shin,
1991].Xceptionusesdebuggingandperformancemonitoringcapabilitiesof
processorstoinjecterrorsinCPUfunctionalunits.Theprocessorisin-
structedtohaltwhenfaultsaretobeinjectedandlow-levelexceptionhan-
dlingcodeperformstheinjection.Theadvantageofthisapproachisthat
interferenceswiththetargetsystemisminimized,itrequiresnosourcecode
access,ortrace-basedexecutionofapplicationsorOS.Focusisonsimulating
hardwaretransientfaults,astheseformthemajorityoffaultsinmodernpro-
cessors[Shivakumaretal.,2002;Constantinescu,2005].Severaltriggersare
supported,includingaddress-based(fetchofopcodefromspecificaddress)
andtimeout-based.Faultsareinjectedasbitlevelfaults(stuck-at,flipsand
masks).Xceptionwaslaterusedforotherstudies,including[Madeiraetal.,
2002].2000,PROPANE(PROPagationANalysisEnvironment)isafaultinjec-
tiontoolusedtoprimarilystudyerrorpropagationinembeddedsoft-
ware[Hilleretal.,2002b;Hiller,2002].Data-levelerrorsaretargeted,by
modifyingdatavalues,eitheronthebit-levelorbyfixedvaluesoroffsets.
Injectionsaretriggeredbyselectinginjectionlocations.Additionallytimers
canbeset,eitherusingclockcountersorcountersonreachinginjectionlo-
cations.PROPANEcaninjectbothtransient,intermittentandpermanent
errors.Sinceinstrumentationisdoneonthesourcecodeofthetargetsoft-
ware,propagationcanbestudieddowntoindividualsignals(variables)of

2.3.FAULTINJECTION

25

thecomponentsofthesystem.Togetherwiththemeasuresdefinedinthe
EPICframework(Exposure,Permeability,Impact,Criticality),PROPANE
wasusedtoevaluatethepropagationoferrorsinanaircraftarrestmentsys-
tem[Hilleretal.,2004;Hiller,2002].
RIDDLE(RandomandIntelligentDataDesignLibraryEnvironment)
testsapplicationandsystemservices/librariesonWindowsNTusingrandom
butsyntacticallycorrectstringsasinput[Ghoshetal.,1998].Theprogram
isobservedforunexpectedtermination,crashes,unhandledexceptionsetc.
Theapproachtakenissimilartothatof[Milleretal.,1990]describedin
2.4.SectionFST(FailureSimulationTool)wrapsapplicationsrunningontheWin-
dowsfamilyofOS’swithaninstrumentationlayer,wherebyfailingOSfunc-
tionscanbesimulated.Onatechnicallevelthewrappingisperformedin
averysimilarmannertotheInterceptormodulesusedinthisthesis(see
Section4.5formoredetails).FailuresintheOSaresimulatedbythrowing
exceptionsandreturningerrorcodes.Thefaultswereselectedfromtheset
ofoutcomesfrompreviousexperimentsonthesystemusingRIDDLE[Ghosh
etal.,1998].Applicationsaredeemedasrobustiftheydonothang,crash
ordisruptthesysteminpresenceofperturbations.
HEALERS(HEALersEnhancedRobustnessandSecurity)isasystem
forautomaticallyincreasingtherobustnessofClibraries[FetzerandXiao,
2002a,b].HEALERSusesadaptivefaultinjectiontoevaluatetherobustness
ofindividualparametersinlibraryfunctions.Informationpresentinheader
filesandmanualpagesisusedtobuildfaultinjectors,whichprogressivelytest
functionstocomputetherobustargumenttype,i.e.,thesetofvaluesforwhich
thefunctiondoesnotcrashorreturnwithanerror.Thisinformationisused
toautomaticallybuildrobustnesswrappersfortheselectedlibraryfunctions.
HEALERSwaslaterextended(thenewtooliscalledAutocannon),using
anextendedtypesystemfromBallista(seeSection2.4)tofurthersimplify
thegenerationofrobustnesswrappers[S¨ußkrautandFetzer,2007].Wrappers
aredefinedaspredicatesoverasetoftestsonparametervalues,makingthe
approachmoreflexibleandextensiblethantheoriginalHEALERSapproach.
AutoPatchreusespartsoftheHEALERSsystemtoinvestigateappli-
cations’handlingoferrorcodesfromlibraryfunctions[S¨ußkrautandFetzer,
2006].Errorinjectionisusedtofindunsafefunctions,i.e.,errorcodere-
turnvaluesareinjectedforfunctioncalls,andapplicationsnothandling
them(crashing)arelabeledunsafe.Unsafefunctionscanbeautomatically
patchedusingavarietyofpatchingtechniques.
DTS(DependabilityTestSuite)testsapplicationsrunningonWindows
NTbycorruptingtheparameterstolibrarycalls[TsaiandSingh,2000].The
serverinaclient-serversystem(ApacheIIS,SQLServer)wastargetedand

26

CHAPTER2.BACKGROUNDANDCONTEXT

outcomeswereclassifiedfromaclient’sperspective,i.e.,retryrequired,server
restartrequired,completefailureetc.Theserversweretestedrunningstand-
aloneandusingtwodifferentfault-tolerancemiddlewaresolutions.Threebit-
levelfaultmodelswereusedfortheparameterstothelibrarycalls:setting
allbits,zeroingallbitsorflippingallbits.Large-scaleinjectionsverifiedthat
theevaluatedmiddlewarereducedthenumberoffailuresconsiderably.

G-SWFIT(GenericSoftwareFaultinjectionTechnique)usessoft-
waremutationstoinjectsoftwarefaultsintothebinaryofatargetprogram
[Dur˜aesandMadeira,2006].Afieldstudyofrealfaultswasusedtogenerate
mutations.FirstthefaultswereclassifiedaccordingtoODC(seeSection
2.2.2).Thisclassificationisthenrefinedwithanorthogonalclassification
ofmissing,wrongandextraneousconstructs,whichallowformoreprecise
faultinjection.Thebinaryofthetargetissearchedforpatternsrelatingto
higher-levelcodeconstructs,wherecodemutationschosenfromarepresen-
tativesetofsoftwarefaultsareinserted.Usingfailuremodeanalysisthe
behaviorofthreetestprogramsiscompared,wheninsertinglow-levelmu-
tationsandsourcecodefaults.Overallmostofthesourcecodelevelfaults
couldbereproducedbythemutationstosomeextent.

Severalhardware-basedtechniqueshavebeendevelopedaswell,injecting
faultsatdifferentlevelsofthesystem.MESSALINE[Arlatetal.,1990,1993]
injectsfaultsatthepinstoICsoffaulttolerantsystems.Karlssonet.al.
[Karlssonetal.,1994]useheavy-ionradiationforvalidationoffault-handling
hanisms.mec

Faultinjectionhasmainlybeenusedtoevaluatefaulttolerancemecha-
nismsorrobustnessissues.However,ithasalsobeenfoundusefulinthearea
ofsecurity,especiallyforprotocols[PRO].In[Chenetal.,2002]errorswere
ininjectedfirewallsintcanwoinksomeernel-basedcasesleadfirewtoallsonsecuritLinyux,andvulnerabilities.itwasMofounddelingthataerrorsreal-
isticinstallationsuggeststhaterror-causedvulnerabilitiesisanon-negligible
ensourcevironmenfortosecuritfanyconcerns.applicationDuandandobservMathedurthe[2000]applicationsinjectedforerrorssecuritinythevi-
olations.NFTAPEhasbeenusedtoinjectcontrolflowbit-flipsintheuser
authenticationsectionofsshdandftpdonLinuxanditwasfoundthat
suc2001].hfaultsNevesmaetyal.openpresenuptthetheaffectedAJECTtservoolerswhicforhperformvulnerabilitiesattack[Xuinjeetctional.,
fordetectingvulnerabilities[Nevesetal.,2006].Attackstargetprotocolsby
used.messagesthearyingv

2.4.OPERATINGSYSTEMSDEPENDABILITYEVALUATION27

2.4OperatingSystemsDependabilityEvalu-
ation

Severalpasteffortshavefocusedonevaluationofdependabilityandrobust-
nessissuesinOS’s,includingthepreviouslymentionedfieldstudies.This
sectionisdedicatedtodependabilitybenchmarks,whereastandardmethod-
ologiesandtoolsareusedtoevaluateandcomparesystems.
Inbenchmarkingtheaimistocomparecompetingsystemsusingafair
andrepeatableprocess.Benchmarksforcomparingcomputerperformance
areabundantandhavefoundwidespreaduse,eventhoughtheinterpretation
oftheresultsoftenisnon-trivial.Oneofthemostinfluentialbenchmarksis
theStandardPerformanceEvaluationCorporation(SPEC)benchmark[SPE;
Henning,2000].ThemostwellknownbenchmarkfromSPECisprobably
SPECintformeasuringintegercomputingcapabilitiesofCPUs,buttheorga-
nizationoffersbenchmarksinmanyareas,suchasgraphics,high-performance
computing(HPC)andweb-basedsystems.Severalotherbenchmarksexist
forspecificareas,suchastheEmbeddedMicroprocessorBenchmarkConsor-
tium’s(EEMBC)benchmarks[EEM;WeissandClucas,1999],Transaction
ProcessingPerformanceCouncil(TPC)[TPC]andLINPACK[LIN]tomen-
few.abuttionDependabilitybenchmarksarenotaimedatcomparingperformance
(only),buthow“dependable”asystembehaves.Twowellknownprojects
ondependabilitybenchmarkingaretheBallistaprojectfromCarnegieMel-
lonUniversity[Bal]andtheEU-ISTprojectDBench[DBE;Kanounetal.,
2001].Anintroductiontothegeneralproblemofbenchmarkingandspecific
issuesrelatedtodependabilitybenchmarkingisgivenin[Johansson,2001].
Ballistaisarobustnessbenchmark[Koopman,1999;DeValeetal.,1999;
KoopmanandDeVale,1999].ThefirstversionofBallistatargetedthePOSIX
interfacefoundonmanyOS’s.Itbuildsatestingwrapperforthefunction
targetedandthenautomaticallybuildstestcasesbyselectingparameter
valuesfromasetofvalidandinvalidvaluesforthatparticulardatatype.
SincethenumberofdifferentdatatypesusedinthePOSIXinterfaceis
relativelylow,thenumberoftypesforwhichinjectorsneedtobespecifiedis
alsolow.Additionofnewfunctionstobetestedrequiresonlytodefinevalues
foranynewtypenotpreviouslyused,makingtheapproachveryscalable.
ExtensiveexperimentationdoneonseveralOS’srevealedmultiplerobustness
issues[KoopmanandDeVale,1999,2000].Ballistawaslaterusedfortesting
I/Olibraries[DeValeandKoopman,2001],CORBAimplementations[Pan
etal.,2001]andforWin32interfacesinWindows[Sheltonetal.,2000].
MendoncaandNevesusedfaultinjectiontotestfunctionsintheWin-

28

CHAPTER2.BACKGROUNDANDCONTEXT

dowsDDK(theinterfacefordevicedrivers)[MendoncaandNeves,2007].
SincetheDDKforWindowsexportsmorethanathousandfunctions,only
functionsusedinatleast95%ofthedriversweretested.Eachfunction
wastestedinisolation,similartotestsinBallista,andfailuremodeanaly-
siswasperformed.Thefailuremodeswererelated,notonlytothesystem
robustness(crash,hangetc)butalsotoconsistencyofdataondisk,where
FAT32andNTFSwerecompared.ThreeversionsofWindowswerecom-
pared(XPSP2,Server2003andVistaRC1)andtheresultsshowedgreat
similarities,indicatingthatthetestedfunctionshavenotundergonefunda-
mentalchangesimpactingrobustnessthroughoutthethreeversionstested.It
wasalsofoundthatNTFS,asexpected,showednofilesysteminconsistencies,
whereasFAT32didinsomecases.
EarlybenchmarkingprojectsaimedatOS’stargetUNIXsystems.The
crashmeprogramwasdevelopedtotesttherobustnessoftheOSbyexecuting
randomdata[Carrette].Thiswasachievedbyfirstallocatinganarrayand
fillingitwithrandomdata.Thenseveralchildprocessesarespawnedthattry
toexecutethedataasifitwasacodesegment.Thesystemissubmittedtoa
largenumberofsuchprocesses,withtheintentoftestingtheerrordetection
andhandlingcapabilitiesoftheOS.Thissimpletestsuccessfullycrashed
severalUNIXsystems.TheprogramwaslaterextendedtoCMUCrashme
whichsubjectedUNIXsystemcallstorandomstrings,therebytestingtheir
parametercheckingcode[MukherjeeandSiewiorek,1997].Thismodified
versioncouldcrashtheMach3.0OSinlessthantenseconds.Theauthors
alsopointedouttheusefulnessofmodularbenchmarks,targetingspecific
areasofthesystem,suchasfile,memoryandinter-processcommunication
subsystemsforanOS[Siewioreketal.,1993;Dingmanetal.,1995;Mukherjee
1997].Siewiorek,andAnotherapproachusingrandomdatawascarriedoutbyMilleretal.
[1990],whereaseriesofcommercialUNIXimplementationswerebench-
markedandcompared.ThetargetwasnottheUNIXkernelperse,buta
setofutilityapplicationscommonlyincludedinmostUNIXOS’s,suchas
awk,diffandgrep.Thetestsconsistedofsupplyingrandomstringsasinputs
totheseutilities(whichtypicallyworkontextinput).Thetechniquewas
namedfuzzingandhasservedasinspirationtotheareaofRandomTest-
ingandalsotothefuzzingerrormodelusedinthisthesis.Robustnesswas
measuredbutobservingthebehavioroftheapplication,wherecrashesor
hangswereundesirableoutcomes(showsnon-robustbehavior).Asurprising
numberofdeficiencieswerefound,withlargedifferencesbetweenthebench-
markedsystems.Theexperimentswerelaterrepeatedwithsimilarresults
[Milleretal.,1995].AlsostudiesforWindowsNT[ForresterandMiller,
2000]andMacOShavebeenconducted[Milleretal.,2006].

2.5.OTHERTECHNIQUESFORVERIFICATIONANDVALIDATION29

[TsaiAetdepal.,endabilit1996]yusingbencthehmarkFTforAPEfaultfault-toleraninjectionttosystemsol.wasMeasureddevelopwasedthein
nutem.mbFeraultsofwerecatastrophicinjectedinincidentotstheandCPU,thepmemoryerformanceandI/Odegradationcomponenoftstheofsys-the
Mosystem.dularSinceRedundancythesystems-TMR)testedtheexpconsistedectedofoutcomeredundanistonlyacomputersperformance(Triple
degradation.UsingPostMarkTM,afilesystemperformancebenchmark,asworkload,
Kanounet.al.developedadependabilitybenchmarkforseveralversions
ofWindowsandLinux[Kanounetal.,2005].Thebenchmarkisdefinedas
ameasureoftherobustnessoftheOS’sabilitytowithstandinvalidAPI
inputs.MeasuredisalsoreactionandrestarttimesforthecomparedOS’s.
TheTPC-Csametransactionalauthorshapvealsoerformancedefinedbencadephmarkendabilitaswyborkloadenchmark[Kalakechusingettheal.,
ws.Windofor2004a,b]abilitBroybwnencet.hmarkal.bargueyshothatwingthethathumanmostoffactorthemustoutagesbeforincludedlargeinasystemsdepend-are
depcausedendabilitby(hybuman)encophmarkeratorswhich[BrownincludesandrealPhatterson,umanop2001].erators,Theyppresenerformingta
bothdetectionandrecoveryactionstoinjectedfaultsandtoperformstan-
dardizedmaintenancetasks[Brownetal.,2002].VieiraandMadeiraalso
consideroperatorfaultsforstudyingrecoveryproceduresindatabaseman-
agementsystems(DBMS)[VieiraandMadeira,2002a,b].Aportablefault-
loadforDBMS’sisdefinedin[VieiraandMadeira,2004],usedtoformade-
pendabilitybenchmarkforOn-LineTransactionProcessingsystems(OLTP).
depTwoendabilitkindsofy-related.measuresPareerformanceused,prelatederformance-relatedmeasureswere(fromtakenbTPC-C)othwithand
andwithoutfaultsinjected.Dependability-relatedmeasuresincludedata
inferentegrittyDBMS’sandavandailabilitthreeydifferenmeasures.tOS’s,Thebandenchmarkcomparisonswaswappliederetomadefouracrossdif-
them.

2.5OtherTechniquesforVerificationand
alidationV

tionThisandsectiontheirpresenrelationtstocomplementhefaulttarytecinjection-basedhniquesforvtecerificationhniquesusedandvinalida-this
sizethesis.thatInalsogeneral,thepropnooneosedtecsinglehniquestechniquemustisbetobusedeaspreferredcomplemenandwtsetoempha-other

30

CHAPTER2.BACKGROUNDANDCONTEXT

verificationandvalidationtechniques.

estingT2.5.1devSoftwelopareer(ortestingaisdesignatedthemosttester)basicbuildsformaoftestvcaseerificationwhichfordefinessoftwtheare.conThetext
forthetestandtheinputs,aswellastheexpectedoutputs,basedonthe
spcomparedecificationtoofthetheexptestedectedcompresult.onenAt.Theplethoratestofistestingexecutedtecandhniquestheresultexist
andinthissectionwehighlightthemorerelevantonestothiswork.Good
introSoftwareductoryTestingtextsbytoMysoftersware[2004],testingorTincludesestingtheComputerseminalbSoftwarookebTheyArKanertof
[1999].al.etFromanimplementationpointofviewtestingandfaultinjectionhave
manycommonalities.Especiallyfaultinjectionfocusingoninterfaces,which
isthecaseinthisthesis.Thistypeoffaultinjectionresembleswidelyspread
unittestingapproaches,suchasequivalenceclasstestingorboundaryvalue
2004].ers,[MytestingConceptually,softwaretesting’sgoalistoidentifyfaults,i.e.,bugs,
whereasthegoalofarobustnessevaluationistoidentifyweaknesses.A
wlieeaknessoutsideinofthisthesensescopemaofythenotbspeabugecification,(althoughormaityarisemight),onlybinecauseaitcertainmay
text.conInequivalencepartitioningtestingofafunctiononetypicallyfocuseson
bothvalidandinvalidclassesofinputs.Theinputspaceissplitintoasetof
(inequivtermsalenceofclassescorrectness)whereforitallisvaluesassumedwithinthatatheclass.functionThespbehaecificationvesforsimilarlythe
inputisrequiredforperformingthepartitioning.Boundary-valuetestingcan
beseenasanextensionofequivalencepartitioningtestingwhereonefocuses
onthevaluesaroundtheboundariesoftheequivalenceclasses.

dsMethoormalF2.5.2Anyformoffaultinjectionisinherentlyadynamictestingmethod[My-
ers,2004].Weconsiderdynamictestingtechniquesasours,andmoreformal
techniques,includingstaticanalysis(like[BallandRajamani,2002])ascom-
plements.Bothareusefulforbuildingmoredependablesystemsandboth
havetheirstrengthsandweaknesses.Formalproofs(liketheoremproving)
areusedtoprovethattheimplementationismadeaccordingtothespeci-
fication,whichalsoneedstobeexpressedformally.Anotherapproachisto
usingamodelofthesystem,buildandcheckallthestatesofthesystemand

YSUMMAR2.6.

31

voferifythethatsafetystheypdoecificationnotviolateforthethespsystemecification,[KumarforandLi,instance2002].leadaviolation
[HayesandOffutt,2006]usesstaticanalysisoftheuserinputspecifica-
andtionstotogenerateprogramstesttobcasesothforidentifytesting.Ainconsistencieslargeempiricalinthespcasestudyecificationshowitselfed
thattheautomatictoolfounddefectsfasterthanexperttesters,butnot
Thenecessarilyresultssuppmore.ortsItthealsofoundcomplemendefectstarynotuseoffoundbautomaticyhumantoolswithtestersatdomainall.
ertise.expmeansEventoreacthoughhformalcompleteness,methodstestingaretectheoreticallyhniquesandattractivrelatede,sinceexptheyerimenoffertal
techniqueslikefaultinjectionarelikelytoprevailformanyyearstocome,
duetotheireaseofuseandunderstanding.However,testautomationisa
necessaryevolutionintesting,assystemsgrowlargerandmorecomplex.

Summary2.6

Thischapterhaspresentedbackgroundinformationandreviewedrelated
researchwithintheareasofOSrobustnessevaluationandfaultinjection.
OnthisbackgroundwehaveidentifiedtheOSasbeingthekeytosystem
dependabilityandrobustnesssinceitistheplatformonwhichapplications
andservicesarebuilt.Furthermore,devicedriverswereidentifiedasthemain
sourceofsoftware-relatedcausesofsystemfailures.Severalpreviousstudies
havefocusedoninterfaces(OS-ApplicationandOS-Driver)asitfacilitates
portabilityandfaircomparison,importantaspectsofbenchmarks.Fault
injectionhasinmultiplepreviousstudiesbeenshowntobeaneffectivemeans
forevaluationofdependabilityattributesofOS’s.
Ourreportonrelatedworkdoesnotstopwiththischapter.Throughout
thethesiswewillgivepointerstorelevantstudieswhereappropriate.

32

CHAPTER2.CKBAOUNDGRANDCONTEXT

3Chapter

ErrorandSystemdelMo

Whatarethesystemboundaries,andwhatisanerror?

OS’sarekeybuildingblocksinvirtuallyallcomputerbasedsystem,rang-
ingfromsmalldeeplyembeddedcontrolsystems,todesktopworkstationsand
largeserversforonlinetransactions.ConsequentlyOSdependabilityisan
importantobjectiveandaprerequisitefordependableprovisionofservices.
Thischapterbuildsthefoundationforthefollowingchapters,startingby
presentingthesystemmodelused.Thenageneralerrormodelisdefined,in
termsoflocation,typeandtrigger,followedbythepresentationofthethree
errormodelsusedinthefollowingchapters.Theexperimentalsetupusedis
presented,bothintermsofhardwareandsoftware.Thechapterisconcluded
withasummarycontainingatableofthesymbolsintroducedforreference
hapters.claterin

33

34

CHAPTER3.SYSTEMANDERRORMODEL

delMoSystem3.1

MostmodernOS’saremonolithic,i.e.,theOSkernelprovidingthemost
basicfunctionalitiesrunsinkernelspace,asillustratedinFigure3.1.Thisis
incontrastto,forinstance,microkernel-basedOS’s,wherethefunctionality
oftheOSkernelisspreadacrossmultiplesubcomponentswithwellspecified
terfaces.inWeuseagenericmodeloftheOS.Similartomanyotherstudies,e.g.,
[Albinetetal.,2004;Dur˜aesandMadeira,2003],wemodelamonolithic
system.Themodelconsistsoffourlayers:applications,OS,driversand
hardwareplatform.Wehavechosenthismodelasitisgenericenoughto
applytoseveralcommercialOS’s,likeWindowsorLinux.Itisalsosuffi-
cientformeasuringtherobustnessofthesystemduetoerrorsindriversby
propagation.errorstudyingEachlayerconsistsofoneormoresubcomponents(likedifferentapplica-
tionsintheapplicationlayer,ordifferentdriversinthedriverlayer).Our
modeldoesnotspecifythesubcomponentsrequiredineachlayersincethey
differforeachspecificOS.Eachlayerprovidesservicestobeusedbyneigh-
boringlayers.Aservicecanberealizedinmanyways.Commonisfor
instancefunctioncalls(likeAPI’s,ApplicationProgrammingInterfacesor
systemcalls),butingeneralothermechanismscouldbeusedlikethemessage
passingparadigmusedforcommunicationbetweentheOSandthedrivers
definedintheWindowsDriverModel(WDM)foundonWindowsXP[Oney,
2003].Thenatureofthecommunicationisnotimportantforthemodel,
importantisthattheserviceissyntacticallyspecified,andthattheflowof
informationcanbeinterceptedandmodified.Thisisrequiredtobeableto
injecterrorsandtoobservetheoutcomeofeachinjection.Thespecifica-
tionisdefinedinaninterface.Thetwointerfacesofinterestherearethe
OS-ApplicationandOS-DriverinterfacesindicatedinFigure3.1.
Thesystemhasasetofnapplications,APP1...APPn.Theapplication
setincludesallapplicationsrunningonthesystemwhicharenotrequiredfor
theOStofunctionproperly.Thisincludesapplicationsaddedforthepur-
poseoftheevaluation,calledbenchmarkapplications,ortestapplications.
ApplicationsmakeuseofOS-levellibrariestoimplementtheirfunctionali-
ties.Typically,applicationsruninuserspaceandtheOSandthedevice
driversexecuteinprivilegedmode.
TheOSlayerincludestheOSkernelandallrequiredlibrariesdeliveredas
partsoftheOS.Anexampleofsuchlibrariesarelibrariesusedbyapplications
tointerfacewiththeOS(POSIX,C-runtimesetc.).
TheOSprovidesasetSofservicestobeusedbyapplications(si’sin
Figure3.1).TheOS-Driverinterfaceconsistsofservicesprovidedbothby

MODELSYSTEM3.1.

35

intheOSFigure(os3.1).x.y’sinCollectivFigureely3.1)theyandareservicesreferredprotoasvidedthebysetdrivOersof(theservicesdsx.yin’s
thisinterface.EachapplicationAPPxusesasetofOSservices,termedAx,
⊆SAwherex

......Application layerAPP1...APPn
......[OS-Application inte]efac......s}ireyOS LaOperating System[OS-Driver interface]...............}osx.y
.........}dsx.y
Driver LayerD1D2...DN
Hardware LayerHardware Platform

del.moSystem3.1:Figure

Adriverismodeledasacomponenthavingbothimportandexportinter-
facesasillustratedinFigure3.2.Theexportedinterfaceconsistsofasetof
servicesthattheOScallstorequestthedrivertoperformoperations.These
servicesaretermeddsx.yfortheythserviceprovidedbydriverDx.Theim-
portedserviceinterfaceisusedbythedrivertoaccomplishtheserequestsand
canbefromtheOSitselforotherlibrariesinthesystem.Theseservicesare
termedosx.yfortheythserviceimportedbydriverDx.Whennodistinction
ismadebetweenimportedandexportedserviceswetermaservicesatthis
interfacesx.y∈O.Inthetargetenvironmentusedinthisthesis(Windows
CE.Net)aservicecorrespondstoafunctioncall.
Toperformerrorpropagationanalysiswerequiresufficientaccess(with
specification)tothesystemtobeabletointerceptinformationflowinthetwo
interfacesdefined(OS-DriverandOS-Application).Inmostcasesthiscanbe
achievedwithoutrequiringaccesstosourcecode,neitherforthedrivers,nor
fortheOSitself.ForWindowsCE.Netnosuchaccessisrequired.However,
accessisneededtothesourcecodeofthebenchmarkapplications,forinstru-
mentingthemwithassertionsusedtotracktheoutcomesofinjections.The
availabilityofinterfacespecificationsisabasicrequirementforanyOSopen
forextensionsbynewtypesofdrivers/applications.

36

CHAPTER3.SYSTEMANDERRORMODEL

Services exported
by the driver (dsx.y)

Target driver Dx

Services imported
om the OS and fr)omponents (osother cx.yFigure3.2:Drivermodel.
delMoError3.2Inordertoconductfaultinjectionbasedexperimentalstressing,threeques-
namely:arise,tionsinject?toWhere•inject?toWhat•inject?toWhen•Theanswerstothesequestionscorrespondtothreepropertiesofanerror
model,referredtoastheerrortype,errorlocationanderrortrigger.Another,
fourthproperty,relatedtotheerrortriggerisforhowlongtoinject.Eachof
thesepropertiesofanerrormodelisdiscussedinthefollowingsubsections.
Throughoutthisthesiswedonotmakeanydistinctionbetweentheterms
errorsandfaults.Consequently,wewilluseerrorwhendiscussingtheper-
turbationsinsertedinthesystem.Whenthedistinctionisneeded,wewill
explicitlyusethetermfault.

3.2.1ErrorType
Theerrortypeconstitutesthenatureoftheerror.Theerrortyperelatesto
theoriginoftheerror,i.e.,thefault,butalsotothemanifestationoffaultsas
errors.Theerrortypedescribeshowanerrorchangessomeinternalstateof
thesystem,fromtheoriginally(assumed)correctstate,toanother,possibly
state.erroneous,

MODELORERR3.2.

37

Dependingonthegoaloftheevaluation,errortypesarechoseneitherto
ascloselyaspossiblematcherrorsexpectedtoappearinthesystemasitis
deployedinthefield,orgenericerrortypesareused,basedontheirability
toprovokethesystemsuchthatweaknessesinhandlingperturbationsare
discovered.Asourinterestisonrobustness,i.e.,howthesystemhandles
externalperturbations,ourgoalistousemodelsthatprovokeasmanyand
asdiversevulnerabilitiesaspossible.Itcanalsobearguedthatwhenthe
purposeoftheevaluationiscomparative,asisthecasehere,thevalueof
usingarealisticerrormodelisdecreased,assumingthattherelativeeffectof
differentmodelsisthesame[Hiller,2002].Chapter6studiestheselectionof
differenterrormodelsexplicitly.Aspreviouslynoted,robustnessevaluation
canalsobeameansforfindingsecurity-relevantvulnerabilities.
Faultinjectionoriginatesfromthedesiretoestimatetheeffectiveness
oferrordetectionandrecoverymechanisms(EDRMs)builtintoasystem.
Forthispurposeonechoosestousetheerrormodelusedforthedesignof
thesemechanisms.Thesecondtypeofevaluationisexplorativeinnature.
WithoutknowledgeofpresenceorcoverageofanyEDRMs,thesystemis
evaluatedtoseehowithandlestheperturbationsinjected.Thistypeof
evaluationcanbeguidedbytheneedtoexploreextra-functionalbehaviorof
thesystem(orlackthereof)orbylackofoperationalscenarios.Thesecond
caseisespeciallytrueforgeneralpurposesystemcomponents,suchasOS’s,
whichmaybeusedinmany,fundamentallydifferent,operationalcontexts.
Thefocusofthisthesisisonexplorationofrobustnessvulnerabilitiesof
OS’s.Tothisendwehavechosenerrortypesbasedontheirusefulnessin
otherresearchprojectsaswellasreal-worldprojectsasreportedinliterature.
Threemainerrormodelshavebeenused:datatype-basederrors,bit-flips
alues).v(randomfuzzingandAnerrorappearingattheinterfaceofacomponent(suchasadevice
driver)appearsasadatalevelerror,i.e.,the(data)valueofsomeparameter
usedintheinterfacehasanerroneousvalue.Whatconstituteserroneous
dependsontheinterface/parameterinquestionandthestatethesystem
isin.Forinstance,adriverreturninganerroneous“busy”valuemayonly
causeasmalldelayfortheoverallsystem(providedthataretrymechanism
exists).Adriverresponding“ready”whenitinfactisnotreadytoreceive
commandsmaycauseseverefailuresinthesystem.
Therearemanypossiblesourcesforerroneousvaluestoappearatthe
interface,suchaspropagatinghardwareerrors,faultyassignmentsofvari-
ablesinthedrivercode,wronguserinputsorconcurrencyproblems.ODC
isaframeworktoclassifysoftwaredefectsandmanyofthemcanmanifest
asinterfaceerrors(oneclassinODCrefersspecificallytointerfacedefects)
[Chillarege,1996].Aswemodeltheeffectsoffaults,i.e.,errors,theinterface

38

CHAPTER3.SYSTEMANDERRORMODEL

levelerrorsmodeltheeffectsofmanyoftheunderlyingfaults(andconse-
quensolelytlyonODCdatalevclasseseloferrorsfordefects)deviceasdrivpropagatingersattheierrors.nterfaceHowetover,theasOSwewfoecusdo
notshouldofferbecompletetreatedascovaerageofcomplemenallopttoerationalothertecfaults.hniques.Therefore,ourapproach
menAlltationthreeandmothedelslevusedelofrepresensemantticdataexpressivleveleness.errors,butThedifferthreeinmothedelsimple-will
nowbepresentedonebyone.

DataTypeErrorModel
Theparameterinmanifestationquestion.ofaSincedatamoerrordernalsocompilersdependsconontainttheypecdatahectkypers,eofnottheall
assignmentsarepossibletodoforaparameters,restrictingthesetofpossible
errorsthereforefortheselectedparameters.dependingTheontheerroneousdatavtyaluepeofforthedatatypparametere(DTin)errorsquestion.are
AsmostdevicedriversarewrittenintheCprogramminglanguagewewill
suppconsiderortedC-stinyleotherdatatyhigh-levpes.elThislanguages,excludessuchhighaslevelclasses/obabstractjectsindataobtypject-es
languages.programmingtedorienSomeOS’sdoprovidethepossibilitytowritedevicedriversinother
programminglanguages,forinstanceC++.However,sincemostdriversare
stillwritteninCwefocusonsuchinterfaces.Inprinciple,object-oriented
interfacescanbeseenasextensionsofthedatastructuresused(wealready
supportthestructdatastructure)inC.
Foreachdatatypeusedasetofinjectioncasesaredefined.Theseare
predefined,beforeinjection,andarechosenbasedontheireffectivenessin
exposingvulnerabilitiesinthesystem[Koopmanetal.,1997].Valuesinclude
predefined(norandomnessinvolved)testvalues,offsetvaluesandboundary
values.Offsetvaluesmodifytheoriginalvalue,forinstanceusingaddition
ordefinedissubstractiontypicallyoperationsrelativelyonlothew,allooriginalwingvthisalue.errorThenmoumdelbertoofincurinjectionsfewer
1997].injectionsChapter(onav6erage)discussesthan,thefornuminstance,beroftheinjectionsbit-flipmodelrequired[Koopmancomparedetal.,to
dels.moerrorotherSincetheinjectioncasesaredefinedonadatatype-basisthenumberof
Tsuchypicallydata,tympesultipleusedbservicesecomesinthesuchiscalingnterfacesfactoruseforthethedatasametypdataeterrorypes.moFdel.or
instance,in[Kroppetal.,1998]only20datatypeswereusedforthe233
thePOSIX20datafunctionstypes,targeted.makingthisEacherrorofthemodel233scalefunctionsverywusedellawithcomthenbinationumberof

MODELORERR3.2.

39

offunctions.Nospecializedinjectionscaffoldingisrequiredforeachtested
function.However,onecaveatisthatinformationonthedatatypeusedis
requiredtoselecttherightinjectioncases.

Table3.1:Overviewofthedatatypesused.
DatatypeC-Type#Cases
7inttegersIn5intunsigned7long5longunsigned7short5shortunsigned7INTEGERLARGE3void*Misc6HKEY*}...{struct4Strings7charCharacters5charunsigned5twchar1boololeanBoEnumsmultiplecases#identifiers

Table3.1showsanoverviewofallthedatatypesusedandthenumber
ofcasesimplementedforeachofthetypes.Theinjectioncaseswerecho-
senbasedontheirreportedusedinliterature,suchastheBallistaproject
[Bal]andtoincludecasesmodifyingtheoriginalvalue.Notrelyingsolely
onexplorationstaticallyofdefined“closevtoalues,correct”likevbalues,oundarywhicvhaluesmayandbespveryecialvaluesproblematicallowsto
candetectoccurandinrecorealvecorde.from.ItisFimpurthermoreortanttowbeehaablevetoonlymatchselectedthevinjectedalueswhicerrorh
toeacahhyperrorotheticalinjectedfaultmustinhathevecobde.eenpSinceossiblewetosimintroulateducemostlybyansoftwareimplemenfaults,ta-
tionfault.Consequently,eachinjectederrormustbecompilable,i.e.,itmust
passthetypecheckbythecompiler.Byhavingaspecificinjectioncasefor
eachdatatypethispropertyismaintained.
oneDatainjectiontypecaseerrorsforalsotheptreatoinpter,ointersnamelyasassettingpecialittodatatNULLyp.eWandrongreservuseesof
poinNULL-pters,oinbuttersnotisforacommonimplicitpointers,programmingsuchasmistakstrings,e.Thiswhichdonehaveforthisexplicitcase

40

CHAPTER3.SYSTEMANDERRORMODEL

Table3.2:Datatypeerrorcasesfortypeint.
aluevNew#Case1(Originalvalue)-1
2(Originalvalue)+1
1304-15MININT6MAXINT7

Table3.3:Datatypeerrorcasesforstrings.
aluevNew#Case1Overwriteendofstring(’\0’)
2Increasereferencepointer
3Replacewithemptystring
NULLtoreferenceSet4

definedasaspecialinjectioncase.Tofurtherillustratehowdatatypeerrors
aredefined,Table3.2showsthecasesforthetypeint.Cases1and2
modifytheoriginalvaluebyaddinganoffsettoit.Cases3-7usecommonly
difficultvaluesandboundaryvalues.Table3.3showstheerrorsinjected
forstringparameters(bothforUnicodeandASCIIstrings).Thefirstcase
effectivelyevaluatestherelianceontheendcharacterforstrings.Thesecond
caseshortensthestringbydisregardingthefirstcharacterandthethirdcase
replacestheentirestringwithanemptystring.Thelastcasesetsthepointer
tothestringtoNULL.
Thechoiceofvalueswasdonebasedonknownproblematicvalues,and
previousstudies;anditwaskeptrelativelylow.Thechoiceofeffectivevalues
(thoseexposingvulnerabilities)isdifficultandcontextdependent,andis
similartotheproblemsarisingwhenselectingsuitableequivalenceclasses
forfunctionaltesting[Hamlet,2006].Forthisstudywehavethereforeopted
forasimpleandlightweightdatatypemodel.Ourmodeldoesnothave
thesameexpressivepowerastheonesusedforinstancein[Koopmanetal.,
1997]or[FetzerandXiao,2002b],asitisbasedsolelyonthedatatypeof
theparameter.Thein-situinjectionstrategyeffectivelylimitsthepossible
typesofinjectionsthatcanbecarriedout.Themodelusedischosenforits
simplicityandlownumberofinjectioncases.

MODELORERR3.2.

41

delMoErrorBit-FlipvWhenoltagelevhardwelsareinelementransistorstsarecanexpcosedhange,to,forcausingtheinstance,logicalradiationonesandorEMIzerostheto
csimhangeulatesvaluestheseortyevpeensogetfstucfaultskinatachardwertainare,vbyalue.selectivTheelybit-flipflippingmodel(certainBF)
bits,changingthevaluefromonetozeroorviceversa.
TheBFmodelwasfirstintroducedtosimulatehardwareerrorsasabove,
andwasusedinmultiplefaultinjectiontools(seeSection2.3forseveralex-
amplesofsuchtools).Atfirst,hardware-basedinjectiontoolswereused,but
SoftWareImplementedFaultInjection(SWIFI)soonemerged,wherehard-
warefaultsareinjectedusingsoftwaremechanisms.SWIFIimprovesflexibil-
ityandeaseimplementation(nospecialhardwarecomponentsneeded),but
maybelimitedinwhichareasofhardwarecanbetargeted.Onceinjection
couldbeconductedusingsoftware,bit-flipsweresoonalsousedtosimulate
softwarefaults[VoasandCharron,1996].Thereisstilladebatewhether
theBFmodelaccuratelyreflectssoftwarefaults.Someauthorsarguethat
thisrelationisoflesserimportance,especiallyforrobustnessevaluation,and
thattheimportantquestioniswhethertheeffectsoftheinjectedfaults(the
errors)arethesameasthoseofrealfaults[Jarbouietal.,2002b].
IntheBFmodeleachparameterisseenasadataword,whereselected
twbitseenaresingleflippevedentotsimupsetsulate(SEU,faultsalsointhereferredmotodule.asAsofterrdifferenceorsinismadediscussionsbe-
onhardwarereliability)whereonlyonebitisflipped,andthecasewhere
mwhereultipleonebitsofthearebitsflippised.Inselectedthisasthesistargetfoandcusisflipponed.theForasimpler32bitSEUarcmohitec-del,
doturenotthisusettheypicallyfull32resultsbitsinand32thusainjectionssmallerpernumbparameter.erofbitsSomecanbedataused.types
Thegreatestadvantageofthebit-flipmodelwhenusedoninterfacepa-
beingrametersveryissimsimpletoultaneouslyimplemenitstitgreatestlackswineakness,expressivnamelyenessforsimplicitmorey.complexWhile
errorsinabstractdatatypes,suchasstringsetc.

delMoErroruzzingFThefirstuseofthefuzzingerrormodel(FZ)inthecontextofrobustness
evaluationwasreportedin[Milleretal.,1990].HereUNIXutilityprograms
werefedrandominputdataandtheirbehaviorwasobserved.Thetechnique
ofrandominputforrobustnessevaluationwasfurtherdevelopedin[Ghosh
etstanceal.,by1998;MicrosoftOehlert,as2005;partGofotheirdefroidetSecureal.,Dev2007]elopmenandtisadvLifecycleocated[Hoforwardin-

42

CHAPTER3.SYSTEMANDERRORMODEL

andForiLipner,nterface2006]faultandisinjection,mainlyfofuzzingcusedontranslatesfilesandintonetwreplacingorkprotothevcols.alueof
aparameterintheinterfacewitharandomvalue.Therandomvalueisuni-
formlyselectedacrossall32-bitvalues.Theuniformdistributionisselected
sinceparameternoknovalueswledgearepresenregardingt(oroperationalassumed),whicprofileshorcouldequivjustifyalenceusingclassesadif-of
theferentareaofdistribution.randomMoretesting(seesophisticatedSectiontec2.5.1).hniquesAstruecanalsorandomnessbeapplied,isadifficultsin
toachieve,pseudo-randomgeneratorsareused.Careneedstobetakento
eacmakheexpsureerimenthatt,theelseseedstheusedsametovaluethesewillbegeneratorschosenareforeacselectedhinjection.differentlyfor
ItisimportanttonotethatwhereasBF(andinsomecasesalsoDT)
modifiesagivenvalue,i.e.,thenewerroneousvaluedependsontheoriginal
(presumablycorrect)value,fuzzingcompletelyreplacesthevaluewithanew
one.correct”Thisvalues,meansthatwhereasBFFZcanisbebexpetterected(thankstomoretotheeffectivrandomelytestselection)“closetoat
findingmorerarevaluescausingvulnerabilities.

cationLoError3.2.2Aslocationanddistributionofactualfaultsmaybeunknown,ortoocostly
tofullyexplore,acommonapproachtoinjecterrorsinsteadoffaults,i.e.,to
injecttheconsequencesofactivatedfaultsratherthanthefaultsthemselves
[Bartonetal.,1990].Manyfaultsmaymanifestasthesameerror,i.e.,atthe
samelocation/levelinthesystem.ThisconceptisillustratedinFigure3.3.
Aninjectederrormayrepresentmultiplefaults(at2aor2b),originatingat
differentlocations.Anerrorinjectedattheinterface(at3.)maytherefore
representmultipleerrorshavingpropagatedtothesamelocation.
Jarbouietal.[2003]makesadistinctionbetweenthelevelinthesystem
whereafaultisinjectedandthereferencelocationoftheoriginatingfault,
dr.InFigure3.3thiscorrespondstothedistancebetweenpoint1and2.
Furthermore,adistinctionismadebetweenthelocationoftheinjectederror
andthelevelwherethefailuresofthesystemareobserved,do.InFigure3.3
betweenpoints1and4.
Inthisthesiswehavefocusedonerrorsappearingintheinterfacebetween
theOSanditsdevicedrivers.Forthepurposeofrobustnessevaluationof
anOS,thisinterfacerepresentsagoodlocationforinjectingerrorsforthe
reasons:wingfollo

•faceThisisafacilitatesstandardfairinterfacecomparisonsdefinedacrossforthedriversOS.forUsingtheasamestandardOS.inter-

MODELORERR3.2.

omponent BC43

12b2aomponent AC

tion locationault injec F1.ult originsa F2.

3. Error location in interface
ation pointv Obser4.

example.manifestationError3.3:Figure

43

•Theinterfaceallowsforlow-intrusioninterceptionofthecallsbeing
madeacrosstheinterface.Lowintrusionmeansthatnosourcecode
cavhangesailableareforneeded,commercialneitherprotoducts.OS,nortodrivers,whichmaynotbe
•AsandthedescribOSedalloabowsve,forsiminjectingulatingerrorsminultipletheinfaultsterfacewithinbetwtheeendrivtheerdrivander
inthehardwareitcontrols.
•forInjectingtheinjectederrorsaterrorsthislevusingelallowspre-profiling.forachievingThispro100%cessisactivdescribationedratioin
4.6.Section•Nodriver-specificknowledgeisneeded,makingtheapproachreadily
availablealsofornon-driverexperts.
•Theinterfaceisanopeninterface,inthatother3rdpartydevelopers
thearevgivendorenofaccessthetoOS.thefullinterface,makingrobustnessakeyissuefor

riggerTError3.2.3Anerrorcanbeapermanentdefectpresentinthesystem,oracombination
ofdefectsandexternalperturbationsthattogetherleadtoanerror.This
givesrisetotwodistinctpropertiesofanerror,relatingtotiming:theevent
triggeringtheappearanceofanerrorandthedurationoftheerroronceit
ears.app

44

CHAPTER3.SYSTEMANDERRORMODEL

thoughForrecenhardwtareresearcrelatedhindicateserrors,pthatermanenthetratioerrorsofaretransiennotterrorsuncommon,isincreas-even
ing.Forsoftware,permanenterrors(Bohrbugs)aretargetedusingtesting.
ForrobustnessevaluationthemaintargetisHeisenbugs.Furthermore,the
vtionerybetnatureweenofinputsHeisenandbugsfailuremaketcannothembedifficulttoestablished.findsinceTherefore,asimpletheyrela-are
havinsteadefosimcusedulatedonabythetransientinjectionerrorofmodelerrorsaswinebtheelievesystem.thisIntothismoreworkcloselywe
tines.representMultiplebehaviorotherbyfaultthedrivinjectionerstonotolsfoundallowforthroughinjectionstandardofbothtestingtransienrou-t,
intermittentandpermanentfaults,e.g.,[Hanetal.,1995;Stottetal.,2000].
ThetriggerusedforaninjectionisstudiedindepthinChapter7.Which
typeoftriggertouse(event-ortime-driven)andwhichparameterstouse
isaresultingnon-trivialinantask.effective,Thisyetthesissimple,propproosescessafornoveltheevenselectiont-drivenofapproactriggeringh,
evwhicenhts.alloThewspropinjectionosedinapproacdifferenhitsstatesbasedofonthethesystem.usageprofileofthedrivers,

3.2.4OtherContemporarySoftwareErrorModels
Theworkreportedin[Albinetetal.,2004;Dur˜aesandMadeira,2003;Arlat
etal.,2002;Guetal.,2004;Jarbouietal.,2002a]exploredtheuseofvari-
ouserrormodelsandinjectiontechniquesforOSrobustnessevaluationand
benchmarking.In[Jarbouietal.,2002a],forinstance,errormodelssimilar
tooursareused,butareinjectedatdifferentlevelswithintheLinuxkernel.
Dur˜aeset.al.useacodemutationerrormodel,wherecodesegmentsof
devicedriversaretargeted[Dur˜aesandMadeira,2003].Mutationshavelong
beenusedtoassesstheeffectivenessoftesting.Forinstance,DeMilloused
codemutationsforinvestigatingtheefficiencyofasetoftestcasesindiscov-
eringflawsofapieceofsoftware[DeMilloetal.,1978].Theauthorsdevelopa
theoryregardingthecouplingeffect,namelythatifasetoftestcases(input
values)candistinguishall(simple)mutationsofaprogramfromthecorrect
one,thenitwillalsodetectmorecomplexfaultsinthecode.Mutationswere
laterusedinfault-basedtesting(forinstance[Zeil,1983;Morell,1990])to
verifythatcertaincodelevelerrorsarenotpresentinapieceofsoftware.
Finally,in[Moraesetal.,2006]theauthorsnotethaterrorsappearing
atinterfacesofcomponents,thoughbeingusefulforrobustnessevaluation,
donotnecessarilyrepresentfaultsinthecode.Sinceweareindeedfocusing
onrobustness,interfaceerrorsarerelevant.However,continuedresearchon
errorpropagationcanhopefullyrevealwhicherrorscanberepresentedatthe
interface,andwhichnot.

ONMENTENVIRALEXPERIMENT3.3.

3.3ExperimentalEnvironment

45

senThetaexperimencommonlytalenusedOS.vironmenWtechosedetailedWindointhiswsCEsection.Netwabscecausehosenittoprorepre-vides
ittheispaossibilitwide-spreadytoOS,customizewiththeusesinwholeawidesystemrangeimageofproinanducts.easyFmannerurthermore,and
itscasearcstudyhitecture.ThisresemsectionblesfirstthatinoftromostducesmoWindodernwsOS’s,CE.Netmaking,itsitanarcexcellenhitecturet
andtoolsupport.Thenthehardwaresetupusedispresentedtogetherwith
adescriptionofthesoftwaresetup.

.NetCEwsWindo3.3.1WindowsCE.NetisanOSfromMicrosoft,targetedmainlyattheembed-
dedmarket.Itishighlyconfigurable,makingitwidelyusedindifferent
surfconfigurationstations,Pinoindiverset-of-Saleproducts,stations,suchGPSasnamobilevigatorsphonesetc.andWindoPDwsAs,CEInisternetthe
foundationformorespecificembeddedOS’sfromMicrosoft,suchasPocket
.MobileWindowsandPCThefirstreleasedproductsbasedonWindowsCEwerereleasedin1996
assocalledHandheldPCs.Furtherrevisionsofthefirstversionhasledto
thethiscurrenthesisistlybasedlatestvonversionersion6.04.2atofthethetimeOS,ofcalledwritingWindothiswsthesis.CEThe.Net,workwhicinh
wasreleasedin2003.Forreasonsofcontinuitywehavechosennotmoveto
anewerversion.Therefore,therestofthisthesisdescribesversion4.2of
theOS.AgoodintroductorytextbookonprogrammingforWindowsCEis
DouglasBoling’sbookProgrammingMicrosoftWindowsCE.Net[Boling,
2003].Figure3.4showsanoverviewofthearchitectureifWindowsCE.Net.
ItshowshowtheOSlayerissplitintwoparts,thegenericOSlayerpro-
videdbyMicrosoft,whichformstheinterfaceusedbyapplications,andthe
OEMlayerwhichisprovidedbytheOEM(OriginalEquipmentManufac-
layturer)ermakembesiteddingpossibleWindotowsuseCEWindoinawsproCEductforsoldmanytohardwcustomers.arearcThehitecturesOEM
drivandersforaaremtheultitudemainofpresperipheralonsibilityodevicesftheandOEMs,technologies.somegenericAlthoughdriversdeviceare
alsoprovidedbyMicrosoftaspartoftheOSpackage.
Inrelationtooursystemmodel(Figure3.1)theDriverLayerisformed
bythedriversprovidedeitherbytheOEMorMicrosoft.Therestofthe
OEMWindolayerwsisCEconsidered.Net,asalthoughpartbofeingtheaOSLacompletelyyer.differentOSthanother

46

CHAPTER3.SYSTEMANDERRORMODEL

Internet Client ServicesWinCE ApplicationsUser Interface
Custom ApplicationsApplication Layer
Core DLLObject Store
Graphic WindowingCommunication
Multimediaand Event SystemDeviceServices and
Technologies(GWES)ManagerNetworking
KernelOperating System Layer
OEM Adaptation Layer
BootConfigurationDrivers
LoaderFilesOEM Layer

reyOEM La

Hardware Layer

Figure3.4:AnoverviewofthearchitectureofWindowsCE.Net.Figure
[MSDN].fromadopted

OS’susedonprotheducedbWindoywsMicrosoft,platform.offersmanExamplesyoftheincludesametheservices.Netandplatforminterfaces(asa
subsetknownas.NetCompactFramework),Win32,MFCetc.Atdesign
AtimecomptheonentdesignerisacanseparatechoosepiecewhicofhcompfunctionalitonentsytothatincludecanineitherthebeOSincludedimage.
intheOSimageornot.ThedesignerusesthePlatformBuildertooltobuild
theOSimageandtodownloadittothetargetmachine.

3.3.2DeviceDriversinWindowsCE
AdevicedriverforWindowsCE.Netisadynamiclinklibraries(Dll).Itis
dynamicallylinkedintoanotherprocessatloadtime.Thishostprocesscan
thenusetheservicesprovidedbythedriver.MostdriversinWindo1wsCE
.Netareloadedbythedevicemanager(device.exe).TheRegistryisused
tospecifywhichdriversaretobeloadedinthesystem,andinwhichorder
(iftherearedependenciesacrossdriverstheordermightbeimportant).
TheinterfaceusedforcommunicationbetweentheOSandthedriver
1TheRegistryisaWindowsspecifictechniquetocentrallystoreconfigurationinfor-
mation,bothforthesystemandforapplications.

ONMENTENVIRALEXPERIMENT3.3.

47

isdefinedintheCprogramminglanguage.Eachdriverexportsasetof
servicesanduses(imports)servicefromtheOStoperformservicerequested.
ApplicationsaccessdevicesthroughtheOS,forexamplethroughthefile
systeminterface.Thesecallsaretranslatedtocorrespondingcallsintothe
er.drivSinceWindowsCE.Netissupportedonmanyhardwareplatformsand
usedmainlyforembeddedsystemsitsupportsmanydifferentperipheral
devices.WindowsCE.Netsupportsthreebasictypesofdrivers,native,bus
andStreaminterfacedrivers.Nativedriversarebuilt-indriversprovided
bythehardwarevendors.Theyaretypicallytiedtospecifichardwareand
OSversions,forcontrollingthingssuchaskeyboards,touchscreensetc.They
mightusecompletelycustominterfacestotheOSandthereforeoftenrequire
changeswhennewversionsoftheOSarereleased.Nativedriverscanbeseen
asextensionsoftheOSfortherequiredhardware,ratherthansupporting
devices.add-onThemostcommontypeofinterfaceistheStreaminterface,whichpro-
videsstandardentrypointsforadriver.Sincetheinterfacesiswellspecified
itallowsfor3rdpartydeveloperstobuilddriversoftheOS.Table3.4shows
anoverviewoftheentrypointsprovided.TheprefixCOMisbyconven-
tionusedforserialdrivers,otherdriversuseotherprefixes,suchasCONfor
consoledriversorWAVforaudiowavedrivers.

Table3.4:Streaminterfaceforserialdriver.
NamerebNum0InitCOMDeinitCOM1enOpCOM2CloseCOM354COMCOMWReadrite
SeekCOM6trolIOConCOM78COMPowerDown
9COMPowerUp

WefocusmostlyonStreaminterfacedrivers,astheseuseastandard
3inrdterfacepartiesalloandwingrepresenfairtadd-oncomparisonpacrosseripheralsdrivwhereers,aarecthoiceypicallyamongdevcompelopedetingby
productscanactuallybemade.

48

CHAPTER3.SYSTEMANDERRORMODEL

areHardw3.3.3Thehardwaresetupusedfortheexperimentspresentedinthisthesisis
anXScale-basedreferenceboardproducedbyIntrinsycLtd[Int].Multiple
boardswereacquiredtoallowforparallelexecutionoftheinjectionexperi-
ts.menTheboardsarebasedontheIntelPXA250architecture,withanIntel
XScaleprocessorchip.Eachboardcarries64MBRAMand32MBFlash-
basedROM.Abootloaderispresentinflashandallowsforsimpledownload
ofnewOSimages,eithertotheROM,ortoRAMforimmediateboot.A
dedicatedflashchipisalsopresent,withaccessfromuserspaceapplications.
RS232),Thebowhereardsareoneactsequippasedwithdebugapsetort.ofTwserialoportstandardconnectionsEthernetso(standardckets
(RJ45)CompactFlashallowforsocnetket,workwherepconnection.eripheralEachdevicesboardcanisbealsoattacequipphededontowiththea
bus.PCMCIA

reawSoft3.3.4UsingtheprovidedPlatformBuildertool,asmall-footprintimagecontain-
ingtheOSandtheassociatedsoftwaremodulesdescribedinSection4.5is
builtanddownloadedtothetargetboardusingEthernet.Startingwiththe
smallestsupportedimage,onlycomponentsforthedesireddriversandthe
hardwarespecificcomponentssuppliedbythevendorwereincluded.This
resultedinanimagewithafootprintoflessthan3MB.Sincetheboardsused
areheadless,onlyminimalgraphicsandwindowingcomponentsareneeded.
Alsomediarelated(e.g.,readersandviewersofdifferentfileformats)and
Internetcomponents(web,telnet,andftpserversetc.)areleftoutofthe
image.Thiswaywegetasystemwhichcontainsaminimumnumberof
componentsthatmayinfluencetheresultoftheexperiments.

3.3.5SelectedDriversforCaseStudy
Threedriverswerechosenasrepresentativeforacasestudy:aserialport
driver(cerfioserial),anetworkcarddriver(91C111)andadriverforaccess-
ingaCompactFlashcardconnectedtothePCMCIAbus(atadisk).These
driverswerechosenastheyrepresentdifferentclassesofdrivers,thefirst
twoarecommontypesofcommunicationandthethirdaccesstoexternal
sourcesattachedtothesystem.Thefirsttwoaresupplied,inthiscase,by
thevendorofthedevelopmentboard(notthesameastheproducerofthe
hardwarecircuits),whereastheCompactFlashdriverisdeliveredaspartof

SUMMAR3.4.Y

49

theOS.Theyalsorepresentdriverstypicallyfoundonmanysystemsand
platforms.AllthreeprovidetherequiredStreaminterfaceentrypointsand
areloadedbydevice.exeatloadtime.

3.4Summary

Thischapterhasintroducedthepreliminariesneededforthediscussioninthe
followingchapters.Thesystemmodelusedwasintroducedandforreference
Table3.5providesanoverviewofthesymbolsdefined.Theerrormodelsused
wereintroducedanddiscussedfollowedbyadescriptionoftheexperimental
environmentused,includingbothhardwareandsoftwareaspects.

lobSymPAPiDxsiosx.jdsx.jsx.ySAxO

Table3.5:Summaryofsymbolsintroduced.

DescriptionApplicationioutofatotalnapplicationsrunningon
systemtheDriverxfromatotalofNdriversinthesystem
AserviceprovidedbytheOS,tobeusedbyanapplica-
tion.AserviceprovidedbytheOS,usedbydriverDx.
AserviceprovidedbydriverDxtobeusedbytheOS.
AnyserviceintheOS-Driverinterface,disregardingthe
differencebetweenimportsandexports.
ThesetofOSservicesprovidedintheOS-
terface.inApplicationThesetofOSservicesusedbyAPPx.
ThesetofservicesintheOS-Driverinterface.

50

CHAPTER3.SYSTEMANDORERRMODEL

Chapter

4

InjectionaultF

orkramewF

Howtoperformfaultinjectionfordevicedrivers?

requiredInorderexptoerimenprotsvideathefaultinjectioninfrastructureframewandorksupphasortbeenneededtoimplemenperformtedforthe
errorWindomowsdelsCEin.Net.theinTheterfaceframewbetworkeenallothewsOSforandinjectionitsdeviceofavdrivarieders.setThisof
chapterfirstdiscussestherequirementsontheinjectionframeworkandthen
describesthearchitectureoftheframework,itsimplementationforWindows
CE.Netandtheextensionpossibilitiesitprovides.

51

52

CHAPTER4.FAULTINJECTIONFRAMEWORK

troIn4.1duction

Whenperformingresearchusingfaultinjectionaflexibleenvironmentis
needed,suchthatnewideas1caneasilybepursued,withoutextensivere-
designoftheunderlyingtool.Theframeworkshouldsupportautomatic
configuration,suchthatmanagementofconfigurationsettingsissimplified
andtheriskofmistakesisminimized.Itshouldalsosimplifythecollection,
data.ofanalysisandstorageAplethoraoffaultinjectiontoolsexistinliterature(seeSection2.3fora
comprehensivelist).However,eventhoughsomeofthetoolsmayhavebeen
abletoadaptwedecidedtoimplementanewinjectionframeworktobetter
suitourneeds.Fortheimplementationseveralrequirementswerepostulated,
whichallowforaflexiblefaultinjectionenvironment:

•Extensibility:Injectionofmultipledrivers,usingmultipleerrormod-
elsshouldbesupported.Theper-driverscaffoldingshouldbemini-
mized.

•Blacexternalk-boevx:aluatorsNoaccessthepotossibilitsourceytocodeusetheshouldtool.beassumed,toallow

•Datahandling:Thedataextractedfromtheexperimentsshouldbe
processedandstoredwithoutlossofinformation,andallowforeasy
extensionandinteroperabilitywithexternaltools.

•Automation:Automationiskeytoa)reducethetimeoverheadasso-
ciatedwithconfiguringandrunningfaultinjectionexperiments;b)to
minimizetheriskofusermistakesinconfiguringtheexperiments;and
c)tomoreeasilyadapttochangesinthesetuprequiringconfiguration
hanges.c

Thesedesignrequirementswerethebasisforthedesignofourfaultin-
jectionframework.Eventhoughoneofthegoalsisextensibilitywehave
chosentoimplementtheframeworkforaspecificOS,WindowsCE.Net.
ExtendingtheframeworkforotherOS’sispartoffuturedirections.The
restifthechapterdescribestheoveralldesignoftheframework,thephysical
setupusedandthefunctionalityofthesystemcomponents.

1Asourtoolisflexibleandsupportsextensionswewillrefertoitasaframeworkfor
faultinjection.

4.2.EVALUATION,CAMPAIGN&RUN

53

4.2Evaluation,Campaign&Run
Tosimplifythediscussionwemakeadistinctionbetweenanevaluation,an
spinjectionecificcerrorampandaigntheandanobservationinjection(andrun.Alogging)nofinjectiontheruneffectsisofthetheinjectioninjection.ofa
Ancase,asinjectionpecificdrivcampaigner,iserroramocollectiondelandofininjectionterfaceruns,(Dll,pimpertainingorted/expto,inorted).our
Asetincludingofpinjectionossiblymultiplecampaignsdrivformersandthebasiserrorformothedels.evaluationofasystem,

SetupareHardw4.3Thetargetsystemfortheevaluationrunsonadedicatedcomputer,the
TargetComputer.Asalreadymentionedweusespecialdevelopmentboards
fortheevaluation.TheevaluatorcontrolstheevaluationfromanormalPC
workstation(inoursetuprunningWindowsXPSP2).

HosteromputC

onnecrial ceStion

EthernetchwitS

CTaomputrget ers

devFigureelopmen4.1:tTheboardhardwtoarethesetup.privateUsingnetwanork.EacEthernethboardswitchistoalsoconnectconnectedeach
directlytotheHostComputerviaserialcables.
Figure4.1showsthehardwaresetup,withtwotargetboardsconnected.
proEachvidesboardisdynamicconnectedIPtoaddressesaprivusingatenetaworkbuilt-inusingDHCPanservEtherneter.Eacswitchbh,oardwhicish
connectedtotheHostComputeroveradebugserialconnectiontoaccessthe

54

CHAPTER4.FAULTINJECTIONFRAMEWORK

bootconnectionsmenuofaretheusedbobotyeacloaderhbandoard’stowreadorkload,debugasoutput.describedbeloAdditionalw.serial

SetuparewSoft4.4

Thebinaryimagedownloadedtoaboardforeachexperimentcampaign
containsallnecessarysoftwarecomponentsrequiredforperformingtheex-
periments.Thisincludesallcomponentscomprisingthesystemdepictedon
thetargetcomputersideofFigure4.2.
PlatformBuildertogetherwiththeEmbeddedVisualC++4.0toolwere
usedtocompileandbuildtheapplicationsandOSimagesused.Mostofthe
codeforthecomponentsdetailedinSection4.5waswritteninC++,orin
somecasestheCprogramminglanguages.FurthermoreSQLServerwasused
ontheHostComputer(WindowsXPworkstation)tostoretheexperiment
data.ApplicationsrunningontheHostComputerarewritteneitherinC#
C++.or(.Net)

SetupInjection4.5

Eachinjectionisspecifiedusingthree(integer)parameters:serviceID,pa-
rameternumberandinjectioncasenumber.TheserviceIDscanbeselected
inanyorder,aslongaseachserviceforeachcampaignisuniquelyidentified.
Whenviouslyhasstoringtohadataveingloballythedatabaseuniqueonidenthetifier.HostParametersComputer,eac(includinghservicereturnob-
values)aresimplynumberedastheyappearintheargumentlist.Finally
injectioncaseshavetobeuniquelyidentifiableforeachparameterandare
definedforthedatatypeanderrormodelused.
Tosupportextensibilityandtohaveaflexibleinjectionframeworkwe
haveinjectionsoptedaretopuseerformedSWIFI.usingNospsoftwaecializedreonly.hardwFigurearesupp4.2ortshoiswsanrequiredoverviewand
ofthemainsoftwarecomponentsofthesystem,showingboththetargetand
HostComputer.ApartfromtheOSitselfanditsdrivers,thesystemcontains
thefollowingmainmodules:

•HostComputer:ThemainresponsibilityoftheHostComputeristo
receiveandstorelogmessagessentbytheExperimentManager.
OSTheimagesHosttoComputerthetargetisadditionallycomputers.usedtobuildanddownloadnew

SETUPINJECTION4.5.

Target Computer
Experimentest ApplicationsTanagerMOperating System- Exp- Exp S. Syncetup.
ggingo- LInterceptor- Restarting

erget drivraT

HosteromputCchoErevserggingoLrevser

Figure4.2:Anoverviewoftheexperimentalsetup.

55

•ExperimentManager:Responsibleforsetup,controlandloggingof
experiments.ItcommunicateswiththeHostComputer,whichstores
messages.log•Interceptor:TheInterceptorisamoduleusedtointerceptcommu-
nicationbetweentheOSandthetargeteddriver.TwotypesofInter-
ceptorsareused,onefortrackingcallsandoneforinjectingerrors.
•TestApplications:Theworkloadconsistsofasetoftestapplications
exercisingthesystemandthedriversinamultitudeofways.
TheInjectorandtheExperimentManagerinteracttocoordinateeach
injectionrunandtosendlogmessagestotheHostComputer.Similarly,the
testapplicationsreporttheirprogresstotheExperimentManager,which
forwardslogmessages.Informationbetweenmodulesisexchangedusing
messagequeues,amessagepassingprimitivenativetoWindowsCE.Net.
ThecomponentsofthetargetcomputerarebuiltintoanewOSimage
asdescribedintheprevioussection.TheOSimageisdownloadedtothe
onboardflashmemoryandloadedintoRAMeachtimetheOS(cold)boots.
Apossibleriskwhenconductingfaultinjectionisthepresenceofdormant
faults,i.e.,faultsfrompreviousinjectionsthatareleftdormantinthesystem.
Thiscanleadtounpredictableandnon-reproducibleresults,astheoutcome
ofaninjectionmaybeaffectedbysuchdormantfaults.Tominimizethis
riskthesystemis(cold)restartedbetweeneachinjection,resultingina

56

CHAPTER4.FAULTINJECTIONFRAMEWORK

freshcopyoftheOSimagebeingloadedforeachinjectionrun.Logsare
sentandstoredonadifferentmachine(theHostComputer)andaminimal
setofconfigurationinformationisstoredinflashmemory.Thisprocessis
conservativeandcommonforfaultinjectionexperiments,e.g.,[Chillarege
andBowen,1989;Guetal.,2003].However,therestartbeforeforeach
injectionincursasubstantialrun-timeoverheadwhenerrorsaremaskedor
rwritten.evoTheprocessofproducinganinjection-readyOSimageisillustratedin
Figure4.3.Firstthebinaryoftheoriginaldriverisscannedtoidentifyex-
portedandimportedservices.Togetherwithinformationinsystemheader
filesandtheonlinedocumentationtheInterceptormoduleisconstructed(A
inFigure4.3).Forinterceptingimportedfunctionsthebinaryoftheorigi-
naldriverismodifiedtoimportthefunctionsfromtheInterceptormodule
insteadoftheoriginalservices(BinFigure4.3).Forexportedservicesthe
systemisconfiguredtousetheInterceptormoduleinsteadoftheoriginal
driver,bymodifyingtheconfigurationoftheloadingprocessofdriversin
thesystemRegistry(CinFigure4.3).Lastlythe(modified)driver,Inter-
ceptor,configurationandothersystemcomponentsaremergedintoasingle
OSimagetobedownloadedontothetargetcomputer(DinFigure4.3).

ABuild specialized interceptor wrapper
BModify driver to use wrapperExperiment
CModify RegistryManagerD
DBuild new OS imageTApplicationsest
BerDrivModifiederDrivHeaderFilesInterceptor
CSpecification/DocumentationAconfigurRegistryation

CconfigurRegistryation

Figure4.3:BuildinganOSimageforinjection.

tionInjecimage

SETUPINJECTION4.5.

57

ManagerterimenExp4.5.1TheExperimentManagerrunsasaseparateprocessoneachboardand
isresponsibleforsetup,monitoringandlogging.Theparametersneededto
configurethesystemaresetupusingthesystemRegistry,whichispartofthe
binaryimageoftheOSbuiltoffline.Thesesettingsremainstaticthroughout
theexperiment.Dynamicinformationconcerningwhichinjectionshavetaken
placeandtheconfigurationforthenextinjectionisstoredinaplaintext
configurationfileinpersistent(flash)storageonthedevice.
Atboottimeitstartsbysettingupaconnectionforsendinglogmessages
totheHostComputer,eitherusingEthernetorserialcommunication.From
thispointonlogmessagescanbesenttotheHostComputer.Dependingon
thetargeteddrivereitherserialorEthernetcommunicationisusedtosend
logmessagestotheHostComputer.
Nextitreadstheconfigurationfiletofindoutwhichserviceistobetar-
getedforthenextinjection(seeFigure4.4).Whentheinjectiondatahas
beenreadtheyaremarkedaspendingandthechangeisflushedtobemade
persistent.Onceanexperimenthasfinished,andbeforerebooting,thepend-
ingflagischangedtofinished,indicatingthattheExperimentManagercould
rebootthesysteminacleanway.Thissimplemechanismallowsustode-
tecthangsanduncleanrebootsbythesystem.If,atboottime,apending
flagisfoundforthenextinjectionaspeciallogmessageissentwiththis
information.TheExperimentManagerthenwaitsfortheInterceptortosenda“Ready
forinjection”message,afterwhichitsendstheinjectiondata(service,pa-
rameterandinjectioncase).TheInterceptorthenhandlestherestofthe
injection.actualAfteraninjectionhasbeenconfiguredthetestapplicationsarestarted
andmonitored.EachtestapplicationupdatestheExperimentManager,on
anyassertionviolation.TheirexitstatusisalsomonitoredbytheExperiment
Managerandiftheyexitabnormallythisislogged.Iftheyhavenotexited
withinagiventimeperiod(aboutthreetimesnormalexecutiontime)they
areconsideredhung/crashedandthisfactislogged.
Oncetheoutcomesforeachofthetestapplicationsisknownthesystem
isautomaticallycoldrestarted.Thecoldrestartensuresthatanydataleft
inRAMisreplacedandthatacleanimageoftheOSisreadbackinfrom
flashstorage.Forthecasewhenanerrorcausesthesystemtonotrespond
totheExperimentManager’sattemptstorebootthesystem,adedicated
rebootprocessisused.Thisprocessisstartedautomaticallyatboottime
andsimplytriestorebootthesystemafteraspecifictimeouthastriggered
(currentlyfourminutes).Todefinesuchatimeoutiscommonpracticefor

58

CHAPTER4.FAULTINJECTIONFRAMEWORK

failsfaultamaninjectionualhardw[Arlatareetal.,resetis1990;requiredDur˜aesbyandtheevMadeira,aluator.2006].Ifthisalso

ComputerHost4.5.2TheHostHostComputerComputerrunsiassetusedoftoexpmanageerimentandservers.supportEachtheservexperiserimenrespts.onsibleThe
forcommunicatingwithoneTargetComputer.Theservercanbeconfigured
btoothrespecondhoingbothtestonnetapplicationworkanddatasenserialtascommpartoftheunication.wIorkloadtisonrespeachonsibleTargetfor
Computerandforreceivinglogmessages.Italsokeepsatimerforeachlog
alertedstreamthatandifthenoboardmessagemayishavehdetectedung,withinrequiringagaivenhardwtimearethereset.operatoris
Logmessagesarestoredsequentiallyinatextfile,onefilepermanaged
binstanceoard.Thethatopaberatoroardwcanasalsomanadduallycustomrestarted.logThemessageslogtofilesthearelogprofile,cessedfor
off-lineandtheresultsarestoredinarelationaldatabase.Currentlyweuse
SQLServer2005fromMicrosoft,butotherdatabasescouldbeusedaswedo
norelyonspecificfunctionalitiesoftheunderlyingserver.Headerfilesare
processedtomatchservicestofunctionnamesforeasierhandlingofthelog
data.QueriesThecanusebepofaosedindatabaseastructuredsignificanwatlyyandimprosavveesdwfororkinglaterwithusetheusingdata.the
SQLquerylanguage,inourcaseTransact-SQL[TSQ].
ThisExpisdoneerimentsareautomaticallyclassifiedoninthetodatafailurestoredclassesinasthepresendatabasetedinforSectioneach5.2.ex-
periment.Failureclassesaredefinedasdisjointpredicatesontheexperiment
data,andareimplementedasviewsinSQL.Thisisaveryflexibleapproach
asquicklyimprobvyemenmotsdifyingtothethefailureSQLdefinitionsclassificationforsceachhemeview.canbIteisintroalsovducederyveasyery
toindeedrundisjoinconsistencytandcheccompletekson(eacthehdataexptoerimenchectiksinthatone,theandfailureonlyoneclassesclass).are

terceptorsIn4.5.3InterCommceptorunicationmobdules.etweenTherethearettargetedwotypedrivsoerfInandtheterceptors,OSistrtracackerskedandusingin-
jectors.TheTrackerisusedonlyinChapter7totrackthecallsmadetoa
driver.TheInjectorisusedtoperformtheactualinjection.
Eachserviceexportedandimportedbyadriverisafunctioncallto/from
aerrordynamicmodelslinkrequirelibrarythat(Dll).theAsdatatyppreviouslyesofthedescribedparametersinSectionusedin3.2.1functionsome

SETUPINJECTION4.5.

59

callstobetracked.Trackingthedatatypeusedservestwopurposes,first
andforemostitisusedtoselectthespecificerrortoinject,butitisalsoused
toreducethenumberofinjectionsforerrormodelsbasedonbitstrings(like
theBFmodeldescribedinSection3.2.1)byrestrictinginjectiontothebits
used.TheC-languagedoesnotprovideanyreflectivemechanismswherebythe
datatypeofaparametercanbediscoveredatrun-time.Thisinformationis
pertinentfordatatype-basedinjection.Formostfunctionstheinformation
isavailableintheformofheaderfilespresentonthesystem.Insomerare
casesonlineproductdocumentationisusedtoresolveparameterdefinitions,
inthiscaseMicrosoft’sonlinedeveloperdocumentation[MSDN].
Injectionisdoneononeinterfaceatatime.Exportedfunctionsare
targetedseparatelyfromimportedfunctions,therebydefiningasinglein-
jectioncampaign.FunctionsfromoneimportedDllaretargetedseparately.
Theinjectionmodulewrapsthedriverandactsasa“trojanhorse”toboth
thedriverandtheOS,byimitatingthebehavioroftheotherparty.Simi-
larstrategieshavebeenusedinpreviousfaultinjectiontools,forinstance,
1988].al.,et[SegallForeachinjectioncampaign(driver/Dll)aseparateInjectormoduleis
built,whereaninjectionwrapperisbuiltforeachfunctionintheservice
interface.TheInjectorinteractswiththeExperimentManagerandactivates
thetargetedinjectionwrappersandcanbeconfiguredtomakemultiplewrap-
persactiveforaninjectionrun.However,fortheexperimentscarriedoutonly
onewasmadeactiveandthenon-activatedwrappersactaspassthroughs,
withouttouchingthetheparametervaluesused.
Tomakesurethatthesystemitselfdoesnotmodifyorperturbthebe-
haviorofthesystemeachexperimentcampaignstartswithanerror-freerun,
i.e.,arunwhereallwrappersareinplacebutactaspassthroughs.This
runallowstheevaluatortoverifythatcommunicationwiththeHostCom-
puterissetupproperlyandthatnounexpectedproblemshavearisen,before
anyactualexperimentsarecarriedout.Duringthiserror-freerunthesystem
isprofiledtominimizethenumberofinjectionsthatneedtobecarriedout.
ThisprocessisfurtherdetailedinSection4.6.
Whentargetingimportedservicesthebinaryofthedriverismodified.
ThebinaryformatofdriversonWindowsCE.NetfollowthePortableEx-
ecutable(PE)format[Mic,2006],whereDllsbeingdynamicallylinkedto
thedriverarespecifiedtogetherwiththeservicesused.Bymodifyingthe
nameofthelibrarybeinglinked,theInterceptorDllcanbelinkedinstead
oftheoriginalDll.Adedicatedapplicationhasbeenimplementedforper-
formingthemodificationofthebinaryimageofadriver.ItrunsontheHost
ComputerandisusedbeforebuildinganewOSimage.TheInterceptoris

60

CHAPTER4.FAULTINJECTIONFRAMEWORK

implementedtoexportallfunctionsthatthedriverusesintheoriginalDll.
TheInterceptortheninturnloadstheoriginalDllandcanpassanycalls
along.TheInjectorcanworkinthreemodes:a)fortestingpurposesitcanact
asacompletepassthrough,b)itcanwaitforinjectioninstructionsfromthe
ExperimentManagerandthenactivatetheappropriateinjectionwrapper,
andc)itcanbuildallinjectionwrapperswithpassthroughfunctionalityand
querytheirinjectioncases.Thelattermodeallowsthecreationofinjection
casesonthefly.Thisisimportant,aswhenanewerrormodelisimple-
mented/enhancedorwhennewfunctionsarewrapped,theinjectioncases
areautomaticallygeneratedwithouthumanassistance,savingtimeandre-
ducingtheprobabilityformistakes.Thetwolattermodesareillustratedin
4.4FigureInFigure4.4,whenthesystembootsitcheckforthepresenceofacon-
figurationfile,specifyingwhicherrorsaretobeinjectedandwhichinjections
havealreadybeenperformed,aspreviouslydescribed.Ifoneexistsitisread
andtheInterceptorbuildstherequiredinjectionwrapper.NexttheInter-
ceptorsendsthe“readyforinjection”messagetotheExperimentManager,
whichthenstartsthetestapplications(workload).TheInterceptorisnow
readytoinjecttheerrorwhenthespecifiedtriggerfires.Wheninjectionis
finished(intermittentandpermanenterrormodelsmayinjectmultipleer-
rors)thesystemwaitsforthetestapplicationstoexitbeforeupdatingthe
configurationfileandthenrebooting.Notethatifthesystemisstillpre-
paredtoinjecterrorswhenthetestapplicationsexitthesystemisrebooted
anyway,elsepermanenterrors,orerrorsnotbeingtriggeredbyacertain
workloadleadtolivelock(thesystemwaitingfortheerrortobetriggered,
whichwillnothappen).
Inparalleltotheinjectionprocessawatchdogtimeoutprocessisstarted
whichrebootsthesystemafterasettimeoutfromboottimehaselapsed.The
purposeofthewatchdogistorebootthesystemincasetherestofthesystem
failsduetoanerrorandthenormalrebootstepisnotreached.Currently
thetimeoutissetto200seconds,morethantwicethenormalexecutiontime
system.theofBuildinginjectionwrappersiscurrentlyamanualprocess,butsincethe
informationrequiredis(mostly)availableinparsableheaderfilesanauto-
maticprocessingispossible,similartotheapproachusedin[FetzerandXiao,
2002b].Asthewrapperonlyneedstobeimplementedonceforeachfunc-
tion,andcanthereafterbeusedforanyerrormodel,theapproachstillscales
reasonablywell.However,foralargescaledeploymentoftheapproachthis
stepneedstobeautomatedasmuchaspossible.

4.5.SETUPINJECTION

Timeout?

Boot

nfig exists?oC

orRead ErrtionBuild injecapperwrtStarloadrkowigger?rTort errInjecinishedFting?injecoWloadrkfinished?UpdateonfigcReboot

No

No

Build all tion injecapperswrtStarloadrkowRprecoofilerd
loadrkoWfinished?cWronfigite

tStaronditionCorait-fWtioncA

61

Figure4.4:Anoverviewoftheinjectionprocess.
delMoErrorecifyingSpEachInjectorbuildsanin-memorydatastructure(aC++object)containing
informationregardingthedatatypesused,andpointerstothevaluespassed

62

CHAPTER4.FAULTINJECTIONFRAMEWORK

forthetargetedservice.Foreachserviceonesuchobjectisbuilt.Thedata
typesusedforaspecificservicearehardcodedintothecode.Thisisnoreal
limitationastheinformationisstatic(functiondefinitions)andonlyneedto
bedefinedonceforeachfunction.Theerrormodelisspecifiedusingaplugin
model,whereamodelisspecifiedusingtheerrortype,timing(duration)and
trigger,analogouslytothedescriptioninSection3.2.
Foreachparametertargetedforinjectionthedatatypeneedstobe
recordedandtheerrorselectedaccordingly.Thedatatrackingmechanism
isimplementedinC++,buttheinterfacebetweendevicedriversandthe
OSisdefinedusingC.Wemakeadistinctionbetweenthreemajorclassesof
parametertypes,namely:

•Basictypes-Thebasictypesincludethebuilt-intypesprovided
bytheprogramminglanguage,likeintegersandbooleans,aswellas
specializedtypeswhereitmakessensetousespecificinjectioncases,
forinstanceHKEYrepresentingahandletoaRegistrykey.
•Structures-ThisisthestructtypeusedinC.Itcontainsasetof
memberswhichthemselvescouldbeofanyofthethreeclasses.
•Pointers-Thesearereferencestoothervalues,whichcouldbasic
types,structuresorotherpointers.

ThetrackingmechanismisbuiltaroundtheconceptofaParameter.
AParameterisofanyofthethreeclasses:basictype,structureorpointer.
Figure4.5showstherelationbetweenthethreeclassesoftypesfound.Struc-
turesandpointersrefertootherparametervalues,whichinturncanbeany
classes.threetheofThethreeclassesareimplementedasC++classesinheritingfromaPa-
rameterbaseclass.Foreachdatatypetrackedanewspecializedclassis
implemented.AParameterprovidesmethodsforspecifyingerrormodels,
queryingforinjectioncasesandtoinjecterrors.Structuresandpointers
furthermorecontainsmethodsforaddingmembersandreferences.
Figure4.5illustratesourimplementationofthedatatrackingmechanism.
Foreachtargetedparameteranobjectisdefined,whoseclassinheritsfrom
theParameterclass.Usinginheritance,newdatatypescaneasilybeadded,
providedtheyinheritfromtheParameterclassandimplementtherequired
methodsforinjectingerrorsetc.
AspreviouslyexplainedinSection4.5.3aninjectionwrapperisbuiltfor
eachtargetedfunction.Thewrapperbuildsamodeloftheinterfaceusingthe
classesexplainedabovetogetherwithinformationregardingtheerrormodel.
Usingthisinformation,theExperimentManagercanquerythewrapperfor

SETUPINJECTION4.5.

aPerametr

ointerP

peyBasic ttureStruc{

intunsigned intcharunsigned charoidvtshortunsigned shorHKEYboolchar_tw

63

ypesExamples of basic data t

Figure4.5:Datatypetrackingmechanism.

theinjectioncasespossibleforthegivenerrormodel.Thisenablesautomatic
configurationandgenerationofinjectioncases,illustratedintherightbranch
4.4.FigureofThepluginmodelmakestheinjectionframeworkconsiderablymoreflex-
ible,comparedtohardcodedinjectionsonaservice/driverbasis.Oncethe
injectionwrapperisdefinedseveralinjectioncampaignscaneasilybeper-
formedbyspecifyingdifferenterrortypes,durationsandtriggermodels.The
errortype,durationandtriggerarespecifiedaskeysintheRegistry,which
areextractedbytheExperimentManageratboottime.Theinjectionwrap-
peristheninstructedtobuildthecorrespondinginjectionobjectonline.This
facilitatesaveryflexiblearchitecture,thatautomaticallyextracttheinjec-
tioncasestobeperformed.Thenumberofinjectionsrequiredisextracted
fromtheerrortypeobject.Thefirsttimethesystemisbootedallinjection
objectsarebuiltandtheinjectionstobeperformedarestoredinafile.
Threeerrortypepluginshavebeenimplemented(BF,DTandFZ),but
additionalmodelscaneasilybeimplemented,includingtimingerrors(de-
lays).Threeerrordurationsareimplemented,transient(occuronlyonce),
intermittent(occurxtimes)andpermanent.Aspreviouslydescribedonly
thetransientmodelhasbeenevaluated.Thetriggeringmechanismsimple-
mentedincludefirst-occurrence,callblock-basedandtimeout-basedmodels.
Onecanalsospecifymoreadvancedtriggeringmechanisms,whereerrorsare
triggeredafterxcallstoaserviceoronlyaftercallstoacertain(other)
service.

64

CHAPTER4.FAULTINJECTIONFRAMEWORK

ApplicationsestT4.5.4Theworkloadforarobustnessevaluationistypicallythesetofuserappli-
cationsrunningonthesystem,togetherwiththeirinputs.Thepurposeof
theworkloadinthiscontextistwo-fold:a)todrivetheuseofthesystem
andattachedmaketosurethethatsystem;allrelevandanb)ttopartsofdetecttheanyOSarerobustnessused,includingviolationsindevicesthe
used.servicesOStype)Whentheworkloadrobustnessevshouldaluationascloselyisascarriedpoutossibleonamimicfinishedtheuseproofductthe(orproproto-duct
initsoperationalsetting.However,forgenericcomponents,likeanOS,in-
formationonoperationalsettingsmightbeunknown.Forsuchcasesgeneric
wmarkorkloadsareapplicationstypically[Bartonused.etal.,Examples1990;Kanaincludewatietstandardal.,p1992;erformanceCarreirabetencal.,h-
1998].reusedinThethesamefuture.problemThefuturearisesuseswhenmaytestingnotcompmatchonenthets,opwhicherationalmaypro-be
fileavailableatdevelopmenttimeandthustestersareforcedtoanticipate
“typical”useofthecomponent[Weyuker,1998].
harnessAnothersimulatingapproacophistoerationaltesteachconditions.serviceTheindividuallyharnessbneedsytodefiningsetupaantesty
spviceecifictobeconcalledtextin(sucahasrealisticheldsetting.resources,Thisopenapproacfileshetc)wasneededusedforfortheinstanceser-
in2002b].BallistaWeha[KoveopmanoptedtoanduseDeVaale,realistic1999]wandorkloadHEALERSinstead,[FthusetzeravandoidingXiao,the
problemofdefiningappropriatecontextscenarios.
Theworkloadusedinthisthesisconsistsofasetoftestapplications.The
sppurposeecificallyofthethetestdevicedrivapplicationsersevistoaluated.exerciseEacahwidetestvarietyapplicationofOSisservicesenhancedand
thewithexpectedadditionalresult.assertionsTheexpthatectedverifyresultthatiseacderivhcalledtofromantheOSdoservicecumenreturnstation
oftheOSandagoldenrunoftheapplication.Thetestapplicationsarekept
assimpleaspossibletomakethemdeterministic.Thisallowsassertionsto
berectness.manuallyThreetinsertedypesinoftotestthecodeapplicationsofeacharetestused,applicationusingthetovOSerifyinthedifferencor-t
ys:aw

[MemoryManagement]:Theapplicationallocatesmemoryandaccessit.
released.thenismemoryThe[FileSystemOperations]:Normaltextfilesarecreatedandopened.
Sometextiswrittentothefileandreadback.Fileattributesare
setandchecked.Thefileisfinallydeleted.

OFILINGPRE-PR4.6.

65

[DriverSpecific]Thedriverspecifictestapplicationusesthedriverby
issuingdriverspecificoperations.

Foreachdrivertestedaspecifictestapplicationisbuilttotestthedriver’s
functionality.Foranetworkcarddriverpacketsaresentandreceivedusinga
connectiontotheHostComputer.Similarly,theserialdriverreadsandwrites
ontheserialportconnectedtotheHostComputer.Specificconsistency
checkingassertionsareaddedtocheckforanyerrorsinthereceivedecho
strings.Similarly,theechoserverontheHostComputeralsochecksfor
incompleteorotherwiseerroneousmessages.TheCompactFlashdriveris
testedinasimilarfashiontothefilesystemtestsabove.
Togetaconsistentsystemforallinjections,andtoestablishacommon
groundforcomparingdrivers,alldriverspecifictestapplicationsareexecuted
foreachtest,evenwhenthespecificdriverisnottargeted.

Pre-Profiling4.6Aforkeytheconcerninjectedwithfaultsanasyhighfaultaspinjectionossible,istoi.e.,kaseepmantheyalevselpofossibleactivofationthe
faultsshouldbeactivatedandbecomeerrors.Sincetypicallythegoalof
faultinjectionistoexercisethesystem’sfault/errorhandlingmechanisms
andobservehowitbehavesinthepresenceoffaults,theactivationrateisa
measureofhoweffectivelythesemechanismsaretested.Notethatthegoal
ofinjectingfaultscanalsobetoassesstheactivationrateitself,i.e.,how
easilyfaultsareactivatedandbecomeerrors.However,thisisnotthecase
inthisthesis.Ingeneral,onewantstoachieveanaccelerationofthefault
→error→failureprocess[ChillaregeandBowen,1989].
Robustnessevaluationsoflargesystems,suchasOS’sdonottargetevalu-
ationofspecificfaulttolerancemechanisms.Therefore,onecannotgenerally
knowifanon-activatedinjectedfaultisstilldormant,hasbeenoverwritten,
ordetectedandcorrected.Tocopewiththisanexperimentaltimeoutisset,
afterwhichthefaultisdeclaredtohavedisappeared,eitherbybeingover-
writtenorhandledbythesystem.Thetimeoutneedstobelongenoughto
justifythisassumption,butshortenoughtomakeexperimentationfeasible.
Ahighactivationlevelnaturallyhelpsspeeduptheexperimentalprocess,
asfewerexperimentsneedtorununtilthetimeoutelapses.Fortheexperi-
mentspresentedinsubsequentchaptersatimeoutoffourminuteswasused.
Thiswassettobemorethan100%longerthantheexecutiontimeofthe
testapplicationsinfault-freescenarios.Initialexperimentswithsignificantly
longertimeoutsdidnotrevealanydormantfaultssurfacing.

66

CHAPTER4.FAULTINJECTIONFRAMEWORK

Byinjectingerrorsatahigh-levelinterfacewhichisreadilyaccessible
onecanachieve100%activationratioofinjectederrors,byemployinga
pre-profilingstagebeforetheexperimentsstarts.Thisstageprofilesthe
componentinquestionandrecordseachinvocationmadeintheinterface.
Thisinformationisthenusedtofilteroutanyinjectionsthatwouldnever
takeplace(becausethefunctionisnotused).Thistechniqueassumesa
deterministicinvocationpatterninthesensethatthesamesetofservices
areinvokedforeachrunofthesystem.Thispropertyisalsorequiredtoget
repeatableresults,animportantaspectofsystemevaluation.Theuseoftest
applicationsthatgiverisetoadeterministicworkloadmakesthisassumption
justifiedandindeedfortheexperimentscarriedoutforthisthesis,nosuch
deviationswereobserved.However,itisimportanttore-profilethesystem
asanychangesaremade,especiallyregardingthetestapplications,asthey
mightgiverisetonewinvocationpatternsforthedrivers.
Fortheinjectionexperimentscarriedoutapre-profilingstageisrunfor
eachexperimentcampaign.Thefirsttimethesystembootsthetestappli-
cationsareexecutedastheyarewhenerrorsareinjected.Inthiscase,an
invocationprofileofthetargeteddriverisautomaticallycollectedinsteadof
anerrorbeinginjected(rightbranchofFigure4.4).Basedontheservices
thataremarkedasusedtheinjectioncasesgeneratedcanbefiltered,such
thaterrorsareonlyinjectedforservicesthatareactuallyused.Byautomat-
icallyperformingtheprofilingforeachnewconfigurationanychangesmade
tothesystemconfigurationarealwaysconsidered.
Additionallytorecordingwhichservicesareinvoked,alsothenumberof
invocationsisstored.Thisisusedtofurthereliminatenon-activatedinjec-
tionswheninvestigatingthetimeofinjectioninChapter7.
IntheDTStoolasimilarapproachtooursisused[TsaiandSingh,
2000].Librarycallsmadebyanapplicationaretargeted.Onlycallsactually
performedaretargeted,reducingthepotentiallylargesetoffunctionsconsid-
erably.Weusethesamestrategy,reducingthenumberoftargetedfunctions
by36.7%onaverage.Table4.1showsthenumberofservicesspecifiedand
thenumberofservicesactuallyused.Thetableshowsthatmanyservices
arenotused,indicatingthatmorecomplexworkloadsmaypotentiallyincur
moreservicestobetargetedandthusformoreinjectionstobeperformed.
In[S¨ußkrautandFetzer,2007]staticanalysisoflibrarycodeisperformed
toreducethenumberofinjections.UsingDTinjectionssimilartoours,the
injectioncasestouseareselecteddependingonhoweachparametermightbe
used.Thisinformationisfoundbystaticallyanalyzingthecodetofindout
which(other)libraryfunctionsarecalledforthetargetedfunctionandbased
onthisrestrictthenumberofinjections.Thisdoesnotlimitthenumberof
servicestargeted,butthenumberofinjectioncasesrequiredforeachfunction

4.7.SUMMARYOFRESEARCHCONTRIBUTIONS

67

Table4.1:ThenumberofservicesspecifiedintheOS-Driverinterfaceand
thenumberofservicesusedforthespecifiedworkload.
DriverSpecifiedUsed[%]
cerfio91C111serial5460462676.748.1
63.83047atadisk

parameter.

4.7SummaryofResearchContributions

Thisinjectionchapterexperimenpresentstsptheerformedinjectionforframewthisorkthesis.usedTheforpframewerformingorkistheimple-fault
mentomation.tedforUsingWindoawspluginCEmo.Netdelspthreeecificallyerrorfomocusingdelsonhavebeenextensibilityimplemenandtedau-
andtheframeworkiseasilyextendedtoincludemoreerrormodels.Sev-
usereralaspmistakectsesofandthetoproexpcesseditehavexpebeerimenentation.automated,Thetoframewminimizeorkprothevidesrisktheof
followingbenefitsfortheevaluationofOSrobustness:

•Aflexibleandextensiblepluginmodelforeasyadoptionofnewerror
dels.mo

•Alow-intrusion,black-boxinjectionmethodology,notrequiringaccess
de.cosourceto

•Awhenhighchangesdegreeoftotheautomation,usederrornotmodelrequiringorwmanorkloadualareactionsptoerformed.betaken

•wAnouldefficiennotthavebselectioneenactivofatedinjectionforthecases,usedweliminatingorkload.injectionsthat

68

CHAPTER4.AFTULINJECTIONORKFRAMEW

5Chapter

ErrorSystems-PropagationWheretoinOpInjecterating

HowtomeasureerrorpropagationinOS’s?Whatarequantifi-
ablemeasuresoferrorpropagation?

iorsinThatthesoftformwareofcomperrorsonenandtsconfailurestainisafaultsfactthatthatleadwilltonotdisappundesirableearbinehathev-
howforeseeablesucherrorsfuture.mayInaaffectsystemthesystemdesignitasisawhole.thereforeAnimpimportanortantttoaspquanecttifyof
thisishowerrorsspreadthroughoutthesystem,i.e.,howtheypropagate.
to.AnotherBothaspareectimpistheortanteffectastheytheyhaallovew,ai.e.,dtheesignerfailuretoa)moquandestifytheypgivotenetialrise
depqualityendabilitcompybonentsottlenecandksc)intotheguidesystem,additionb)ofassistinenhancementhetsindesigningthesystemhigher
manner.eeffectivaninThischapterintroducesaseriesofmeasuresthatcanbeusedtoquan-
tifybasederroronthepropagation,previouslyRQ2describ-edquansystemtifiablemodel,measuresandofshowthatrobustness.byintroTheyduc-are
ingstudyerrorshowinthetheOSintreatsterfacebfaultetwyeendrivers.deviceAsdrivsuchersitandalsotheOSconsiderscanbeRQ3used-theto
questionofwheretoinjecterrors.

69

70CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS

ductiontroIn5.1

Thischapterdefinesaframeworkfortheevaluationoferrorpropagation
withrespecttorobustnessinOS’s.AsdetailedinChapter1animportant
aspectwhenincreasingtherobustnessofasystemisidentifyingpotential
sourcesandsinksforerrorpropagation(RQ1).Thegoalofthischapteris
todefinemeasuresthathelpidentifyingsuchservices.Toachievethiswe
usefailuremodeanalysisandfourseparatemeasures:ServiceErrorPer-
meability,ServiceErrorExposure,ServiceErrorDiffusionandDriverError
Diffusion.Aftertheirdefinitionadiscussionontheiruseanditsimplications
ted.presenisAspreviouslydiscussed,theuseoftheOS-Driverinterfaceformeasuring
errorpropagationandeffectissuitableformanyreasons,suchasportability
acrossdrivers,lowintrusion(asnosourcecodechangesarerequired)and
itallowsinjectionofmultipledriverfaults.Themeasurepresentedinthis
chapteraredefinedforerrorsappearingatthislevel,andthereforefurther
substantiatethechoseoferrorlocation,i.e.,answersthequestionofwhere
errors.injecttoThischapterpresentstheanalyticalfoundationuponwhichthefollowing
chaptersbuild.Furtherdiscussiononusingtheframeworkpresentedhere
anditsimplicationfromaquantitativepointofviewwillbediscussedin
hapters.ctsubsequen

5.2FailureModeAnalysis

RobustnessevaluationissimilartoFailureModeandEffectAnalysis
In(FMEA),FMEAathewellfailureknownmotecdeshniqueofinindividualreliabilitycomponenengineeringtsarep[Leveson,ostulated1995].and
theireffectonothercomponentsandthewholesystemderived.Robustness
evaluationdiffers,asitisexperimentalinnature,treatingasystemnotasa
staticentity,butadynamicsystemincontext.
bFeforehandailuremo(theydesmforayusebeiniterativrobustnesselyrefinedevaluationofcourse)aretasypicallyasetpofostulatedfailure
tomoinvdes,orestigateclasses.whichForfailuresafetymodescriticalthesystems,systemorFMEAcompmaonenytalsoshobewspinoperformedera-
tion.Eventhoughthetechniquesdevelopedheremayhelptowardsthisgoal
aswell,itisnottheprimefocusofthisthesis.
Thefailureseverityscaleusedinthisthesisissimilartoseveralprevious
bsevencerityhmarkscales.[SiewiorekSiewioreketet.al.,al.1993].usedaManfivyefaultgradescaleinjectionfortotheirolshaverobustnessused

5.3.ERRORPROPAGATION

71

similarscalesaswell,likeMAFALDA[Arlatetal.,2002;Rodriguezetal.,
Bo2002],wen,NFT1989;APEBarton[Gueettal.,al.,2003]1990;andMarsdenothers,andforFabre,instance2001;[ChillaregeDur˜aesandand
Madeira,2006].Asarepresentativeexampleoffailuremodesdefinedfrom
etanal.,application1997]ispshoerspwnectivinTe,ablethe5.1.CRASHThesevAPIeritofythescaleOSpresenistestedtedinby[Kocreatingopman
aspecifictaskthatcallsthetargetedfunctionandtheoutcomeisclassified
scale.CRASHthetoaccording

Table5.1:TheCRASHseverityscalefrom[Koopmanetal.,1997].
DescriptiondeMoailureFcrashSystemCatastrophicRestartThetaskishungandrequiresarestart
AbortThetaskterminatesabnormally
SilentNoerrorreportisgeneratedbytheOS,even
thoughtheoperationtestedcannotbeper-
formedandshouldgenerateanerror
returnederrorIncorrectHindering

ThefailureclassesusedinthisthesisarelistedinTable5.2.Weusethe
termfailureclassaseachfailureclassmaycorrespondtomultiplefailure
morepresendestdepgenericendingonclassesthethatdesiredapplylevtoelofgeneralgranpurpularitosey.Thesystems.chosenTheclassesfailure
beclassesunamarebiguouslydefinedtobedetermineddisjoint,tobsucehathatmembtheerofaoutcomespofecificanclass.experimenWhenevtcaner
theoutcomefitsthedescriptionofoneormoreclassesitisassignedthemore
sevanderethenone.theForestrofinstance,theansystemerrorwouldthatonlyfirstbecausesconsideredaninapplicationthelattertoclass.crash

PropagationError5.3Errorpropagationinsoftwarehappenswhenafaultisactivated(becomes
anerror)andthensubsequentlyusedinacomputation,leadingtoanew
erroratadifferentlocation[LeeandIyer,1993;Voasetal.,1996].Asan
example,considerafaultylineofcodewherethewrongvalueisassignedan
integervariable.Thisvalueisreadandusedinaconditionstatementand
thewrongdecisionistaken,leadingtoasetofstatementsbeingexecutedin
error.Theerrorhasnowpropagatedfromtheassignmenttoanotherpartof
thecomponent.Theerrormightcontinuetopropagateandmaypropagateto

72CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS

1Class

Table5.2:Thefailureclassesused.
DescriptionClassailureFClassNFWhennovisibleeffectcanbeseenasanoutcomeof
anexperiment,theNoFailureclassisused.This
indicatesthattheerrorwaseithernotactivatedor
wasmaskedbytheOS.
Class1Theerrorpropagated,butstillsatisfiedtheOSser-
vicespecificationasdefinedinthedocumentation.
ExamplesofClass1outcomesarewhenaner-
rorcodeisreturnedthatisamemberofthesetof
allowedcodesforthiscallorifadatavaluewas
corruptedandpropagatedtotheservice,butdid
ecification.sptheviolatenotClass2Theerrorpropagatedandviolatedtheservice
specification.Forexample,returninganunspec-
ifiederrorcodeorifthecalldirectlycausesthe
applicationtohangorcrashbutotherapplications
inthesystemremainunharmed,resultinthiscat-
.egoryClass3TheOShungorcrashedduetotheerror.Ifthe
OShangsorcrashes,noprogressispossible.For
acrashedOS,thisstatemustbedetectedbyan
outsidemonitorunlessthisstateisautomatically
detectedinternallyandthemachineisrebooted.

2Class

3Class

etc.otherEvcompentuallyonen,tsthebyerrorfunctionmightcalls,propagatemessagetopassing,theoutputssharedofmemorythesystem,areas
failure.acausingthereKnowingwhicherrorspropagateandwhereisimportantbecauseiten-
ablescountermeasurestobetaken.Ageneraldesignprincipleinthedesign
ofnentdepmasksendableanysystemserrors,isnottheexpconceptosingofinerrteractingorccompontainmentonen,tsi.e.,tothatapropagatingcompo-
errors[Pradhan,1996].Forsoftwarethisisdifficulttorealizeinpracticefor
everytypeoferrors.Instead,thefailuremodesandthepropagationpaths
needAttobleastefound,threesucmainhtheirusesforimpactknocanwledgebecregardingharacterized.errorpropagationcan
visioned:eneb•beIdentifyingmorelikrelytoobustnessspreadbottleneerrorscksorinmoretheliksystem.elytobeSomethecompsinkonenfortserrorsmay

5.3.ERRORPROPAGATION

73

propagatinginthesystem.Thesecomponentsshouldbethefocusof
otherverificationandvalidationefforts.
•Exposeflawsandandtheirimpactonsystemdependability.Errorprop-
agationmayrevealrealflawsinthesoftwareandmaytherebyassistin
thedesigninghigherqualitycomponents.Theimpactofsuchflawscan
becharacterizedusingforinstancefailuremodeanalysis,whichhelpsa
designerfocusattentiontothecomponentswhichcauseseveredamage.
•Locatingerrordetectionandrecoverymechanisms.Theerrorpropaga-
tionpathsidentifylocationswherespecificerrordetectionandrecover
maybeadded.Byplacingthemalongsuchpathstheireffectivenessis
increased.

AThediscussionframewisproorkvidedpresenontedhoinwthisthechaptermeasuresaimsdefinedatallcanofhelptheseacthreehievegoals.this.
Chapter6presentsanimplementationofthesemeasuresonarealOSand
.ysuitabilittheirdiscusses

DistributionClassailureF5.3.1Thesimplestwaytocompareasystem’sabilitytowithstanderrorsindrivers
istocomparethenumberofseverefailuresthesystemincursasaresultof
injectederrors.Sincethenumberoffailuresdependsonthechosenerror
model,i.e.,thenumberofinjectionsperformed,onecanusetheratiosof
failuresintodifferentfailureclasses,thefailureclassdistribution.
Thefailureclassdistributionhighlightskeydifferencesbetweendrivers
andgivesafastoverviewofdifferentdriver’sand/orerrormodel’sability
toprovokefailuresinthesystem.However,whenmoredetailedresultsand
guidanceisneededmorerefinedmeasuresshouldbeused,astheonepre-
sentedinthenextsection.

MeasuresPropagationError5.3.2Inthepropagatecontextofthroughoutthisthethesiswsystem.eareTinodoterestedthis,inwehowneedtoerrorsinclearlydevicespecifydrivtheers
observationpointswhereerrorpropagationismeasured.Errorsareinjected
intheinterfacebetweenthedriverandtheOS.Observationsaremadefrom
auserperspectivebyobservingthebehaviorofuser-spaceapplications.This
givspreadesuserrorstheandabilityantocapplication’sharacterizeusetheofOSrelationservices.betweendrivers’abilityto

74CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS

APP1...APPnAPP1...APPnAPP1...APPnAPP1...APPn
Operating SystemOperating SystemOperating SystemOperating SystemD1...D2...DND1...D2...DND1...D2...DND1...D2...DN
dcab

Figure5.1:Thefourpropagationmeasuresintroduced:a)ServiceError
Permeability,b)ServiceErrorExposure,c)ServiceErrorDiffusion,andd)
Diffusion.ErrorerDriv

Withtheintentoffindingrobustnessbottlenecksinthesystemthreemain
goalsareidentified:a)toidentifyservicesintheOS-Applicationinterface
thatarethelikelysinksforpropagatingerrors,b)toidentifyservicesinthe
OS-Driverinterfacethataremorelikelytospreaderrors,andc)toidentify
driversthataremorelikelytospreaderrorsinthesystem,giventhaterrors
arepresent.Tofacilitatesuchanidentification,abasicpropagationmeasure
isdefined,theServiceErrorPermeability,capturingthelikelihoodthatan
errorintheOS-DriverinterfacewillspreadtoanOS-Applicationservice.The
objectivesforourmeasuresareillustratedinFigure5.1andaresummarized
ws:folloas

(a)MeasurefordegreeoferrorporosityofanOSservice:ServiceError
,abilityPerme(b)MeasureforerrorexposureofanOSservice:ServiceErrorExposure,
(c)Measureofadriverservice’pronenesstospreaderrors,ServiceError
and,Diffusion(d)OS-DrivMeasureerofinadrivterface:er’sabilitDriverytoErrorspreadDiffusionerrors.inthesystemthroughthe

yermeabilitPErrorServiceTheServiceErrorPermeabilityistheprobabilitythatanerrorpropagates
fromaspecificserviceintheOS-Driverinterfacetoaspecificserviceinthe
OS-Applicationinterface.Itisconditionedonthepresenceofanerrorinthe
firstplace.Thatitisaconditionalprobabilityissignificant,sinceelsewe
woulddifficulthavandetosystemknowsptheecific.probabilitWithyotheferroroconditionalccurrence,probabilitwhichyisweinherenstillgettly
anassessmentofthesystem’sabilitytocontainerrorpropagationandwhen

5.3.ERRORPROPAGATION

75

aerroroccurrenceprobabilityisknowitcanbecombinedwiththeService
.yermeabilitPErrorTwoclassesofservicesareidentifiedintheOS-Driverinterface,asshown
previouslyinFigure3.2(page36).EachdriverDxexportsasetofservices
dsx.1···dsx.N.ThesearetheservicesthattheOScallstoinstructthedriver
toperformacertainoperation.Toimplementitsfunctionalityadriveralso
useasetofOSservicesosx.1···osx.M.
Foragivendriverorproject,onlyoneoftheclassesmaybeofinterest,for
instance,fordriversthatdonotmakeextensiveuseoftheexportinterface.
TheServiceErrorPermeabilityisthereforedefinedforeachclassexplicitly.
PDSix.yisdefinedfortheexportedservicesandPOSix.yfortheimported.
TheServiceErrorPermeabilitiesisdefinedforadriverDx,anOS-
ApplicationservicesiandanOS-Driverservice(eitherdsx.yorosx.y):

PDSix.y=Pr(errorinsi|errorindsx.y)(5.1)

POSix.y=Pr(errorinsi|errorinuseofosx.y)(5.2)
Typically,propagationisevaluatedtoaserviceinaspecificapplication,
i.e.,si∈AxisdefinedforaspecificapplicationAPPx,andthisisthewayit
here.terpretedinisServiceErrorPermeabilitygivesanindicationofthepermeabilityofthe
particularOSservice,i.e.,howeasilytheOSletserrorsinaspecificservice
intheOS-Driverinterfacepropagatetoaserviceusedbyanapplication.
Ahigherpermeabilityimpliesthatprecautionsneedtobetakenforthe
servicesinvolved.Suchprecautionscouldentaileitherensuringthatthe
servicesareproperlyused(faultpreventionandremovalmethods),including
handlingexceptionalsituations,oradditionoferrorhandlingcode.Note
thatEquation5.2allowsustocomparethesameOSserviceusedbydifferent
drivers.Theimpactofthecontextinducedbydifferentdriverscanthusbe
studied.NotethattheServiceErrorPermeabilityisdefinedwithrespecttosubsets
oftheservicesattheOS-Applicationinterface,S,andOS-Driverinterface,
O.Forservicepairsnotmembersofthissubset,noassertioncanbemade
abouttheirpermeability.Itisthereforedesirabletomakethesesubsets
representativeofthesetofservicesusedwhenthesystemisoperational.

osureExpErrorServiceTtheothesystem,findaOSsetofservicesrelevanthattaredriversmoreneedsexptoosedbetoerrorsconsidered.propagatingThepropagationthrough

76CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS

fromthissetofdriverscanbecombinedintoacompositemeasure,namelythe
ServiceErrorExposure1(Ei).ServiceErrorExposureconsiderseachdriver’s
contributiontothepropagationoferrorstoaspecificOS-Applicationservice
si(seeFigure3.1).Thusitisanestimationonhowexposedthisserviceis
topropagatingerrorsfromthesedrivers.BothPDSandPOScontributeto
theServiceErrorExposure,andconsequentlybotharepartofEquation5.3.
WeusethemeasureServiceErrorPermeability,tocomposetheService
ErrorExposureforanOSservicesi,namelyEi:
Ei=POSix.j+PDSix.j(5.3)
∀x∀j∀x∀j
ServiceErrorExposureconsidersalldriversinfluenceononeOSservice.
ThusitsuseismostlytocompareOSservicesandrankthembasedontheir
exposuretopropagatingerrors.Itcanthereforebeusedtoguideverifica-
tioneffortsofapplicationsorplacementoferrorhandlingmechanismsonthe
applicationlevel.Notethatthisexpressionimpliesaggregatingallimported
andexportedServiceErrorPermeabilities(5.1&5.2above)andallconsid-
ereddrivers.Whencomparingservicesonadriverperdriverbasisthedriver
specificServiceErrorExposurecanbeapplied:
Exi=POSix.j+PDSix.j(5.4)
j∀j∀Thedriverspecificserviceexposureallowsstudyofdriverattributeddif-
ferencesinexposureofpropagatingerrors.Italsomakesassessmentofex-
posureindependentoftheselectedsetofdrivers.

DiffusionErrorServiceAsServiceErrorExposureconsidersaspecificserviceattheOS-
Applicationlevel,ServiceErrorDiffusion(SEx.y)focusesonspecificservices
ontheOS-Driverlevel.Thisallowsustopin-pointserviceswhicharemore
likdefinedelytoforaspreaddrivererrorsDandthroughasptheecificsystem.os:SEx.yforanimportedservicesis
x.yxSEx.y=POSix.y(5.5)
i∀ServiceErroriDiffusionforexportedservicescanbecalculatedanalogously
to5.5usingPDSx.y.
1WewillusethetermServiceErrorExposureandServiceExposureinterchangeably.

5.3.ERRORPROPAGATION

77

ServiceErrorDiffusioncanbeusedtorankdriverservicesontheirability
tospreaderrorsinthesystem.Valuescanbecomparedeitherglobally(across
alldrivers)orlocally,foraspecificdriverDx.

DiffusionErrorerDrivDriverErrorDiffusionisusedtorankdriversontheirabilitytospreaderrors
inthesystem.Todothis,theServiceErrorPermeabilityvaluesforonedriver
areaggregated.Ahighervaluemeansthatthedrivermaymoreeasilyspread
errorsinthesystem.ForadriverDxandasetofservices,theDriverError
Diffusion,Dxisdefinedas:
Dx=POSix.j+PDSix.j(5.6)
∀i∀j∀i∀j
AnalogoustoServiceErrorExposure,ahigherDriverErrorDiffu-
sionvalueisanindicationofwhereeffortsneedtobespentonverification
orwhereerrordetection/recoverymechanismsshouldbeplacedinthesys-
tem.SinceDriverErrorDiffusionfocusesonthedriverlevel,locationsare
identifiedonthislevelaswell.
Oncearankingacrossdriversexists,thedriver(s)withthehighestDriver
ErrorDiffusionshouldbethefirsttargets.Detailsonspecificerrorpathscan
nowbeused(i.e.,ServiceErrorPermeability)toguidethecompositionand
placementofdetectionandrecoverymechanisms.

MeasuresofUse5.3.3evThealuatorpreviousmighthsectionaveispresenthereforetedsixwhicdifferenhoftthesemeasures.touseAforanaturalspecificquestionproject.an
5.3.ThreeWhenketheyusesgoalisfortoerroridentifypropagationrobustnessbanalysisottlenecwereks,idenServicetifiedErrorinExpSectiono-
sureandServiceErrorDiffusioncanbeusedtoguidethesearchforspecific
services,andmoreinformationcanbethenbegainedbystudyingindividual
ServiceErrorPermeabilityvaluesforeachconsideredservices.Driverwith
potentialforspreadingerrorscanbeidentifiedusingDriverErrorDiffusion.
Informationusedfordebugging(exposingflaws)canbegainedbylooking
atErrorthespecificDiffusion.injectionThesecasesalsoidenhelptifiedinlobycatingServiceerrorErrorExpdetectionosureandandrecoServicevery
mechanismsUltimatelyinittheisthesystemlevelbyofidendetailedtifyingrequiredprominentthatguidespropagationtheusepaths.ofthe
propdata,osednosignificanmeasures.tovAserheadallispresenattactedhedtomeasuresthearecalculationbasedonofeacthehsamemeasure.raw

78CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS

FortheexperimentalsetupusedallmeasuresarepredefinedasSQLscripts,
whichareloadedandexecutedonthedatastoredinthedatabase.All
scriptsexecuteforatmostafewsecondsontheusedworkstation.Theuse
ofadatabasemeansthattheonlyoperationrequiredwhennewexperiments
havebeenconductedistoloadthemintothedatabase,ataskforwhicha
dedicatedapplicationhasbeenimplementedgreatlysimplifyingthetask.

Discussion5.4Thissectionprovidesageneraldiscussiononsomeoftheconceptspresented
inthischapter.Amoredetaileddiscussionregardingimplementationdetails
andinterpretationofthemeasuresisfoundinChapter6.

ClassesailureFWhenequally.invSomeestigatingerrorserrorleadtopropagationsevereinafailuressystemandnotsomealltoerrorsmerecanbannoeytreatedances.
Whatconstitutesthe“severity”ofafailureissystemdependentandasub-
jectiveproperty.Differentusersmayconsiderdifferentfailuresasworst.For
instance,Dur˜aeset.al.[Dur˜aesandMadeira,2003]defineasetofgeneric
failuremodes,anddependingontheuseroftheevaluationdefineseverity
scalesassubsetsofthegenericfailuremodesaccordingly.Fromafeed-
bacwhereaskpointfromofanviewavtheailabilitworstypoinfailuretofisviewthealossofcompletelydatawithoutunrespanonsivyewsystemarning,
orst.wtheisForgeneralpurposesystems,suchdifferentiatingviewsbecomeproblem-
atic.Asanexample,manyusersarefrustratedwhentheirdesktopPC
crashesduetofailureforsomedrivertheydidnotknowexistedontheirsys-
tem.However,thatthesystem“crashes”maybeanexplicitdecisionbythe
ofOSthetoavcauseoidorremedyinconsistenciesforofandataerrororevexist,enlosstheofonlydata.saneWhenthingnotoknodowledgemight
betotakethesystemdownandhopethattheerrorhasdisappearedwhen
thesystemisrestarted.Hadthesystembeenwrittenforadedicatedpur-
pose,correctdiagnosisandrecoverymighthavebeenpossible,andthecrash
bavehaoided.vior,Ai.e.,crashwhencanthealsobsystemedesirablefailsitifdotheessoOSbisytostoppingimplementotresp“failondsilenandt”
withoutanyothersideeffects[Powelletal.,1988].
SincethisthesisisconcernedwithrobustnessofOS’swehaveoptedtouse
agenericseverityscaleforthefailuremodesdefined.Itisimportanttonote
thateventhoughwedoconsiderthescaleascontainingprogressivelymore

DISCUSSION5.4.

79

severefailures,thisonlyreflectsagenericseverityranking.Contextinput
isneededtorefinetheseverityscaleusedforaspecificsystem,inorderto
defineusefulandcomparablefailureclasses.Additionally,morefine-grained
failuremodescanbedefinedwhenknowledgeregardingaspecificsystemis
known.Forinstance,applicationsrunningonthesystemmightbeofdifferent
criticalityandfailuremodesreflectingthismaybedesirable.

Interpretation&Evaluation
Theusefulnessofanalysisusingfailureclassesisthatresourcescanbeguided
tothemoreseverefailureclasses,thususingthemmoreefficiently.This
appliestobothfaultpreventionandremovalefforts,suchasimprovementof
theengineeringprocessordifferentkindsofverificationefforts,aswellasfor
faulttoleranceapproacheswhereerrordetectionandrecoveryisenhanced
byadditionofsoftwaremechanisms.
Atypicalprocessistostartwiththemostseverefailureclass,andthen
progressivelyapproachthelesssevereclassesastimeandmoneypermits.
Thishelpstoensurethateffortsarespentwherethepay-offisthegreatest
andmayalsobeusedasastopcriteriaforrobustnessenhancement.
Anotherimportantpracticalaspectistheimpactdifferentfailureclasses
have.Class3failuresforcethewholesystemtohalt,i.e.,onecouldargue
thattheerrorpropagatedtoallservicesontheOS-Applicationlevel.In
suchcases,theservicesontheOS-Applicationlayerdonotimpacttherela-
tivecomparisonofdrivers,i.e.,theDriverErrorDiffusion.Whencomparing
driversusingDriverErrorDiffusionforClass3failuresonecantherefore
simplifyEquations5.1and5.2toonlyconsidertheprobabilitythatanerror
propagatesatall(sinceweknowitpropagatestoallservices).Theconsid-
erationofeachOS-Applicationlevelservicewouldonlygivealinearscaling
oftheDriverErrorDiffusion,notaffectingtherelativeorderacrossdrivers.
Chapter6showshowsuchsimplificationscanbemadeforarealsystem.

ortsExpvs.ortsImpInthischapterwemakeadistinctionbetweentheimportedandtheexported
servicesofadriver.Thisdistinctionmaynotbeusefulinallcontexts,andthe
servicescanthensimplybe“bundled”togethertoformonesetofservices.
ThiswouldsimplifyEquations5.3-5.6byusingonlyonetermPSix.ydefined
ws:folloas

PSix.y=Pr(errorinsi|errorinuseofsx.y)

(5.7)

80CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS

wheresx.yisaservicefromthecombinedset(O)ofallimportedand
exportedservicesfordriverDx.

ErrorDistribution&OperationalProfile
Animportantaspectwheninterpretingthevaluesforthemeasurespresented
isthelackofexplicitdependenceonerrorinputdistribution.Inanypractical
settingsuchadistributionisveryimportant.Fromarobustnesspointof
view,theerrordistributionmaybeoflessimportancesincethegoalisnotto
estimatethereliabilityofthesystem.Equations5.1and5.2areconditioned
onthepresenceofanerrorandcanbecombinedwithanerrordistribution
ailable.vawhenAnotherimportantaspectistheimplicitdependenceontheoperational
profileofthesystem.Theoperationalprofileincludestheusageprofileof
theapplicationsrunningonthesystemwhichimplicitlygivesrisetoadriver
usageprofile.Dependingonhowapplicationsareused,differentservices
providedbytheOSareusedandtheusageprofileofdriversdiffer.Fora
certainprofilesomeservicesmaynotbeusedatall,whereasinothersthey
arefrequentlyused.Thisinfluencesthevaluesofthemeasures,sinceonly
included.areusedactuallyservicesTheoperationalprofileofasystemmaynotbeknownatthetimeof
theevaluation,ormaychangewithtime.Thismeansthattherobustness
profileofthesystemmaychangeaswell.Itisthereforeimportanttotry
touseaprofilecloselymatchingtheexpectedonewhendoingexperimental
measures.theofestimationsAsalastpointitisimportanttonotethatwedonottrytotestdrivers
perse,sothismeasureonlytellsuswhichdriversmaycorruptthesystem
byspreadingerrors.Also,weemphasizethattheintentofthesemeasuresis
notforabsolutevalues,buttoobtainrelativerankings.

orkWRelated5.5

Errorconcepts.propagationSincebothanalysisarewellandfailureestablishedmotecdehniquesanalysistherearetiswoaintertplethorawinedof
tantliteratureresearchmakingconusetributionsofthem.withinThisbothsectionareas.reviewssomeofthemoreimpor-
systemErrorandaffectpropagationotherstudiescomphoonenwtsthethaneffectstheofsourceerrorspcompercolateonentofthroughthefaultthe
[LeeAnalysis)andIy[Ver,oaset1993].al.,V1996,oas1et.997;al.Voas,presen1997a]tsEPwhicAhiden(ExtendedtifiescodeloPropagationcations

ORKWTEDRELA5.5.

81

whichmightviolatethesafetyrequirementsofthesystemiffaultsoccurin
theselocations.Theauthorsintroducethetermfailuretolerancetomean
thatthesystemistoleranttofailuresof3rdpartysoftware.In[Voasetal.,
1997]theauthorsfurtherspeculatethatasimilartechniquewouldbemost
usefulforanOSsetting,sincesystemsoftwareconsistsofamultitudeof
ts.onencompteractingin

Fordependablesystemdesigns,errorpropagationisaphenomenathatis
toiorbofeavotheroided,sincesubsystemsitalloandwstheleadtofailureaoffailureoneofthesubsystementiretosystem.affecttheHobweehaver,v-
errorpropagationisausefulpropertyinsoftwaretesting,asithelpstoreveal
statecorruptionduetofaultsbypropagatingsuchfaultstotheinterfacesof
thesystem[VoasandMiller,1994a,1995;Voasetal.,1997].Therefore,for
revcompealonenfaultstsinwiththehighcode,erroriftheyarepropagationpresent.Fprobabilitromy,thisptestingointoisfmoreview,likelyrobust-to
sevnessereevaluationconsequences.identifiesOncehot-sptheseotshot-spintheotshasystemvebeenwhereidenerrorstifiedandcanleadtreatedto
(eitherbyensuringthatfaultsarenotpresentorbyaddingerrordetec-
tion/recoverycapabilities)thelikelihoodofthesystempropagatingerrorsis
loplacewered.suchIn[VoasassertionsandatMiller,themost1994b]effectivtheelopropagationcationsintheinformationcode,issuchusedthatto
e.effectivecomesbtesting

MichaelandJonesshowthatdatastateerrorsinsoftwarepropagate
uniformly,i.e.,eitheralldatastateerrorsforaspecificlocationpropagate
totheoutputsofaprogram,ornoneofthemdo[MichaelandJones,1997].
Fnotrombea.Whentheoreticalonlyaviewpsmallointthissubsetisofthesurprising,valuebuttodomainaforpractitioneravariableitmighcant
beconsidered“correct”,thenmostchangestothisvariablewillbeerroneous
andmaytriggerpropagationoffaults.Thisisespeciallytrueforvalueswhich
forarevaliditassumedyintothebecode.correctbythedeveloper,andarethereforenotchecked

Hillerdevelopedanextensivepropagationprofilingframeworkforembed-
dedcontrolsystems[Hilleretal.,2004;Hiller,2002].Basedonacomponent
modelerrorpropagationmetricssimilartotheonesdevelopedinthisthesis
arepresented.Whereasthefocustherewasondatalevelerrorsincontrol
software,wefocusonasingle(althoughcomplex)componentofcomputer-
OS.thesystems,based

82CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS

5.6SummaryofResearchContributions

ureThismocdehapterinanalysis.troAducedseriestheofconceptOSlevofelerrorpropagationpropagationmeasuresinOS’s,wereusingdefinedfail-
tothatcomphelponenantsevthataluatorarefindmoresystemlikelybtoottlenecspreadksanderrorstoorguidemorelikfurtherelytoeffortsbe
thesinkforpropagatingerrors.Relatedresearchprojectsintheareaof
reviewfailureed.moTdeableanalysis5.3andsummarizeserrorthepropagation,measuresespintroeciallyducedinrelatedthistocOS’shapter.were
Thefollowingresearchcontributionsarepresentedinthischapter:

•ErrorpropagationinthecontextofOS’sisdefinedinagenericand
systemindependentmanner.Thefundamentalpropagationmea-
sureServiceErrorPermeabilityisusedtomeasurethepropagation
acrossservicesintheOS-DriverinterfacewithservicesintheOS-
er.ylaApplication

•TheServiceErrorExposuremeasureisintroducedtomeasurethein-
fluencepropagatingerrorshaveonspecificOSservices.Itcanbeused
tocompareservicesinarelativemanner.

•ServiceErrorDiffusioncanbeusedtorankservicesintheOS-Driverin-
terfaceontheirabilitytospreaderrors.

•TheacrossDrivdriverersErrorontheirDiffusionabilitytmeasureospreadsimilarlyerrors.allowsrelativecomparison

5.6.SUMMARYOFRESEARCHCONTRIBUTIONS

Table5.3:Summaryoftheerrorpropagationmeasuresintroduced.

iSPOx.y

iPSx.yiE

5.2

5.75.3

SymbolEquationDescription
TheServiceErrorPermeabilityfordriver
PDSix.y5.1drivserviceserisservicethedsprobabilitwillythatpropagateantoerroraninOS-a
Applicationservicex.ysi.
TheServiceErrorPermeabilityforanOS-
Driverservicesistheprobabilitythataner-
POSix.y5.2rorinanOSserviceusedbydriverDxwill
driverservicedsx.ywillpropagateanOS-
.sserviceApplicationiThecombinedServiceErrorPermeabil-
PSix.y5.7itymakesnodistinctionbetweenimported
functions.ortedexpandTheServiceErrorExposureisusedtocom-
Ei5.3pareOS-Applicationservicesontheirsuscep-
tibilitytopropagatingerrors.
ThedriverspecificServiceErrorExposureis
iusedtocompareOS-Applicationserviceson
Ex5.4theirdiffersfromsusceptibilitEiinythattoitpropagatingconsiderseacherrors.driverIt
.individuallyServiceErrorDiffusionisusedtocompare
SEx.y5.5OS-Driverservicesonintheirabilityto
system.theinerrorsspreadDriverErrorDiffusionisusedtoidentifyand
Dx5.6comparedriversontheirabilitytospreader-
system.theinrors

iEx

SEx.yxD

5.4

5.55.6

83

84

CHAPTER5.ORERROPPRGATIONAINTINGOPERASYSTEMS

6Chapter

ErrorModelEvaluation-What
Injectto

WhatWhicharerreorthemotrdelade-offsshouldtobemakeusedacrforossOSerrrorobustnessmodels?evaluation?

theTheresultschoiceandoftheerrortimemodelrequiredfortorobustnessperformevthealuationevofaluation.OS’sThisinfluencescbhapteroth
inandveFstigatesuzzingtheerrors.effectivItenessbuildsofonthreetheerrormeasuresmodels:introducedbit-flips,indata-tChapterype5errorsand
usesthesetoevaluatethethreeerrormodels.IthelpsansweringRQ2-
quanAntifiableextensivemeasuresseries-ofandfaultRQ4-injectionwhattoexpinject.erimentsshowthatthebit-flip
mothedelnumalloberwsofforinjectionsmoredetailedrequired.results,Fuzzinghowisever,foundattaobehighercheapcosttoinimplementermsoft
butpresenistedlesswhereprecisethelowcomparedcostoftofuzzingbit-flips.isAcomnovbinedelcompwithositetheerrorhighermolevdelelofis
detailsofbit-flips,resultinginhighprecisionwithmoderatesetup/execution
costs.troFducedinurthermore,Chapterthis5ccanhapterbeshousedwsinhothewthecontexterrorofapropagationrealsystem.measuresin-

85

86

CHAPTER6.ERRORMODELEVALUATION

ductiontroIn6.1

theWhenchoicepoferformingtheaerrormorobustnessdelused.evInaluationmanofyancasesOStheseveralrepresenfactorstativenessinfluenceof
ofthemostusedimperrormortance.odelTobcomparedeabletototheestimateerrorsthefoundbinehaaviordeploofyethedOSsysteminanis
operationalsetting,theinjectederrorsneedtoascloselyaspossiblematch
realsincea)errors.thetHoypweevander,fordistributionCOTScompofrealonents,errorssuchmaaysnotOS’s,bethisknoiswn,b)difficult,the
operationalsettingmightnotyetbeknown,orc)theoperationalcomposition
ofthesystemmaynotbeknown.
compThisositionthesisoffothecusessystemontheintorobustnessconsideration.aspectoftheRobustnessOS,oftakingtheassystempecificis
evaluatedwithatypicalcomposition,toidentifysystemvulnerabilitiesin
theplatformsformof(OSerrorandhardwpropagationare)arepaths.evThisaluatedisforonainstanceprototypdoneestage,whenordifferenwhent
platformrobustnessisevaluatedaspartofqualityassuranceofanentire
sourcesystem.coIndeofthistysystempeofcompsetting,onents,theorevlacaluatorkthetabilitypicallyy(orlackspaccessermissions)totheto
it.difymoOS)Givandenlacthekofconsourcetextofcode,wrobustnessetargetevthealuationinterface(errorsbetwareeenexternaldevicetodriversthe
andcalledthetopOS.erformThisinservices.terfaceisThettargetypicallyfordefinedinjectionasaissettheofparametersfunctionstothatsucareh
functionsthatcarryerroneousinformationfromadrivertotheOS.
i.e.,Ahokweyarequestionerroneousbecomesstatesofwhictheherrorsystemmomodeldeledtochboyoseforinjectingtheeverrors.aluation,This
cnesshapterinevfindingaluatessystemthreecontempvulnerabilities,orarytheirerrormoeasedelsofbasedimplemenontheirtationandeffectivthee-
eactimehoftherequiredcriteriaforpinverformingestigatedtheisexpproerimenvided.tation.Adetaileddiscussionon
duced:Thecerrorerfiomoserialdels(serialareevport),aluated91C111usingthe(Ethernetthreedrivcard)ersandapreviouslytadiskin(Com-tro-
pactFlash).

delsMoErrorConsidered6.2

Theconsiderederrormodelsforthisstudywereintroducedanddescribed
inSection3.2.Thissectionfirstbrieflydiscussesthethreemodels.Table
6.1showsanoverviewofthethreemodels,showingthenumberofservices

MODELSORERRCONSIDERED6.2.

87

intheOS-Driverinterfaceandthetotalnumberofinjectionsperformedfor
eachoferrormodel.Thenumberofusedservicesdiffersacrossthemodels,
withtheserialdriverusingthemost.OnecanalsoseethattheBFmodel,
asexpected,incursthehighestnumberofinjectionsandtheDTmodelthe
est.few

Table6.1:Overviewofthetargetdrivers.
casesInjection#Driver#ServicesBFDTFZ
cerfioserial6026533971395
91C111541850283990
atadisk471486267899

6.2.1DataTypeErrorModel
Thedatatype(DT)errormodelmodifiesthevalueofaparameterbasedon
itsdatatype,andispresentedindetailinSection3.2.1.Ithasbeenshown
inpreviousstudiesthatthistypeofinjectionisveryscalableintermsofthe
numberofdatatypesusedinAPI’s,suchasPOSIX[DeValeetal.,1999].
Thetotalnumberofdatatypestargetedfortheexperimentsreportedhere
was22.Giventhattheaveragenumberofservicestargetedacrossallthree
driverswas54,withtypicallymorethanoneparameter,thisisafairlylow
er.bmun

delMoErrorBit-Flip6.2.2Forthebit-flip(BF)errormodeleachtargetedparameterisconsideredas
a32bitvalue.32injectionscasesaredefined,flippingthebitsfrom0(least
significantbit)tobit31(mostsignificantbit)oneaftertheother.
theThexor-function.flippingisacThishievedapproacbyhcastingisalsothevdetailedaluetoforanininstancetegerandin[Vthenoasusingand
Charron,1996].Thenewvalueisthenusedinthecalltotherealfunction.
used.TheHowBFever,moitdelisdobeseneficialnottonecessarilydosoforneedsometobspeecificadaptedreasons:tothedatatype
•Reducingthenumberofbitsusedfordatatypesusingfewerbitsreduces
thetotalnumberofinjectionsrequired.Forinstance,thedatatype
charusesonly8bits,whereasthetypeintuses32.Sinceinjecting
inthisthetype,remainingthenum24berbitsofofbitsachartargeteddoescannotbereflectrestrictedasofttow8.areerrorfor

88

CHAPTER6.ERRORMODELEVALUATION

•Mansomeyofotherthedatatyparameterspe(orusedvoidin).theByintracterfacekingaresuchpoinrelationsterstoathevpalueointerof
thantargetcantargetingbeusedtheforreferenceinjection,pointermorevaluescloselyalone.simulatingsoftwareerrors
•Acommonfeatureinthedriverinterfaceistousepointerstostruc-
turesdirectly(structtarget’s).them.WithoutDatatypdetailsetraconkingtheirmemfacilitatesbers,this.BFerrorscannot

6.2.3FuzzingErrorModel
Thefuzzing(FZ)errormodelusesapseudo-randomgeneratortogenerate
randomvaluestoinject.Thetargetedserviceinvocationisinterceptedand
anewrandomvalueischosentoreplacetheexistingvalue.Weusethe
standardC-runtimefunctionrand()togeneratetherandomvalues.Each
targetcomputer(board)storesthegeneratedrandomvalueinpersistent
storageinjection.andThisuseswaythiswevaaluevoidasseedgeneratingtothetherandomsamevaluegeneratorforeacforhtheinjectionnext
(whichwouldhavebeenthecasehadthesameseedbeenused).
Foreachservicetargetedfifteeninjectionswithdifferentrandomvalues
areperformed.Thisnumberwasselectedtogiveareasonableexecution
timeoftheexperimentsandyetproduceusefulresults.Section6.5.2further
discussesthenumberofinjectionsfortheFZmodel.

PropagationError6.3Thissectiondetailsourexperimentalestimationoftheerrorpropagation
pressionsmeasuresindefinedtheinpreviousChapterchapter5.Itcanisbedemonstratedadaptedtohowassessmenthetanalyticalwithfaultex-
tionsinjection.fromAlargeseriesscaleoffaultsimplificationsinjectionareexppresenerimentedtsareandpresenresultsted.andTinoterpreta-shorten
thewithSectiondiscussion,6.4fodevcusinotedthistothesectioncomparisonwillbeputacrosssolelytheonthreethemobit-flipdels.model,

DistributionClassailureF6.3.1Table6.2showsthefailureclassdistributionforthethreedriversusingthe
BFthoughmofordel.allThethreeatadiskdriversdrivtheerhasratiothestayshighestbelowratio4.5%ofofClassthe3injectedfailures,errors.even
ThecerfioserialdriverhasconsiderablymoreClass2failuresthantheother
twodrivers.Thisisduetoanumberofinjectionsforthisdriverleadingto

6.3.ERRORPROPAGATION

Table6.2:ThefailureclassdistributionfortheBFmodel.
DriverNF[%]C1[%]C2[%]C3[%]
cerfioserial206077.65381.4348118.13742.79
91C111132071.3541622.49502.70643.46
atadisk111775.1730020.1930.20664.44

89

hangsinOSservicesusedbyapplications,i.e.,serviceshangunexpectedly.
Theserialdriver,beinginherentlyofblockingnature,istheonlydriverto
showsuchbehaviortoasignificantextent.Theothertwodrivershavehigher
Class1ratiosinsteadandallthreedrivershaveroughlythesameamount
ofClassNFfailures,above70%.ThehighnumberofClassNFfailures
suggestthatthereispotentiallyroomforreducingthenumberofinjections
further,beyondwhatisalreadydonethroughthepre-profilingstage.

6.3.2EstimatingServiceErrorPermeability
ServiceErrorPermeabilityistheconditionalprobabilitythatanerrorap-
pearinginaOS-DriverinterfaceservicewillpropagatetoaserviceintheOS-
Applicationinterface,giventhatoneappears(seeEquations5.1and5.2).A
distinctionismadebetweenerrorsappearinginservicesprovidedbydrivers
(exports)andthoseprovidedbytheOSitself(imports).Asimplificationis
alsomadeinEquation5.7wherenosuchdistinctionismade.
ServiceErrorPermeabilityisestimatedbytheuseoffaultinjectionasthe
ratiobetweenthenumberofinjectionsperformedresultinginafailuretothe
totalnumberofinjectionsforagivenservice.ServiceErrorPermeabilityis
calculatedthesamewayforbothimportedandexportedservices.Wedenote,
foradriverDx,thenumberofinjectederrorsinaservice1osx.ywithNx.y
andthenumberoffailuresforservicesiwithni.TheestimatedServiceError
Permeabilityisthencalculatedasfollows:
SPix.y=Nni(6.1)
x.yTypicallyonestudieseachfailureclassinisolation.Inthiscaseniis
thenumberofinjectionsresultinginfailureofthespecificclassunderstudy.
SPix.yisusedasanestimateofbothPDSix.yandPOSix.yandassuchcorre-
5.7.EquationondspServiceErrorPermeabilitycanbeusedtostudytherelationbetween
tuplesofOS-DriverservicesandOS-Applicationservices.Thenumberof
1Thecalculationforexportedservices(dsx.y)isanalogoustothatforimported.

90

CHAPTER6.ERRORMODELEVALUATION

servicesintheOS-DriverinterfacecanbeseeninTable6.1.Foreachof
theservices,ServiceErrorPermeabilityisdefinedinrelationtoeachservice
studiedattheOS-Applicationlayer.SinceServiceErrorPermeabilityisa
probability,thevalueswillbeintherange[0...1].Itisimportanttonote
thatavalueof0.0mustnotbeinterpretedasanproofthatnoerrorswill
propagatealongthispath.Itisonlyanindicationthatthelikelihoodislow,
giventheerrormodelused.Similarly,avalueof1.0onlyindicatesthaterrors
arelikelytopropagate,butisagaindependentontheusederrormodel.
Aspreviouslymentioned,propagationresultsarebestinterpretedby
studyingtheindividualfailureclassesseparately.ForServiceErrorPer-
meabilityClass3failuresarenotrelevant,sinceaClass3failurewillhave
thesameeffectonallapplicationlevelservices,sinceitrenderstheentire
systemirresponsive,eitherthroughahangoracrash.Similarly,errorprop-
agationishardtotrackformostClass2failures,sincetheeffectneedsto
bepinpointedtothespecificservicebeingthevictimofthefailure.This
wouldrequiretrackingnotonlynegativereportsforeachservice,i.e.,when
errorspropagate,butalsopositivereports,i.e.,eachservicecallneedstobe
logged.Thiswouldputatremendouspressureonthetrackingmechanism
tosafelystoreorforwardinformationoneachcall.Consequentlywehave
notconsideredServiceErrorPermeabilityforClass2failures,anditthus
remainsasafutureextensiontoourwork.
OurinvestigationusingServiceErrorPermeabilityfocusesonClass
1failures.Manysuchpropagationpathsexist(althoughwithmanyhav-
inganestimatedpropagationpermeabilityof0.0).Therefore,Tables6.3,
6.4and6.5forbrevitypresentsonlytheprominenterrorpropagationpaths
identified,foreachofthethreedriversusingtheBFmodel.
Table6.3showsallpropagationpathsforcerfioserial.“Stringcompare”
isnotaspecificOSserviceperse,butanaddedconsistencycheckforthe
receivedechostringssentbythetestapplication.Similarly,“SerialEcho
Error”istheechoservercheckperformedonthehostside.Driverservices
withtheprefixCOMareexporteddriverservices.
Severalobservationscanbemaderegardingthepropagationpathsre-
portedinTable6.3.First,thetableshowsthatbothimportedandexported
functionscanleadtoClass1failures.Second,somepropagationpathsare
distinctlymoreprominentthanothers,havingServiceErrorPermeabilityval-
uesofupto1.0,i.e.,eachinjectederrorleadtoaClass1failurereported
bytheapplicationtestingtheserialportfunctionality.Third,a“clustering”
effectcanbeseen,wheremanyOS-Driverserviceshavemultiplepropaga-
tionpathswiththesameServiceErrorPermeabilityvalue.Thisindicates
thatinjectingatransienterrorintheseserviceswillcorruptthe“state”of
thesystemandcausesubsequentserviceinvocationstofailaswell.This

6.3.ERRORPROPAGATION

91

Table6.3:Class1errorpropagationpathsforcerfioserial,basedonService
ErrorPermeability(SEP),fortheBFmodel.
SEPOS-ApplicationerOS-Driv#1.000CreateFileterruptDisableIn11.000GetCommStateterruptDisableIn23InterruptDisableWriteFile1.000
1.000SetCommTimeoutsterruptDisableIn465InInterruptDisableterruptDisableReadFileGetCommTimeouts1.0001.000
7InterruptDisableStringcompare1.000
98InmemcpyterruptDisableCloseHandleSetCommState0.0421.000
10COMOpenWriteFile0.031
11COMOpenSetCommState0.031
12COMOpenReadFile0.031
13COMOpenSetCommTimeouts0.031
14COMOpenStringcompare0.031
15COMOpenGetCommState0.031
16COMReadStringcompare0.016
17EventModifyStringcompare0.016
18EventModifySerialEchoError0.016
19COMIOControlStringcompare0.010
20COMIOControlGetCommState0.010
21COMIOControlSerialEchoError0.010

isnobigsurpriseconsideringthetypeofservicesinvolved.Forinstance,
COMOpen,whichwhenfailingtoopentheserialportwillcausesubsequent
servicerequestsbytheapplicationfailaswell.
Table6.4showsthetopthirtyClass1errorpropagationpathsforthe
Ethernetdriver.OneservicehasaServiceErrorPermeabilityvalueof1.0,
withcatessevthateraltheseotherhaservicesvingarehighpropvidedermeabilitbytheyvNdisalues.libraryThe,aNdissystemprefixlibraryindi-
providedtosupportandsimplifynetworkcarddrivers.Similarly,theOS-
Applicationservicesarealsonetworkrelated,asexpectedsincethetestap-
plicationmostaffectedisusingnetworkingservicesheavily.Themostperme-
ablepathissurprisinglyenoughforNKDbgPrintfW,afunctionwhichprints
debuginformationusedbydevelopers.Thissuggeststhatevenfunctions
thatarenot“expected”bydeveloperstocausepropagatingerrorsmustbe

92

CHAPTER6.ERRORMODELEVALUATION

Table6.4:Class1errorpropagationpathsfor91C111,basedonService
ErrorPermeability(SEP),fortheBFmodel.
SEPOS-ApplicationerOS-Driv#1NKDbgPrintfWWSACleanup1.0000
1.0000connecttfWNKDbgPrin23NKDbgPrintfWshutdown1.0000
45NdisOpNKDbgPrintfWenConfigurationWSAclosesocCleanketup0.93751.0000
0.9375connectenConfigurationNdisOp687NdisOpNdisOpenConfigurationenConfigurationshclosesoutdocwnket0.93750.9375
9NdisCloseConfigurationWSACleanup0.8750
0.8750connectNdisCloseConfiguration1011NdisCloseConfigurationshutdown0.8750
12NdisCloseConfigurationclosesocket0.8750
13NdisInitializeWrapperconnect0.5625
1514NdisInitializeWNdisInitializeWrapprapperershWSAutdoCleanwnup0.56250.5625
1716NdisInitializeWNdisMRegisterMiniprapperortclosesoclosesocckkeett0.56250.4844
1918NdisMRegisterMinipNdisMRegisterMiniportortshWSAutdoCleanwnup0.48440.4844
2120NdisMRegisterMinipNdisMRegisterAdapterShortutdownHandlerWSAconnectCleanup0.48440.4688
22NdisMRegisterAdapterShutdownHandlerconnect0.4688
23NdisMRegisterAdapterShutdownHandlerclosesocket0.4688
24NdisMRegisterAdapterShutdownHandlershutdown0.4688
25QueryPerformanceCounterclosesocket0.4688
26QueryPerformanceCountershutdown0.4688
27QueryPerformanceCounterconnect0.4688
2928QueryPNdisReadConfigurationerformanceCountershWSAutdownCleanup0.43930.4688
30NdisReadConfigurationclosesocket0.4393

care.withusedTheclusteringofservicesisagainshownclearly,i.e.,severalapplication
levcateselthatservicestheyshowfailthetogether;sameServicewhenoneErrorservicePermeabilitfail,sevyveralalues.otherThisservicesindi-

6.3.ERRORPROPAGATION

93

Pfailtoermeabilito.9yv1C111alueabhadovine0.0.total71propagationpathswithaServiceError

TonableService6.5:ErrorSelectionPofermeabilitClassy1(SEP),errorforthepropagationBFmopathsdel.foratadisk,based
SEPOS-ApplicationerOS-Driv#1.0000GetFileTimeInitDSK132DSKDSKInitInitCloseHandleGetFileInformationByHandle1.00001.0000
4READPORTUSHORTCloseHandle1.0000
65READREADPORPORTTUSHORUSHORTTCreateFileGetFileInformationByHandle1.00001.0000
7DetectATADiskCreateFile1.0000
8DetectATADiskWriteFile1.0000
9DetectATADiskCloseHandle1.0000
10DetectATADiskSetEndOfFile1.0000

Table6.5fortheCompactFlashdrivershowsasimilartrendtothepre-
vioustwotables.Forthisdriver,twoexportedservicesshowup,DSKInit
andDetectATADisk.Theapplicationlevelservicesinthelistarerelatedto
fileoperations,asexpectedforthisdriver.Intotalatadiskhas176registered
paths.propagationItisimportanttonotethatinTables6.3,6.4and6.5ahighervaluefor
ServiceErrorPermeabilityindicateshigherlikelihoodofpropagatingerrors
resultinginClass1failures.Alowervaluemaythereforebeanindicationof
pronenesstohigherseverityfailures,ortoahigherdegreeoffaulttolerance.
ServiceErrorPermeabilitymustthereforebeusedinconjunctionwithother
measures,suchasServiceErrorExposureandDriverDiffusion.

osureExpErrorService6.3.3ServiceErrorExposureconsidersalldriver-levelservices’contributiontothe
failureseenforaspecificOS-Applicationservice.Therefore,weanalogously
totheServiceErrorPermeabilityonlyconsiderClass1failuresalsofor
osure.ExpErrorServiceThenumberofservicesusedbyeachtestapplicationwaspurposelykept
low,asaconsequenceofkeepingthetestapplicationssmallandsimple.
Tables6.6,6.7and6.8showthedriverspecificServiceErrorExposurecal-
culatedforeachserviceforinjectionsincerfioserial,91C111andatadisk,
.elyectivresp

94

CHAPTER6.ERRORMODELEVALUATION

ThedriverspecificServiceErrorExposureiscalculatedusingEquation
5.4,whichisbasedontheServiceErrorPermeabilityvalues(partially)pre-
sentedintheprevioussection.Sinceitisasumofprobabilities,theyarenot
uselimited,liesinandthenosprelativecificeincomparisonterpretationacrosscanmanbeymadeservices.onaAsinglehighervvalue.alueTheirindi-
catesthataservicesismoreexposedtopropagatingerrorsfromthedrivers
considered.

Table6.6:ServiceErrorExposurevaluesforthecerfioserial,usingthe
del.moBF

#ServiceServiceErrorExposure
1.0730compareString11.0417GetCommState21.0313SetCommTimeouts31.0313riteFileW41.0313ReadFile51.0000CreateFile61.0000CloseHandle71.0000GetCommTimeouts80.0729SetCommState910SerialEchoError0.0260

Allthreetablesshowthesameclusteringeffectobservedforfailures,in-
casedicatedisbnotyobservingsurprising.thesameConsideringServiceforErrorinstanceExposureCreateFilevalue.andThatthisCloseHandleisthe
initsTablehandle6.8willitbiseinvalid.understandableAsubsequenthatifttryCreateFiletoclosereturnsitwillwithalsoerror,returnthenan
error,andconsequentlytheerrorpropagatestobothservices.
AnotherusefulpieceofinformationcanbeseeninTable6.6whichshows
this“Stringisacheccompare”kptoerformedhaveonthethehighestreceivedServicedataErrorandnoExposurereturnedvalue.errorcoSincede
forreceivanedOSwithoutserviceanythisotherindicatesservicethatiindicatingnsomeancaseserror.Thiserroneouscorrespdataondscantobae
silendata.terror,Similarly,suggesting“Serialthatechodataerror”integritsignalsychecthatksthemightbdataesenneededttoforthecriticalHost
checComputerksmayinbensomeeededcasesatisthecorrupted,receivingsidesuggestingaswell.thatsimilardataintegrity
InTable6.7services5-15arefromthetestapplicationforatadisk,in-
causedicatingClassthat1infailuressomealsocasesforinjectingapplicationserrorsinnottheusingintheterfacefaultfory9driv1C111er.can

6.3.ERRORPROPAGATION

95

Table6.7:ServiceErrorExposurevaluesfor91C111usingtheBFmodel.
#ServiceServiceErrorExposure
6.5020connect16.5020wnutdosh26.5020upCleanWSA34closesocket6.4083
0.1250CreateFile50.1250CloseHandle60.0938ReadFile70.0625sizeof80.0625GetFileSize90.0313GetFileTime100.0313GetFileInformationByHandle110.0313SetEndOfFile120.0313riteFileW130.0313teroinSetFileP140.0313DeleteFile150.0208getaddrinfo16

Table6.8:ServiceErrorExposurevaluesforatadiskusingtheBFmodel.
#ServiceServiceErrorExposure
30.1790CloseHandle130.1790CreateFile234sizeofReadFile22.643615.0957
15.0832GetFileSize576WSetFilePriteFileointer7.547877.54787
98DeleteFileSetEndOfFile7.547877.54787
7.53537GetFileTime107.53537GetFileInformationByHandle11

DiffusionErrorService6.3.4WhenconsideringwhichOS-Driverservicesaremorelikelytospreaderrorsin
thesystemonecanusetheServiceErrorDiffusionmeasures,whichconsiders
oneservice’spronenesstospreaderrors.Sinceweareconsideringdriver

96

CHAPTER6.ERRORMODELEVALUATION

servicesspecificallywedonothavethefailureclassrestrictionsthatapplyto
applicationlevelmeasures.Wewillthereforeconcentrateonthemostsevere
failures.3Classclass,failureServiceErrorDiffusionisdefinedinEquation5.5asasumoverallappli-
cationlevelservices.Sinceallservicesareaffecteduniformly,thistranslates
intoasimplescalingoftheeffectswiththenumberofservicesused.A
simplifiedexpressioncanthereforebeapplied,wheretheapplicationlevel
servicesarenotaccountedforindividually.Thissimplifiedversionisshown
inEquation6.2,andisdefinedfordriverDxandservicesx.y(either:osx.yor
):dsx.y

SEx.y=nx.y(6.2)
Nx.ywherenx.yisthenumberofClass3failuresandNx.ythenumberof
injectionsperformedforservicesx.y,asabove.

Table6.9:Class3ServiceErrorDiffusionvaluesforcerfioserialusingthe
del.moBFDiffusionErrorServiceService#0.3125memset10.2083ymemcp20.1528MmMapIoSpace340.0909LoadLibraryW65FreeLibraryDisableThreadLibraryCalls0.06250.0625
87LoSetProcalAllocPcermissions0.03130.0313
910HalTCreateThreadranslateBusAddress0.02360.0234

Tables6.9,6.10and6.11presentthenon-zerovaluedservicesforthe
threedrivers.Oneservicestandsoutamongthedata,wcslen,foratadisk,
whichhasaServiceErrorDiffusionvalueof1.0,whichmeansthatallin-
jectederrorsforthisserviceresultedinaClass3failure.Thismakesthis
serviceatopcandidateforfurtherrobustnessenhancement.Comparingthe
numberofservicesinthesetableswiththenumberofservicesusedinthe
OS-Driverinterfaceforthethreedrivers(Table6.1)onecanobservethata
smallnumberofservicesgiverisetoallClass3failures,forallthreedrivers.
Furthermore,itcanbeseenthatsomeservicescauseseverefailuresfor
allthreedrivers,suchasmemsetandmemcpy.Thesearelow-levelsystem

6.3.ERRORPROPAGATION

97

Table6.10:Class3ServiceErrorDiffusionvaluesfor91C111usingthe
BFmo#del.ServiceServiceErrorDiffusion
0.2708memset123NdisAlloDisableThreadLibraryCallscateMemory0.12500.1250
54QueryPLoadLibraryWerformanceCounter0.09090.0938
76FNdisMSyncreeLibraryhronizeWithInterrupt0.07810.0625
0.0313yVirtualCop8910RegOpNdisMSetAenKeyExWttributesEx0.00930.0188
1211NdisInitializeWNdisMRegisterInrappterrupter0.00770.0078
0.0063trolKernelIoCon13

Table6.11:Class3ServiceErrorDiffusionvaluesforatadiskusingthe
del.moBFDiffusionErrorServiceService#1.0000cslenw10.2727ycscpw20.2708ymemcp30.1875memset40.0625DisableThreadLibraryCalls576LoMapPtrTcalAllocoProcess0.03130.0313

DiffusionErrorService1.00000.27270.27080.18750.06250.03130.0313

functionspresentinmanydriversandtheServiceErrorDiffusioniscompa-
rableacrossallthreedriversindicatingthatforthesedriverspropagationis
independentfromthedriveritself.Ifgenericerrordetectionandrecovery
mechanismscouldbedefinedforthesetwoservices84Class3failurescould
beremovedacrossallthreedriversfortheBFmodel.Thiscorrespondsto
41%oftheClass3failuresreportedfortheexperiments.

98

CHAPTER6.ERRORMODELEVALUATION

DiffusionErrorerDriv6.3.5WhileServiceErrorDiffusionisusedtoidentifyindividualservicesthat
bareecomemorehardlikelytotoovspreaderviewanderrorsservicespresentmaatybtheespreadOS-Driveracrossinmterfaceultiplethisdrivmaersy
maymakingwanantytospfoecificcusondrivertheleveldriverimprothatvisementmoreeffortspronecostlyto.Inspreadingthiscaseerrors,one
ratherthanindividualservices.TothisendweuseDriverErrorDiffusion.
plifiedDrivforerErrorClass3Diffusionfailures.canEquationsimilarlyto6.3ServicepresentsErrortheDiffusionestimatedbeDrivsim-er
ErrorfusionthusDiffusiontransformsnottoconsideringasumofapplicationServicelevErrorelservices.DiffusionvDrivalueserasErrorfolloDif-ws:
Dx=SEx.y=nx.y(6.3)
∀y∀yNx.y
Table6.12showstheDriverErrorDiffusionvaluesforallthreedrivers
usingexpressiontheBFinmoEquationdel.The6.3.valuespresentedarecalculatedwiththesimplified

Table6.12:DriverErrorDiffusionforallthreedriversconsideringClass
failures.3DiffusionerDriv9cerfio1C111serial0.931.00
1.86tadiska

FromTable6.12onecanseethatatadiskisclearlymorepronetodif-
fusingerrorsinthesystem.91C111andcerfioserialareveryclose,with
cerfioserialhavingslightlyhighervalue.Consideringthesevaluesaneval-
uatormightconsiderdevotingextraresourcestoensuringthatatadiskdoes
errors.tainconnot

delsMoErrorComparing6.4DepThereendingaremanonythecriteriagoalofonethecouldevhavaluation,eforcriteriaselectingsucthehaserrormoexecutiondeltotimeuse.
ornumberoffoundfailuresmaynotbeequallyimportant.Therefore,we
thecompareusestheandthreeimplicationsmodelsofonaeachwideevrangealuationofcriteriacriteria.isFirst,presenated,discussionfolloweond

6.4.COMPARINGERRORMODELS

99

byapresentationandinterpretationoftheresults.Table6.13showsan
overviewoftheresultsforthethreemodels.
Thefollowingcriteriaareconsideredwhencomparingtheerrormodels:

•tanNumtasbearofhighernfailuresumberfound:maygivTheemoreabsoluteinsighntuimnbtoerhofowthefailuresissystemimpcanor-
failandconsequentlygivebetterfeedbacktodevelopersofthesystem
it.evimproto

•Numberofinjectionsandexecutiontime:Thenumberofin-
jectionsrelationshipinfluencesisnotthelinear,timesincerequiredthetoexecutionperformtimethealsoevdepaluation.endsonThethe
outcome,butmoreinjectionsgenerallymeanslongerexecutiontime.

•Injectionefficiency:Theefficiencyoftheinjectionsismeasuredas
thenumberoffailuresperinjection.Thismeasurehelpsmakinga
trade-offbetweenthetwopreviouscriteria.

•Coverage:Thetermcoverageishereusedtocomparedifferentmodels
abilitytopinpointcertainservicesaspotentiallyvulnerable.Sinceno
informationontherealvulnerabilitiesexist,thecomparisonisbased
onabest-effortstrategy,wheretherelativecoverageacrossmodelsis
compared.

•Implementationcomplexity:Thecomplexityoftheimplementa-
tionisameasureoftheeffortrequiredtoimplementtheerrormodel.
Sincenolabexperimentswithrealdevelopershavebeenconducted,the
comparisonremainssubjective.However,thereareclearanddistinct
differencesintheimplementationeffortneededforthestudiedmodels.

100

CHAPTER6.ERRORMODELEVALUATION

%2.79%3.78%4.73%3.46%4.59%2.73%4.44%1.50%0.67%
3Class7415666413276646
%18.13%13.35%28.74%2.70%0.35%0.00%0.20%0.37%0.44%
2Class481534015010314
%1.43%16.37%1.43%22.49%23.32%39.29%20.19%33.71%34.48%
1Class3865204166638930090310
shoistserimenexpoferbmunThe6.13:ableTclass.failureanddelmoerrorer,drivheacforwnserial
%77.65%66.50%65.09%71.35%71.73%57.98%75.17%64.42%62.85%
ailureFNo206026490813202035741117172565
delMoBFDTFZBFDTFZBFDTFZ
Errorer91C111atadiskcerfioDriv

91C111

6.4.COMPARINGERRORMODELS

6.4.1NumberofFailures

101

Theabsolutenumberoffailuresthatanerrormodeltriggersisimportant
fromafeedbackperspective.Themorecasesoftriggeringvulnerabilities
shown,theeasieritwillbetoidentifythevulnerabilityandpossiblyremove
it.Table6.13showsthenumberoffailuresfoundforeachofthefourfailure
classes,errormodelanddriver.Fromthetableitcanclearlybeseenthatthe
BFmodel,havingthemostinjections,alsoincurthemostClass3failures.
ThenumberoffailuresfortheBFmodeliscomparableacrossallthree
drivers.Fortheothertwomodelstherearedifferencesinthenumberof
Class3failures,indicatingthattherearedifferencesbetweenthedriversin
theirabilitytospreaderrors.
Table6.13furthersubstantiatethefactthatcerfioserialismoreprone
toClass2failuresthantheothertwodrivers.Thatthisbehaviorisdriver
related,andnotdependentontheerrormodelisfurthersupportedbythe
factthatthepercentageofinjectionsleadingtoClass2failureisdistinctly
higherforallerrormodelsforcerfioserial,comparedtotheothertwodrivers.
However,itisimportanttonotethatsincetheapproachisexperimental,
allresultsaredependentonthespecificsetupused.Inthiscasealltest
applicationsarewritteninastraight-forwardmanner,withoutanyexplicit
fault-tolerancemechanisms.Suchmechanismswillmostprobablychangethe
resultsoftheevaluation,andtheresultspresentedindeedsuggestthatsuch
needed.arehanismsmecFurthermore,manyinjectionsdonotresultinanyobservableerrorprop-
agation(58-78%)withinthetimeusedforeachinjection,i.e.,noobservable
deviationfromtheexpectedbehaviorwasobserved.Thisisinlinewithmul-
tiplepreviousstudies,e.g.,[Dur˜aesandMadeira,2003],[Guetal.,2003]and
[Jarbouietal.,2002a].ExperimentsintheClassNFcategoryareeither
maskedbythesystem,forinstanceparametersnotusedinthiscontextor
overwritten;orhandledbybuilt-inerrordetection/correctionmechanisms
checkingincomingparametervaluesforcorrectness.Anotherexplanation
couldbethatthefaultisdormantinthesystemandhasnotyetpropagated
totheOS-Applicationinterface.Itisimportanttonotethatallerrorsin-
jectedwereinfactactivated,sincethepre-profilingeliminatesservicesnot
usedpriortoinjection(Section4.6).ThehighnumberofClassNFexper-
imentsindicatesthatthereisroomforimprovingtheselectionofinjection
casesbeyondthepre-profilingalreadycarriedout.

102

CHAPTER6.ERRORMODELEVALUATION

TimeExecution6.4.2Thenumberofinjectionsperformedandthetimerequiredforexecutingthe
experimentsarerelated.Anincreaseinthenumberofinjectionswillmean
increasedexecutiontime.However,theoutcomeoftheexperimentsinfluence
theexecutiontime.Aninjectionthatdoesnotleadtoerrorpropagationcan
beconsiderablyfasterthanonethatleadstoasystemhang,requiringfirst
thatthehang,isdetectedandthenarebootofthesystem.
Table6.14reportstheexecutiontimesfortheinjectionsperformed.The
timesreportedincludeonlytheactualexecutiontime,notimplementation,
setupandoff-lineprocessingtimes.

Table6.14:Experimentexecutiontimes.
DriverErrorModelhoursExecutionminTimeutes
cerioserialDTBF5381514
4420FZ91C111DTBF1175620
487FZatadiskDTBF2205651
5511FZ

Finjections,romthealsotablehasitthecanlongestclearlybeexecutionseenthattime.theTheBFtimemodel,requiredhavingfortheBFmostis
roughlytwiceasmuchasforFZandseventoeighttimesasmuchasDT.As
notedabovetheoutcomeoftheexperimentsinfluencetheexecutiontime,and
thismightdifferacrossdrivers.Inoursetupcerfioserialandatadiskboth
takeAlongerfactortimeinfluencingwhenthefailing,effewhicctivehexpalsoerimeninfluencesttimetheistheexecutiondegreeoftime.opera-
tor(whicinhvolvdrivemener,t.errorThemoopdeleratoretc).isTherequiredsetuptotimespisecifythethesameexpforerimenalltmotodels.run
Additionally,someinjectionsforcethesystemintoastatewhereitcannot
automaticallyreboot,requiringamanualrebootbytheoperator.Conse-
quently,withoutexternalrebootmechanismstheexperimentisdelayeduntil
theoperatorisnotifiedandcanperformthereboot,whichcanprolongthe
tionexecutiontimeintimeTablesubstan6.14tiallysince.noThisassumptionadditionalisdelamadeyisonnotthepartofpresencetheofexecu-the

6.4.COMPARINGERRORMODELS

103

operator.TheissueofmanualrebootsisfurtherdiscussedinSection6.6.

EfficiencyInjection6.4.3Theabsolutenumberoffailureseacherrormodelgivesrisetoalsoneedstobe
putincontrasttothenumberofinjectionsperformed,togiveanindication
oftheefficiencyofthemodel.Figure6.1graphicallyshowsthedatainTable
This6.13.wItouldcanbeclearlyseenfavthatormtheoodelsverallincurringtrendisfewersimilarinjections,acrossallespthreeeciallydrivDTers.,
injectionsFZthealsobut

Class 3Class 2Class 1

45.0Class 3Class 240.0Class 135.030.025.020.0Failure class distributions in percentBFDTFZBFDTFZBFDTFZ
15.010.05.00.0cerfio_serial91C111atadisk

.efficiencyInjection6.1:Figure

Theothercriteriausedarequantitativeinnature,wherearelativescale
of“goodness”canbedefinedandusedtorankthemodels.Oneaspectthat
cannotbecomparedquantitativelyisamodel’sabilitytoassessthe“true”
propagationpatternsofthesystem.Anefficientmodelmaybeonegivinga
bigger“bangforthebuck”,atleastiffindingfailuretriggeringvulnerabilities,
butmaystillbemisleading.Aseparateconcernisthereforetoinspectthe
differencesinpropagationresultsforthethreemodelsstudied,represented
bytheDriverErrorDiffusionvaluesforallthreemodelsanddriversinTable
6.15.Table6.15showsthatthereareindeeddifferencesacrosstheresultsof
thethreemodels.DTandFZidentifytheserialdrivertobethemost

104

CHAPTER6.ERRORMODELEVALUATION

Table6.15:DriverDiffusionforClass3failures.
FZDTBFerDrivcerfio91C111serial0.931.000.981.500.591.93
0.190.631.86atadisk

withvulnerabletheserialdriver,andwhereasEthernetBFdriverspin-phaoinvingtsaverytadisktosimilarbevthealues.mostItcanvulnerable,alsobe
observedthattheresultsforatadiskisclearlymorespreadthanfortheother
twodrivers,with91C111beingfairlyconsistentacrossallthreemodels.This
indicatesthattheservicesfor91C111givingrisetoClass3failuressuffer
from“uniform”vulnerabilities,i.e.,anysmallchangeinthedatasuppliedwill
triggertriggeredfailures.onlyforOnmoretheconspecifictrary,values,servicesinusedthisbycaseatadisktriggeredhavebybit-flips.vulnerabilities
theAnusmbtraigheroftforwexparderimenviewtsofintheeachresultsfailureisclasspresenisteddetailed,inTableboth6.13,inactualwhere
numbersandaspercentagesofallinjections.
Thefirstobservationisthatforalldriversanderrormodelsthepercentage
ofOSisinjectionscapableofendinguphandlingasmanClassyp3failureserturbationsisbeloandwav5%,oidingaindicatingcatastrophicthatthe
failure.

Table6.16:Class3ServiceErrorDiffusionvaluesforcerfioserialusingthe
del.moerrorDTDiffusionErrorServiceService#0.5000MmMapIoSpace10.4000ccalAlloLo20.2500LoadLibraryW30.2000ermissionscPSetPro40.0909ymemcp50.0625CreateThread6

Whencomparingtheerrormodels,cleardifferencescanbeidentified.
Forinstance,whereforDriverErrorDiffusionDTpreviouslyidentified
cerfioserialasthemostdiffusivedriver,91C111hasahigherratioofClass
3failuresforthiserrormodel.Onlyconsideringtheratiomayinthiscasebe
misleading,ascerfioserialinthiscasehasmoreserviceswithhighService
ErrorDiffusioncomparedto91C111forDT,asseenfromTables6.16and

6.4.COMPARINGERRORMODELS

105

Table6.17:Class3ServiceErrorDiffusionvaluesfor91C111usingthe
del.moerrorDTDiffusionErrorServiceService#0.2500LoadLibraryW10.1818ymemcp20.1764cateMemoryNdisAllo30.1500enKeyExWRegOp40.1333memset50.0555terruptNdisMRegisterIn60.0322ttributesExNdisMSetA7

6.17.Similarly,whereDriverErrorDiffusionwiththeBFmodelindicates
atadisktobebyfarthemostdiffusivedriver,theratiosofClass3failures
showthemtoberelativelyclose.Thisisaneffectofdiffusionbeinga“sum
ofprobabilities”.Diffusionshowsthatatadiskhasmoreservices(especially
wsclenwith1.0)withahighpermeabilitythan91C111(Tables6.10and
6.11).

6.4.4Coverage:IdentifyingServices
Tablefailures6.18foreachdepictsservice/errorservicesmoincurringdel.BFClassoutp3erformsfailurestheandothertheerrornummoberdels,of
bnuothmbeinroftermsClassof3thenfailuresumber(moreofidenclearlytifiedvisiblevulnerableinTableservices,6.13).andBFtheidentotaltifies
22individualservices,DT12andFZ11services.
servicesConsideringwhichonlywhiconehmoservicesdelidenthetifies,differenagaintmoBFdelsoutpuniquelyerformsidenDTtifies,andi.e.,FZ.
BFerroridenmodel.tifiessevDTenidenservicestifiesnoandsucFZhtwounique,whichservices.arenotFromidenthetifiednubmybaneryofotherfail-
uresidentified,FZidentifiesseveralserviceswithonlyonecase,suggesting
thattherandomnatureofheFZerrormodelhasahigherprobabilityoffind-
pingerformeduniqueforserviceBFtypicallyvulnerabilities.revealmoreWhereasthantheonemorefailure.systematicinjections

yComplexittationImplemen6.4.5Theimplementationcost,measuredasthetimerequiredfortheimplementa-
exptionoferience,anerrorknomowledgedelofisthenaturallyareaandsubthejectivthee.avTheailabilitamounyotfotfoolsandprogrammingdocu-

106CHAPTER6.ERRORMODELEVALUATION
Table6.18:ServicesidentifiedbyClass3failures.
FZDTBFService#013CreateThread12DDKRegGetWindowInfo001
108DisableThreadLibraryCalls34FreeLibrary401
5HalTranslateBusAddress305
76KernelIOConLoadLibraryWtrol120200
8LocalAlloc240
9MapPtrToProcess211
1011memcpmemsety4674331834
27911MmMapIoSpace1213NdisAllocateMemory1230
1415NdisInitializeWNdisMRegisterInrappterrupter110100
16NdisMSetAttributesEx311
17NdisMSynchronizeWithInterrupt500
1819RegOpQueryPenKeyExWerformanceCounter133000
20SetProcPermissions119
21VirtualAlloc001
2322wcscpVirtualCopyy640000
24wcslen1100
mentationallinfluencetherequiredtime.However,someobservationsduring
thecourseofimplementingtheinjectionframeworksuggeststhatthereare
differencesacrosstheerrormodels.
WhereasTable6.14showsthatBFandFZareclearlymoreexpensive
intermsofexecutiontimecomparedtoDT,amajordrawbackwiththe
DTerrormodelisthecostforimplementation.Thedifferenceliesinthatfor
toevberyetrackfunctioned,sucinhthethatservicetheinrightterfaceinjectorthecandatabteycpehosen.ofeachBFandparameterFZonneedsthe
otherhanddonothavethisrequirement,makingtheirimplementationcosts
considerablycheaper.Bothusesimpleinjectiontechnologies,makingthetwo
modelscomparableintermsofimplementationcosts.Additionally,thetime

ORERRCOMPOSITE6.5.MODEL

107

requiredfordefiningtheinjectioncasesforeachdatatypeisconsiderable
higherfortheDTmodel.
ThecostfortheDTmodelcouldpotentiallybereducedbyuseofauto-
maticparsingtoolsand/orreflection-capableprogramminglanguages.The
implementationcostisalsoaone-timecostforeachdriverwhichmightbeac-
ceptableiftheexperimentsaretoberepeatedinaregressiontestingfashion.
Ftionurthermore,effortsused.theFcosturthermightresearcbehonacceptablethetrueincostscomparisonforsuchtoerrorothermovdelserifica-is
ted.arranwindeednotItisrequirealsodataimptyortanpettotracnotekingthattheyevcanenbthoughenefitthefromBFit.andByFZknomowingdelthedo
datatypeused,thenumberofbitstargetedcanbelimitedfordatatypes
notusingallbitsanyway(suchas8-bitintegers).Thistechniquehasbeen
appliedintheexperimentspresentedinthisthesis.

6.5CompositeErrorModel

Twomajorfindingscanbeextractedfromthepreviouslypresentedresults,
namely:a)thatBFidentifiesthemostClass3failures,bothintermsof
absolutenumberandinthenumberofindividualservicesidentified,andb)
FZ,eventhoughnottriggeringasmanyfailuresasBFidentifies,identifies
Class3failuresforadditionalservices,beyondthoseidentifiedbyBFand
bined.comDTEventhoughtheseresultsmustbeinterpretedinthecontextofourcase
studytheyshowdifferenterrormodels,althoughbeinginjectedonthesame
levelandthereforebeingcomparable,havedifferentproperties.Itwould
bedesirabletocombinethemodels(BFandFZ)intoacombinedmodel,
drawingonthestrengthsofbothmodels.Wedothisbycombiningthetwo
modelsintoasocalledcompositemodel(CO).
ForthecompositemodelwewillfocusontheClass3failureclassasit
inmostcasesisthecriticalclassforrobustnessevaluation.Themainhurdle
foruseoftheBFmodelwasthecomparativelyhighnumberofinjections.
Thuswefirstfocusonreducingthenumberofinjectionsrequiredbystudying
theimpacteachbitinjectionhasandselectingonlyasubsetofthebitsfor
injections.FurthermorethenumberofinjectionsrequiredfortheFZmodel
isstudiedtofindareasonableinjectionset.Thefollowingsubsectionswill
studies.thesedetail

108

CHAPTER6.ERRORMODELEVALUATION

6.5.1DistinguishingControlvsData
Asmentioned,thenumberofrequiredinjectionsforBFincreasesthere-
quiredexecutiontimedramaticallycomparedtotheothertwomodels.The
highnumberofcasesforeachparameterisduetothefactthatoneinjection
ismadeforeachbitintheparametervalue,thustypically32injectionsper
parameter.Foraparameteroftypeintholdinganintegervaluethisuniform
injectionmayrepresentavalidselectionoferrorvalues.However,inmany
cases,especiallyfordevicedriverswritteninC,anintegervaluemaynot
actuallybeusedtorepresentall32-bitintegervalues.Insteadonlyasmall
subsetofthevaluesareused,andconsequentlyonlyasmallsubsetofthe
bits.Itisthereforeconceivablethatnotall32bitsneedtobetargeted.

10 8 6Number of Class 3 service failures 0024681012141618202224262830
4 2Bit position

Figure6.2:ThenumberofservicesidentifiedbyClass3failuresbythe
del.moBFFigure6.2showsforeachofthetargetedbitshowmanyserviceswhere
identifiedhavingClass3failureswhenbit-flipswereinjectedinthatbit.
Itcanfromthefigureclearlybeseenthatthereisnouniformdistribution
acrossthebits.Thelowerorderbits,bits0-9,identifymoreservicesthanthe
otherbits,withtheexceptionofthemostsignificantbit(31)whichtypically
hasspecialsignificance,suchasbeingthesignbitforsigneddatatypes.
Figure6.2showthenumberofspecificservicesidentifiedwithvulnera-
bilitiesforeachservice,butnotwhethertheservicesidentifiedbybit0are
thesameasthoseforbit1.ForthisweuseFigure6.3whichshowsthecu-

MODELORERRCOMPOSITE6.5.

20 15Accumulated number of Class 3 failures 10 5 00246

81012141618202224262830
Bit position

109

Figure6.3:Movingfrombit0andupwardsthenumberofservicesincreases
10.bittilun

mulativenumberofidentifiedservices.Readingthefigurefromlefttoright
itshowshowmanyservicesareidentifiedbyfirstinjectinginbit0,followed
byadditionofinjectionsinbit1,then2andsoon.Itshowsthatthesetof
vulnerableservicesincreasesinsizeuptillbit9wheretwentydifferentser-
viceswereidentified.Anotherserviceisfoundatbit27andthelastoneat
bit31,whichaspreviouslymentionedoftenhavespecialsignificance.Closer
inspectionrevealsthattheservicefirstfoundatbit27isalsoidentifiedby
31.bitTheobservationsmaderegardingtheimpactofindividualbitssuggests
thatthesubsetofbitsforwhichbit-flipinjectionsshouldbemadecanbe
reducedtoonlyincludebits0-9and31,i.e.,intotal11bitscomparedto
theoriginal32bitsconsidered.Thistranslatesintoareductionofinjections
by49.8%intotalcomparedtothefullsetusedpreviously.Somefaultin-
jectiontoolssupportsuchspecificationofinjections,likeXception[Carreira
1998].al.,etWhenstudyingtheparametersusedforservicesidentifiedbyBF,butnot
byFZ,showsacleartrend:manyoftheseparametersarecontrolvalues,like
pointerstodata,handlestofiles,modules,functionsetc.Suchparameters
areintuitivelymoresensitivetosmallvaluechanges,i.e.,changescausedby
flippinglowerorderbits.Asanexampleconsiderapointertosomedata
storedinanallocatedmemoryarea.Largechangestothepointervalue

110

CHAPTER6.ERRORMODELEVALUATION

(changesofhigherorderbits)aremorelikelytocausetheerroneousvalueto
lieoutsidetheallocatedmemoryareathanasmallerchange.Memoryaccess
errorsthoughbeingseverecanbedetectedbythesystem(insomecases),
however,smallchangesmaybehardertodetectandmaycausefailuresthat
arehardertoprevent.Similarly,theFZmodel,usingrandomvalues,ismore
likelytochoosevaluesthatarewelloutsidetheexpecteddatarange.This
isreflectedinthedifferenceobservedbetweenthetwomodels.Ontheother
hand,FZ’srandomnaturemeansitcanfindvulnerabilitiesnotfoundby
more“structured”approaches,reflectedinthefactthatFZidentifiesseveral
additionalservicefailuresontopofthosefoundbyBFalone.

6.5.2TheNumberofInjectionsforFuzzing

2.52 1.5Diffusion value1 0.5

0345612

atadisk91C111cerfio_serial

789101112131415
Number of FZ injections per parameter

Figure6.4:StabilityofDiffusionfortheFZmodelwithrespecttothenumber
injections.of

SincetheFZmodel,incontrasttoBFandDT,requirestheevaluator
toquestionsetthefornumbjudgingeroftheinjectionsusefulnesstobeofptheerformed,FZerrorthisbmodel.ecomesanPreviouslyimp,ortanwte
haandveusefulalreadyresults.shownhoOnewthequestionfifteenremaininginjectionsisperformedwhetherprofifteenvideinjectionscomparableis
sufficientforassessingerrorpropagation.Figure6.4showshowtheDriver
ErrorDiffusionvaluesstabilizeasthenumberofinjectionsisincreased.The
vacrossaluesthestabilizedriversaftersuggestingroughlythatteninjections,stabilizationbutmaytherealsobearedrivsomeerdepdifferencesendent.

MODELORERRCOMPOSITE6.5.

111

However,forthesethreedriversthecurvesremainclearlyseparatedforany
numberofinjectionsshown.

6.5.3CompositeModel&Effectiveness
Theresultsfromtheprevioussectionclearlyshowtheneedforusingmulti-
pleerrormodels.Whenresourcesareplentifulitisthereforerecommendable
tousemultipleerrormodelstogetcomprehensivecoverage.Todecrease
thecostofevaluation(inimplementationandexperimentationtime/effort)
thismaynotbedesirable.Inthissectionwethereforeproposeandevalu-
ateacompositemodel(CO).ThenewCOmodelcombinestheBFmodel
usingleastsignificantbits(togetherwiththemostsignificantone)alongside
aseriesofFZexperiments.Section6.5.2indicatesthatevenasfewasten
cFZhosentoinjectionsdecreasegivethestableoverallDrivnerumbErrorerofDiffusioninjections,vbutalues.itisThisnreasonableumberwthatas
moreFZinjectionswillincreasetheprobabilityoffinding“rare”cases.

Table6.19:Diffusionvaluesforthethreedrivers
DiffusionerDriv1.58serialcerfio0.9191C1111.01atadisk

Class 3 Class 2 Class 1

45.0Class 3 Class 2 40.0Class 135.030.025.0Failure class distributions [%]20.015.010.05.00.0BFDTFZCOBFDTFZCOBFDTFZCO
cerfio_serial91C111atadisk

Figure6.5:FailureclassdistributionforCOcomparedtoBF,DTandFZ.

112

CHAPTER6.ERRORMODELEVALUATION

TheCOmodelisevaluatedbyconsideringtheBFinjectionsinbit0-
9servicesandbitha31,vingtogetherClass3withfailuresthefirstbuttenoneFZ(VirtualAlloinjections.c)COcomparedidentotifiestheall
fullFiguresetof6.5.BFThe+figureFZshoinjections.wsthatAntheovresultserviewacofhievtheedwithresultsthisisshosubsetwnofin
injectioniscomparablewiththeresultsfortheothermodelsalone,making
itsabilitytoassesspropagationeffectsonparwiththeothermodels.Thisis
presenfurthertedinsubstanTabletiated6.19.bytheDriverErrorDiffusionvaluesfortheCOmodel

4000 3000Number of injections 2000 1000

0

All BF & FZCO only

atadisk91C111cerfio_serial

Figure6.6:Thenumberofinjectionsforthecompositemodelcomparedto
together.fuzzingandbit-flipsThenumberofinjectionsrequirediscomparedinFigure6.6.Thefig-
ureclearlyshowsthatthenumberofinjectionsiswellbelowhalfofthe
experimentsrequiredforthefullsetofBFandFZ,whichtranslatesinto
considerablesaveinexperimentationtime.

Discussion6.6Thissectiondiscussessomeimportantaspectsoftheworkandtheresults
providedforthecasestudy.

ortsExpvs.ortsImpInthesystemmodelpresentedinChapter3adistinctionismadebetween
importsandexports.Thedatapresentedinthischapterhasnotmadeany

DISCUSSION6.6.

113

distinctionbetweenimportedandexportedservices.Thereasonissimple,
innocasewereaninjectioninanexportedserviceabletotriggeraClass
system.theinfailure3

Table6.20:Acomparisonbetweentheresultsforimportedandexported
services.

DriverInterfaceClass3Class2Class1
3310ortexpcerfioserialimport7445035
91C111impexportort6405004160
atadiskimpexportort6603025743

Therecouldbemanyreasonsforthiseffect.Firstofallthenumber
ofservicesintheexportedinterfaceislowerthantheimported(typically
around10comparedto30-40),whichmakesitreasonablethatmostfailures
willbefoundinthelargerset.Secondly,theexportedinterfaceisastandard
interface,usedbymanydrivers.ItisthereforelikelythattheOStakesspecial
caretovalidatemisusesoftheseservicesandthatmajorflawshavealready
beendetectedduringtesting.Thirdlytheseservicesverycloselymatches
OS-Applicationservices,whichsuggeststhattheamountofadditionalwork
donebytheOSissmallfortheseservices.Consequently,theeffectserrors
canhavewillbemostlyontheapplicationsthemselves,asClass2andClass
failures.1

ExperimentalTechniques
Aswithanyexperimentalevaluationtechniqueitisimportanttoconsiderthe
limitationsofthechosenapproach.Uncertaintiesareintroducedatmultiple
levelsandtheyneedtobeidentifiedandunderstoodtoproperlyinterpret
results.theFirstofalltheerrormodelusedandevaluatedherearegeneric.Theyare
notbasedonanyspecificsystemscenario,butratherrepresentthesubset
ofdatalevelerrorsoccurringattheOS-Driverinterface.Ifsystem-specific
faultsaretobeconsideredmorespecificerrormodelsneedtobeincludedas
wellorinsteadofthegenericonespresentedhere.Furthermore,evenforthe
subsetofconceivableerrorsappearingatthisinterfaceonlyasmallfraction
isactuallyused.Theresultsprovidedinthischaptersupportsourbelief

114

CHAPTER6.ERRORMODELEVALUATION

thatthesearerepresentativeforawiderselectionoferrorssinceeventhough
therearedifferencesbetweenthemodels,theyoverallshowasimilarpattern.
Secondly,theresultspresentedareinfluencedbyexternalfactorssuch
astheselectedworkloadandthecompositionofthesystem(theselected
OScomponents).Tominimizethevariabilityoftheresultsandtominimize
unpredictableinfluencesweuseatargetedgenericworkloadandminimize
thenumberofsystemcomponents(seesection3.3).Foraspecificsystema
workloadcloselyresemblingtheexpectedoneshouldbeusedaswell,andthe
systemshouldbecomposedsuchthatitresemblesthefinalsystemasclose
ossible.pasFinally,theexperimentalproceduresthemselvesmaybeasourceofin-
fluenceonthefinalresults,bothintermsofwhatandhowtheoutcomesare
observed,andanyundesirableinfluencecausedbytheaddedsoftwareused
fortheexperimentation.Wehavefollowedcommonpractiseintheselec-
tionofobservationpoints,namelyfromauserperspective.Furthermore,we
haveminimizedthenumberofcomponentsrequiredfortheexecutionofthe
experimentsandmadeeffortstominimizeanypotentialimpacttheymight
have.However,astheexperimentshavenotyetbeenrepeatedinasimilar
environmentwecannotbe100%certainthatnosuchinfluenceexists.

Bugvs.yulnerabilitVtheAninpresenceterestingofquestionClass2arisesandwhenClass3studyingfailurestheisanresultsindicationpresentedofis“bugs”whetherin
thesystem.Theanswerisbothyesandno,sinceavulnerabilitydiscovered
bysystemexp2.erimenAtalcommonfaultcase,injectionespeciallymay,forormadeviceydrivnotbers,eispresenthattdoinacumendeplotationyed
rulesstatesmathatynotcertainberulesenforced,shouldforpbeeobrformanceeyedwhenreasons,usinge.g.,specificthecostservices.ofchecTheseking
eachagreemenparametert”isvused,aluewhereforathedrivOSermayassumesbetoothathigh.servicesInsteadareanot“genmisused.tlemen’sIf
asystemdriverindotheesmisusetraditionalsuchasense,servicebutitsurelymaisyanotberobustnessconsideredvulnerabilitabugy.inSuctheh
vulnerabilitieshaverecentlyattractedmoreattentioninresearch,sincethey
systemconstitutesnon-respthreatsonsivtoetheandsystem’stherebysecuritthreatenywthehereavanailabilitattackyerofcantherendersystem.the
allAsdiscoveredpreviouslyfailuresmenoftionedthewesystemfocusonvulnerabilities.robustnessandthereforeconsider

2Hencetheuseofthetermvulnerabilityinsteadofbug.

DISCUSSION6.6.

115

InjectionStructuredvs.RandomThereisanongoingdebateinthetestingcommunitywhetherrandomtest-
ingisanappropriatetestingtechniqueingeneral,oroneshouldaimfor
moreclassicaltechniques,suchasequivalencecallsorboundaryvaluetest-
ingHamlet[2006].Ourchoiceofmodelsreflectthisconflict,whereDTand
BFrepresentmorestructuredapproaches,whereasFZintroducesrandom-
ness.Theresultspresentedalsosupportsmanyresearchersviewonrandom
testing,namelythatithasmanyweaknesses,butmayinsomecasesbe
preferred,becausenoalternativeisdefinitelybetter.
Theadvantageofstructuredapproachesisthattheycandrawfromex-
istingknowledgewhenselectinginjectioncases,thisisforinstanceveryclear
inthecaseofDT.DTisontheotherhandlimitedtotheabilityofthe
evaluatortoselectappropriateinjectioncases,aninherentlyverydifficult
task.BFmakesthistasksimpler,bydefininginjectionsbasedontherep-
resentationoftheinjectiontarget(theparametervalue),butisstilllimited
tothespecificmodificationsdonebyflippingthebits.FZimposesnosuch
restrictions,simplychoosingrandomlyselectedvalues.
TheresultsclearlyshowthatBFfindsmorevulnerabilities,inmoreser-
vices,andthatDTisclearlymoreefficient(requiresfewerinjections)than
FZ.However,FZisabletoidentifyservicesbeyondthesetidentifiedby
BFandDT,alsowithalimitednumberofinjections.
Overall,theresultsfavorusingmultipleerrormodels,andthecomposite
modelshowsthatusingthetwomodelsrequiringtheleastimplementation
effortcangiveverypromisingresults.

OperatorInvolvement
Thedegreetowhichtheoperator(thepersonsettinguptheexperiments
andsupervisingthem)isinvolvedintheprocessaffectstheeffectivetime
torequiredconfiguretoptheerformsystemtheandexpsperimenecifyts.whichFirstexpofallerimenthetsoptoperatorerform.isForrequiredthe
frameworkusedinthisthesisthistimeisthesameforeacherrormodel.The
secondtaskistosupervisetheexperimentsandwhenneededmanuallyrestart
boardsthathavehung.Fortheexperimentspresentedhere,ithappensin
manycasesthatthesystemcrasheswithoutbeingabletoautomatically
manrestartuallyitself.forceInathesecoldcasesrestarttheofoptheeratortarget(inbthisoard.caseTablethe6.21authorpresenself)tshasdatato
ontheamountofmanualrebootsrequired.
layedWhenuntilthetheopsystemeratorisunablenoticestotherestartproblemautomaticallyandtakesexpaction.erimentsTheareHostde-

116

CHAPTER6.ERRORMODELEVALUATION

Table6.21:ThepercentageofClass3failuresthatrequiredtheboardsto
bemanuallyrebootedbytheevaluator.
DriverErrorModelManualreboots[%]
8.1BFDTserialcerfio46.715.2FZ54.7BF23.1DT91C11137.0FZ9.1BF25.0DTatadisk16.7FZ

Computerisequippedwithawatchdogtimerthatnotifiestheoperatorifno
logmessageshavebeenreceivedwithinthelastfourminutes,wellbeyondthe
executiontimeofanexperimentthatisautomaticallyselfrebooted(which
isalsotriggeredbyawatchdogtimeoutasdescribedinSection4.5.3).
Sinceonlyelevenservicesoverallhavefailuresrequiringtheoperatorto
manuallyrebootthemachines,thenumberofsuchrebootsforagivendriver
anderrormodeldependsonhowtheseservicesareused,givingrisetothe
differencesreportedinTable6.21.
Afurtherdevelopmentoftheinjectionframeworkwouldbetoimplement
thehardwarerequiredtoautomaticallyrebootthetargetboardwhenthe
hostmachinewatchdogistriggered.
Asdescribedpreviously,agenerictimepenaltyisassignedeverymanually
rebootedexperiment.ThetimesreportedinTable6.14arethereforenot
time.eratoroptheconsideringTheresultsinTable6.21indicatestheefficiencyachievablewithwatch-
dogtimersmonitoringsystemprocesses.Formanyoftheinjectederrorsa
systemlevelmonitoringwatchdog,whichrestartsfailingprocessescouldin-
creasetheavailabilityofthesystem.Thisrequiresthat“micro-rebooting”of
thetargetedcomponentsispossible.Suchstrategieshaveforinstancebeen
deployedin[Candeaetal.,2004;Herderetal.,2007].

ProfilesBitofExtractionSection6.5.1usesthebitprofilestofindwhichbitsfindthemostservice
failures.applicableAtoverymanypracticalsystemsorquestioniftheisofresultscourseareifsptheecificfoundtotheprofilesetupisusedgenerallyfor

ORKWTEDRELA6.7.

117

theseexperiments.Togetaclearpictureofthismoredriversandsystems
wouldhavetobeprofiled,andrelationstospecificdriversandservicetypes
aluated.evfurther

orkWRelated6.7Therehavebeenseveraleffortsmadetocompareerrormodelsandtofind
representativeerrorstoinjectforspecificsystemsandpurposes.Inthis
insectionthiswethesis.reviewWehasomeveofthereforethemostlimitedrelevtheanteffortsselectionthattothoserelatethattothewconsiderork
softwarefaults,especiallywithfocusonOS’sandrobustnessevaluations.A
longertreatmentofrelatedworkisfoundinChapter2.
Albinetetal.havealsostudiederrorsindevicedriverbyinjectingerrors
inDTtheusingOS-Drivtheerinterminologyterfaceinthis[Albinetthesis,etal.,but2004].withaloThewernerrorumbemorodelfusedinjectionis
casescomparedtoours.InjectiononaLinux-basedsystemshowsahigher
ratiodifferencesofkberneletweenhangsthetthanwoobservsystems,edorinthethiscdrivershapter.tested.Thismaybedueto
faultArlatinjection.etal.ThestudyMAFtheALDdepAtooendabilitlisyusedoftomicrokinjectfaultsernel-basedandpsystemerformusingfail-
uremodeanalysis.Theerrormodelconsistsofbothinjectionsinparameter
vofaaluescomptoonenmicrokt.Fernelorbothservicesloandcationsinjectionsbit-flipsinarebothusedcotodesimandulatedatabothsegmensoft-ts
wareandhardwarefaults.Thetypeofinjectionsispartiallysimilartoours
(serviceparameters)butencompassonlytheBFmodel.Theresultsforpa-
rametersuggestinginjectionthatmicrokshowaernelveryarclowhitecturesratioofkareernelbetterhangsatandhandlingcrashes,theseptyossiblypes
monolithicthanerrorsofsystems.Jarbouietal.compareBFandDTerrormodelsfortheLinuxkernel
[Jarbouietal.,2002b,a].Firstlytheyalsofindadistinctdifferenceinthe
nuresultsmbertheofnumbinjectionserofsevererequiredoutcomesforBFaresmallcomparedandbtoothDTmo.delsSimilarlyshowtosimilarour
behaviorintermsoffailuremodedistributions.Thealsocomparethese
resultswitherrorsinjectedinsidethekernelcode(whichwehavenotdone)
andcertainobservleveleofahigherparameterratiovofalidationsevereperformedoutcomesforksuggestingernelservicesthattherepresentisina
system.theThefuzzingmodelwasfirstusedonforutilityprogramsforUNIXsys-
temsapplication[Millerinetal.,terfaces1990][HowandardhasandlaterLipner,been2006;appliedFre].alsoOehlertforprotomakescolsadis-and

118

CHAPTER6.ERRORMODELEVALUATION

formertinctionbmoreetweencloselyintelligenresemtblesandouruninDTtelligenmotdel,fuzzingwherekno[Oehlert,wledgea2005].boutThethe
andformatmayusedthereforeisbassumed.etterTheexplorelatterunexpdoesectednotinputs,requirewhicanyhwpriorasalsoknoshowledgewn
intheresultspresentedinthischapter.
fuzzingToatthebtheestofOS-Drivourerknointerfacewledgeandthisthethesisfirsttimerepresentsfuzzingafirstisquanefforttitativtouseely
comparedtoothererrormodelsinacomparablesetting.

6.8SummaryofResearchContributions
Thischapterpresentsacomparativestudyofthreedifferenterrormodels:
usingbit-flipsdata(BFfrom),adata-trepresenype(DTtativ)eandcasefuzzingstudy(FZconducted).TheonmodelsWindoarewsCEcompared.Net.
Furthermore,therobustnessmeasuresintroducedinChapter5areusedto
comparethethreemodelsontheirabilitiestotriggererrorpropagationin
theusedassystem.inputOvanderallthefollorecommendationswingkeyforobservfutureationsrobustnessaremade,evwhicaluations:hcanbe

•ThemeasuresderivedinChapter5areshowntobeusefulforstudying
failureandpropagationcharacteristicsoftheOS,identifyingservices
anddriverswithpotentialrobustnessvulnerabilities.
•TheBFmodelfindsmorevulnerabilitiesthantheothermodels.Italso
identifiesmoreservicesintheOS-Driverinterfacehavingvulnerabilities
makingitthepreferredchoiceforrobustnessevaluations.
•Allthreemodelsarewellsuitedtostudyerrorpropagationcharacter-
isticsusingespeciallytheDriverErrorDiffusionmeasures.Somedif-
ferencesacrossthemodelsareobserved,relatingtotheuseofcontrol
valuesintheinterfaces.
•TheDTmodelusesthefewestinjections,followedbytheFZmodeland
theBFmodel.Theuseofprofilingcanreducethenumberofinjections
forBFandacarefulstudyofthenumberofinjectionsforFZshows
thatonecanperformexperimentswithrelativelyfewinjections.
•IntermsofimplementationcoststheDTmodelisthemostcostly.
BFandFZarecomparablebutthetimerequiredforanyimplementa-
tiondependsonmanyfactors,includingskills,experiencesandavail-
abilityoftoolsanddocumentation.

6.8.•

YSUMMAROFCHRESEARCONTRIBUTIONS119

ForidentifyingservicevulnerabilitiestheBFmodelisthepreferred
choice.However,therandomnatureofFZallowsforfindingother
vulnerabilitiesthattheothertwomodelsdonotfind.

Anewcompositeerrormodelisdefinedasacompositionofbit-flips
areandfuzzingtargetedafterinjections.aAprofilingsubsetstepofrevtheaealingvailablewhichbitsbitsforhaaveaparameterhigher
system.theonimpact

120

CHAPTER6.ORERRMODELALUAEVTION

7Chapter

ErrorTimingModels-When
Injectto

When-inthetimedomain-shoulderrorsbeinjected?

Servicesusedindriver/OSinteractionsaretypicallyinvokedmultiple
timesduringthelifetimeofthedriver.Therefore,wheninjectingerrorsin
willservicesobviouslywhichareaffectcalledthemoutcomeultipleoftimesthetheexptimeerimenatt.whichaConsequennerrortlyis,coninjectedtrol-
evlingthealuation.timeatMultiplewhichtotheolserrorhaveisbeeninjecteddeviselopaedcrucialwhichpartalloofwtheforconrobustnesstrolof
thetimeofinjection.Mosttoolsallowinjectionbasedonuser-definedevents
orhasbeenaccordingspentotonsometimestrategiesfordistribution.selectingHowevinjectioner,times,surprisinglybeyondlittletimeresearcdis-h
tributions.Thischapterisdevotedtoanoveltimingmodelusedforrobustnesseval-
injectuation-ofbyOS’sdefiningtoaerrorsusageindeviceprofiledrivofaers.drivIter,helpswhichanswcaneringbeusedRQ5to-conwhentrolto
andtationselectshowsthethattimeconatwhictrollinghthethetimeerrorsofareinjectioninjected.isindeedExtensivimpeeortanxpt,erimen-and
furthermorethatitseffectivenessdependsontheusageprofileofthedriver.

121

122

ductiontroIn7.1

CHAPTER7.ERRORTIMINGMODELS

Whendiscussingtimingissuesforfaultinjectiontwoaspectsoftheerrors
injectedarerelevant:thetimeatwhichitisinjectedandthedurationit
staysactive.Thischapterfocusesontheformerpropertyofanerror.
Forthedurationofinjectederrorswefocusonsoftwarerelatederrors.
Thesystemisassumedtofunctionproperlywhennoerrorsareinjected.
Thedurationofinjectederrorsistransient.Transienterrorsappearand
thendisappearshortlythereafter.ThismodelreflectsHeisenbugs,i.e.,those
softwarefaultswhichduetoexternalconditionsdonotdeterministically
reoccureverytimethesystemisused.Suchfaultsarehardtofindwith
traditionaltestingtechniquesandmaythereforeoccureveninwelltested
systems.InjectionsareperformedintheOS-Driverinterface,thuslimiting
thepotentialinjectioninstancestowhenservicesinthisinterfaceareused.
Thetransientmodeltranslatesintotheerrorbeinginjectedonceforthe
targetedserviceandthendisappearingbeforethesecondcalltothesame
service.Twomaingenericstrategiesexisttotriggertheinjectionofanerror:
event-triggeredandtime-triggered.Intheformerapproachspecificevents
areusedtotriggertheinjection,andinthelatterapproachtimeisused
totriggerinjection.Event-driveninjectiontypicallyallowsforamorefine-
grainedcontroloftheindividualinjections,butrequiresthetriggeringevents
tobedefined.Time-triggeredinjectionreliesonalargernumberofinjections,
distributedovertime,andconsequentlyrequiresmoreinjections.
Thischapterpresentsanapproachextendingtheevent-triggeredapproach
presentedinChapter6,whereerrorsareinjectedinthefirstcalltoaservice,
asocalledfirst-occurrenceapproach.First-occurrenceonlytargetsthefirst
calltoaservice,disregardinganysubsequentcalls.Theusageprofileofthe
driverisusedtobuildausagemodelofthedriver,andtheservicecallsto
betargetedcanbeselectedtocoverawiderspectrumofsystemstates.
Therestofthechapterwillbestructuredasfollows:Firstadiscussionon
thetwoalternativetimingmodelsprovidesthefoundationandbackground
neededfortherestofthechapter.Thedriverusageprofilemodelispresented
anddiscussed,followedbyadescriptionoftheevaluationcriteriausedforthe
experimentalevaluationoftheapproach.Thedescriptionofthetheimple-
mentationandtheresultsarethenpresentedanddiscussed.Theconclusions
madeandasummaryoftheresearchcontributionsfollowsinthelastsection.

MODELSTIMING7.2.

Timing7.2delsMo

123

Thetimeatwhichanerrorisinjectedisalsoreferredtoasthetriggering
mechanismfortheerror.Theneedforcontrollingandmonitoringthetrigger-
ingeventhaspreviouslybeenidentifiedasimportant,butinherentlydifficult
[Whittaker,2003].SeveralofthefaultinjectiontoolssurveyedinSection2.3
allowforcontrollingthetriggeringoferrors,atleasttosomeextent.Onan
abstractlevelanerrorisalwaystriggeredbyanevent.
Forpracticalpurposesonemakesadistinctionbetweeneventstriggered
byspecialeventstakingplaceinthesystem,andthosetriggeredbytime
alone,givingrisetothetwoclassesoftimingmodelsforfaultinjection,
time-triggered.andt-triggeredenev

riggert-TenEv7.2.1loInctheation-bevaseendt-triggeredinjection.case,Thisthestrategymostiscommonbasedonapproacthehpremiseforsoftthatwaresinceis
thesystem’svulnerablestatescannotgenerallybepostulateda-priori,the
ofevaenmtsodule.triggeringThesimplestinjectionofaresuchbasedonstrategiesreacishingtoinjectcertainthelofirstcationstimeinathecertaincode
locationisreached(first-occurrencestrategy).
Alocation-basedapproachisrelevantespeciallyforcodeinjection,where
errors(orfaults)areinjecteddirectlyintothesourcecode(orexecutable
binary)tomimicsoftwarefaults[Dur˜aesandMadeira,2002;NgandChen,
2001],orintotheinstructionstreamoftheCPU[Guetal.,2003].Thiscomes
fromthefactthatthesoftwarefaultsmimickedhavespecificlocationsinthe
de.cotrigger,Avi.e.,ariationtheoferrortheisfirst-oinjectedccurrenceafterncallsapproactohaistoservice,useorannafter-oreacccurrencehing
alocationforthenthtime.Thisapproachisageneralizationofthefirst-
occurrenceapproach,butrequirestheusertosetthevalueofn,whichis
farfromtrivial.TheapproachisimplementedforinstanceintheFERRARI
tool[Kanawatietal.,1995].
injectionIn[Tsaiusingetal.,bit-flips.1999]ToCPUmaximizeregisterstheandactivationmemoryrateareoftargetedfaults,andfortheirfault
impact,thetimingoftheinjectionsarecontrolledbytheworkloadinthe
system.Offlineanalysisdetectspathsthroughtheworkloadforagivenset
ofrateofinputstheandCPUfaultsisareusedtoinjectedperformalongtheinjectionspaths.atpAlternativeak-usageely,times.theactivation
toevThealuateadvansptageecificoflothecationseveninat-triggeredsystem,orapproacashspisecificthatevitencantsbtakeetailoredplace.

124

CHAPTER7.ERRORTIMINGMODELS

Itcanthereforespeeduptheevaluationprocessbyreducingthenumber
ofinjectederrorsandfocusingononlytherelevanteventsinthesystem.
Thedisadvantageisthatoftentheseeventsneedtobedefinedbytheuser.
Selectingthemisadifficultprocess,possiblyrequiringdeepunderstanding
ofthesystem,itscomponentsandtheirinteraction.

riggerTime-T7.2.2Whenusingatime-triggeredapproachatimeoutisdefined,afterwhichthe
errorisinjected.Typicallyalargenumberofinjectionsareperformed,and
theirinjectiontimesfollowsomespecificdistribution(e.g.,uniform,normal,
exponentialetc)[KaoandIyer,1994;Hanetal.,1995;Rodriguezetal.,2002].
Inthiscase,oftenthelocationisalsorandomlyselectedacrossasetofpre-
definedlocations.Thisapproachiscommonwhensimulatingphysicalfaults
(radiation,EMIetc.)whichareinherently“random”innature[Karlsson
1994].al.,etAlternatively,thetriggeringeventisdefinedasacombinationoftimeand
location,suchthatafterthetimeouthaselapsedtheerrorisinjectedthatapre-
definedlocation,possiblyusingfirst-occurrence,orafterthenoccurrence
ofacall.FERRARI,forinstance,allowsforspecifyingatimedistribution
(suchasuniform)afterwhichthefaultisinjected[Kanawatietal.,1992,
1995].Manyfaultinjectiontoolssupportbotheventandtime-triggeredinjection
[Carreiraetal.,1998;Kanawatietal.,1995],butstillleavestheburdenof
choosingeventsand/ordistributionstotheuser.
Thetime-triggeredstrategydoesawaywiththeburdenofselectingtrig-
geringevents,butontheotherhandinsteadreliesonalargenumberof
injectionstogetstatisticallysignificantresults.Inanycaseadistribution
needstobeselectedandjustified.

ProfileUsageerDriv7.3AsseenfromSection7.2bothmodelshaveadvantagesanddisadvantages.
Sinceeffectivenessisoneofthekeygoalsofrobustnessevaluationandwith
ourfocusonsoftwarefaultswehaveoptedforanevent-drivenapproach.
Theevent-driverapproachismoresuitablefordrivererrorsasitiseasierto
controlandismorefine-grained.Itisalsomoresuitableforsoftwarefaults.
Asmanyservicesarecalledmultipletimesduringtheexecutionofthe
testapplicationsthequestionariseswhichofthesecallstotarget.Targeting
eachcallwillgenerallynotbepossibleduetothelargenumberofcalls.Thus

7.3.DRIVERUSAGEPROFILE

125

aselectionofasubsetisrequired.Thesimplest,andmoststraight-forward
approachistousethefirst-occurrenceapproach.Thisisindeedtheapproach
usedinjectingintheininpreviousterfacescbehaptertweenandcompinsevoneneralts(sucotherhaprosdrivjects.ers),Howtheever,injectedwhen
haerrorvesevmayeralstemdistinctfromfaultseveralorigins.distinctThus,locationsinjectioninonthefirst-ocompccurrenceonent,i.e.,willmaonlyy
targetasubsetofthosepotentialfaultlocations,namelythosecorresponding
tothoughthefirstthedrivcall.er(andSubsequenindeedtthecallstowholeaservicesystem)mawillybnoteinbeadifferentargeted,tevstateen
whichmaybeofinteresttoevaluate.
Wehavedevelopedamethodologytoselecttherelevantserviceinvocation
basedontheobservationthataservicerequestfromanapplicationtranslates
aintosmallone,orsubsetmoreofthecallspintoossiblethedrivsequenceserbyofthecallsOS.thatInancanybepracticalmadeconaretextactuallyonly
observed.Asanexample,itdoesnotmakesenseforanapplicationtoread
fromfromatrying.filebeforeSinceitwehasbasebeenouropevened,aluationalthoughonatheresystemiswhicnothinghisstoppingfunctional,it
itiscanbexecuted.eexpectedThus,thatthesucophbehaerationalviorbisehanotviorpresenoftthewhendriverthewcanborkloadebrokuseden
downintoaseriesofcallstothedriver,wherecertainsub-sequencesare
ofmorecallsfrequenmadetfromthanothers.applications,Suchtherebysubsequencesdefiningtherepresen“opterations”commonpsequenceserformed
betc.ytheSucdrivhaer,nsucelemenhastary“creatingsequenceafile”,oforcallsis“settingcalledaconnectioncallblock.parameters”Inour
model,thecallblocksareusedtotriggertheinjectionoferrorsintheOS-
Drivinjection,erinterface.comparedTherebtoyfirst-ogivingaccurrencemoreandfine-grainedtime-triggeredcontrolovinjection.erthetimeof
proThevidedbdrivyertheusagedriverprofile(theisdsx.ydefinedservicesasaninorderedFigurelist3.1).ofThiscallstodefinitionservicesis
slightlydifferentfromthetraditionaldefinitionofanoperationalprofileas
icallydefinedfordefinedasinstancethebyfrequencyMusa[Musa,distribution1993].acrossTheopacomperationalonent’sprofileisfunctions.typ-
Ourusageprofileadditionallyconsiderstheorderinwhichthefunctionsare
called.Thelistofcallsmadeistermedthecallstringofthedriver.The
callsequencesstringofisthefurthercalldividedstring.intoasetofcallblocks,whicharedisjointsub-

StringCall7.3.1callsFiguremade7.1todrivillustrateserstheservicesdriverareusageillustratedprofileasasatrectanglesime-serviceinthefigurediagram.(a−Thed).

126

CHAPTER7.ERRORTIMINGMODELS

Thecallstringisformedbyassigningtokenstoeachservicefromapredefined
alphabetandthenforeachcalladdingonetokentothelist.Thecallstringfor
theexampleinFigure7.1isthusababcdabdab.Ascanbeseen,somesequences
ofcallsarerepeating,formingcallblocksassubsequencesofthecallstring
(α,βandγ).Notethatsequentialexecutionofthedriverisassumed.

α1α2βα3γα4
abbacdabbda
Time

Figure7.1:Exampleofcalledservices.

7.3.2CallBlocks
TheexampleinFigure7.1showsthatservicesaandbarecalledmultiple
timesduringtheexecution.Eachcalltoserviceaisfollowedbyacallto
b;aandbthusformacallblock,whichisrepeatedduringtheexecution
ofthedriver.Thesequencescdanddarenotrepeating,andcannotbe
addedtoanyothercallblock.Thesesequencesformthenon-repeatingcall
blocks.ThecallstringisthussplitintocallblocksasindicatedinFigure
7.1.Usingaconventionalregularexpressionsyntaxthethesequencecanbe
representedcompactlyas(ab){2}cd(ab)(d)(ab),where(ab){2}means
thatthesequenceabisrepeatedtwice.
Currentlytheassignmentofcallblocksisperformedthroughacombina-
tionofidentificationofrepeatingblocksandaprioriknowledgeregardingthe
functionalityofthedriver.Ascallstringgrowinsizeautomatedtechniques
willberequiredtohandlethelargenumberoftokens.Section7.6discusses
theuseofspecialdatastructurestoautomatecallblockidentification.
Infigure7.2asimilarcallblockstructureisillustrated,andadditionally
showstheservicescalledwithineachcallblock.Whentargetingservicesi
usingthefirst-occurrenceapproachinjectionsareperformedonlythefirst
calltothatservice.Subsequentcallstosi,forinstanceinα2orα3,arenot
targeted.Usingthecallblockstrategymultiplecallstosicanbetargeted.

PhaseserationalOp7.3.3Ingeneral,adriver’slifetimecanbesplitintothreedisjointphases:initializa-
tion,workingandcleanupphases,asseeninFigure7.3.Intheinitialization
phasethedriversetsuprequireddatastructuresandregistersitspresence
withtheOS.Thereafterfollowstheworkingphase,wherethedriverperforms

7.3.DRIVERUSAGEPROFILE

sisjsksism

sisjsksism

127

α1α2βγα3α4
TimeFigure7.2:ExampleofadrivercallingOSservicesindifferentcallblocks.

workonbehalfofapplicationsortheOSitself.Finally,thecleanupphase
unregistersthedriverwiththeOSandreleasesanyresourcesheld.
Theoperationalphasesbecomerelevantwhendiscussingselectionofcall
blocksforinjection.Intuitivelyitcanbeexpectedthatfailuresintheinitial-
izationandcleanupphasesaremoreseverethanintheworkingphase,asin
thesephasesthedriverinteractswithmanyOSserviceswhichmayaffectthe
statenotonlyofthedriverbutthewholesystem.Theworkingphaseonthe
otherhand,iswheredriversspendthemosttime(maythereforehavebeen
moreextensivelytested)andperturbationsmaybeexpectedandtherefore
consideredbydevelopers.

OperInitializationClean up phaseational phasephase

time

Figure7.3:Theoperationalphasesofadriver.
Inthisworkwearemostlyfocusingonthedriver’soperationalphase.
However,withapplicationlevelknowledge,similarphasescanbedefined
alsofortheworkloadused.Lookingattheworkloadusedforthecasestudy
(Section4.5.4),thedriverspecifictestapplicationscanbedecomposedinto
tworoundsofinitialization,workingandcleanupphases,asillustratedin
Figure7.4.Notethatthisisspecifictothesetestapplicationsandrequires
accesstoandknowledgeoftheapplications.
Eachcallblockistargetedforinjection,i.e.,eachoperationperformed
bythedriver.Forcallblocksthatarerepeatedmultipletimesafiltering
maytakeplace,toreducethenumberofinjections.Preferablyatleastone

128

CHAPTER7.ERRORTIMINGMODELS

InitializationOperationalClean upInitializationOperationalClean up
time

Round 1

Round 2

Figure7.4:Theoperationalphasesoftheworkload.

callcallbloblocckk,petheropfirstoerationalccurrencephaseapproacshouldhincanthisbecaseusedboneeachtargeted.serviceForcalledeach
inblocthatk,callgivingbloriseck.toNotesomethatcalltblocypicallyksnotrequiringallmanservicesyareinjectionscalledandinsomeeachfew.call

SetuptalerimenExp7.4Toevaluatetheusefulnessoftheproposedapproachanimplementationhas
beenmadefortheWindowsCE.Net.TheserialportdriverandtheEther-
netdriverwereselectedandcallblockswerederivedforbothdrivers.
Theproposedapproachwillbecomparedtoatraditionalfirst-
occurrenceapproach.Themaincriteriausedforthecomparisonwillbe
thenumberofinjectionsrequired,thefailureclassdistributionobservedand
thenumberofseverevulnerabilitiesobservedforeachofthetwoapproaches.
Thissectionfirstpresentsthetwodrivers,theinjectionstrategyand
detailstheselectedcallblocksidentifiedforeachofthedrivers.

ersDrivargetedT7.4.1Twodriversareselectedforthiscasestudy,theserialportdriver
(cerfioserial)andEthernetdriver(91C111).Thetwodriversarewellsuit-
ableforevaluationastheya)representfunctionalityfoundinallmodern
OS’s,andb)representdifferentfunctionalities,givingrisetoadifferentus-
OS.theofprofileageThedifferenceinOSusageprofilecanbeillustratedbystudyingthe
frequenciesatwhichOSservicesarecalledbythedriversforsomeworkload,
inourcasethetestapplicationsdescribedinSection4.5.4.Figure7.5and
7.6showthedifferenceinprofileforthetwodrivers.Thex-axesshowthe
servicescalledbythetwodriversandthey-axesshowthenumberofcalls

SETUPALEXPERIMENT7.4.

129

madetoeachservice.Thetwofiguresclearlyshowthatthecerfioserialcalls
ahighernumberofservicesandmorefrequentlythan91C111.
160 140 120 100 80Nr of invocations 60 40 200Services

60 50 40 30Nr of invocations 20 10 0

ServicesFigure7.5:Callprofileforcerfioserial

0ServicesFigure7.6:Callprofileof91C111
cerfioserialuses41services.Onaverageaserviceisinvoked30.5times
forthegivenworkloadwithastandarddeviationof53.5andmedianof2.
Thisreflectsthefactthattherearesomeservicesthatareusedfrequently
(forreading/writing,synchronizationetc.)andsomeonlyonceortwice(like
configurationofthedevice).For91C111theaveragenumberofinvocations
is5.4withamedianof1andstandarddeviationof11.7.
Bothfiguresshowthatmanyservicesarecalledmultipletimes,indicating
thatfirst-occurrencemaynotfindallvulnerabilities.Thedifferencebetween

130

est ApplicationsT

Operstem & libsyting SaTrackerInjector

erget drivraT

CHAPTER7.ERRORTIMINGMODELS

ExperimentanagerM

- Exp- Exp S. Synch.etup
ggingo- Lting- Restar

Figure7.7:Theexperimentalsetup.

HosteromputC

thetwodriversalsosuggeststhatasmoreservicesarecalledfrequentlyfor
cerfioserialitistobeexpectedthatthecallblockapproachwillbemore
effectiveforthisdriver.

delMoError7.4.2toevBuildingaluateonthethetworesultsapproacfromhes.ChapterThe6mothedelwbit-flipasc(hosenBF)masoitdeliswathescmosthosen
vulnerability-revealingmodeloftheonesevaluatedinChapter6andwill
thereforebetterexplorethepotentialofbothapproaches.Focuswillbeput
onthemostsevereclassoffailuresandexperimentsarethereforefocused
onClassthe3importfailuresininterfaceChapterofthe6.drivers,whichwastheonlyonetoexperience

Injection7.4.3Theexperimentalsetupusedfortheinjectiondiffersslightlyfromtheone
busedeeninintrotheducedpreviousbetwceenhaptertheinOSthatandathenewIntargetedterceptordriver.moThedule(exptrackererimen)talhas
setupisshowninFigure7.7.Forthecallblockapproachtheinjectoris
blockconfiguredisreactohed.injectAparterrorsfromwhenthesethectrachangeskerthesignalssetupthatremainsthethetargetedsamecallas
ed.describpreviously

SETUPALEXPERIMENT7.4.

131

Beforeinjectionscanbemade,aprofilingexecutionofthedriverisper-
formed.Duringthiserror-freeexecutionthetrackermodulerecordsallcalls
btheeingtwomadedrivtoerstheanddrivtheer,wi.e.,orkloaditusedrecordsforthethesecallexpstringerimenfortsthethecalldriver.stringsFor
wereinjectingbotherrorsthedeterministic,samei.e.,callevstringerywtimeasthewgenerated.orkloadwThisasisaexecutedsimplificationwithout
andaresultofchoosingasimpleanddeterministicworkload.Usingadeter-
toministiccountingwcallsorkloadmadereducedtoOStheservices.triggeringSectionof7.6injectionsfurtherinaspdiscussesecificcallthisbloissue.ck

7.4.4CallStringsandCallBlocks
Thissectionreportsonthecallstringsandcallblocksidentifiedforthetwo
ers.drivtargeted

erDrivortPSerialTheworkloadforcerfioserialfirstwritesastringofcharacterstotheserial
portwhicharereadbythehostcomputerconnectedtoit.Thehostcomputer
echoesthesamestringbackandtheyarereadonebyonebytheapplication.
Thisprocessisthenrepeatedoncemore.Theworkloadgeneratescallstothe
driver,formingthecallstringshowninFigure7.8,representedasaregular
expression.Intotalthecallstringfortheserialportdrivercontains152
ens.tokThefirstcallperformedisaninitializationcalltothedriver,DllMain.
Suchanentrypointexistsforeachdriver.Afterthisfollowsaseriesofcalls
toperformtheservicesrequested.
InFigure7.8thetokensassignedarelistedinTable7.1,whichshows
theStreaminterfaceentrypointsprovidedbycerfioserial.Additionally,the
DllMainfunctioncalledwhentheDllisloadedisassignedthetokenD.
MoredetailedinformationontheStreaminterfaceanddevicedriversfor
WindowsCE.Netcanbefoundin[Boling,2003].
ThecallstoDllMainandtoCOMInitmakeuptheinitializationphase
ofthedriver.Theworkingphase(whichofcourseisworkloaddependent)
consistsofaseriesofcallstoCOMOpen,COMRead,COMWrite,
COMCloseandCOMIOControl.Thethecleanupphaseofthework-
loadfinisheswiththecalltoCOMClose.Thepatternisthenrepeated
more.onceThecallstringinFigure7.8issplitintocallblocks,asillustratedinTable
7.2andFigure7.9.Fivecallblocksareidentified(δ,α,β,γandω),someof
whicharerepeating.Fortheremainderofthepresentationinthischapter

132

CHAPTER7.ERRORTIMINGMODELS

Table7.1:Streaminterfaceforserialdriver.
NumberNamePurpose
0COMInitInitializingthedriver.
DepricatedDeinitCOM12COMOpenOpenaconnectiontothedeviceorafile.
3COMCloseCloseapreviouslyopenedconnectionor
file.4COMReadReadfromanopenconnectionorfile.
5COMWriteWritetoanopenconnectionfile.
6COMSeekMovewithinthefile.Usuallydonotwork
devices.tedconnection-orienon7COMIOControlSendcontrolcommandstothedevice.Are
ecific.spdeviceypicallyt8COMPowerDownTellthedevicetomovetoapowersaving
state.9COMPowerUpTelldevicetocomebackfrompowersav-
states.ingD02775(747){23}732775(747){23}73
Figure7.8:Theserialdrivercallstring.

wetermthetargetedcallblocksδ,α,β1,γ1,ω1,β2,γ2andω2asshownalong
calltheblox-axiscks,innotFigurephysical7.9.time,Notei.e.,thatthethelengthx-axisofabindicatesoxdoestimenotasrepresensequencesttheof
service.theoftimeexecution

Table7.2:Callblocksforcerfioserial.
CallblockTokensOccurrences
δα0pre-load11
22775βωγ73747462

oftheForcalltheblocallck.bloCcallksblothatckγreorepccureatsonceall(inβalland46ω)wtimes,etargetandeactargetinghinstanceeach
ofinstancethemwofγouldineacclearlyhrepbeveatingerytimesequence.consuming.ForthefirstTherefore,sequencewewetargettargetone

EXPERIMENT7.4.SETUPAL

73

7373747...747747...747
all Block277527750CDemtimeδαβ1γ1ω1β2γ2ω2
Figure7.9:Thecallblocksfortheserialport.

133

thefirstinstanceandforthesecondwehavechosenoneinstancearbitrarily
sixth).(the

erDrivEthernetThenetworkcarddriverworkloadworksinasimilarfashionas
cerfioserialworkload.Amessageissentoverthenetworkandisechoed
backbythehostcomputer.However,astheNDISwrapperisusedinstead
fortheStreaminterfacefortheEthernetdriveraslightlydifferenttracker
mechanismisused(thedriverexportsStreamfunctionsaswellthoughas
partofbeingaproperdriver).

Table7.3:NDIScallbackfunctionsforthepassthroughwrapper.
Num1berMinipNameortInitialize
33MinipMiniportSendortSendPackets
54MinipMiniportProcessSetortQueryInformationPowerOid
ortSetInformationMinip678MinipMiniportTortReturnPransferDataacket
ortHaltMinip91110MinipMiniportResetortCancelSendPackets
tenortDevicePnPEvMinip12wnutdoortAdapterShMinip13TheNDISarchitectureisalayeredone,withtheupperlayerbeingproto-

134

CHAPTER7.ERRORTIMINGMODELS

collayers(TCP/IPetc)andthelowerlayerbeingthedevicedriver(termed
miniportdriver).Thelayeredmodel,withitsdefinedinterfaces,makesit
possibletointroducenewfilteringlayersin-betweenexistinglayers(which
iswhatsomefirewallsandanti-virussoftwaredo).Apassthroughwrap-
perisimplemented,thatisaddedontopoftheminiportdrivertargeted
(91C111.dll).Thepassthroughwrapper,asthenamesuggests,doesnotal-
terthedatainanyway,simplypassesitthroughtotheminiportdriver.The
purposeisonlytotrackthecallsmadetothedriver.Thecallbackfunctions
exportedaresummarizedinTable7.3.Lessthantenfunctionswereexer-
cisedforourworkload,whichallowedustostillusesinglenumericaltokens
forthecallstring.Notethatthisisnoreallimitationtotheapproach,since
anyalphabetcouldhavebeenused.
D1(4){9}1(4){15}666(4){10}666444336663444(3){9}
Figure7.10:TheEthernetdrivercallstring.
ThecallstringfortheEthernetdriverisshowninFigure7.10.Again,the
firstcallistoDllMain.Additionallythedriverhasaspecificsetupexport
(DriverEntry)whichtogetherformsthefirstentryinthecallstring(D).

Table7.4:CallblocksfortheEthernetdriver.
CallblockTokensOccurrences
δDllMain+DriverEntry1
βα144444444444444412
2444666γ14444444µω33666636443333333331
Asforcerfioserialamanualinspectiongaverisetothecallblocksillus-
tratedinFigure7.11.AstheEthernetdrivergaverisetosignificantlyfewer
callblockswetargetallcallblocksusinganyservices.Forsomecallblocksno
OSilarly,servicesthelastwerecallbloused,ckand(ω)givconsequenesrisetlytononocallsinjectionsmadewtoerethepOS.erformed.Therefore,Sim-
itwasnotfurthersplitintocallblocks.

aluationEvofResult7.5Fpaulterformedinjectionusingexpbotherimenthetswfirst-oerecarriedccurrenceoutforapproacbothhdrivanders.thepropInjectionsosedwcall-ere

7.5.RESULTOFEVALUATION

µγγ21all BlockCα1α2
βδ

ω

time

Figure7.11:Thenetworkdrivercallblocks.

135

blockapproach.AnoverviewoftheresultsisshowninTable7.5anda
moredetailedviewisshowninTable7.6.Notetheuseofthethepreviously
introducednamesforidentifyingthecallblocks.Theresultsaregraphically
illustratedinFigures7.12and7.13.Adiscussionandfurtherinterpretation
oftheseresultsisfoundintheSection7.6.

Tnuablember7.5:ofinjectionsComparingandthethefirst-onumberccurrenceofexpanderimencalltblockoutcomesapproacinhestheonmostthe
severefailureclass(Class3).
Trigger#InjectionsSerialdriver#Class3#InjectionsNetworkdriv#Classer3
CallFirstoblocksccurrence840824281310235618181212

7.5.1SerialPortDriver
ThedistributionofoutcomesacrossfailureclassesisshowninFigure7.12.
Forcomparisonpurposestheoutcomeofthefirst-occurrenceinjectionsare
shownaswell.Additionallythenumberofinjectionsforeachcallblockis
shownasopaque,blackbars.ForillustrativepurposestheClassNFisnot
shown,butalldataisfoundinTable7.6.
Callblocksγ1andγ2exhibitthelowestnumberofClass3failures.These
callblockscorrespondtotheworkingphaseofthedriver,wereitisonlysend-
ingandreceivingdata.Sincetheworkdoneinthisphasecorrespondstothe
mainpartofadriver’slifetime,itisreasonabletoexpectittobesufficiently
wellspecifiedandunderstoodfortheOStobeimplementedtoleratingmany

136

CHAPTER7.ERRORTIMINGMODELS

canfluctuationalsobeinobservdevice/drived,witherγb2ehahavingvior.aAslighsmalltlyhigherdifferenceratio.inClass2behavior
Comparedtotheworkingphase,theinitializationphaseofthedriver
shophasewsaofthehighertestratioofapplicationClass(3β)shofailuresws(aδhighandα).ratioofSimilarlyClass,the3failures.initializationThe
andsameω2holdsshowforclosethetocleanidenupticalphaseofdistributions,thetestβ2shoapplicationwsmore(ω).Class1Whereasfailuresω1
.βthan1

Class 2

Class 1

#Injections

#InjectionsClass 1 Class 2 Class 3 45.040.0 200035.030.0 150025.0Number of injectionsFailure class distributions in percent20.0 100015.010.0 5005.00.0FOδαβ1γ1ω1β2γ2ω2 0
Call blocks

oFigureccurrence7.12:Fapproacailurehclassandforeacdistributionhcallbloandcknuofmbcererfioofserial.injectionsforthefirst-

FromFigure7.12itcanalsobeseenthatthefirstinitializationcallblock
δislesspronetoClass3failuresthanthesecondcallblockintheinitial-
izationphase,α.WhenDllMainiscalled(i.e.,inδ)driverdevelopersare
discouraged(inthedocumentation)toincludeanytimeconsumingorcom-
plexoperations,restrictingtoonlyinitializationofsynchronizationobjects
andotherlightweightoperations.ThisminimizescallsmadetotheOSinthis
criticalphaseofthesystem.ConsequentlyweobservefewerClass3failure.
However,asthetheseoperationsmaynowfail,withthewholedriverbeing
unabletomakeprogress,weseeariseinClass2failuresinsteadcompared
.αtoComparingwiththefirst-occurrenceinjectionsitcanclearlybeseenthat
thefirst-occurrenceapproachgivesverysimilarresultsasthecallblocksin

7.5.TRESULFOVETIONALUA137

theinitializationphase.Thisbehaviorisofcourseexpected,sinceitinjects
inthefirstcalltoeachserviceusedbythedriver.
Notonlythedistributionacrossfailureclassesisofinterest,butalsothe
numberservicesfoundtohavesevere(Class3)failures.Table7.7presents
theresultforcerfioserial.Fromthe41servicesbeingtargeted,13services
causedClass3failures.Comparedwiththefirst-occurrenceapproachthisis
anincreaseofthreeservices,i.e.,threeservicesnotpreviouslyfoundtohave
Class3failureswereidentified.Thisindicatesthatchoosingthetriggering
eventforinjectionfortheserialportdriverhasasignificantimpactonthe
resultsobtainedandthatfirst-occurrenceisnotsufficientforacomprehensive
.evaluation

138

CHAPTER7.ERRORTIMINGMODELS

---%C33.05%0.58%1.55%3.76%0.00%1.60%3.24%0.00%1.69%3.52%4.04%0.00%2.34%1.75%1.75%
#74717510184301964710433---
---%C218.53%22.10%9.38%16.08%2.24%2.14%10.92%4.83%2.84%2.75%1.82%0.00%0.58%2.34%1.17%
#4502691032181324145283250320142---
(91C111.dll).erdrivcardorkwnettheandserial.dll)
---1.44%1.48%0.82%0.37%0.34%9.16%12.12%0.34%8.00%22.88%28.13%0.00%38.01%38.01%37.43%
%C1#35189521031612904164940656564---
---76.98%75.84%88.25%79.79%97.41%87.10%73.72%94.83%87.47%70.85%66.00%100.00%59.06%57.89%59.65%
%NF#18899239691082565979979550984128811599610199102---
00024281217109813565801124132858011251818175696171171171
#InjectionsskInjection7.6:ableT(cerfioserialtheforresults
cBloFOδαβ1γ1ω1β2γ2ω2FOδα1α2βγ1µγ2ω
CallNameserial.dller91C111.dllcerfioDriv

serial.dllcerfio

91C111.dll

DISCUSSION7.6.

139

Table7.7:TheserviceshavingClass3failuresforcerfioserial.Servicesnot
identifiedbyfirst-occurrencearemarkedwithX.
Service/CallblockFOδαβ1γ1ω1β2γ2ω2
xxxCreateThreadEventModifyDisableThreadLibraryCallsxxXX
xxreeLibraryFxxranslateBusAddressHalTXInitializeCriticalSectionInterlocLoadLibraryWkedDecrementxxX
LomemcpcalAlloycxxxxx
xxxmemsetxMmMapIoSpaceTSetProcPransBusAddrTermissionsoStaticxXxx

xxx

X

7.5.2erdrivEthernetTable7.6andFigure7.13showthatcallblockδshowsaverysimilardis-
tributionasthefirst-occurrenceapproach.Forcallblocksµ,γ2andωthe
driverdoesnotperformanycallstotheOS,andconsequentlynoinjections
wereperformedandthesecallblocksarenotshowninFigure7.13.
Thecallblocksα2,βandγ1showaverysimilarbehavior,duetothefact
thatonlyoneservicesisusedforallthreecallblocks.Thereforethesame
amountofinjectionsareperformed.Noinjectionsincallblockα1showany
Class3failures.OverallnonewserviceswerefoundtohaveClass3failures
7.5).ableT(see

Discussion7.6

Thismostlysectionontheinmostterpretssevereandclassdiscussesoffailures,theresultsClassof3.thecasestudy.Focusis

140

Class 3 45.040.035.030.025.0Failure class distributions [%]20.015.010.05.00.0

FO

CHAPTER7.ERRORTIMINGMODELS

Class 2 Class 1 #Injections

#Injections

2000 1500#Injections 1000 500δα1α2βγ10
Call blocks

bloFigureckof97.13:1C111.FailureCallclassblocksdistributionwithoutandinjectionsnumberareofexcluded.injectionsforeachcall

7.6.1DifferenceinDriverTypes
Theresultsforthetwodriversshowasignificantdifferenceacrossthem.For
theserialdriveragoodnumberofnewserviceswerefoundtoexperience
Class3failures.TheEthernetdriverontheotherhanddidnotgetany
additionalservicevulnerabilitieswiththenewcallblockapproach.
Section7.4.1showsthattheserialdriverusesmoreservices,andmore
frequentlythantheEthernetdriver,suggestingthatitwouldbemoresus-
ceptibletocallblockinjections,andthisisindeedalsothecase.Forthe
EthernetdrivernonewClass3failureswhereidentifiedandtheδcallblock
showsaverysimilarbehaviortothefirst-occurrenceinjections,furthersub-
tuition.inthetiatingstanFurtheranalysisonthecallsmadeforeachcallblockispresentedin
Figures7.14and7.15.Figure7.14showsthecallprofileforcerfioserial,i.e.,
thenumberofcallsmadebythedriverforeachofthecallblocksidentified.
Itcanclearlybeobservedthatthecallsmadearespreadthroughoutthe
lifetimeofthedriver,withcallblockβhavingthemostcalls.
Figure7.15showsthecallprofilefortheEthernetdriver.Itshowsthatthe
Ethernetdriverismoreactiveintheinitializationphaseofthedriver(tothe
leftinthefigures)thanintheworkingphase.Comparedwithcerfioserialin
Figure7.14thedistributionofcallsisclearlygearedtowardstheinitialization

DISCUSSION7.6.

200150100Nr of calls50......0δβα1γω1β2γω2
21Figure7.14:Thecallprofilefortheserialdriver.

141

phase.Thisexplainswhythecallblockapproachfindsmorenewvulnerabilities
fortheserialdriversthanfortheEthernetdriver.Itconfirmsthatfordrivers
whichperformfewcalls,andespeciallyduringtheinitializationphase,first-
occurrenceistobepreferred.
Theprofilingofthedriversshowthattheeffectivenessoftheapproach
cantosomedegreebepredictedandsuggeststhataprofilingofthetargeted
driversshouldbeconductedbeforetriggeringtechniquesareselected,tomin-
imizethetimeforimplementationandnumberofinjectionsrequired.Note
thatsuchprofilingcanbeconductedpriortoinjectionanyerrors!However,
profiles,suchasFigures7.14and7.15doesrequirecallblockstobedefined.

200150100Nr of calls500δα1α2βγ1µγ1ω
Figure7.15:ThecallprofilefortheEthernetdriver.

OccurrenceFirstwithComparing7.6.2Thefirst-occurrenceapproachhasseveraldistinctadvantagescomparedto
thecallblockapproach.Firstandforemostitusesfewerinjections.Itisalso
appropriatewhendoingcodelevelinjections(Table7.5),wherethelocation
ofthemodeledfaultcoincideswiththeinjectederror.

142

CHAPTER7.ERRORTIMINGMODELS

Thehighernumberofinjectionsforthecallblockapproachtranslatesinto
longerlessenedtimebyprequirederformingforppre-profilingerformingtotheeremovvealuation.non-significanThiscantbeexperimensomewhatts.
Unfortunately,thehighernumberofinjectionsisinherentlyinevitablesince
thecallblockapproachinjectsinmultipleinvocationsofaservice,whereas
first-occurrenceonlyinjectsinthefirstcalltoaservice.
Theadditionalcostsgiverisetoatrade-offwiththeusefulnessofthe
results.Newservicevulnerabilitiescanindeedbeidentifiedusingthecall
blockapproach,asshowninSection7.5.First-occurrencefoundtenClass
3services,whereasthecallblockinjectionsfoundthirteen.Thisrepresents
anincreaseof30%.Ontheotherhand,fortheEthernetdrivernoadditional
servicesareidentifiedduetoitscallprofilebeinggearedtowardstheinitial
phase.

7.6.3IdentifyingCallBlocks
Thelengthofacallstringvariesdependingontheworkloadusedtogenerate
it.Manualinspectionwassufficienttoidentifycallblocksfortheexperiments
presentedinthischapter.However,thiswillnotbefeasibleforlongercall
strings.Forlongercallstringssomelevelofautomationisrequired.
Therepeatingnatureofcallblocksiswhatmakesthemidentifiablein
thecallstring.Toidentifyarepeatingsequenceoftokensfromagiven
alphabetinastringisawellstudiedproblem.Examplesofusesistoidentify
repeatingsequencesinDNA.Multipledatastructuresandalgorithmshave
beendevisedforthispurpose.Manyexamplesofsuchproblemscanbefound
forinstancein[Gusfield,1997].Wehaveexploredtheuseofsuffixtrees,a
specialtreedatastructure,whichfindsrepeatingsequencesquickly.However,
furtherresearchisneededtodevelopappropriatetoolsandtechniquestofully
takeadvantageofthisdatastructure.
Figure7.12and7.13showthatsomecallblocksaremoreusefuliniden-
tifyingClass3failures.Thesecallblockstypicallybelongtoeitherthe
initializationorcleanupphasesofthedriverandworkload.Byfocusing
mostlyontheinitialization(likefirst-occurrence)andcleanupphasesone
couldpotentiallyreducethenumberofinjectionsrequired.

orkloadW7.6.4Withoperationalphasesandcallblocksformingthebasisfortheusagepro-
fileitisimportanttoidentifyrepresentativeworkloadstobeusedtodrive
experiments.Representativeofthesystem’sexpectedworkloadonceitbe-
comesoperational.Inmanycasesno,oronlypartialinformationisavailable

7.6.DISCUSSION

143

proacregardinghistotheuseexpasynectedtheticopwerationalorkload,profile.exercisingIfthenotknosystemwn,inaadivcommonersewaap-y
real-w[Johansson,orld2001].applications.TheHowealternativver,etmanoyusingaapplicationssyntheticarewnotorkloadsuitableistotobusee
useddirectly,duetorequireduserinputs,non-determinismetc.
Ascallblocksrepresenthigherleveloperationscarriedoutonthedriverit
isimportantthatthecallstringisstableacrossruns,i.e.,thatthesamecall
stringisgeneratedeverytimetheworkloadisexecuted.Theworkloadused
inthisthesisisindeedstableanddeterministic,andgivesrisetothesame
callstringforeachrun.Thisisanimportantpropertyforanyworkloadused
inacomparativepurposeandisawellestablishedapproachinthebench-
wmarkingorkloadiscommkeyunitfory,aclikehievingthereprostandardducibilittestsyin,anSPECimp[SPE].ortantApropdertyeterministiciden-
Astifiedoneforstilldepstrivesendabilitafterybusingenchmarksreal-worldw[Johansson,orkloads,2001;anapproacKanounhietstoal.,mo2005].dify
byapplicationsusingspbecificyremouservingscenariossourcesorofusecases.non-determinism,suchasuserinputs,
Anothersourceofnon-determinismfordevicedriversisthefactthatthey
can,ingeneral,beaccessedbyseveralapplicationsconcurrently.Depending
onthesemanticsofthedevice(ordriver)thismaybesupportedornot.
Aserialdriverdoesforinstancenotgenerallyacceptconcurrentaccesses,
whereasanetworkcarddrivertypicallydoes.Fortheexperimentsinthis
thesiswehavedeliberatelyfocusedonsingle-accessapplicationscenarios.
Thisisasimplificationandextendingtheapproachtohandleconcurrency
isterleaavetopicdtokforensfutureintheresearccallh.string,Concurrenbelongingtaccessestowdifferenouldtprogiverisecesses/tasks.toin-
Fcurrenturthermore,onethe(basedonimplemensimpletationcounofters),triggerssincewillonlinebemorepatterncomplexrecognitionthantheis
ativrequired.e(bencWhetherhmarking)concurrenpurposestaccesscanbeshoulddiscussed,atallasbeitgivesconsideredrisetoforpotencompar-tial
issueswithreproducibility.

DurationError7.6.5Asjectedpreviouslyerrors,i.e.,describeachederrorwehaappveearsusedonceaandtransienthentdisappdurationearsmofordelforsubsequenthein-t
invocationsofthesameservices.Ourinjectionframeworkdoessupportin-
Howtermittenever,tnone(errorofdisappthesemoearsdelsafterhavneinbveenoevcations)aluatedaswyeet.llaspermanenterrors.

144

CHAPTER7.ERRORTIMINGMODELS

ErrorsTiming7.6.6Forconfusedthiswwithorkwetiminghaveerrors,consideredwhichcthehangetimetheoftiminginjection.behaviorThisofisthenottosystembe
byforinstancedelayingordroppingcallsbeingissued.Sincemanydevices
requirespecifictimingrequirementstobemet,thismaybearelevanterror
modelforevaluatingdevicedriversandOS’s.Experimentalsupportforthis
noerrormosystematicdelisevimplemenaluationtedhasinyetthebeenfaultcarriedinjectionoutusingframewthisorkmodevdel.eloped,but

orkWRelated7.7Thecallblockstrategyisoneofseveralthatusesaprofileofthesystem
totriggerinjection.In[Tsaietal.,1999]stress-basedinjectionsareper-
formed,whereinjectionissynchronizedwithhighworkloadactivityinthe
system.Similarlyapplicationresourceusageisanalyzedtoguideinjection
intoresourcesactivelyused.
TsaiandSingh[2000]usedasetupverysimilartoours,butwiththeintent
totestapplicationsonWindowsNTbycorruptionofparametervaluesto
librarycalls.Thefirst-occurrencestrategyisused,andacommentismade
thatregardinginjectioninsubsequentcalls:“...preliminaryresultsshowed
thatsuchinjectionsproducedsimilarresults.”[TsaiandSingh,2000,page
4].Webelievethatthisassertiondoesnotgenerallyhold.However,the
resultsforthenetworkdrivershowsthatdependingonthecallprofileofthe
targetedcomponent,differentbehaviorsareobserved.Thissuggeststhat
moreresearchisneededtocompletelycharacterizeforwhichcomponents
first-occurrenceismostsuitableandforwhichnot.
Inthisworkwedonotconsiderdistributedsystemsexplicitly.Fordis-
tributedsystemstheconceptofglobalstateisofkeyimportanceandone
maywanttoinjecterrorsatparticularlocaland/orglobalstates.Whereas
thetechniquepresentedhereissuitableforlocalstates,itdoesnothandle
globalstatesofdistributedsystems,whereevendetectionofspecificglobal
events/statesisdifficult.Someworkhasbeendonewithinthisspecificarea,
forinstancetheLokitool[Chandraetal.,2004].

7.8SummaryofResearchContributions
Thetimeatwhichafaultisinjectedcanhaveanimpactontherobustness
evaluationofasystem.InthecontextofdevicedriversforOS’s,thischap-
terhasestablishedthatselectionofthetriggeringevents(controllingerror

7.8.YSUMMAROFCHRESEARCONTRIBUTIONS145

timing)doesimpacttheevaluationresults.Furthermore,itisshownthata
profilingofthedriverrevealsitssensitivitytothetimingofinjections.
Thefollowingdistinctcontributionsareputforwardinthischapter:

Anoveltimingmodelfordevicedriversispresented.Thenewmodel
isbasedontheconceptofacallblock,asubsequenceofcallstothe
drivercorrespondingtohigherleveloperations.

Itisdetailedhowtheproposedmodelcanbeusedtoidentifyinjection
triggersfordevicedriversusedinfaultinjection.

Alargecasestudyfortwodevicedriversshowthatselectingtheerror
timingimpactstherobustnessevaluation,especiallyfordriverswhich
activelyuseOSservicesthroughouttheirlifetime.

146

CHAPTER7.ORERRTIMINGMODELS

8Chapter

Conclusion

&

utureF

hResearc

Whathavewelearned,andhowdowemoveforward?

Thischapterconcludesthethesisbysummarizingitsmaincontributions
anddiscussingtheirrelevancetotheresearchcommunity.
Additionally,thischapteraimstobroadenthescopeofthetechniques
usedbysurveyinganddiscussingtheirusabilityinenhancingthedepend-
abilityofOS’s.Differentareasofdependabilityenhancementsarediscussed,
coveringbothfault-removalandfault-tolerancetechniques.
Theworkinthisthesisformsthebasisformanyinterestingnewresearch
directions.Thethesisisthereforeconcludedbyscopingoutmultiplefuture
directionsofresearch.Thisincludesbothrefinementsandextensionsaswell
asnewexitingproblemswarrantingfurtherresearch.

147

148

CHAPTER8.CONCLUSIONANDFUTURERESEARCH

8.1tributionsCon

ThissectionrecapstheresearchquestionsposedinChapter1anddiscusses
theindividualcontributionsmadeandtheirrelevance.

Conceptual1:Category8.1.1OS?ResearcWhathisagQuestionoodmo1:delHowfordoidentificerrorsationinofdevicsuchepropdriversagationproppagateaths?inan
Errorsindevicedrivershavebeenshowntocausemanyfailuresin
anOS’sOSandallotowsustopropagatecapturetobothapplicationserrorscausingrunningsevonerethefailuresOS.inOurthemodelsystemof
andthosedataerrorspropagatingtoapplicationsthroughtheOS.
TheresultspresentedinChapter6clearlyshowthatthereisasignificant
differenceacrossservicesregardingerrorpropagation.Someservicesare
identifiedtoberobust,i.e,severesystemstatescannotbeprovokedthrough
it.Otherservicescanleadtosevereconsequences,includingacomplete
systemcrash.Furthermore,itisshownthatthecontextinwhichtheservice
isused,e.g.,thedriverusingit,hasasignificantimpactonitsdamage
tial.otenpResearchQuestion2:Whatarequantifiablemeasuresofrobustness
OS’s?ofofilingprChapter5presentsaframeworkwithwhicherrorpropagationacrossthe
OScanbeestimated.Themeasurespresentedallowforidentifyingindividual
Asservicessuchtheymorearevulnerableusefulfortopropagatingdiscriminatingerrors,serviceseitherbasedasonsourcesosusceptibilitrsinks.y
topropagatingerrors,whichgivesdevelopershintsonwhichservicesare
morelikelytoexperienceproblemsduringruntime.Furthermore,theyallow
discriminationacrossdriversandapplications,whichcanbeusedtoprioritize
acrossplanningmbyultiplemanagers.contendingCompdrivonentswithers/applicationshigherorexpforosurevorerificationdiffusionresourceshould
bethefirsttargetsforimprovements.
Overall,themeasuresdefinedprovideevaluatorswiththetoolstomake
informeddecisions,onvariouslevels.

8.1.2Category2:ExperimentalValidation
ResearchQuestion3:Wheretoinject?Whereareerrorsrepresenting
offaultsdifferinentlodriverscbations?estinjected?Whataretheadvantagesanddisadvantages

CONTRIBUTIONS8.1.

149

Theuseofstandardinterfacesisbeneficialsinceitfacilitatesbothin-
jectionoferrorsandobservingtheireffectswithminimalintrusiononthe
systemstudied.Itgivesclearandeasilyinterpretablefeedbacktotheevalu-
atoronwhereerrorspropagateandwhichservicesanddriversaremorelikely
tospreadthemiftheyarepresent.

ResearchQuestion4:Whattoinject?Whicherrormodelshouldbe
usedforrobustnessevaluation?Whatarethetrade-offsthatcanbemade?
Chapter6evaluatesthreecontemporaryerrormodels,chosenbasedon
theirsuitabilityforinjectionattheOS-Driverinterfaceandtheirpriorusefor
thispurpose.Thecontributionsherearetwo-fold.First,tothebestofour
knowledgethisthesisisthefirstcomprehensivecomparisonacrossmultiple
errormodelsattheinterfacelevel.Thisstudyhighlightsthestrengthsand
weaknessesofthemodels,highlightingdifferencesacrossthem.Secondly,our
comparisonevaluatesthemodelsonthenumberofidentifiedfailures,cov-
erageofservices,executiontime,efficiencyandimplementationcomplexity.
Thisallowsselectingthemostappropriateerrormodel,basedontrade-offs
parameters.theacrossForthecasestudyperformedbit-flipsrevealthemostsevere(Class3)
failuresandprovokefailuresinthehighestnumberofservices.Additional
serviceshavingseverefailurescanbeidentifiedusingtheFuzzingerrormodel.
Theleastnumberofinjectionswereincurredbythedatatypeerrormodel.
Chapter6furthershowshowanewcompositemodelcanbedefined,
combiningthebit-flipandFuzzingerrormodelstoachieveagoodtrade-off
betweenefficiency,coverageandnumberofinjectionsperformed.

ResearchQuestion5:Whentoinject?Whichtimingmodelshould
beusedforinjection?
i.e.,Thetheevtimingentsmotriggeringdelusedconinjections.trolstheClassicallytime,aterrorswhichareerrorsinjectedareeitherinjected,on
first-occurrence,i.e.,thefirsttimeaserviceiscalled,orinjectedaccording
tomodelsomebasedonpredefinedthetimeusageprofiledistribution.ofthecompChapteronen7tinpropterface,osesainnothisvelcasetimingthe
OS-Drivinjectionsercaninbeterface.concenByfirsttratedonprofilingoptheerationsopusingerationstheperformedconceptofoncathellblodrivckser,
i.e.,repeatingsequencesofcalls.
Thenewtimingmodelsallowsformorefocusedinjections,givingmore
comprehensiveresults,withoutrequiringdeepknowledgeoftheservicesin
tythepesofOS-Drivdriverersinareterface.moreFsensitivurthermore,eintheprofilinginitializationofdriversphase,revealwthathereascertainsome
aresensitivealsointheworkingandcleanupphase,suggestingthatthe

150CHAPTER8.CONCLUSIONANDFUTURERESEARCH

first-occurrenceapproachmaybemoresuitablefortheformercase.

orkramewFInjection8.1.3Forcarryingoutallthefaultinjectionexperimentsrequiredaflexibleand
menscalableted.ThefaultframewinjectionorkalloframewwsfororkeasyforandWindofastwsCEextension.Nettohasnewbeenerrormoimple-d-
elsusingapluginmodelforerrormodels.Theflexiblearchitecturemakesit
easytoimplementnewerrormodelsandtoincorporatenewdrivers.

8.2ApplicationsofRobustnessEvaluation

Thissectionillustrateshowtherobustnessevaluationframeworkpresented
inSuchtheevpreviousaluationsccanhaptersservecanbeprimarilyusedtothreeenhancepurposes,thea)asrobustnesssuppofortaninOS.the
buildtestingmoreofrobustsystems,cob)deasbyahighlighsourcefortingpdevoteneloptialerfeedbacrobustnessk,bohelpingttlenecdevkselopusingers
robustnessprofiles,andc)asinputtoactiverobustnessenhancingactivities,
sucsectionshasdetailadditioneachofoferrorthesepodetectiontentialandusesofrecoveryrobustnessmodules.evaluation.Thefollowing

ProfilingRobustness8.2.1provideRobustnessusefulevaluationfeedbackoftotheplatforms,developsuchersasofOS’s,isapplicationsusefulbbuiltecauseonittopcanof
suchplatforms.Byprovidingsocalledrobustnessprofiles,adeveloperis
madeequippaedwaretoofmakpeotentialdecisionsonrobustnesswhichOSvulnerabilitiesservicestoinusetheandsystemtheandisconsequencesbetter
thatmightcomefromusingthem.
tlenecksRobustnessmayexistprofilesinasgivystemeorinformationcomponenont.whereThepotenrobustnesstialprofilerobustnesscan,bforot-
instance,informationconsistgainedofacanbsubseteusedoftotheraisemeasuresawarenesspresentedamongindevChapterelopers5.onThethe
canelectconsequencestousefaultsdifferencanthaveservicesforstopacecifichieveservices.thesameWhengoalpinossible,asaferdevelopmanner.ers
Robustnessprofilescanbeusedasinputtotesters,whichcanfocustesting
onthosepartsofthesystemusingvulnerableservices.Robustnessprofiles
alsomorecanlikbelyetousedcausetofocusdamagecodeininsptheectionssystem.anddesignreviewsonthoseparts

8.2.APPLICATIONSOFROBUSTNESSEVALUATION

151

8.2.2RobustnessEvaluationinTesting
Robustnessevaluationcanbeconsideredaspecialbranchoftesting,where
focusisputonthenon-functionalrequirementsofthesystem.Typically
arobustnessevaluationrequiresanacceptableleveloffunctionalitytobe
fullypresentinexecutetheonsystemthebOSeingevwithoutaluated,anyi.e.,errors.theThisusedwimpliesorkloadthatmustfunctionalsuccess-
testingThereofarethethreesystemphaseshasbofeentestingcarriedwhereout.robustnessevaluationfocusingon
devicedriversmaybeofgreatassistant:
•AcceptanceTesting:ToverifythattheOSanditsdriversbehave
ellevreasonableaat•InusedintegrationtheTsystemesting:Toverifythatadrivercanbeintegratedand
•performedRegressiontheTesting:robustnessofWhenthemajorsystemneedsconfigurationtobecre-evhangesaluatedhavebeen

estingTAcceptanceAspartoftherequirementsforaspecificsoftwarecomponentrequirements
onrobustnessmaybeincluded.Thismayinvolvespecifyingwhichservices
maypropagateerrorsand/oratwhichseverity,forinstancebyspecifying
thatnoservicemaycauseacrashofthesystem,nomatterwhichvaluesitis
usedwith.Assuchthepresentedrobustnessevaluationframeworkmaybe
usedtovalidateorinvalidatesuchproperties.

estingTtegrationIninWhenteractionintegratingacrosscompcomponenonentstswwithorkseacashexpotherected.oneFneedsurthermore,tomakeonesuremaythatbe
interestedinevaluatingtheconsequencesfaultsinonecomponenthaveon
theothercomponent(s).Whenmisbehavingcomponentsareabletocause
severefailurestheymayeitherrequireadditionalfocusedverificationefforts
ormayneedtobeequippedwitherrorhandlingcapabilities.
Thetechniquepresentedinthisthesisiswellsuitedfortestingtheinte-
grationofnewdriversintheOSasitworksontheOS-Driverinterfacelevel,
whichiswheretheinteractiontakesplace.Itisimportanttonotethoughthat
theerrorpropagationprofilingisnotaimedatevaluatingdriversspecifically,
buttheOS-Driverinteraction.Therefore,detectedvulnerabilitiesforanew
drivershouldnotautomaticallybe“blamed”solelyonthedriver.Similarly,

152

CHAPTER8.CONCLUSIONANDFUTURERESEARCH

andisrobustnessthereforeprofilingadocomplemenesnottfotocusonfunctionalthetesting,functionalitnotyoafaspreplacemenecifict.driver,

estingTRegressionAscomponentsevolveaspartofthedevelopmentprocesstheirerrorpropa-
gationabilitiesmaychange,tothebetterortheworse.Aspartofregression
testingcampaignserrorpropagationcanbeevaluatedsuchthatnewpropaga-
tionpathsaredetectedassoonaspossibleandmaybetreatedappropriately.
Thescalabilityandautomationpossibilitiesofinterface-basedfaultinjec-
tionmakesitexcellentforregressiontesting.Usingthepre-profilingapproach
describedtheinjectionsareadoptedtoanychangesinworkloadonthesys-
tem.Additionalinjectionsfornewservicesareeasilydefined.Newerror
modelsrequireminimalchangestothesystembeforeinclusion.

8.2.3RobustnessEnhancingWrappers
Inmanycasesmodificationstosystemcomponentsexhibitingrobustness
vulnerabilitiesarenotpossible,orevendesirable.Thisisforinstancethe
caseforpureblack-boxsystems,wherethelackofaccesstosourcecodepro-
hibitsanymodifications.Evenwithaccesstosourcecode,legalreasonsmay
prohibitmodifyingthecode.Typicalrobustnessenhancingmodifications
includeadditionoferrorcheckingandhandlingcode,suchasexecutableas-
sertions[VoasandMiller,1994b;Hiller,2000].Forsystemsgearedtowards
highcompponentserformanceinvolvited,maespynoteciallybeforviabletogeneral-purpaddosetime-consumingsystemssucchhecaskstoOS’s.the
Asanalternativetomodifyingtheinvolvedcomponentsanattractive
alternativeistoaddnewcomponents“wrapping”theoriginalcomponent
[Fraseretal.,1999;Ghoshetal.,1999;Mitchemetal.,2000].Suchwrappers
canbeaddedwhereneededandthusbeappliedonapolicybasisorwhere
likelytobemosteffective[Hilleretal.,2002a].Errorpropagationanalysis
canbeusedtoidentifyprominentpropagationpaths,suchasin[Hilleretal.,
2002a].Thedatacollectedfromfaultinjectionexperimentscanbeusedtodesign
assertions,whichcanbeimplementedaswrappers[Voas,1997b;Whisnant
etal.,2004].Itcanalsobeusedtoenhancethewrappersdesignbyother
means.Severalresearchprojectshavelookedintotheuseofwrappersfor
enhancingOSrobustnessandsecurity.In[Arlatetal.,2002]theauthorsde-
scribefaultinjectioncampaignscarriedoutontwomicrokernel-basedOS’s.
Robustnessenhancingwrappersareaddedtosomefunctionalcomponentsof
theOSbyformallydefiningpredicatesthatmustholdoverthecourseofthe

ONOUTLOOK8.3.FUTURETHE

153

menexecution.tedusingThisreflection.requiresItaccessistonotedinthatternalevenstateswhenofthesuchOSaccesswhichisipsossible,imple-
thedefinitionof(correct)predicatesistime-consuminganddifficult.The
tremelyrequiredformaldifficulttomodelsimplemenofbtehaforviorgeneralmayinpurppracticeoseCOTSmakeOS’s.theTheapproachauthorsex-
proposetoinsteaduseoperationalconsistencychecks,suchasacceptanceor
validitychecks.
In[FetzerandXiao,2002b,a]wrappersareusedtotracknon-robustar-
ofgumenmemorytstoalloCcationslibrariesonmadethebyheapandapplications.stackandcanStatefulverifywrappthatersakeepccessestracarek
onlymadetoallocatedmemory,presentedfirstin[FetzerandXiao,2001](a
similartechniqueispresentedin[DeValeandKoopman,2001]forexception
hardeningofI/Olibraries).Formemorynotpresentontheheaporonthe
stacdataksignalstructureshandlersarevarealidatedsetupusingtotracexistingkanyvaccessalidationviolations.functionsFprovidedurthermore,by
thesystemandwhensuchfunctionsarenotavailable,stateinformationis
kaeptviolationsimilaristofound,memoryasafealloreturncationctoodeviserifythereturnedtocorrectnesstheofapplication.arguments.If
maNojoritoksyoisfanfailuresadd-onindevicesubsystemdrivtoersan[SwiftOS,etal.,protecting2005].theOSDriversfromareaviso-ast
lated(i.e.,wrapped)withinlightweightprotectiondomains.Allinteraction
withthekernelistracked,tobothisolatefailuresandforfacilitatingcleanup
procedures.Theprotectionisachievedbylimitingadriver’swriteaccess
toprotectkernelagainstmemorymemoryandbykviolationsernelobandjectkerneltrackingstructuremechanisms.corruption.NoFoksaultcanin-
jectionparameterwaschecusedkstotovimproalidatevethefailureapproachisolation.andcanInbelaterusedwtoorkdefine[Swiftspetecifical.,
2006],ducingtheshadowauthorsdrivers.extendedShadothewrecodrivverserytempcapabilitiesorarilytakofestheoversystemwhilebydrivintro-ers
arereloadedandrestarted,thesystemalsohandlesstateinformationtrans-
fertothenewlystarteddriver,makingrecoverytransparenttotheuserof
ittheavoidssystem.timeThemicrconsumingorebooandtpstrategyossiblyisadisruptivpromisingesystemrecovreberyootsapproac[Candeah,as
etal.,2004;Herderetal.,2007].

8.3OutlookontheFuture

onThisfuturesectionstepsreflectsinresearconthehneededthemesforpresenthemtedtobyfurtherdiscussingevolve.andspeculating

154CHAPTER8.CONCLUSIONANDFUTURERESEARCH

8.3.1FaultInjectionTechnology
Thefaultinjectionperformedforthisthesisisbasedoninterceptingcalls
madeindevicedrivers.Thusitisbasedonrealcallsmadeinthesystem,
i.e.,aworkloadisneededtogeneratetherequiredcalls.Thisisincontrast
toapproachwheretestharnessesaresetuptosimulateoperationalcondi-
tions,whereeachservicecanbetestedinisolation.Thebenefitofthelatter
approachisthatmoreinjectionscanbeperformedpertimeunit,butonthe
otherhanditrequiresoperationalconditionstobesetup,whichmaybedif-
ficulttodo,especiallyforlow-levelsystemsoftware,suchasdevicedrivers.
Sincebothapproacheshavemeritsacomprehensiveevaluationofbothona
largerprojectwouldgiveinsightsandguidanceonwhereoneismoreuseful
other.thethanForthefaultinjectionapproachpresentedheretogainwidespreadaccep-
tanceandadoptionitneedstobeincorporatedintoapropertoolset.Sucha
toolsetmustminimizethesemanticalburdenontheevaluatorandautomate
theprocessofevaluationasmuchaspossible,stillallowingforuser-driver
extensibilityandscalability.Theseareasofthepresentedapproachneedto
behandledbythetool:

•Profilingofthetargeteddriver,includingidentificationofallusedser-
vicesandautomaticgenerationofinjectionwrapper.
•Provideaselectionoferrormodelsthattheevaluatorcanchoosefrom,
aswellasastandardinterfaceforaddingcustomerrormodels.
•Automaticallyperformtheinjectionsandcollecttherequiredlogsand
database.ainthemstore•Providetheevaluatorwiththemechanismstoautomaticallycalculate
relevantmeasures,includingerrorpropagation.
•Additionally,throughouttheprocessdatamustbestoredinopenfor-
mats,e.g.,XML,enablingintegrationwithexternaltoolsandfuture
ts.enhancemen

PropagationError8.3.2Therearemanyusesforinformationonerrorpropagation,somealreadydis-
cussedpreviouslyinthischapter.Inthisthesisafour-gradedscalehasbeen
usedtoclassifyeachexperimentintodifferentfailureclasses.Propagation
isthentypicallystudiedonafailureclassbasis.Thescaleusedisbasedon
severity,withoutanyspecificsysteminmind.However,furtherrefinementof

FUTURETHEONOUTLOOK8.3.

155

thescalemaybeusefulforspecificsystems,specificallyincorporatingappli-
cationlevelinformation,suchasdatacorruptionorotherapplicationspecific
failuresofdifferentseverity.Developingguidelinesforincludingapplication
specificfailuresandstillpreservingcomparativecapabilitiesisaninteresting
hallenge.c

Arelatedissuetotheinclusionofapplication-specificinformationinfail-
ureclassificationistofurtherinvestigatetheroleoftheworkloadselection
ontheoutcomeoftheevaluation.Itiswellestablishedthattheusedwork-
i.e.,loadoneshouldusesasthecloselyopaserationalpossibleprofileresemofblethethesystemrealw[Musa,orkload1993].ontheHowesystem,ver,
theusedworkloadcanalsohaveanimpactonthefailurerevealingcapabil-
itiespropagatingoftheeverrorsaluation,thanwhereothers.someForwinstance,orkloadsmaybapplicationsemoreconlikelytainingtoexpsomeose
tolevelofhandleerror(i.e.,checnotkingrepandortingorcorrectionshowingmay,effecttransparenof)mantlyytoofthetheuser,bpropagatingeable
errors.Whensuchapplicationsareused“asis”theymayhideimportant
robustnessinformationfromtheevaluator.Thissuggeststhatthebestop-
thistionwthesis.ouldbeDuetotousethese“errorrevconflictingealing”goals,theapplications,compwhicositionhisofwhatanis“efficiendonet”in
workloadbecomescomplicated,asitmightnotreflecttheactualuseofthe
systemandthereforeskewtheresults.Moreresearchisneededintoidentifi-
faultcationofinjectionbothexpusefulerimenandts.realisticworkloadstobeusedinconjunctionwith

Knowingwhichservicesmayaffectyourapplicationcanbefoundoutby
studyingtheServiceExposuremeasure.Thiswayindividualservicescan
beidentifiedandthedesigneroftheapplicationcanverifyiftheseservices
arepropusederlyinandthethatapplicationpropagatingintheerrorsfirstareplace,handled.andifHoso,wevthater,thistheyproarecessusedis
complexandcanbetime-consuming,especiallysinceitneedstoberedone
iswhencthereforehangestohavdefineebeenApplicmadeationtotheExposureapplication.measures,Anewcapturingresearchhowdirectionappli-
cationsareaffectedbypropagatingerrors.Then,techniquesforassessment
aofcsuchhallengingeffectstask,needesptobeeciallyfound,forforcaseswheninstancenousingsourcefaultcodeisinjection.available.ThisFi-is
nally,ApplicationExposureandrobustnessprofilesoftheOSarecomposed
intoasystem-levelrobustnessprofile,consideringthespecificapplications
system.theonrunning

156CHAPTER8.CONCLUSIONANDFUTURERESEARCH

delsMoError8.3.3Chapter6evaluatedtheappropriatenessofthreeerrormodelsandtheresults
clearlyfavorssimplermodels,suchasbit-flipsandfuzzing,oversemantically
richermodelssuchasthedatatypemodel.Thisisalsoinlinewithcurrent
advancesinrandomtesting[Hamlet,2006;Pachecoetal.,2007].However,
theresultspresentedheremustbeinterpretedinlightofthespecificcase
studywheretheywerefound.Therefore,moreresearchisneededinthearea
ofsoftwareerrormodels,bothforsimplerandmorecomplexmodels.The
compositemodelpresentedshowsthat,atleastforspecificsystems/contexts,
modelsmayhavetobecombinedtobemosteffective.Thesemodels,even
Thethoughsetcoofvmoeringdelsaevwidealuatedspectrumshouldofpropthereforeerties,beareenlarged,ofcourseespnoteciallycomplete.consid-
eringcodelevelfaults,suchasmutations,tomakeamorecomprehensive
selectionofmodelsavailableforsystemevaluators.
Alsoonthedatalevel,allthreemodelscanbeextended.Thedata-level
errorscanforinstancebeextendedwithsemanticknowledgeofthefunctions
testedintheinterface.Fuzzingcanbeextendedusingadvancesinrandom
testing[Hamlet,2006].Bit-flipscanbefurtherextendedtoincorporatemul-
tipleflips(extendedfromtheSEUmodel)andfurtherextendtheworkon
injections.biteselectivAnotherimportantresearchdirectionistovalidatetheproposedrobust-
nessevaluationmethodologyaspartofastructureddevelopmentprocess.
Thiswouldallowforastrongerconnectionwiththe“bug-revealing”capabil-
itiesofthechosenerrormodels,animportantaspectnotfocusedoninthis
thesis.

TimingError8.3.4Thetimingmodelpresentedwasdevelopedwithinthecontextofdevice
drivers.However,webelievethatitcouldbeofmuchmoregeneralusein
bocompxesonenandt-basedrobustnesstesting.evInaluationsystemsiswwarranereted,comptheonentsselectionareofseenastriggeringblack
eventsisasdifficultasfordevicedrivers.Anextensionoftheworktosuch
systemscouldpotentiallyfurthervalidateitsusefulnessandmakeitmore
ers.elopdevtoaccessibleThedriversevaluatedhadnoconcurrentaccesspatterns,simplifyingthe
analysis.Furtherresearchisneededtohandleconcurrentaccesspatterns.
Formakingthecallblocktechniquemoreapproachableitshouldbebased
onautomaticidentificationofcallblocksfromagivencallstring.Whether
completeautomationispossibleisstillanopenquestion,butsupporting

LEARNEDLESSONSCTICALPRA8.4.

157

totrees.olscanInitialbedevprototelopeypdetobasedolshaonvebpatterneenimplemenrecognitiontedtecandshohniques,wsomesuchaspromise.suffix
Moreresearchisneededindevelopmentofalgorithmsandtoolstohandle
sets.datalarger

LearnedLessonsPractical8.4scienDuringtific)thelessonsprocesswereofwlearned.orkingonThisthesectionmaterialaimsfortolistthissomethesisofsevtheseerallessons,(non-
andthepurposeistoshareourexperiencewiththesesystems.Someofthese
maybeobvioustoexperiencedresearchers,butwestillhopetheycanbe
usefulforyoungresearchersanddevelopers.

datastructuredStore1:LessonBymovingtoastructuredstorageofdata(i.e.,adatabasevs.simpletext
files)hasanassociatedoverhead,intermsoftime,effortandskillsrequired.
Ourexperienceisthatthiscostissmallcomparedtothebenefitsgained.
Havingdatainadatabasemakesiteasytochangeanalysistools,tomodify
theanalysisorextendit.Italsosimplifiesaccessingthedataastoolsalready
existtoworkwithdatabases.Itisalsoeasytosearchthedataforinconsis-
tencies,arisingduetosoftwarebugsorincompletelogfiles,somethingwhich
otherwisemaybehardwhentheamountofdataincreasesrapidly.

Lesson2:Userevisioncontrol
Akeytoanysuccessfulprogrammingtaskissecuringthecodefromaccidental
(ormalicious)changes.Thisnotonlyincludeshavingastructuredback-
upsystem(thatisalsotested!),butalsotouserevisioncontrolforthe
forsourceinstancecode.whenThiswsimplifiesorkingonthemtask,ultipleevenmacwhenhines.thereisAdditionallyonly,onedevputtingeloptheer,
asresultsloosing(thesucrahwfileslogmafilesyindestroourycase)manyunderhoursorevisionfexpconerimentrolistation.alsobeneficial,

Lesson3:Don’ttrustthedocumentation!
Inmanycasesdocumentationcanbeoutdated,orincomplete.Whenthe
systemdoesn’tbehavethewaythedocumentationstatesitshould,itmay
bethatthedocumentationisoutdatedandnotthatsomethingiswrong.This
isespeciallytrueforarticleswrittenbeforethereleaseofasoftware(white
papers).Makesurethatyouhavethelatestversionofthedocumentation

158

CHAPTER8.CONCLUSIONANDFUTURECHRESEARandandwesearcbhforumstheInareternetgoodforplacesuserstoexpgeteriencinganswers.similarproblems.Newsgroups

Lesson4:Usetherighttoolforthejob

Typicallytherearemanytools(suchasprogramminglanguages)thatmaybe
usedtoaccomplishagiventask.Theydifferineaseofuseandfeatureset.For
instance,manyhigh-levelprogramminglanguages(suchasJava/.Net)allow
forprovvideerysimplerapidtodevuseinelopmenterfacestusingtoformoderninstancedevbuildelopmeninttuitiveneuservironmenintsterfaces,and
interactingwithdatabasesetc.Choosethetoolthatisbestsuitedforthe
problem,consideringthetimeitrequirestoimplementthesolutionandthe
pfamiliaritossibilitiesyofmightnotextendingbetheitforbestfuturedecisionneeds.intheCholongosingarun.toolbasedonlyon

Alastwordfromtheauthor

wThisarethesissystems,hasforathercusedthanononidentifyingincreasingthevulnerabilitiesdependabilitandywofeaknessessuchinsystems.soft-
AsmenatlastmadebcommenyJimtonGramyyiwnork1I990.wouldNamelythereforethatlikiteistoafterparaphraseallpaossiblecom-to
buildtrulyfaulttolerantsystems(containingsoftware)havingameantime
be[Gratwy,een1990].failuresItisofsevencouragingeralyearsasorasoftmorewareusingengineertherightotknotecwhniquesthatsucandhtogoalsols
able.hievacindeedare

Bibliography

BOINC:BerkeleyOpenInfrastructureforNetworkComputing.Projectweb
site.URLhttp://boinc.berkeley.edu/.Accessed2007-10-27.
TheBallistaProject.Projectwebsite.URLhttp://www.ballista.org.
2007-10-27.AccessedTheEU-ISTDependabilityBenchmarkingproject(DBench).Projectweb
site.URLhttp://www.dbench.org.Accessed2007-10-27.
TheEmbeddedMicroprocessorBenchmarkConsortium(EEMBC).Consor-
tiumwebsite.URLhttp://www.eembc.org.Accessed2007-10-27.
FreeBSDKernel//people.freebsd.org/StressTestSuite.Onlinepho/stress/index.htmlcollection.oftests.Accessed:URL2007-10-http:
~27.IEEEStandardGlossaryofSoftwareEngineering.IEEEStandard610.12-
1990,December1990.
IntrinsycSoftware.Companywebsite.URLhttp://www.intrinsyc.com.
2007-10-27.AccessedTheLINPACK.Webpage.URLhttp://www.netlib.org/linpack/.Ac-
2007-10-27.cessedPROTOS-SecurityTestingofProtocolImplementations.Projectweb
Accessed:.http://www.ee.oulu.fi/research/ouspg/protos/URLsite.2007-10-27.TheStandardPerformanceEvaluationCorporation(SPEC).Organization
website.URLhttp://www.spec.org.Accessed2007-10-27.
TheTransactionProcessingPerformanceCouncil(TPC).Organizationweb
site.URLhttp://www.tpc.org.Accessed2007-10-27.

159

160

BIBLIOGRAPHY

Transact-SQLReferenceforSQLServer2005.Onlinereferencedocu-
menaspxt..URLAccessed2007-10-27.http://msdn2.microsoft.com/en-us/library/ms189826.
ArnaudAlbinet,JeanArlat,andJean-CharlesFabre.Characterizationof
theImpactofFaultyDriversontheRobustnessoftheLinuxKernel.In
ProceedingsoftheInternationalConferenceonDependableSystemsand
2004.807–816,pages,NetworksJeanArlat,MartineAguera,LouisAmat,YvesCrouzet,Jean-CharlesFabre,
Jean-ClaudeLaprie,LeianeMartins,andDavidPowell.FaultInjectionfor
DependabilityValidation:AMethodologyandSomeApplications.IEEE
TransactionsonSoftwareEngineering,16(2):166–182,February1990.
JeanArlat,AlainCostes,YvesCrouzet,Jean-ClaudeLaprie,andDavid
Powell.FaultInjectionandDependabilityEvaluationofFault-Tolerant
Systems.IEEETransactiononComputers,42(8):913–923,August1993.
JeanArlat,Jean-CharlesFabre,ManuelRodriguez,andFredericSalles.De-
pendabilityofCOTSMicrokernel-BasedSystems.IEEETransactionson
Computers,51(2):138–163,February2002.
AlgirdasAviˇzienis,Jean-ClaudeLaprie,BrianRandell,andCarlLandwehr.
BasicConceptsandTaxonomyofDependableandSecureComputing.
IEEETransactionsonDependableandSecureComputing,1(1):11–33,
2004.huary-MarcJanThomasBallandSriramRajamani.Theslamproject:Debuggingsystem
softwareviastaticanalysis.InProceedingsofSymposiumonPrinciplesof
ProgrammingLanguages,pages1–3,2002.
JamesH.Barton,EdwardW.Czeck,ZaryZ.Segall,andDanielP.Siewiorek.
FaultInjectionExperimentsUsingFIAT.IEEETransactionsonComput-
1990.April39(4):575–582,,ersDouglasBoling.ProgrammingMicrosoftWindowsCE.Net.MicrosoftPress,
2003.edition,thirdAaronB.BrownandDavidA.Patterson.ToErrisHuman.InProcedeedings
oftheFirstWorkshoponEvaluatingandArchitectingSystemdependabil-
2001.,(EASY)itYAaronB.Brown,LeonardC.Chung,andDavidA.Patterson.Includingthe
HumanFactorinDependabilityBenchmarks.InProceedingsoftheDSN
WorkshoponDependabilityBenchmarking,pagesF9–14,2002.

BIBLIOGRAPHY

161

GeorgemandoFoCandea,x.MicrorebShinichiootKaAwTeamoto,chniqueYuicforhiFCheapujiki,RecoGregvFery.Iriedman,nProceeanddingsAr-
oftheSymposiumonOperatingSystemDesignandImplementation,pages
2004.31–44,

Jo˜TaoechniqueCarreira,fortheHenriqueExperimenMadeira,talEvandaluationJo˜aoofGabrielDepSilvendabilita.yinXception:ModernA
Computers.IEEETransactionsonSoftwareEngineering,24(2):125–136,
1998.ebruaryF

Jo˜aoViegasCarreira,DiamantinoCosta,andJo˜aoGabrielSilva.Fault
InjectionSpot-ChecksComputerSystemDependability.IEEESpectrum,
1999.August36(8):50–55,

GeorgeJ.Carrette.Thewebsiteforcrashme.URLhttp://people.
2007-10-27.Accessed.delphiforums.com/gjc/crashme.html

RameshChandra,RyanM.Lefever,KaustubhR.Joshi,MichelCukier,and
tributedWilliamH.SystemSanders.EvAaluation.Global-State-TIEEETrriggeredansactionsFonaultParalInjectorlelandforDis-Dis-
tributedSystems,15(7):593–605,2004.

ShuoChen,JunXu,RavishankarK.Iyer,andKeithWhisnant.Evaluating
TtheransienSecurittyErrors.ThreatInofProceeFirewdingsallofDataInternationalCorruptionConferCausedencbeyonDepInstructionend-
ableSystemsandNetworks,pages495–504,2002.

RamChillarege.HandbookonSoftwareReliabilityEngineering,chapterOr-
thogonalDefectClassification,pages359–400.McGraw-Hill,1996.

RamChillaregeandNicholasS.Bowen.UnderstandingLargeSystemFail-
ures-AFaultInjectionExperiment.InProceedingsoftheInternational
SymposiumonFault-TolerantComputing,pages356–363,1989.

RamDianeS.Chillarege,Moebus,InderpalBonnieS.K.RayBhandari,,andJarirMan-YK.uenChaar,Wong.MichaelOrthogonalJ.HallidaDefecty,
tionsClassificationonSoftwar-AeEngineConcepteringfor,In-Pro18(11):943–956,cessMeasuremenNovembts.er1992.IEEETransac-

AndyEngler.Chou,AnJunfengEmpiricalYang,StudyofBenjaminOperatingChelf,SethSystemHallem,Errors.andInPDaroceewsondingsR.
ofSymposiumonOperatingSystemsPrinciples,pages73–88,2001.

162

BIBLIOGRAPHY

J¨orgenChristmanssonandRamChillarege.GenerationofanErrorSetthat
EmulatesSoftwareFaultsBasedonFieldData.InInternationalSympo-
siumonFaultTolerantComputing,pages304–313,1996.
J¨orgenChristmansson,MartinHiller,andMartinRim´en.AnExperimental
nationalComparisonConferofFencaulteonandSoftwarErroreRInjection.eliabilityInEngineProceeeringdings,pagesofthe378–396,Inter-
1998.JeffreyClarkandDhirajK.Pradhan.FaultInjection:AMethodforValidat-
ingComputer-SystemDependability.IEEEComputer,28(6):47–56,June
1995.ChristianConstantinescu.NeutronSERCharacterizationofMicroproces-
temssors.InandProceeNetworksdings,ofpagesthe754–759,International2005.ConferenceonDependableSys-
ChristianIEEEMicroConstan,tinescu.23(4):14–19,Trends2003.andChallengesinVLSICircuitReliability.

JonathanCorbet,AlessandroRubini,andGregKroah-Hartman.Linux
DeviceDrivers.O’Reilly,thirdedition,February2005.URLhttp:
.//www.oreilly.com/catalog/linuxdrive3/book/index.csp

RichardA.DeMillo,RichardJ.Lipton,andFrederickG.Sayward.Hints
onTestDataSelection:HelpforthePracticingProgrammer.IEEECom-
1978.11(4):34–41,,puter

JohnDeValeandPhilipKoopman.PerformanceEvaluationofException
HandlinginI/OLibraries.InProceedingsoftheInternationalConference
onDependableSystemsandNetworks,pages519–524,July2001.

JohnDeVale,PhilipKoopman,andDavidGuttendorf.TheBallistaSoft-
wareRobustnessTestingService.InProceedingsoftheTestingComputer
SoftwareConference,1999.
ChristopherP.Dingman,JoeMarshall,andDanielP.Siewiorek.Measur-
ingRobustnessofaFault-TolerantAerospaceSystem.InProceedingsof
theInternationalSymposiumonFault-TolerantComputing,pages522–526,
1995.WenliangDuandAdityaP.Mathur.TestingforSoftwareVulnerabilityUsing
EnvironmentPerturbation.InProceedingsoftheInternationalConference
onDependableSystemsandNetworks,pages603–612,2000.

BIBLIOGRAPHY

163

Jo˜aoDur˜aesandHenriqueMadeira.MultidimensionalCharacterizationof
theImpactofFaultyDriversontheOperatingSystemBehavior.IEICE
Transactions,E86-D(12):2563–2570,December2003.
Jo˜aoDur˜aesandHenriqueMadeira.EmulationofSoftwareFaultsbyEd-
ucatedMutationsatMachine-CodeLevel.InProceedingsoftheInter-
nationalSymposiumonSoftwareReliabilityEngineering,pages329–340,
2002.Jo˜aoDur˜aesandHenriqueMadeira.EmulationofSoftwareFaults:AField
DataStudyandaPracticalApproach.IEEETransactionsonSoftware
2006.32(11):849–867,,eringEngineChristofFetzerandZhenXiao.AFlexibleGeneratorArchitectureforIm-
provingSoftwareDependability.InProceedingsoftheInternationalSym-
posiumonSoftwareReliabilityEngineering,pages102–113,2002a.
ChristofFetzerandZhenXiao.DetectingHeapSmashingAttacksThrough
FaultContainmentWrappers.InProceedingsofIEEESymposiumonRe-
liableDistributedSystems,pages80–89,2001.
ChristofFetzerandZhenXiao.AnAutomatedApproachtoIncreasingthe
RobustnessofCLibraries.InProceedingsoftheInternationalConference
onDependableSystemsandNetworks,pages155–164,June2002b.
JustinE.ForresterandBartonP.Miller.AnEmpiricalStudyoftheRobust-
nessofWindowsNTApplicationsUsingRandomTesting.InProceedings
oftheUSENIXWindowsSystemsSymposium,pages59–68,2000.
TimothyFraser,LeeBadger,andMarkFeldman.HardeningCOTSSoftware
withGenericSoftwareWrappers.InProceedingsofIEEESymposiumon
SecurityandPrivacy,pages2–16,1999.
ArchanaGanapathiandDavidPatterson.CrashDataCollection:AWin-
dowsCaseStudy.InProceedingsoftheInternationalConferenceonDe-
pendableSystemsandNetworks,pages280–285,2005.
ArchanaGanapathi,VijiGanapathi,andDavidPatterson.WindowsXP
KernelCrashAnalysis.InProceedingsofLargeInstallationSystemAd-
ministrationConference,2006.URLhttp://www.cs.berkeley.edu/
.archanag/publications/lisa.pdf~AnupGhosh,MatthewSchmid,andShah.TestingtheRobustnessofWin-
dowsNTSoftware.InProceedingsofInternationalSymposiumonSoftware
ReliabilityEngineering,pages231–235,1998.

164

BIBLIOGRAPHY

AnupGhosh,MatthewSchmid,andFrankHill.WrappingWindowsNT
SoftwareforRobustness.InProceedingsofInternationalSymposiumon
Fault-TolerantComputing,pages344–347,1999.
PatriceGodefroid,MichaelY.Levin,andDavidMolnar.AutomatedWhite-
boxFuzzTesting.MicrosoftTechnicalReportMSR-TR-2007-91,Microsoft
2007.Julyh,ResearcJimGray.ACensusofTandemSystemAvailabilityBetween1985and1990.
IEEETransactionsonReliability,39(4):409–418,1990.ISSN0018-9529.
JimGray.WhyDoComputersStopandWhatCanWeDoAboutIt.Tech-
nicalReportTR85.5,Tandem,1985.
MichaelGrottkeandKishorS.Trivedi.FightingBugs:Remove,Retry,
Replicate,andRejuvenate.Computer,40(2):107–109,2007.
WeiningGu,ZbigniewKalbarczyk,RavishankarK.Iyer,andZhenyuYang.
CharacterizationofLinuxKernelBehaviorunderErrors.InProceedingsof
theInternationalConferenceonDependableSystemsandnetworks,pages
2003.468,–459WeiningGu,ZbigniewKalbarczyk,andRavishankarK.Iyer.ErrorSen-
sitivityoftheLinuxKernelExecutingonPowerPCG4andPentium4
Processors.InProceedingsoftheInternationalConferenceonDependable
SystemsandNetworks,pages887–896,2004.
DanGusfield.AlgorithmsonStrings,TreesandSequences.CambridgeUni-
1997.Press,yersitvDickHamlet.WhenOnlyRandomTestingWillDo.InProceedingsofthe
InternationalWorkshoponRandomtesting,pages1–9,2006.ISBN1-
ttp://doi.acm.org/10.1145/1145735.1145737.hdoi:59593-457-X.SeungjaeHan,KangG.Shin,andHaroldA.Rosenberg.DOCTOR:AnIn-
tegratedSoftwareFaultInjectionEnvironmentforDistributedReal-Time
Systems.InProceedingsoftheInternationalComputerPerformanceand
DependabilitySymposium,pages204–213,1995.
JaneHuffmanHayesandJeffOffutt.Inputvalidationanalysisandtesting.
EmpiricalSoftwareEngineering,11(4):493–522,December2006.
JohnL.Henning.SPECCPU2000:measuringCPUperformanceinthenew
millennium.IEEEComputer,33(7):28–35,2000.

BIBLIOGRAPHY

165

JorritN.Herder,HerbertBos,BenGras,PhilipHomburg,andAndrewS.
Tanenbaum.Failureresiliencefordevicedrivers.InProceedingsofthe
InternationalConferenceonDependableSystemsandNetworks,pages41–
2007.50,MartinHiller.ExecutableAssertionsforDetectingDataErrorsinEmbed-
dedControlSystems.InProceedingsoftheInternationalConferenceon
DependableSystemsandNetworks,pages24–33,2000.
MartinHiller.ASoftwareProfilingMethodologyforDesignandAssessment
ofDependableSoftware.Ph.D.Thesis,DepartmentofComputerEngineer-
ing,ChalmersUniversityofTechnology,G¨oteborg,Sweden,2002.
MartinHiller,ArshadJhumka,andNeerajSuri.OnthePlacementofSoft-
wareMechanismsforDetectionofDataErrors.InProceedingsoftheInter-
nationalConferenceonDependableSystemsandNetworks,pages135–144,
2002a.MartinHiller,ArshadJhumka,andNeerajSuri.PROPANE:AnEnviron-
mentforExaminingthePropagationofErrorsinSoftware.InProceedings
ofInternationalSymposiumonSoftwareTestingandAnalysis,pages81–
2002b.July85,MartinHiller,ArshadJhumka,andNeerajSuri.EPIC:ProfilingtheProp-
agationandEffectofDataErrorsinSoftware.IEEETransactionson
Computers,53(5):512–530,May2004.
MichaelHowardandSteveLipner.TheSecurityDevelopmentLifecycle.Mi-
2006.edition,firstPress,crosoftMei-ChenHsueh,TimothyK.Tsai,andRavishankarK.Iyer.FaultInjection
TechniquesandTools.IEEEComputer,30(4):75–82,April1997.
RavishankarIyerandPaolaVelardi.Hardware-RelatedSoftwareErrors:
MeasurementandAnalysis.IEEETransactionsonSoftwareEngineering,
1985.SE-11(2):223–231,TaharJarboui,JeanArlat,YvesCrouzet,andKaramaKanoun.Experi-
mentalAnalysisoftheErrorsInducedintoLinuxbyThreeFaultInjection
Techniques.InInternationalConferenceonDependableSystemsandNet-
works,pages331–336,2002a.
TaharJarboui,JeanArlat,YvesCrouzet,KaramaKanoun,andThomas
Marteau.AnalysisoftheEffectsofRealandInjectedSoftwareFaults.In

166

BIBLIOGRAPHY

ProceedingsofthePacificRimInternationalSymposiumonDependable
2002b.51–58,pages,ComputingTaharJarboui,JeanArlat,YvesCrouzet,KaramaKanoun,andThomas
Marteau.ImpactofInternalandExternalSoftwareFaultsontheLinux
Kernel.IEICETransactionsonInformationandSystems,E86-D(12):
2003.2571–2578,SteveJobs.KeynotetalkatAppleWorldWideDevelopersConference,2006.
Andr´easJohansson.DependabilityBenchmarking.Master’sthesis,De-
partmentofComputerEngineering,ChalmersUniversityofTechnology,
G¨oteborg,Sweden,2001.
AliKalakech,TaharJarboui,JeanArlat,YvesCrouzet,andKaramaKa-
noun.BenchmarkingOperatingSystemDependability:Windows2000as
aCaseStudy.InProceedingsofthePacificRimInternationalSymposium
onDependableComputing,pages261–270,2004a.
AliKalakech,KaramaKanoun,YvesCrouzet,andJeanArlat.Benchmarking
TheDependabilityofWindowsNT4,2000andXP.InProceedingsof
theInternationalConferenceonDependableSystemsandNetworks,pages
2004b.681–686,GhaniA.Kanawati,NasserA.Kanawati,andJacobA.Abraham.FER-
RARI:AToolfortheValidationofSystemDependabilityProperties.In
InternationalSymposiumonFault-TolerantComputing,pages336–344,
1992.GhaniA.Kanawati,NasserA.Kanawati,andJacobA.Abraham.FER-
RARI:AFlexibleSoftware-BasedFaultandErrorInjectionSystem.IEEE
TransactionsonComputers,44(2):248–260,February1995.
CemKaner,JackFalk,andHungQ.Nguyen.TestingComputerSoftware.
1999.Sons,&WileyJohnKaramaKanoun,JeanArlat,DiamantionJ.G.Costa,MarioDalCin,Pedro
Gil,Jean-ClaudeLaprie,HenriqueMadeira,andNeerajSuri.Dbench(De-
pendabilityBenchmarking).InWorkshoponTheEuoropeanDependability
2001.D12–15,pages,InitiativeKaramaKanoun,YvesCrouzet,AliKalakech,Ana-ElenaRugina,and
PhilippeRumeau.TMBenchmarkingtheDependabilityofWindowsandLinux
usingPostMarkWorkloads.InProceedingsoftheInternationalSympo-
siumonSoftwareReliabilityEngineering,pages11–20,2005.

BIBLIOGRAPHY

167

Wei-LunKaoandRavishankarK.Iyer.DEFINE:ADistributedFaultInjec-
tionandMonitoringEnvironment.InWorkshoponFault-TolerantParallel
andDistributedSystems,pages252–259,1994.
Wei-LunKao,RavishankarK.Iyer,andDongTang.FINE:AFaultInjection
andMonitoringEnvironmentforTracingtheUNIXSystemBehaviorunder
Faults.IEEETransactionsonSoftwareEngineering,19(11):1105–1118,
November1993.
JohanKarlsson,PeterLiden,PeterDahlgren,RolfJohansson,andUlfGun-
neflo.UsingHeavy-IonRadiationtoValidateFault-HandlingMechanisms.
IEEEMicro,14(1):8–23,February1994.ISSN0272-1732.
PhilipKoopman.Towardascalablemethodforquantifyingaspectsoffault
tolerance,softwareassurance,andcomputersecurity.InComputerSecu-
rity,DependabilityandAssurance:FromNeedstoSolutions,1998.Pro-
ceedings,pages103–131,1999.
PhilipKoopmanandJohnDeVale.ComparingtheRobustnessofPOSIX
OperatingSystems.InProceedingsoftheInternationalSymposiumon
Fault-TolerantComputing,pages72–79,1999.
PhilipKoopmanandJohnDeVale.TheExceptionHandlingEffectivenessof
POSIXOperatingSystem.IEEETransactionsonSoftwareEngineering,
26(9):837–848,September2000.
PhilipKoopman,JohnSung,ChristopherDingman,DanielSiewiorek,and
TedMarz.ComparingOperatingSystemsUsingRobustnessBenchmarks.
InProceedingsoftheSymposiumonReliableDistributedSystems,pages
1997.72–79,NathanPKropp,PhilipJKoopman,andDanielPSiewiorek.Automated
RobustnessTestingofOff-the-ShelfSoftwareComponents.InProceedings
oftheInternationalSymposiumonFaultTolerantComputing,pages230–
1998.239,SanjeevKumarandKaiLi.UsingModelCheckingtoDebugDevice
Firmware.InProceedingsoftheUSENIXSymposiumonOperatingSys-
temsDesignandImplementation,2002.
Jean-ClaudeLaprie,editor.Dependability:BasicConceptsandTerminology.
1992.erlag,Springer-V

168

BIBLIOGRAPHY

InhwanLeeandRavishankarK.Iyer.Faults,Symptoms,andSoftwareFault
ToleranceintheTandemGUARDIAN90OperatingSystem.InProceedings
oftheInternationalSymposiumonFault-TolerantComputing,pages20–
1993.29,NancyG.Leveson.Safeware:SystemSafetyandComputers.Addison-
1995.,esleyWHenriqueMadeira,DiamantinoCosta,andMarcoVieira.OntheEmulation
ofSoftwareFaultsbySoftwareFaultInjection.InProceedingsoftheInter-
nationalConferenceonDependableSystemsandNetworks,pages417–426,
2000.JuneHenriqueMadeira,RaphaelR.Some,F.Moreira,DiamantinoCosta,and
DavidRennels.ExperimentalEvaluationofaCOTSSystemforSpace
Applications.InProceedingsoftheInternationalConferenceonDepend-
ableSystemsandNetworks,pages325–330,2002.
EricMarsdenandJean-CharlesFabre.FailureModeAnalysisofCORBA
ServiceImplementations.InProceedingsofMiddleware,volume2218of
LectureNotesonComputerScience,pages216–231.SpringerVerlag,2001.
ManuelMendoncaandNunoNeves.RobustnessTestingoftheWindows
DDK.InProceedingsoftheInternationalConferenceonDependableSys-
temsandNetworks,pages554–564,2007.
ChristophC.MichaelandRyanC.Jones.OntheUniformityofErrorPropa-
gationinSoftware.InProceedingsoftheAnnualConferenceonComputer
Assurance,pages68–76,1997.
VisualStudio,MicrosoftPortableExecutableandCommonObjectFileFor-
matSpecification.Microsoft,8.0edition,May2006.URLhttp://www.
.microsoft.com/whdc/system/platform/firmware/PECOFF.mspxBartonMiller,DavidKoski,CjinPheowLee,VivekanandaMaganty,Ravi
Murthy,AjitkumarNatarajan,andJeffSteidl.FuzzRevisited:ARe-
examinationoftheReliabilityofUNIXUtilitiesandServices.Technical
Report1268,DepartmentofComputerSciences,UniversityofWisconsin,
1995.BartonP.Miller,LouisFredriksen,andBryanSo.AnEmpiricalStudyofthe
ReliabilityofUNIXUtilities.CommunicationsoftheACM,33(12):32–44,
December1990.ISSN0001-0782.doi:http://doi.acm.org/10.1145/96267.
96279.

BIBLIOGRAPHY

169

BartonP.Miller,GregoryCooksey,andFredrickMoore.AnEmpiricalStudy
oftheRobustnessofMacOSApplicationsUsingRandomTesting.InPro-
ceedingsoftheInternationalWorkshoponRandomTesting,pages46–54,
2006.TerrenceMitchem,RaymondLu,RichardO’Brien,andKentLarson.Linux
KernelLoadableWrappers.InProceedingsofDARPAInformationSur-
vivabilityConference&Exposition,volume2,pages296–307,2000.
R.Moraes,R.Barbosa,J.Duraes,N.Mendes,E.Martins,andH.Madeira.
Injectionoffaultsatcomponentinterfacesandinsidethecomponentcode:
aretheyequivalent?InProceedingsoftheDependableComputingConfer-
ence,pages53–64,2006.
L.J.Morell.Atheoryoffault-basedtesting.IEEETransactionsonSoftware
Engineering,16(8):844–857,August1990.ISSN0098-5589.
MSDN.TheMicrosoftDeveloperNetwork(MSDN).Onlinereferencedocu-
ments.URLhttp://www.msdn.microsoft.com.Accessed2007-10-27.
ArupMukherjeeandDanielP.Siewiorek.Measuringsoftwaredependability
byrobustnessbenchmarking.IEEETransactionsonSoftwareEngineering,
0098-5589.ISSN1997.June23(6):366–378,BrendanMurphy.AutomatingSoftwareFailureReporting.Queue,2(8):
2004.42–48,BrendanMurphyandBj¨ornLevidow.Windows2000Dependability.In
ProceedingsoftheWorkshoponDependableNetworksandOS,2000.
JohnMusa.OperationalProfilesinSoftware-ReliabilityEngineering.IEEE
Software,pages14–32,March1993.
GlenfordJ.Myers.TheArtofSoftwareTesting.JohnWiley&Sons,2
2004.edition,NunoFerreiraNeves,Jo˜aoAntunes,MiguelCorreia,PauloVer´ıssimo,and
RuiNeves.UsingAttackInjectiontoDiscoverNewVulnerabilities.In
ProceedingsoftheInternationalConferenceonDependableSystemsand
2006.457–466,pages,NetworksWeeTeckNgandPeterM.Chen.ThedesignandVerificationoftheRio
FileCache.IEEETransactionsonComputers,50(4):322–337,2001.ISSN
0018-9340.

170

BIBLIOGRAPHY

PeterOehlert.ViolatingAssumptionswithFuzzing.IEEESecurity&Pri-
2005.3(2):58–62,,MagazinevacyWalterOney.ProgrammingtheMSWindowsDriverModel.MicrosoftPress,
2003.CarlosPacheco,ShuvenduK.Lahiri,MichaelD.Ernst,andThomasBall.
Feedback-DirectedRandomTestGeneration.InProceedingsoftheIn-
ternationalConferenceonSoftwareEngineering,pages75–84,Washing-
ton,DC,USA,2007.IEEEComputerSociety.ISBN0-7695-2828-7.doi:
ttp://dx.doi.org/10.1109/ICSE.2007.37.hJiantaoPan,PhilipKoopman,DanielSiewiorek,YennunHuang,Robert
Gruber,andMimiLingJiang.RobustnessTestingandHardeningof
CORBAORBImplementations.InProceedingsoftheInternationalCon-
ferenceonDependableSystemsandNetworks,pages141–150,2001.
DavidPowell,G.Bonn,D.Seaton,PauloVerissimo,andF.Waeselynck.
TheDelta-4ApproachtoDependabilityinOpenDistributedComputing
Systems.InProceedingsoftheInternationalSymposiumonFault-Tolerant
1988.246–251,pages,ComputingDhirajK.Pradhan,editor.Fault-TolerantComputerSystemDesign.Prentice
1996.Hall,ManuelRodriguez,ArnaudAlbinet,andJeanArlat.MAFALDA-RT:ATool
forDependabilityAssessmentofReal-TimeSystems.InProceedingsof
theInternationalConferenceonDependableSystemsandNetworks,pages
2002.267–272,Z.Segall,D.Vrsalovic,D.Siewiorek,D.Yaskin,J.Kownacki,J.Barton,
R.Dancey,A.Robinson,andT.Lin.FIAT-FaultInjectionBasedAuto-
matedTestingEnvironment.InProceedingsoftheInternationalSympo-
siumonFault-TolerantComputing,pages102–107,1988.
CharlesP.Shelton,PhilipKoopman,andKobeyDeVale.RobustnessTesting
oftheMicrosoftWin32API.InProceedingsoftheInternationalConference
onDependableSystemsandNetworks,2000.
KangG.Shin.HARTS:aDistributedReal-TimeArchitecture.IEEECom-
puter,24(5):25–35,May1991.
PremkishoreShivakumar,MichaelKistler,StephenW.Keckler,Doug
Burger,andLorenzoAlvisi.ModelingtheEffectofTechnologyTrendson

BIBLIOGRAPHY

171

theSoftErrorRateofCombinationalLogic.InProceedingsoftheInter-
nationalConferenceonDependableSystemsandNetworks,pages389–398,
2002.DanielSiewiorek,JohnJ.Hudak,Byung-HoonSuh,andZarySegal.Devel-
opmentofaBenchmarktoMeasureSystemRobustness.InProceedings
ofInternationalSymposiumonFault-TolerantComputing,pages88–97,
1993.AbrahamSilberschatz,PeterBaerGalvin,andGregGagnet.OperatingSys-
temsConcepts.JohnWiley&Sons,seventhedition,December2004.
DanielSimpson.WindowsXPEmbeddedwithServicePack1Reliabil-
ity.Technicalreport,MicrosoftCorporation,2003.URLhttp://msdn2.
2007-10-27.Accessed.us/library/ms838661.aspxmicrosoft.com/en-DavidT.Stott,BenjaminFloering,DanielBurke,ZbigniewKalbarczyk,and
RavishankarK.Iyer.NFTAPE:AFrameworkforAssessingDependability
inDistributedSystemswithLightweightFaultInjectors.InProceedingsof
theInternationalSymposiumonComputerPerformanceandDependabil-
2000.91–100,pages,ityMarkSullivanandRamChillarege.SoftwareDefectsandtheirImpacton
SystemAvailability-AStudyofFieldFailuresinOperatingSystems.In
InternationalSymposiumFault-TolerantComputing,pages2–9,1991.
MarkSullivanandRamChillarege.AComparisonofSoftwareDefectsin
DatabaseManagementSystemsandOperatingSystems.InInternational
SymposiumonFault-TolerantComputing,pages475–484,1992.
MartinS¨ußkrautandChristofFetzer.RobustnessandSecurityHardening
ofCOTSSoftwareLibraries.InInternationalConferenceonDependable
SystemsandNetworks,pages61–71,2007.
MartinS¨ußkrautandChristofFetzer.AutomaticallyFindingandPatching
BadErrorHandling.InProceedingsoftheEuropeanDependableComput-
ingConference,pages13–22,2006.
MichaelM.Swift,BrianN.Bershad,andHenryM.Levy.ImprovingtheReli-
abilityofCommodityOperatingSystems.ACMTransactionsonComputer
2005.23(1):77–110,,SystemsMichaelM.Swift,MuthukaruppanAnnamalai,BrianN.Bershad,and
HenryM.Levy.RecoveringDeviceDrivers.ACMTransactionsonCom-
puterSystems,24(4):333–360,November2006.

172

BIBLIOGRAPHY

AndrewS.Tanenbaum.ModernOperatingSystems.PrenticeHall,2edition,
2001.TimothyTsaiandNavjotSingh.ReliabilityTestingofApplicationsonWin-
dowsNT.InProceedingsoftheInternationalConferenceonDependable
SystemsandNetworks,pages427–436,2000.
TimothyK.TsaiandRavishankarK.Iyer.MeasuringFaultTolerance
withtheFTAPEFaultInjectionTool.InProceedingsofthePerformance
Tools/MMB,LNCS977,pages26–40.SpringerVerlag,1995.
TimothyK.Tsai,RavishankarK.Iyer,andDougJewitt.AnApproachto-
wardsBenchmarkingofFault-TolerantCommercialSystems.InProceed-
ingsoftheInternationalSymposiumonFault-TolerantComputing,pages
1996.314–325,TimothyK.Tsai,Mei-ChenHsueh,HongZhao,ZbigniewKalbarczyk,and
RavishankarK.Iyer.Stress-BasedandPath-BasedFaultInjection.IEEE
TransactionsonComputers,48(11):1183–1201,November1999.
MarcoVieiraandHenriqueMadeira.PortableFaultloadsBasedonOper-
atorFaultsforDBMSDependabilityBenchmarking.InProceedingsof
theInternationalComputerSoftwareandApplicationsConference,pages
2004.202–209,MarcoVieiraandHenriqueMadeira.RecoveryandPerformanceBalance
ofaCOTSDBMSinthePresenceofOperatorFaults.InProceedingsof
theInternationalConferenceonDependableSystemsandNetworks,pages
2002a.615–624,MarcoVieiraandHenriqueMadeira.DefinitionofFaultloadsBasedonOp-
eratorFaultsforDMBSRecoveryBenchmarking.InProceedingsofthe
PacificRimInternationalSymposiumonDependableComputing,pages
2002b.265–272,JeffreyVoas.CertifyingSoftwareforHigh-AssuranceEnvironments.IEEE
Software,31(6):48–54,July-August1999.
JeffreyVoas.ErrorPropagationAnalysisforCOTSSystems.IEEComputing
&ControlEngineeringJournal,8(6):269–272,December1997a.
JeffreyVoas.BuildingSoftwareRecoveryAssertionsfromaFaultInjection-
basedPropagationAnalysis.InProceedingsoftheInternationalComputer
SoftwareandApplicationsConference,pages505–510,1997b.

BIBLIOGRAPHY

173

JeffreyVoasandF.Charron.TolerantSoftwareInterfaces:CanCOTS-based
SystemsbeTrustedWithoutThem?InProceedingsoftheInternational
ConferenceonComputerSafety,ReliabilityandSecurity,1996.
JeffreyVoasandKeithW.Miller.DynamicTestabilityAnalysisforAssessing
FaultTolerance.HighIntegritySystemsJournal,1(2):171–178,1994a.
JeffreyVoasandKeithW.Miller.SoftwareTestability:theNewVerification.
IEEESoftware,12(3):17–28,1995.
JeffreyVoasandKeithW.Miller.PuttingAssertionsinTheirPlace.In
ProccedingsoftheInternationalSymposiumonSoftwareReliabilityEngi-
neering,pages152–157,1994b.
JeffreyVoas,GaryMcGraw,andAnupK.Gosh.GluingSoftwareTogether:
HowGoodisYourGlue?InProceedingsofthePacificNorthwestSoftware
QualityConference,oct1996.
JeffreyVoas,FrankCharron,GaryMcGraw,KeithMiller,andMichaelFried-
man.PredictingHowBadly“Good”SoftwareCanBehave.IEEESoftware,
1997.July-August14(4):73–83,AlanR.WeissandRichardClucas.TheStandardizationofEmbeddedbench-
marking:ThePitfallsandtheopportunities.InProceedingsoftheEmbed-
dedSystemsConference,1999.
ElaineWeyuker.TestingComponent-BasedSoftware:ACautionaryTale.
IEEESoftware,15(5):54–59,Sep.–Oct.1998.
KeithWhisnant,RavishankarIyer,ZbignewKalbarczyk,PhillipH.JonesIII,
DavidRennels,andRaphaelSome.TheEffectsofanARMOR-BasedSIFT
EnvironmentonthePerformanceandDependabilityofUserApplications.
IEEETransactionsonSoftwareEngineering,30(4):257–277,April2004.
JamesA.Whittaker.HowtoBreakSoftware.Addison-Wesley,2003.
JunXu,ZbigniewKalbarczyk,andRavishankarK.Iyer.NetworkedWindows
NTSystemFieldFailureDataAnalysis.InProceedingsofthePacificRim
InternationalSymposiumonDependableComputing,pages178–185,1999.
JunXu,ShuoChen,ZbigniewKalbarczyk,andRavishankarK.Iyer.AnEx-
perimentalStudyofSecurityVulnerabilitiesCausedbyErrors.InProceed-
ingsoftheInternationalConferenceonDependableSystemsandNetworks,
2001.421–432,pages

174

enStev

J.

Zeil.

ansactionsrT

estingT

on

for

eSoftwar

erturbationsP

eringEngine

,

of

Program

BIBLIOGRAPHYts.Statemen

SE-9(3):335–346,

yMa

1983.

IEEE

CV

DataersonalP

´AndrName:Johanssoneas

Dateofbirth:March24,1977

Placeofbirth:Falkenberg,Sweden

olhoScEducation

1984-1993Apelskolan,Ullared,Sweden

1993-1996FalkenbergsGymnasieskola,Falkenberg,Sweden

EducationyersitUniv

1997-2001G¨oteborg,MasterSwedenofScienceinComputerEngineering,

2002-2008Ph.D.inComputerScience,TechnischeUniversit¨at
yGermanDarmstadt,

175

Chalmers,

Darmstadt,