La lecture à portée de main
technischen_universitat_darmstadt - Andréas Johansson
Description
Sujets
Informations
Publié par | technischen_universitat_darmstadt |
Publié le | 01 janvier 2008 |
Nombre de lectures | 40 |
Exrait
RobustnessEvaluationofOperatingSystems
VomFachbereichInformatikderTechnischenUniversit¨atDarmstadt
genehmigte
Dissertation
zurErlangungdesakademischenGradeseinesDoktor-Ingenieur(Dr.-Ing.)
novorgelegtv
Johanssoneas´Andr
ausFalkenberg,Schweden
Prof.NeeraReferenjSuri,ten:Ph.D.
ChristofProf.Ph.D.etzer,F
DatumDatumdermder¨undlicEinreichenhPrung:¨ufung:19.11.200714.01.2008
2008DarmstadtD17
ii
Summary
ThepremisebehindthisthesisistheobservationthatOperatingSystems(OS),
beingthefoundationbehindoperationsofcomputingsystems,arecomplexentities
andalsosubjecttofailures.Consequently,whentheydofail,theimpactisthe
lossofsystemserviceandtheapplicationsrunningthereon.Whileamultitude
ofsourcesforOSfailuresexist,devicedriversareoftenidentifiedasaprominent
failures.ehindbcauseInordertocharacterizetheimpactofdriverfailures,atboththeOSand
applicationlevels,thisthesisdevelopsaframeworkforerrorpropagation-based
robustnessprofilingforanOS.Theframeworkisfirstdevelopedconceptuallyand
thenexperimentallyvalidatedonarealOS,namelyWindowsCE.Net.Thechoice
ofWindowsCEisdrivenbyitsrepresentativenessforamultitudeofOS’s,aswell
astheabilitytocustomizetheOScomponentsforparticularneeds.
Forexperimentalvalidation,faultinjectionisaprominenttechniquethatcan
beusedtosimulatefaults(orerrors)inthesystembyinsertingsyntheticones
andstudytheireffect.Threekeyquestionswithsuchatechniquearewhere,what
andwhentoinjectfaults.Thisthesisshowshowinjectingerrorsattheinterface
betweendriversandtheOScanbeveryeffectiveinevaluatingtheeffectsdriver
faultscanhave.
ToquantifytheOS’srobustness,thisthesisdefinesaseriesoferrorpropaga-
tionmeasures,specificallytailoredfordevicedrivers.Thesemeasuresallowfor
quantifyingandcomparingbothindividualservicesanddevicedriversontheir
abilities.diffusingandysusceptibilitThisthesiscomparesthreecontemporaryerrormodelsontheirsuitabilityfor
robustnessevaluation.Theclassicalbit-flipmodelisfoundtoidentifyahigher
numberofseverefailuresinthesystem.Italsoidentifiesfailuresformoreservices
thanbothothermodels,datatypeandfuzzing.However,itsmaindrawbackis
thatitrequiressubstantiallymoreinjectionsthantheothertwo.Fuzzing,even
thoughnotgivingrisetoasmanyfailuresisabletofindnewadditionalservices
failures.eresevwithAcarefulstudyoftheinjectionsperformedwiththebit-flipmodelshowsthat
onlyafewbitsaregenerallyusefulforidentifyingnewserviceswithrobustness
weaknesses.Consequently,anewcompositemodelisproposed,combiningthe
mosteffectivebitsofthebit-flipmodelwiththefuzzingmodel’sabilitytoidentify
newservices,givingrisetonewmodelwithoutlossofimportantinformationand
atthesametimeincurringamoderatenumberofinjections.
Toanswerthequestionofwhentoinjectanerrorthisthesisproposesanovel
modelofadriver’susageprofile,focusingonhigh-leveloperationsbeingcarried
out.Itguidestheinjectionoferrorstoinstanceswhenthedriveriscarryingout
specificoperations.Resultsfromextensivefaultinjectionexperimentsshowthat
moreservicevulnerabilitiescanbediscovered.Furthermore,aprioriprofilingof
thedriverscanshowhoweffectivetheproposedapproachwillbe.
iii
iv
KurzfassungDerHintergrunddieserDissertationberuhtaufderBeobachtung,dassdas
Betriebssystem,welchesdieGrundlagef¨urdenBetriebvonRechnersystemen
darstellt,einesehrkomplexeStrukturaufweist,wash¨aufigzuFehlernimBe-
triebssystemf¨uhrenkann.WenndiesebetriebssysteminternenFehlerAusf¨allevon
DienstenzurFolgehaben,sindauchdieimRahmendesBetriebssystemslaufenden
Applikationengef¨ahrdet.AuchwennesimallgemeinenvieleFehlerquellengibt,
werdenoftfehlerhafteTreiberalsdieh¨aufigsteUrsacheangegeben.
UmdieAuswirkungenvonTreiberdefektenaufderBetriebssystem-undApp-
likationsebenezucharakterisieren,wirdindieserDissertationeinaufderAusbre-
itungvonFehlernbasierendesFrameworkf¨urRobustheitsauswertungentwickelt.
DasFrameworkwirdsowohlkonzeptionellentwickeltalsauchaufeinemechtenBe-
triebssystemexperimentellvalidiert.Dasgew¨ahlteBetriebssystem,WindowsCE
.Net,istrepr¨asentativf¨urvieleandereBetriebssysteme.Esistmodularaufgebaut,
wasdieAnpassungderBetriebssystemkomponentenanverschiedeneBed¨urfnisse
erheblichvereinfacht.
FehlerinjektionisteinebedeutendeTechnikf¨urdieexperimentelleValidierung,
wobeiFehlersimuliertwerdenindemmansieindasSysteminjiziertundihreFol-
genbeobachtet.DreiwichtigeAspekte,diehierbeiber¨ucksichtigtwerdenm¨ussen,
sind:WelcheFehlersollenwoundwanninjiziertwerden.IndieserDissertation
wirdgezeigt,dassFehlerinjektionindieSchnittstellezwischendemBetriebssystem
unddenTreiberneineeffektiveVorgehensweisedarstellt,dieFolgenvonTreiber-
atzen.h¨abzuscfehlernUmdieRobustheiteinesBetriebssystemszuquantifizieren,werdeneineReihe
vonFehlerausbreitungsmetrikendefiniert,diespeziellaufTreiberfehlerzugeschnit-
tensind.AnhanddieserMetrikenk¨onnenDiensteundTreiberhinsichtlich
EmpfindlichkeitundAusbreitungsverm¨ogenverglichenwerden.
DieseDissertationvergleichtdreizeitgem¨aeFehlermodelleinBezugaufihre
TauglichkeitzurRobustheitsbewertung.DasklassischeBit-Flip-Modellermittelt
amh¨aufigstenschwereAusf¨alleimSystem.MehralsdiebeidenanderenModelle,
DataTypeundFuzzing,ermitteltdiesesModellauchdiemeistenDienste,diezu
Ausf¨allenf¨uhrenk¨onnten.Dergr¨oteNachteildiesesModellsistallerdings,dass
essehrvieleInjektionenerfordert.FuzzingermitteltwenigerDienste,daf¨uraber
neuefehlerhafte,vonBit-FlipnichterkannteDienste.
Einesorgf¨altigeUntersuchungderErgebnissedesBit-Flip-Modellszeigt,dass
schoneineTeilmengederBitsausreichendist,umneueDienste,diezuRobus-
theitsausf¨allenf¨uhren,zuermitteln.Daraufhinwirdeinneues,zusammengesetztes
Modellvorgeschlagen,dasdiegutenEigenschaftendesBit-Flip-Modellsunddas
Verm¨ogendesFuzzing-ModellsneueDienstezuidentifizierenmiteinanderkom-
biniert.DasneueModellverliertkeinewichtigeInformation,underfordertinsge-
samtdeutlichwenigerInjektionen.
UmdieFragezubeantwortenwannessinnvollistFehlerzuinjizieren,wirdein
neues,andasBenutzerprofildesTreibersangelehntesTimingmodellvorgeschla-
v
gen.
Das
neue
dellMo
basiert
auf
der
Ausfuhrung¨
nov
Befehlen
in
einer
oherenh¨
Schicht.BestimmteFehlerinjektionenwerdenzumZeitpunktderAusf¨uhrungbes-
timmterBefehleget¨atigt.DieErgebnissederFehlerinjektionenzeigen,dassein
Vielfachesanst¨orungsanf¨alligen
utzerprofilBendas
neuen
Methode.
des
ersreibT
Dienstengefundenwerdenkann.Auerdemgibt
im
orausV
vi
hlussAufsc
erub¨
die
atEffektivit¨
der
tswledgemenknoAc
andThespacious,pathtoonlyatoPh.D.beiscomealong,narrowewindingrandone.narroAtwtheer.beginningSometimesititisgowidees
steeplyupwards,sometimesdownwards.Sometimesyouthinkyouseean
opcrossings,eningandwherelightoneafterhasthetochnextooseturn,whichonlypathtotofindapursue.deadSomeend.pathsThereloareok
morepromisingthanothersbutyouquicklylearnthattheeasypathisnot
alwaystheshortest.
Iamnowattheendofthepath,onlytorealizethatitisthebeginningofa
new.Iwouldnothavegottenherewithouttheassistanceandencouragement
ofProf.severalNeerapjeople.Suri.FirstThanksofallforIwyourouldlikeguidancetothankandmsuppyort.guideIandwouldmentor,also
likeVilgot,to¨thankOrjan,allArshad,presentRobandert,formerAdina,membDinersu,oftheMarco,DEEDSDan,Ripgroup:on,Martin,Brahim,
Faisal,MajidandMatthias.ManythanksalsotoBirgit,Ute,Sabineand
bBoeymayn.oppAonengreatt.manythanksalsotoProf.ChrstofFetzerforacceptingto
theIproamjectsalsoECgratefulIPforDECOS,fundingECNoEforReSISTconductingandmbyyresearcresearchhgrancomingtsfromfrom
overMicrosoftaninternshipResearch.atAMicrosoftpartoftheResearch,researchCamvbridge.alidationAwspasecialaccomplishedthanksto
BrendanMurphyatMSRforhostingmyinternshipandforalldiscussions
ers.papwritinghelpandFinallytomybeautifulandsupportingwife,Mia.Thankyouforevery-
thing!
vii
viii
tstenCon
ductiontronI11.1Dep1.1.1endabilitDepy:endabilitTheyABasicttributesConcepts................................
1.1.21.1.3DepDependabilitendabilityyMeansThreats.....................................
1.1.4AlternateTerminology:SoftwareEngineering.....
1.1.5BohrbugsandHeisenbugs................
......................aluationEvRobustness1.21.3ThesisResearchQuestions&Contributions..........
..........................StructureThesis1.42BackgroundandContext
2.1AShortOperatingSystemHistory...............
2.1.12.1.2OSDeviceDesignDrivers................................................
2.22.1.3SourcesofWhatFisailurestheofOpProblem?eratingSystems..............................
2.2.22.2.1SoftHardwwareareRelatedRelated..........................................
.......................RelatedUser2.2.3.......................ersDrivDevice2.2.4..........................InjectionaultF2.32.4OperatingSystemsDependabilityEvaluation..........
2.5OtherTechniquesforVerificationandValidation........
2.5.12.5.2FTormalestingMethods.................................................
.............................Summary2.633.1SystemSystemandMoErrordelModel...........................
............................delMoError3.2
ix
1345677894115616171818191020212227292030313334363
3.2.1ErrorType........................
3.2.33.2.2ErrorErrorTLoriggercation.............................................
3.2.4OtherContemporarySoftwareErrorModels......
3.3ExperimentalEnvironment....................
3.3.1WindowsCE.Net.....................
3.3.2DeviceDriversinWindowsCE.............
3.3.43.3.3SoftHardwwareare...................................................
3.3.5SelectedDriversforCaseStudy.............
.............................Summary3.44FaultInjectionFramework
............................ductiontroIn4.14.2Evaluation,Campaign&Run..................
4.3HardwareSetup..........................
4.4SoftwareSetup..........................
4.5InjectionSetup..........................
4.5.1ExperimentManager...................
......................ComputerHost4.5.24.5.44.5.3TInestterceptorsApplications.............................................
............................Pre-Profiling4.64.7SummaryofResearchContributions..............
5ErrorPropagationinOperatingSystems
............................ductiontroIn5.15.2FailureModeAnalysis......................
........................PropagationError5.35.3.1FailureClassDistribution................
...............MeasuresPropagationError5.3.2......................MeasuresofUse5.3.3.............................Discussion5.45.5RelatedWork...........................
5.6SummaryofResearchContributions..............
6ErrorModelEvaluation
............................ductiontroIn6.16.2ConsideredErrorModels.....................
6.2.1DataTypeErrorModel.................
6.2.2Bit-FlipErrorModel...................
x
6324344454546484848494512535354545758585465676690707173737778708288568687878
6.2.3FuzzingErrorModel...................
........................PropagationError6.36.3.1FailureClassDistribution................
6.3.2EstimatingServiceErrorPermeability.........
6.3.3ServiceErrorExposure..................
..................DiffusionErrorService6.3.46.3.5DriverErrorDiffusion..................
6.4ComparingErrorModels.....................
6.4.1NumberofFailures....................
6.4.26.4.3InjectionExecutionTimeEfficiency..........................................
6.4.4Coverage:IdentifyingServices..............
6.4.5ImplementationComplexity...............
6.5CompositeErrorModel......................
6.5.1DistinguishingControlvsData.............
6.5.2TheNumberofInjectionsforFuzzing..........
6.5.3CompositeModel&Effectiveness............
6.76.6RelatedDiscussionWork........................................................
6.8SummaryofResearchContributions..............
delsMoTimingError77.27.1InTimingtroductionModels......................................................
7.2.1Event-Trigger.......................
.......................riggerTime-T7.2.27.3DriverUsageProfile.......................
.........................StringCall7.3.17.3.2CallBlocks........................
....................PhaseserationalOp7.3.37.4ExperimentalSetup........................
7.4.1TargetedDrivers.....................
7.4.27.4.3ErrorInjectionModel..................................................
7.4.4CallStringsandCallBlocks...............
7.5ResultofEvaluation.......................
7.5.1SerialPortDriver.....................
7.67.5.2DiscussionEthernetdriv.............................er......................
7.6.1DifferenceinDriverTypes................
7.6.2ComparingwithFirstOccurrence............
xi
8888889839598989101201301501501701801011111211711811121221321321421421152621621821821031031131431531931931041141
7.6.3IdentifyingCallBlocks..................
.........................orkloadW7.6.4Error7.6.5......................Duration7.77.6.6RelatedWTimingorkErrors..................................................
7.8SummaryofResearchContributions..............
8ConclusionandFutureResearch
8.1Con8.1.1tributionsCategory1:...........................Conceptual.................
8.1.2Category2:ExperimentalValidation..........
8.1.3InjectionFramework...................
8.2ApplicationsofRobustnessEvaluation.............
...................ProfilingRobustness8.2.18.2.28.2.3RobustnessRobustnessEvEnhancingaluationinWTrappestingers........................
8.3OutlookontheFuture......................
8.3.1FaultInjectionTechnology................
....................PropagationError8.3.2.......................delsMoError8.3.38.48.3.4PracticalErrorLessonsTimingLearned...........................................
yBibliograph
xii
241241341441441441
147114488
841051051051115512
351451451651115567
159
ListFiguresof
1.21.1TheThedepattributesendabilitofydepandendabilitsecurityytreeandsecurit...............y..........34
1.41.3TheThefaultWhat,→Whereerror→andfailureWhenprocessdimensions...............offaultinjection..105
2.1Microkerneldesign........................17
3.23.1TheThedrivsystemermomodeldel..................................................3356
3.3Errormanifestationexample...................43
3.4OverviewoftheWindowsCE.Netarchitecture........46
4.1Thehardwaresetup........................53
4.2Overviewoftheexperimentalsetup...............55
4.44.3InjectionBuildinganproOScessimage................................................6516
4.5Thedatatypetrackingmechanism...............63
5.1Errorpropagationmeasures...................74
6.1Injectionefficiency........................103
6.36.2CumClassulativ3efailuresnumbeforroftheserviceBFmowithdelClass...............3failures......110098
6.46.5FailureDiffusionclassstabilitydistributionfortheFZcomparedmodelwith...............CO..........111110
6.6Numberofinjectionsforthecompositemodel.........112
7.1Genericcallstringexample....................126
7.2Exampleofdrivercallingservices................127
7.47.3WDriverorkloadopoperationalerationalphasesphases.......................................112287
7.67.5CallCallprofileprofileoffor9cerfio1C111serial..........................................112299
xiii
7.77.87.97.107.117.127.137.147.15
Timingexperimentsetup.....................
Serialdrivercallstring......................
SerialEthernetdriverdrivercallcallblocksstring..........................................
SerialEthernetdriverdrivererrorcalltimingblocksfailure....................classdistribution.......
SerialEthernetdriverdrivercallerrorprofiletiming......................failureclassdistribution.....
Ethernetdrivercallprofile....................
xiv
031231331113354
631041114411
ablesTofList
3.23.1OvDataterviewypeoferrorusedcasesdatafortyptyespeint...................................3490
3.43.3DataStreamtypineterfaceerrorforcasesserialfordrivstringser..................................4470
3.5Summaryofsymbolsintroduced.................49
4.1Driverservicesused........................67
5.25.1FTheailureCRASHclassessev..erit.yscale............................................7721
5.3Summaryoftheerrorpropagationmeasuresintroduced....83
6.1Targeteddrivers..........................87
6.2ResultsoffaultinjectionforBFmodel..............89
6.46.3ServiceServiceErrorErrorPPermeabilitermeabilityyresultsresults--SerialEthernetdriverdriv.er.............9921
6.56.6ServiceServiceErrorErrorPExpermeabilitosureyresultsresults-S-erialpCompactortdrivFlasher.......driver...9934
6.86.7ServiceServiceErrorErrorExpExposureosureresultsresults--CEthernetompactFlashportdrivcarderdriv.....er..9955
6.96.10ServiceServiceErrorErrorDiffusionDiffusionresultsresults--BFBF-9c1C111erfioserial.................9976
6.11ServiceErrorDiffusionresults-BFatadisk..........97
6.12DriverErrorDiffusionresults..................98
6.13Resultsoffaultinjection.....................100
6.14Experimentexecutiontimes...................102
6.15DriverDiffusionforClass3failures..............104
6.16ServiceErrorDiffusionresults-DT-cerfioserial.......104
6.186.17ServiceServicesErroridentifiedDiffusionbyClassresults3-failuresDT-91C111........................110065
6.206.19DiffusionComparingresultsimportforandtheexpthreeortdrivservicesers...........................111113
xv
6.21
7.17.27.37.47.57.67.7
Requiredmanualreboots.....................
SerialStreamdrivinerterfacecallbloforcks.serialport.....................driver..............
NDISfunctionssupportedbyEthernetdriver.........
EthernetComparisondrivofercallfirst-oblockccurrencesand....................callblockinjection
..................resultsinjectiontimingErrorClass3servicesforcerfioserial.................
xvi
....
611
113322
331113345
831931
1Chapter
ductiontroIn
Whatisrobustness,andwhyisitimportant?
Astheusageofcomputersproliferates,aconsequenceisourincreasing
relianceoftheiroperationsindiverseapplicationenvironments.Theuse
ofcomputers,andespeciallycomputersoftware,promisesmanyadvantages
comparedtoelectronicorpurelymechanicalsolutions,includingrapiddevel-
opment,flexibility,effectivecomponentreuse(bothsoftwareandhardware),
etc.effectsagingnoHowever,softwarebringsaboutnewproblems.Fulfillingnotonlyfunc-
tionalrequirements,butalsorequirementsondeterminism,real-timeand
dependabilitypropertiesbecomeincreasinglydifficult.Softwareengineer-
ingtriestohandletheseproblemsbystructuringthedevelopmentprocess.
However,thatengineeringsoftwareisdifficulthaslongbeenknown.Leveson
notesthatassoftwareisdividedintocomponents(awellestablishedtech-
niquetohandlecomplexity)anewcomplexityisintroducedinthemany
explicitandimplicitinterfacesthatarise[Leveson,1995].Furthermore,the
lackofphysicalconstraintsmakessoftwareinherentlymoreflexible(whichis
positive)butalsogivesrisetonew,unexpectedandunintendedinteractions
(whichmaybehardtofind,quantifyandmaster).Incontrasttophysi-
calsystems,smallperturbationsinsoftwaremaygiverisetoseriousfailures
withoutmuchdelay.Theseproblemsrequirenewmethodsforbuildingand
verifyingsystemsbasedonsoftware.
Akeydesignmodelusedtohandlesomeofthecomplexitiesistouse
standardplatformcomponentstobuildapplicationsupon,theOSbeingthe
mostsignificantsuchsoftwareplatform.TheOSformsthebasicinterface
towhichapplicationsandservicescanbebuilt.Consequentlyarelianceon
continuedprovisioningofcorrectserviceisputontheOS,andwhenthisis
1
2
CHAPTER1.ODUCTIONINTRnotthecasethesystemmightnotfunctionproperly.
ThisthesisaddressestheproblemofevaluatingtherobustnessofanOS,
i.e.,towhichdegreeanOStoleratesperturbationsinitsenvironment.Such
evaluationscanserveseveralpurposes,suchasgaininginformationonhow
thesystemcanfailwhenoperational,guidingverification/validationefforts
towardsserviceswhicharemorelikelytospreadorbetheoferrors,andto
guidetheadditionofrobustnessenhancingcomponentswheretheyaremost
e.effectivThischapterfirstpresentsthebasicterminologyusedinthethesisand
thenintroducestheareaofrobustnessevaluation.Theresearchproblems
addressedarepresentedanddiscussedtogetherwiththecontributionspro-
vided.
1.1.DEPENDABILITY:THEBASICCONCEPTS
3
1.1Dependability:TheBasicConcepts
Dependabilityistheabilityofasystemtoavoidservicefailuresthatare
morefrequentandmoreseverethanisacceptable[Aviˇzienisetal.,2004].
Thisdefinitionimpliesthatthesystemiswellspecified,togetherwiththe
servicesitprovides,suchthatfailuresofthesystemcanbeclearlydefined
anddetected.Italsorequiresacceptableservicefailurefrequenciestobe
establishedandthattheseveritiesoffailuresareknownandcanbeestimated.
[Aviˇzienisetal.,2004]isacollectiveeffortbythedependablecomputing
communitytoagreeonasetofstandardterms.Furtherdefinitionscanfor
instancebefoundthetheIEEEStandardGlossaryofSoftwareEngineering
[IEE,1990]orinDependability:BasicConceptsandTerminology,which
presentsthebasicterminologyinfivedifferentlanguages[Laprie,1992].This
sectionprovidesabriefintroductiontothetermsmostcommonlyusedinthe
fieldandrelevanttotheworkinthisthesis.
Dependabilitycanbeseenasanumbrella,incorporatingseveralat-
tributes,includingnotonlyattributesdirectlyrelatedtofunctionality,but
alsoattributesrelatedtosecurity.Inthisthesis,noemphasizeisputon
securityrelatedattributes.Theyareincludedanddiscussedshortlyinthis
chapterforcompleteness.Figure1.1showsanoverviewoftheattributesof
dependability,thethreatstodependabilityandthemeanstoachievedepend-
.yabilit
yependabilitDandyecuritS
esttributA
satreTh
eansM
AvReliabilitailabilityy
CSafetyonfidentiality
yegritIntyaintainabilitMaultsForsErresailurFFErrault Tor Prevolerancentione
FFault Fault Remoorvecastingal
Figure1.1:Thedependabilityandsecuritytree,from[Aviˇzienisetal.,2004].
4
ODUCTIONINTR1.CHAPTER
1.1.1DependabilityAttributes
bDepothtoendabilitproyvisionisaofcompfunctionalitositey,term,securityencompassingandmainsevtainabiliteralaspy:ectsrelating
•Availability-Theabilityofthesystemtobereadyforusewhen
required
•latedReliabilitservicesy-forTheaspabilitecifiedyofptheeriodsystemoftimetocontinuouslyprovidestipu-
•Safety-Theabsenceofcatastrophicconsequencesonusersandthe
tvironmenen
•Integrity-Theabsenceofimpropersystemalterations
•Confidentiality-Theabsenceofdisclosureofconfidentialinformation
titiesenunauthorizedto
•Maintainability-Theabilityofthesystemtoundergorepairsand
dificationsmo
Dependabilityandsecurityareobviouslyrelated.Availability,forin-
andstance,fromisaasecuritconcernybpotherspfromectiveadep(denialendabilitofyservice).perspectivFiguree(lac1.2koshofwsservice)how
dependabilityandsecurityattributesarerelated.
yailabilitvAyReliabilitDependabilitySafetySecurity
yonfidentialitCyegritIntyintainabilitaM
Figure1.2:Theattributesofdependabilityandsecurity,from[Aviˇzienis
2004].al.,et
1.1.DEPENDABILITY:THEBASICCONCEPTS
5
ThreatsyendabilitDep1.1.2Tofacilitateadiscussionregardingthecauses,effects,detectionandrecovery
fromfaultsinthesystemadistinctionismadebetweenfaults,errorsand
failures.Faultsarethecausesoffailuresinthesystembybeingactivated
(becomingerrors)andthenpropagatingtotheoutputsofthesystemand
therecausingafailure.Figure1.3illustrateshowafaultpropagatingtoa
failureofonecomponent(ComponentA)canbetheinput(fault)ofanother
component(ComponentB)andsoon.
Component AComponent BComponent C
FailureFaultErrorFailureFault
Figure1.3:Thefault→error→failureprocess.
aultsFFaultsarethesourcesforerrorsandfailuresofasystemorcomponent,
includingfaultsappearingduringdevelopment,physicalfaultsinhardware
andfaultinbyteractionitselfisnotfaultsosufficienccurringttoinincauseateractionsfailurewithoftheexternalsystem,compitmonenustts.alsoA
beactivated.Certaintriggeringconditionsarerequiredforthefaulttobe
activated.Forinstance,thepartofthehardwarecontainingthefaultmustbe
theused,faultorisforactivsoftwatedareitthebcoecomesdeconanerrtainingor.Athesthefaultmtriggeringustbemecexecuted.hanismforWhenthe
befaultpresenactivtationwithoutprocesstimmediatelyypicallybiseingtime-depactivatedendenandtarefaultstheninthecalledsystemdormantcan
faults.Agoodexamplethereofisasoftwarefault(bug)inamodulethat
isonlytriggeredforcertaininputvalues,whichmayappearatsomelater
pointintimeasthemoduleisused.
ErrorsErrorssystemctohangefail.theForinthisternaltostatehappofenthethesystemerrorminustawaycausethatamaseriesycofausechainthe
reactions,wheretheerrorispropagatedthroughthesystembyinternalcom-
putations.Errorsaretransformedintoothererrorsinasimilarmanneras
faultsserviceareoutputactiv(orated.lackEvofentuallyoutput)aviolatingpropagatedthesperrormaecificationycauseforantheincorrectsystem,
6
ODUCTIONINTR1.CHAPTER
i.e.,afailure.Thus,alsoerrorscanstaydormantinthesystem,waitingfor
thetriggeringconditionsforpropagationtotakeplace,beforeitpropagates.
ailuresFFailuresareobservedontheoutputsofthesystemandaredetectableas
failuredeviationsmayfromitselfancauseassumedafaultspinecification.anotherAscomponenillustratedt.ThereinareFigurem1.3ultiplea
facetstofailuresinasystem.Somefailuresmaybeofhighercriticality
prothanvisions,others.someSimilarlysystems,acanfailurepromvideustanotlimitedmeanlevaeloftotalservice,absencei.e.,ofthereserviceis
degradation.servicea
MeansyendabilitDep1.1.3Therearefourwaysinwhichdependabilitycanbeachievedandanalyzed:
faultprevention,faulttolerance,faultremovalandfaultforecasting.This
thesisismainlyconcernedwithfaultforecasting,andtosomedegreewith
faulttoleranceandremoval.
tionenPrevaultFThemainintentwithfaultpreventionistoavoidintroducingfaultsinthe
systemduringitsdevelopmentbyuseofmaturesoftwareengineeringprac-
ticesandtools.Faultsarisinginthefieldareavoidedbytheuseofhigh
are.hardwyqualit
oleranceTaultFFaulttoleranceisafundamentalswitchinmentalmodelcomparedtofault
prevention.Infaultpreventiononeavoidstointroducefaultsinthesystem.
Infaulttoleranceontheotherhand,faultsareassumedtobepresentinthe
system,duetoimperfectdesignmethodologies,agingofhardware,interac-
tionfaultswithcomponentsoutsidethecontrolofthedevelopmentteametc.
Faulttoleranceisbasedonthepremiseoferrordetectionandrecovery.Er-
rorsaredetectedandrecoveredfrom,orerrorsaremaskedusingredundancy.
Dependabilityisthenachievedbytoleratingthefaultsratherthanavoiding
tointroducethem,whichmaybeveryhard,ortoocostly.
1.1.DEPENDABILITY:THEBASICCONCEPTS
7
FaultRemoval
Faultremovalaimstoremovefaultseitherduringthedevelopmentstageof
thesystemorduringtheoperationalstage.Developmentstagemethodsare
brokendownintoverificationandvalidation,whereverificationrelatesto
verifyingthatanimplementedsystemactuallyimplementsthespecification
given,andvalidationtocheckingthespecificationitself.Atruntimediagnosis
andcompensationtechniquescanbeusedtoremovefaultsfromthesystem.
Thecontributionsinthisthesisrelatesmainlytoverification,morespecif-
icallytodynamicverification,suchastesting.
orecastingFaultFbFehaaultviorinforecastingtheaimspresencetooffaults.qualitativItelyaimsandtoquanestablishtitativelyfailureevmoaluatedesofsystemthe
systemandtoevaluateothersystemattributesregardingthedependabilityof
thebuttosystem.establishTheopintenterationalisnotcthesameharacteristicsasinoffaulttheremosystem.val(e.g.,verification)
Thethrustofthisthesisisonrobustnessevaluation,whichisapartof
forecasting.fault
1.1.4AlternateTerminology:SoftwareEngineering
Intheareaofsoftwareengineeringaslightlydifferentterminologyforde-
pendabilityfacetsexists.Insoftwareengineeringanerrorrepresentsthe
mistakemadebytheprogrammerthatmadehim/herintroduceaflawin
thecode,thefault(alsoknownasbug).Theconsequenceofthedormant
faultisthatitmaygetactivatedandthenpropagatetothesoftwareoutputs,
causingafailureofthecomponent.
Inthisthesiswewillconsistentlyusetheterminologyfromthedepend-
abilitycommunityaspresentedin1.1.Itallowsforadiscussiononthe
representativenessofinjectederrorsandisalignedwiththelargebodyof
workpresentedinChapter2.
bugsHeisenandBohrbugs1.1.5Asstatedourmainfocusisonfaultsoriginatingfromsoftware.Furthermore,
wefocusonthesubsetofsoftwarefaultsthataretransientinnatureandre-
quirecomplextriggeringforactivation.Thesefaults,knownasHeisenbugs
[Gray,1985]areofkeyinterestbecausetheyarelesslikelytobefoundusing
traditionaltestingtechniques.Theyrepresentfaultsthatrarelyappearin
8
ODUCTIONINTR1.CHAPTER
normalcircumstancesandcontextsandarethereforehardertofind.Theop-
posite,Bohrbugs,havesimpleranddeterministicactivationconditionsand
areeasilyrepeatable.SomeauthorsusethetermMandelbugforbugswhich
giventheexactsameconditions,sometimesappear,sometimesnot[Grot-
tkeandTrivedi,2007].UsingthisterminologyHeisenbugsareasubsetof
Mandelbugs.
aluationEvRobustness1.2
Adegreerelatedtotermwhichtoadepsystemendabilitorycompisronentobustnesscan.functionRobustnesscorrectlyisdefinedintheaspres-“the
enceofinvalidinputsorstressfulenvironmentalconditions”[IEE,1990].
theRobustnesssystem.isAtthethereforesamerelatedtimeittoisandmoreanrestrictedinfluencethanonthedepdependabilitendability,ysinceof
itonlyrelatestoexternalperturbationsandnotinternalones.
ThefocusofthisthesisisontherobustnessofOS’s,asitgivesuseful
andmeaningfulinformationaboutthesystemwithoutrequiringaspecific
operationalscenariotobeinplace,aswouldbethecaseforforinstancereli-
abilityoravailability.Robustnessisconcernedwiththecaseswereexternal
components(includinghumanusers)donotbehaveasexpectedorasstip-
ulated/assumedbythedesignerofacomponent.Assuch,robustnesseval-
uationcomplementstraditionalverificationandvalidationtechniques(also
nerformalabilitiesones).1inThethegoalsystem.ofSuchrobustnessevvulnerabilitiesaluationmaisytoormidenaytifynotpbeotentialtriggeredvul-
infaultsasp(e.g.,ecificsoftopwareerationalbugs)inscenario.theTheysystem,maandyortheymaymaynotleadtoconstituteseveredesigncon-
sequencesforavailability,reliability,safetyorsecurity.
AsCommercial-Off-The-Shelf(COTS)componentsaremoreandmore
kbeyaspecomingectofthestandardverificationbuildingproblockcess.sinmoRobustnessdernofdesignsindividualtheircompcompositiononentsisisa
ofgreatimportancesincecomponentsbuilttobere-usedinmultiplecontexts
cannotbebuiltwithanysuchexplicitcontextinmind.Whencombinedwith
newcomponents,havingdifferentfailurecharacteristics,componentsmaybe
facedwithunexpectedandabnormalinputs.Therefore,componentsshould
bebuilttorespondrobustlytosuchinputsandevaluatingtheirrobustness
mayrevealinformationonhowwellsuitedtheyareforaparticularcompo-
sition.in1aThesystemtermthatvulnerabilitmightyleaddoestonotreferrobustnessonlytofailures.securityThesevulnermightabilitiesalso,butincludetowsecuriteaknessesy-
eaknesses.wrelated
1.3.THESISRESEARCHQUESTIONS&CONTRIBUTIONS9
tionsManasytdescribypesedofinusersthismaythesis,beinincludingterestedindevpeloperformingersforrobustnessdebuggingevoralua-pro-
compfilingonenpurpts;oses;testersinfortegratorsidenfortificationfindingpofossibleinvulnerabilities;teractionorsystemproblemsbetdesignersween
andmanagersforsuitabilitytests,resourceguidanceoridentificationofweak
ofcompsoftwonenarets.devWeelopmenwillt,emphasizebutusewhentheaspmoreectsgeneralapplytotermaevaluatorparticularforaspectthe
personorentityconductingtheevaluation.
1.3ThesisResearchQuestions&Contribu-
tionsTheuseofCOTScomponents,suchasOS’s,isbecomingmoreandmore
common,alsoforproductswithstringentrequirementsondependability.
Forsuchcomponent-baseddesignstofulfiltheserequirements,oneneedsto
establishtheamountoftrustthatcanputonthesecomponentstowork
inaspecificenvironment,includinghowwelltheyhandlefaultsappearing
inothercomponentsofthesystem.Toanswersuchaquestion,thefailure
characteristicsoftheOSneedtobeestablished.Thisincludeshowthe
bOSythecanOSfailduemoretofaultsvulnerable?intheAreencertainvironment.otherArecompcertainonentsservicesmoreprolikelyvidedto
causeafailureoftheOSandconsequentlyafailureofthesystem?
alsoUsingconatainingmodelwhereapplicationstheandOSisdevicethemaindriversinplatformterfacingcompwithonenttheinahardwsystemare,
thesefundamentalquestionsregardingtheOShasguidedtheworkpresented
inexpthiserimenthesis.talTerrorogivepropagationinsightsinandtohoeffectwsuchframeworkquestionshasbcaneenbedefined,answeredwherean
synDriversthetichaserrorsbeenareideninjectedtifiedasintheoneinoftheterfacebmainetwconeenthetributorsOSofandOSitsdrivfailuresers.
[MurphyandLevidow,2000;Simpson,2003;Ganapathietal.,2006].Along
thesamelineChouetal.foundthatdrivercodecontainsuptoseventimesas
manybugsasotherpartsoftheLinuxkernel[Chouetal.,2001].Asdataon
howasystemhandleserrorstypicallyisnotavailableasthesystemisbuilt,
techniquesareneededtospeedupthisprocess.Onesuchtechniqueisfault
injection,wheresyntheticalfaults(orerrors)areinjectedandthebehavior
ofthesystemisobserved.2Thismethodologyraisesadditionalquestions
andregardingwhichhoerrorwtomoinjectdeltoerrorsuse.,Thesewherethreetoinjectquestionsthem,arewhenfundamentotalinjecttoanthemy
2Traditionally,thetechniqueiscalledfaultinjection,evenwhenerrorsareinjected.
10
ODUCTIONINTR1.CHAPTER
faultinjectionapproachandareorthogonal,asillustratedinFigure1.4.
epTy
cationoL
Timing
Figure1.4:Threefundamentaldimensionsinfaultinjection.
Eachinjectederrormodifiessomepartofthesystem.Theerrortype
referstohowthesystemismodified,likeflippingabit,orassigningawrong
value.registers,Wherememorytheorerrorinisparametersinjectedistoitslofunctioncationcalls.dimension,TheliktimingeinofCPUthe
injectionspecifieswhentheerrorisinjected,relativetosomesystemevent,
likebootuptime.Thetimingcanprincipallybetimeoreventtriggered.
Onemayarguethatfaultinjection,beinganexperimentaltechniqueis
ofinherenfaultstlyandlimitednotsincefindingitdoanyesnotfailuresproisvidenoproofcompleteness.ofcorrectnessInjectinga(asninumbtheer
system’sabilitytohandleallfaults/errors).However,wearguethateven
sincewithitlackgivofescompleteness,informationabevoutaluationhowtheusingsystemfaultbehainjectionvesiisnstillpractise.veryEvusefulen
softthoughwaremaformalybetecdesirablehniquesitandisnotadditionalwaysofpsevossibleeralforlayerslargeofsystems,fault-tolerancesuchas
anOS,duetoperformance,complianceorcostlimitations.
OnthesepremisesweareinterestedinfindingouthowOS’sbehavein
thepresencefolloofwingerrors,researchmorespquestionsecificallyareposederrorsforinthisdevicethesis:drivers.Consequently
QuestionshResearcThesisTheresearchquestionsposedforthisthesisaregroupedintotwobroadcat-
egories,firsttheconceptualdefinitionofrobustnessprofilingandtheassoci-
atedmeasures,andthenthequantifiableexperimentalaspectsofvalidating
measures.osedpropthe
Conceptual1:Category
1.3.THESISRESEARCHQUESTIONS&CONTRIBUTIONS11
ResearchQuestion1[RQ1]:Howdoerrorsindevicedriverspropagate
inanOS?Whatisagoodmodelforidentificationofsuchpropagationpaths?
Chapter3setsupthemodelusedtoevaluateandprofiletheOS.The
modelmustallowforcleardefinitionofpropagationpaths,andforuseful,
easilyinterpretableresultstobeextracted.
ResearchQuestion2[RQ2]:Whatarequantifiablemeasuresofro-
OS’s?ofofilingprbustnessChapter5presentsaframeworkthatallowsforidentificationoferror
propagationpaths,thathelpusquantifywhichservicesaremorelikelyto
spreaderrorsinthesystem.Italsoallowsustoidentifyforanapplication,
whichOSservicesusedaremorelikelytobevulnerabletopropagatingerrors.
Additionally,devicedriverscanberankedbasedontheirpotentialdiffusion
oferrors,allowingadesignertomakeinformedchoicesonwhethertoinclude
adriverinthesystem,toenhanceitsrobustnessortofindanalternate
driver.Chapter6presentsexperimentalresultsforacasestudyconducted
.Net.CEwsWindofor
Category2:ExperimentalValidation
representingResearchfaultsQuestionindrivers3[RbestQ3]:injecteWherd?etoWhatinjearect?theWhereadvantagesareerrandors
disadvantagesofdifferentlocations?
drivWers.ehaveThisclevhoseneltoofinjectinjectionerrorsproinvidestheinflexibilitterfaceybeandtwpeentheortabilitOSyandamongits
otheradvantages.Chapter3presentsoursystemmodelandshowswhere
forerrorserroraretobeinjected.injected.Chapter4detailsourinjectionframework,whichallows
ResearchQuestion4[RQ4]:Whattoinject?Whicherrormodel
shouldbeusedforrobustnessevaluation?Whatarethetrade-offsthatcanbe
made?Thechoiceoferrormodelisnotstraightforwardandtherearetrade-offs
tobemadeontheamountofdetailsprovided,thetime/effortrequiredand
theimplementationcomplexity.ItisshowninChapter6howsuchtrade-offs
canbemadeandthreedistincterrormodelsareevaluated,bit-flips,data-
typeandfuzzing.Anovelcompositeerrormodelisprovided,combiningthe
highervulnerabilityexposureofthebit-fliperrormodelwiththelowcosts
ofthefuzzingerrormodel.
ResearchQuestion5[RQ5]:Whentoinject?Whichtimingmodel
shouldbeusedforinjection?
12
ODUCTIONINTR1.CHAPTER
Whendoingin-situfaultinjectionexperiments,whichareneededforro-
bustnessevaluation,thetimeofinjectionbecomesanissue.Thestateof
thesystemevolvesasitexecutesandconsequentlyalsoitssusceptibilityto
faults.Anovelapproachtoselectingthetimeofinjectionispresentedin
Chapter7,basedontheusageprofileofthedriver.
tributionsConThesisTheresearchpresentedhereconstitutesseveralimportantcontributionsto
theresearchcommunity.Eachcontributionliststhecorrespondingresearch
questionsithelpsanswerinbrackets.
agationConinantributionOS,fo1:AcusingframewonaorkkeyissourcepresentedofforOScfailures,haracterizingerrorserrorindeviceprop-
drivers.[RQ1,RQ2,RQ4]
whichConareusedtributionto2:profileAtheseriesofrobustnesserrorofthepropagationOS.[RQ2,RmeasuresQ4]aredefined,
Contribution3:AlargescalecasestudyforWindowsCE.Nethasbeen
[RcarriedQ3,Rout,Q4,RwhereQ5]faultinjectionisusedtovalidatetheproposedmeasures.
ciencyConofsevtributioneralerror4:AmodelsdetailedforinusevinestigationOSonrobustnesstheeveffectivaluations.enessandModelseffi-
arecomparedonseveralparameters,includingnumberofprovokedfailures,
servicecoverage,requiredexecutiontimeandimplementationcomplexity.
Q3][RContribution5:Weshowhowanewcompositeerrormodelcanbe
usedwhenprofilingdrivers,combiningdesirablepropertiesofothermodels
givingexcellentcoveragecharacteristicsforamoderatenumberofinjections.
Q3][RshownConthatfortributiona6:certainTheclassimpactofdrivoftheers,thetimeofimpactinjectionishigh.isstudiedThisandindicatesitis
thatcontrollingthetimeofinjectionisimportant.[RQ5]
presenContedtributiontogether7:withAanovlargeelcaseapproacstudyhtosuppselectingortingthetherightresults.timetoThemoinjectdelis
usesthenewconceptofcallblocktodefinethetimeofinjection.[RQ5]
Contribution8:AflexiblefaultinjectionframeworkforWindowsCE
.Nethasbeenimplementedandusedtocarryingoutthefaultinjectionex-
pmoerimendels,tsdriversrequired.andTheservices.framew[RQ3,orkRalloQ4,wsRQ5]foreasyextensiontonewerror
1.3.THESISRESEARCHQUESTIONS&CONTRIBUTIONS13
ThesisthefromResultingPublicationsTheworkreportedinthethesisissupportedbyanumberofinternational
publications:•Andr´easJohansson,NeerajSuriandBrendanMurphy,OntheIm-
pactofInjectionTriggersforOSRobustnessEvaluation,Proceedingsof
theInternationalConferenceonSoftwareReliabilityEngineering(IS-
2007.SRE),•Andr´easJohanssonandNeerajSuri,RobustnessEvaluationofOp-
eratingSystems,Chapter12ofInformationAssurance:Dependability
andSecurityinNetworkedSystems,Editors:YiQian,JamesJoshi,
DavidTipperandPrashantKrishnamurthy,Tobepublishedin2007
Kaufmann.Morganyb•Andr´easJohansson,NeerajSuriandBrendanMurphy,OntheSe-
lectionofErrorModel(s)ForOSRobustnessEvaluation,Proceedings
oftheInternationalConferenceonDependableSystemsandNetworks
2007.(DSN),•Andr´easJohanssonandNeerajSuri,ErrorPropagationProfiling
ofOperatingSystems,ProceedingsoftheInternationalConferenceon
DependableSystemsandNetworks(DSN),2005.
•Andr´easJohansson,AdinaSˆarbu,ArshadJhumkaandNeerajSuri,
OnEnhancingtheRobustnessofCommercialOperatingSystems,Pro-
ceedingsoftheInternationalServiceAvailabilitySymposium(ISAS),
SpringerLectureNotesonComputerScience3335,2004.
Additionally,theauthorhasbeeninvolvedinthefollowingpublications
thatarenotdirectlycoveredbythethesis:
•Andr´easJohanssonandBrendanMurphy,FailureAnalysisofWin-
dowsDeviceDrivers,WorkshoponReliabilityAnalysisofSystemFail-
ureData,CambridgeUK,2007.
•ConstantinSˆarbu,Andr´easJohansson,FalkFraikinandNeerajSuri,
ImprovingRobustnessTestingofCOTSOSExtensions,Proceedings
oftheInternationalServiceAvailabilitySymposium(ISAS),Springer
LectureNotesonComputerScience4328,2006.
•NeerajSuriandAndr´easJohansson,SurvivabilityofOperatingSys-
tems:ProfilingVulnerabilities,FuDiCoII:BertinoroWorkshoponFu-
tureDirectionsinDistributedComputing,2004.
14
StructureThesis1.4
1.CHAPTERODUCTIONINTR
Thestructureofthefollowingchaptersfollowsthestructureoftheresearch
previously:ostulatedpquestions
Chapter1introducestheresearchproblemsstudiedandthecontribu-
tions.Also,itintroducestheterminologyusedthroughoutthethesis.
Chapter2givesabackgroundandcontexttotheproblemsapproached
inthisthesisbysurveyingrelatedwork.
Chapter3presentsanddiscussesthesystemanderrormodelused.
Theexperimentalenvironmentispresented,bothintermsofhardwareand
re.awsoftChapter4presentsourexperimentalmethodologyandpresentsdetails
regardingthefaultinjectiontechniqueused.
Chapter5introducesourerrorpropagationframeworkandintroduces
thekeymeasuresusedforoferrorpropagationandeffectanalysis.Theiruse
discussed.isterpretationinandChapter6investigatestheimpactofthechoiceoferrormodelbypre-
sentingacomprehensiveexperimentalevaluationofthreeerrormodels.The
evaluationbuildsonthemeasuresintroducedinChapter5.
Chapter7showstheimpactofthetimeofinjectionandpresentsanovel
approachtochoosingrelevantinjectiontimes.
Chapter8finallyputsthecontributionsofthethesisbackintocontext
bydiscussingthegeneralconclusionstobedrawn.Additionallyadiscussion
onhowtheresultscanbeappliedforseveralotherresearchfieldsisprovided
andfutureresearchdirectionsareoutlined.
2Chapter
BactextConandkground
WhatisanOS,andhowhasitsrobustnessbeenevaluated?What
isthestateoftheartandstateofthepracticeinOSrobustness
evaluation?
OvertheyearstheOShasevolvedinitscomplexityandroles.What
startedasaprogramtohelpcomputeroperatorsreadjobsfromtapesfor
largemainframecomputers,istodaypresentinamultitudeofcomputing
productsandresponsibleforservingmultipleconcurrentusersandhandling
awiderangeofdevices.Thesophisticationoftheservicesprovidedhas
increasedtremendouslyovertheyears,ashastherelianceonthecorrectand
timelyprovisionofservicetoapplicationsandusers.Thishasgivenriseto
awholeareaofdependabilityevaluationsandenhancements.
Thischapteraimstorelatetheworkpresentedinthisthesistothelarge
bodyofworkperformedbyotherresearchers.Thusitformsthebackground
andthecontextfortheresearchquestionsposedandputsthecontributions
presentedintoperspective.
15
16
CHAPTER2.BACKGROUNDANDCONTEXT
2.1AShortOperatingSystemHistory
Thefirstcomputerswereprogrammedperhandandtheprogramsweregiven
toanadministratoraspunchcards,whichthenplacedtheminthecard
readerforthecomputers.Ascomputersevolvedandtheusesandrequire-
mentsforcomputationsincreaseditbecameevidentthatsomeformofcon-
trolsoftwarewasneeded,bothtoabstractawaytheintricaciesofthehard-
wareandtoallowforconcurrentaccessformultipleusers.TheOSwasborn
tohandlemultiplejobsthatneededtimeontheCPU.Atfirstthesejobswere
batchedandtheroleoftheOSwastoreadthecodeforonejobintomemory
(fromtapesorpunchcards)andwhenitwasfinishedwritetheoutputon
printers,tapesetc.Onemajorissuewithbatchingofjobswasthatwhilethe
ingforcomputer,somewhichexternalwasadeviceithorrendouslycouldexpnotensivmakeeanpieceyofprogressequipmenandt,wawsaswsimpleait-
idle.ThiswassolvedwhenmultiprogrammingwasintroducedinOS’s.The
memoryavailabletothecomputerwaspartitionedacrossmultiplejobs,such
jobthatcouldwhenuseonethejobprowascessorwaitingtopforerformsomeI/Ocomputations.operatingFtourthercomplete,improvemenanotherts
followed,suchastimesharingwheremultipleusersattachedtoterminals
couldsharethecomputer,bydividingthetimeusedontheprocessoracross
theusers.Ascomputersbecamesmaller,fasterandmoreuserfriendly,the
numberofcomputerusersalsoincreased.SeveraldifferentOS’sevolved,
themostprominentonesbeingfirstUNIX(whichcomesinmanyflavors,
includingOS’slikeGNU/LinuxandMacOSX/Darwin),laterfollowedby
MicrosoftDOSandWindows.ManyspecialpurposeOS’sweredeveloped,
forinstanceforReal-Timesystems,orforlarge-scaleservers.Goodtext
booksongeneralOSrelatedthemesincludetheclassicalbooksbyTanen-
baum[2001]andSilberschatzetal.[2004].
DesignOS2.1.1OneofthekeygoalsforanOSisprotection.Itshouldpreventusersand
processestogainaccesstodata(read,modify,executeetc),devicesandother
processesinanuncontrolledmanner.Thisincludesbothunintentionaland
intentional(evenmalicious)accesses.Acommontechniquetoenforcethisis
todefine(inhardware)differentprivilegelevels,whereprocessesexecuting
withhigherprivilegecanaccesslowerprivilegeprocesses,butnottheother
wayaround.FormostOS’stwosuchlevels(ormodes)aredefined,user
levelandkernellevel.Onlyatthekernellevelisitpossibletousesome
processorinstructions.ByexecutingtheOSatthehigherprivilegelevel
(kernelmode)itcancontroluserprocesses’accesstothesystem.Naturally
2.1.ASHORTOPERATINGSYSTEMHISTORY
17
failuresofkernelmodecomponentsarepotentiallymoreseverethanuser
modecomponents,sincefewprotectionmechanismsexisttopreventthem
fromcorruptingimportantsystemdataandcomponents.
ClientClientProcessTerminalFileMemory
processprocessserverserver. . .serverserverUser mode
Mernelokcri
Kernel mode
Figure2.1:Exampleofmicrokerneldesign.Figurefrom[Tanenbaum,2001].
lithicTherekernelhaveandbtheeentwomicrokmainernel-baseddesigndesign.principlesInforamicrokgeneralOS’s,ernel-basedthedesignmono-
theOSkerneliskeptsmallandprovidesonlylowlevelservices,suchaspro-
cessandmemorymanagement,inter-processcommunication(IPS)etc.The
microkernelistheonlyentityoftheOSrunninginprivilegedmode.Other
servicesthatonewantstheOStoprovide,suchasfilesystems,devicedrivers
etcexecuteinusermode(andareoftenreferredtoasservers)asillustrated
inFigure2.1.ApplicationsrequestOSservicesusingIPCtotheparticular
servIneraprovidingmonolithicthedesignservice,onasshothewnotherbythehand,arroallwinOStheservicesfigure.executein
privilegedmode,andapplicationsmakesystemcallstousetheservices1
providedbythesystem.ThemodeloftheOSlayeredvertically,witheach
layerusingservicesoflowerlayers,whereasthemicrokerneldesignismore
ofahorizontaldesign.Thisdesignisreflectedinoursystemmodel,whichis
3.1.Figureinwnsho
ersDrivDevice2.1.2Devicedriversare,asthenamesuggests,responsibleforinteractionwith
devices.Therearealsodriversforvirtualdevices(protocolsetc)andother
softwaremakinguseofthedriverarchitecturetoextendthefunctionality
oftheOS.Adriver’sroleistoencapsulateandhandlethedevicespecific
interactionneededinorderfortheOSandapplicationstousethedevice.
Asmanydeviceshavespecialfunctionalities,orusespecificprotocols,the
driversprovideamiddlelayerbetweentheOSandthedevices.
1call.HowThroughoutever,forwethewillsystemusetheusedterminthisservicethesiswhictheyhisaremoresynongeneralyms.thanthetermsystem
18
CHAPTER2.BACKGROUNDANDCONTEXT
InordertofacilitateOS-driverinteractions,andtomakeiteasiertode-
velopdrivers,theinterfacebetweenadriverandtheOSistypicallystandard-
ized.Thismeansthatadriverneedstoimplementcertainfunctionalityfor
theOStobeabletointeractwithit.InexchangetheOSprovidesfunctionali-
tiesthatmakeiteasiertodevelopandmaintaindrivers.ThiswaytheOScan
handlewholeclassesofdriversthesameway,makingitsignificantlyeasier
todevelopnewdevices(anddrivers)forexistingOS.Usingdevicedrivers
alsopotentiallysimplifiestheportingoftheOStomultiplehardwarearchi-
tectures,asthedriverscanbeusedhandlepartsofthehardware-specific
featuresofdifferentarchitectures.
Problem?theisWhat2.1.3ThereareseveralreasonswhyitisdifficulttodesignandtestanOS.First
vofarietall,yofmostwOS’sorkloadsareandgeneral-purptheycanose,bei.e.,highlytheyareparameterizedbuilttotohandlebeausedwidein
differentenvironments.Furthermore,theOSkernelrunsinhigh-privilege
mode,werefailureseasilytakedownthewholesystem.OS’softenhavelong
run-times,especiallyintheserverandembeddedareas,makingthemsensi-
aretivetoservice-orienresourceted,exhaustionmeaningandthatleakagecorrectnessproblems.ofManexecutionyOSmafybeunctionalitieshardto
defineandlimit,makingtestingandothermeansofverificationandvalida-
tionhard.LastlythesheersizeofmodernOS’sposesaproblemforthorough
verificationandvalidation.Togiveahintonsize,SteveJobs(CEOofApple
Inc.)wasquotedtosaythatMacOSXcontained86millionlinesofcode
2006].[Jobs,
2.2SourcesofFailuresofOperatingSystems
ThissectionwillsurveysomeofthesourcesforOSfailures.Togetinfor-
mationoncommonsourcesforfailuresthemoststraightforwardtechnique
istocollectdatafromdeployedsysteminthefield.Mostcompaniescollect
failuredatafortheirsystemstosomeextent,butthereareseveralaspects
warrantingconsideration,suchasprivacy,userparticipation,unbiaseddata
2004].y,[MurphetcsetsOneofthemostinfluentialpaperswithinitsfieldisGray’s1985classical
paper“WhyDoComputersStopandWhatCanBeDoneAboutIt?”[Gray,
1985].StudyingoutagereportsforalargenumberofTandemsystemsfour
mainclassesofsourcesforoutageswereidentified:administration,software,
hardwareandenvironment.Administrationandsoftwarewerewerefoundto
2.2.SOURCESOFFAILURESOFOPERATINGSYSTEMS19
bethelargestcontributors(42%and25%respectively).Anotherinteresting
findingwasthatamajorityofthesoftwarefaultsinvestigatedforaspecific
subsystemwereHeisenbugs,notBohrbugs,whichsupportstheideaofusing
softwarefaulttolerancethroughredundancy,e.g.,processpairsetc.Alater
reportin1990reportsonatrendthatsoftwareisincreasinglybeingthe
sourceoffailures(upto60%in1989).
Therestofthesectioncoversdifferentsourcesoffaults.Therearemany
possibleclassificationsoffaults,andinthisthesiswewillconsiderthree
classes:hardware,softwareanduser-relatedfaults.Thissectionwillreview
eachoftheseinturnandrelatethemtoOSfailures.Asdevicedriversisa
majorsourceofOSfailuresandofinteresttothisthesis,thelastsubsection
isdedicatedtofaultsindevicedrivers.
RelatedareHardw2.2.1Hardwarerelatedfaultsarefaultsstemmingfromphysicaldefectsorphe-
nomenainthehardwareplatformuponwhichtheOSisrunning.Hardware
faultsmayhavedifferentcauses,suchaspowerglitches,wear-out/aging,ra-
diation/EMIetc.Muchworkhasbeenspentoncharacterizingandprotecting
againsthardwarefaults.Commonexamplesincludeerrorcorrectingcodeson
memory,redundantbuslines,redundantdisks(RAIDetc)andmanyother
techniques.FromanOSperspectivethesefourclassesoffaultshavebroadly
bedefinedin[Kaoetal.,1993]:
•Memoryfaults,corruptingmemorylocations,eithercodeordata,
•asCPUregisterfaults,corruptionscomputation,(PC,conPSRtrolfloSPwetc),andregisterfaults.Theyappear
•Busfaults,affectingbuslines,and
•I/Ofaults,externaldevicescausingproblems.
Cosmicrayspenetratingtheatmospheremaycausetransistorvoltagelev-
elstotransientlychangewhentheyhitchips.Duetotheirnature(transient)
sucherrorsareoftenreferredtoassofterrors(comparedtohard,perma-
nen(SER).tIterrors).isexpTheectedratethatatwhicfuturehctheyhipsoccurwillhisaveanreferredtoincreaseasintheSERsoftdueerrortoratethe
scalingofsizeandsupplyvoltage[Shivakumaretal.,2002;Constantinescu,
2005].2003,source.BytheHowevprincipleer,manofyerrorhardwconaretainmenerrorst,mayerrorsarepropagatebesttothehandledsoftwclosearetolevtheel
20
CHAPTER2.BACKGROUNDANDCONTEXT
undetected.Thisisespeciallytrueforsystem-levelsoftware,includingdevice
drivThisers.relationSuchwaserrorsstudiedmayinbe[IyerdifficultandVtoelardi,discriminate1985].Datafromwasoftswarecollectedfaults.for
aninstallationoftheMVSOSatStanford.Thepurposeofthestudywas
torelatedevaluatetohardwtheareOS’serrors.abilitytTheoinvdetectestigationandshodiagnosewedsoftthatwaretheOSerrorswasthatrarelyare
abletocorrectlydiagnosetheerrorashardwarerelated,andlesssothanfor
errors.arewsoftpure
2.2.2SoftwareRelated
SullivanandChillaregestudiederrorreportsforsoftwareerrorsintheMVS
OS,betweentheyears1985-1989[SullivanandChillarege,1991].Twomain
groupsofsoftwaredefects(termederrorshenceforth)wereanalyzed,regu-
larandoverlayerrors.Regularerrorsrepresentsa“typicalsoftwareerror
encounteredinthefield”.Overlaysareerrorswherememoryareashave
beenoverwritten,suchasbufferoverruns.Themostcommontypesofmis-
takeswerefoundtoberelatedtomemoryallocation,copyingoverrunsand
pointermanagement.Othererrortypesidentifiedincluderegisterreuse,
typemismatch,uninitializedpointer,undefinedstate,synchronization,dead-
lock,sequenceerror,statementlogic,dataerror,compilationerrorsandoth-
ers/unknown.Thestudywaslaterextendedtoincludedatabasemanagement
systemsandfurtherrefinedandrelatedtothedevelopmentprocessthrough
theconceptofdefecttype[SullivanandChillarege,1992].Thisgaverise
totheclassicalOrthogonalDefectClassification(ODC)process[Chillarege
etal.,1992;Chillarege,1996].ODCcontainssevendefecttypes:function,
assignment,interface,checking,timing/serialization,build/package/merge
anddocumentation.Eachdefectcanbeclassifiedtobelongtooneofthese
types.Thesetypeshavelaterbeenusedtobuildlibrariesforinjectionof
artificialfaultsinsystems,forinstance[ChristmanssonandChillarege,1996;
Christmanssonetal.,1998;Dur˜aesandMadeira,2006].
RelatedUser2.2.3andMurphfoundyanddriversLevidotowbienvaestigatedsignificantthesourcesourcesofforsystemoutagescrashesforWindo[MurphwsyandNT,
Levidow,2000].However,systemoutagesaremostlyfoundtobeplanned
(installationofhardware/OS/applicationsetc).Xuetal.[1999]investigated
thecausesforWindowsNTsystemrebootsandfoundplannedmaintenance
tionandenconfigurationvironmentnettowbeorkrespofservonsibleers.forSoftw31%areofandthedohardwwnaretimetoforalsoaprocauseduc-a
2.2.SOURCESOFFAILURESOFOPERATINGSYSTEMS
21
significantpartofsystemdowntime(22%and10%).
itisEvnotenthethoughfocusofthisuser-relatedthesis.faultsWeisfoacuskeyontosoftwareminimizingrelatedoutagesfaultsofandsystemtheir
consequences.
ersDrivDevice2.2.4
dor.DeviceHowdriveverser,arefortsomeypicallygeneraldevelopdrivedersbya(busdifferendrivters,partforythaninstance)theOSgenericven-
drivmallyerspmayerformedbedevbyeloptheedbdeviceythemanvendor.ufacturer.TestingDueoftodrivtheersnuismberthereforeofdevicesnor-
prosameducedlevelofandthequalityastime-to-markotherpartsetofpressure,theOS.deviceDevelopdriversersmaareyoftennothanotveoftime,the
orwithbetheskilledOSandenough,thetodevices.handleDevicethedrivsometimeserstinypicallytricateexecuteininteractionkernelmorequiredde,
tem.meaningRecenthattaeffortscriticalhavefaultmadeinauser-modevicededriverdrivmaersyptakeossibledowinnthemanywholemodernsys-
OS[Corbetetal.,2005].
Devicedriverstodayformthelargestpart(intermsoflinesofcode)
withintheOS.Chouetal.[2001]reportedthat70%oftheLinuxkernelcode
isdevicedrivers.ThatdevicedriversisamajorsourceforOSfailuresis
thereforenosurprise.Severalfieldstudieshavefounddriverstobeamain
sourceofsystemfailures,e.g.,[MurphyandLevidow,2000;Ganapathiand
Patterson,2005;Ganapathietal.,2006].
dowsOSGanapathi[Ganapathietal.inandvPestigatedatterson,the2005;causesofkGanapathierneletcrashesal.,for2006].theCrashWin-
reportsfrastructurewereforcollectedNetworkfromvolunComputing)teersusingplatformthe[BOI].BOINCDevice(BerkleydrivOpersenwereIn-
onfoundcrashes,tobenotthemaoutages,jorcausewhichformkayernelnotbecrashes,duetobutthecrashes.studyisbasedsolely
repInortsconandtrastlogstoChoutheetpreviousal.hasstudies,studiedwhicLinhuxwkereernelsbasedonspanningcollectingsevenyerrorears
usingstaticanalysis[Chouetal.,2001].Theanalysiswasstaticcompiler
errorbased,ratesusingoftheuptosourcethreecodetoofsevtheenktimesernel.thatTheyoffoundotherdevicepartsofdrivtheerstokhaernel.ve
Fothersurthermore(clusteringtheyoffoundbugs)thatandsomethatnewfunctionserfileshavareemoredistinctlypronemoretoerrorsbugsthanthan
ones.older
22
CHAPTER2.BACKGROUNDANDCONTEXT
InjectionaultF2.3
Faultinjectionisatechniquewherefaults(orerrors)areintentionallyin-
sertedinasystemtoobservehowthesystemreacts.Thetechniquestarted
inthehardwarearea,withthespecificpurposeoftestingfault-tolerance
mechanisms[Arlatetal.,1993].Ithasalsobeenproposedtousefaultinjec-
tionaspartofcertificationforhigh-assurancesystems[Voas,1999].
Faultinjectioncanbeperformedatdifferentlevelsinthesystem(like
hardware,software,protocols)andatdifferentstagesindevelopment(on
designmodels,prototypesordeployedsystems).Thefocusinthisthesisis
onexecutablesystems,i.e.,atleastaprototypeofthesystemneedstoexist
fortheevaluationtobeperformed.Furthermore,wefocusontechniques
implementedinsoftware,socalledSWIFItechniques(SoftWareImplemented
FaultInjection).Othertechniquesrequireuseofspecial-purposehardwareor
useabstractmodelsofthesystemtoinjectfaults.SWIFIhastheadvantage
ofbeingmoreflexibleandcheaper.Surveysoffaultinjectiontechniques,
coveringallthreeclasses,includeClarkandPradhan[1995],Hsuehetal.
[1997]andCarreiraetal.[1999].Thisrestofthissectioncoversawiderange
ols.toSWIFIofFIAT(FaultInjection-basedAutomatedTesting)isafaultinjectiontool
fordependabledistributedapplications[Segalletal.,1988;Bartonetal.,
1990].Systemdependabilityproperties,especiallyerrorcoveragesandla-
tencieswereevaluated.Injectionswereperformedbybit-levelcorruptionof
atask’sdataand/orcodememoryareas.Threetypesofcorruptionswere
used,zero-byte,set-byteand2-bitcompensate.Theoutcomeofexperiments
wereclassifiedonafive-gradescale,frommachinecrashtoinvalidoutput
.orerrnoandInanearlyworkonfaultinjection,ChillaregeandBowenintroducedthe
conceptoffailureaccelerationachievedthroughfaultinjection.Failureac-
celerationoccurswhenthefault→error→failureprocessisaccelerated,by
decreasingthefaultanderrorlatencies,andincreasingtheprobabilitythata
faultcausesafailure[ChillaregeandBowen,1989].Thismakesexperiments
fastertoperformandallowsforestimationsofthetransitionprobabilities
(fault→erroranderror→failure),whichistypicallynotpossiblefrom
fielddata(whichfocusmostlyonfailures).Theyreportedonafaultinjec-
tionstudyperformedontheMVSOS,wherearandom(virtual)pagein
memorywassetto0xFF,generatinganinvalidaddress/opcode,therebyin-
creasingtheprobabilitythatafaultcausesafailure.Itwasfoundthatonly
asmallfractionoftheinjectedfaultsledtoacompletefailureoftheprimary
serviceofthesystem(16%),whereasmost(70%)ledtonolossofservice
atall.Carefulstudyofthelattercategoryledtothedefinitionofpotential
2.3.FAULTINJECTION
23
hazard,anerrorwhichhascauseddamagedinthesystembutdoesnotlead
tofailureunderthecurrentoperatingstate.Potentialhazardsmayleadto
failureatalaterstage,triggeredbychangesinworkload,andmayexplain
previouslyobservedrelationsbetweenworkloadandfailures.
FERRARI(FaultandERRorAutomaticReal-timeInjector)injectser-
rorsinapplicationandOSprocessessimulatinglowlevelhardwarefaults.
Injectionisperformedeitherbyfirstcorruptingthememoryoftheprocess
beforeitisstarted,orbyinjectingfaultsduringexecution,triggeredeither
spatially(i.e.,afteracertaincodelocationisreached)orbyatimeoutde-
finedbytheuser[Kanawatietal.,1995].Injectionsareperformedpurelyin
software,usingsoftwaretraps.Faultsareinjectedintheaddress,dataor
controllineforthetargetedinstruction,resultinginforinstancedifferentin-
structionsbeingexecutedoroperandsbeingmodified.Theactualinjection
isperformedusingbit-levelmodifications.Bothtransientandpermanent
injected.ebcanfaultsFINE(FaultInjectionandmoNitoringEnvironment)wasusedtostudy
thepropagationoferrorsinOS’s.FINEcaninjectbothhardware(CPU,
memory,bus)andsoftwarerelatedfaults(initialization,assignment,condi-
tion).FINEwasoneofthefirstfaultinjectorstobeimplementedinkernel
space,makingOSevaluationpossible.Previoustools,suchasFERRARI
executedinuser-modeandthushadnoaccesstokernelmemoryareas[Kao
etal.,1993].Anewsystemcallwasimplemented(ftrace)usedtospecify
injectionandinsertionofprobesfromuser-space.[Kaoetal.,1993]reports
onexperimentsforSunOS4.1.2usingrandomlyplacedbit-flipsincodemem-
oryandrandomlyselectedglobalvariables.Softwarefaultsweremanually
injectedinthekerneltextsegment.Only8%oftheinjectedfaultsledtoerror
propagationtoanothersubsystem,withmostofthemcausedbycorrupted
functioncallparameters.FINEwaslaterextendedforusewithdistributed
systemsasDEFINE[KaoandIyer,1994].
FTAPE(FaultToleranceAndPerformanceEvaluator)wasusedtocom-
parefault-tolerantcomputersystems[TsaiandIyer,1995].Thetoolcom-
binesafaultinjectorwithaworkloadgeneratorandmonitor,toallowin-
jectionoffaultsunderhighstressconditions,whenfaultsaremorelikelyto
propagate[Tsaietal.,1996,1999].Faultscanbeinjectedthroughoutthe
system(CPU,memory,disks).Thetoolcaninjectsingle,aswellasmultiple
faults.k-atstucandbit-flipsFTAPEwaslaterextendedtoNFTAPE,whichisanextensibletoolusing
genericcomponentstoperformfaultinjectioninadistributedfashion.So
calledlightweightfaultinjectorcomponentsaredefinedtoperformtheactual
injection,monitorandtriggercomponentscansimilarlybeprovidedbythe
user[Stottetal.,2000].NFTAPEhasbeenusedtostudyerrorsensitivity
24
CHAPTER2.BACKGROUNDANDCONTEXT
ofLinux[Guetal.,2004].
MAFALDA(MicrokernelAssessmentbyFaultinjectionAnaLysisand
DesignAid)usesfaultinjectiontoassesstherobustnessofmicrokernel-based
OS’s[Arlatetal.,2002].Injectionsaremadeintobothcodeanddataareasof
theOS,aswellasintotheparametersofkernelcallsusingthebit-fliperror
model.MAFALDAcanbeusedtostudysystemfailuremodesanderror
propagationacrossthecomponentsofthesystem.Theauthorsalsoshowed
how,usingaformaldescriptionoffunctionbehavior,errordetectionwrappers
canbedefinedforkernelfunctions.Anextensionofthetool,MAFALDA-RT,
wasdesignedtoalsohandlereal-timesystems[Rodriguezetal.,2002].
DOCTOR(integrateDsOftwarefaultinjeCTiOnenviRonment)isatool
forSWIFIfordistributedreal-timesystems[Hanetal.,1995].Itcaninject
communicationfaults,suchaslostorduplicatedmessages.Hardwarefaults
aresimulatedinCPUregisters,busormemoryassingleormultiplebit
faults.Thelocation(inmemory)canbedefinedbytheuserorrandomly
selected.Fordrivingtheexperimentsvarioussyntheticworkloadscanbe
automaticallygeneratedoruser-definedprogramsareused.DOCTORwas
forinstanceusedtoevaluateadistributeddiagnosisalgorithmimplemented
onHARTS,adistributedshared-memory-basedreal-timearchitecture[Shin,
1991].Xceptionusesdebuggingandperformancemonitoringcapabilitiesof
processorstoinjecterrorsinCPUfunctionalunits.Theprocessorisin-
structedtohaltwhenfaultsaretobeinjectedandlow-levelexceptionhan-
dlingcodeperformstheinjection.Theadvantageofthisapproachisthat
interferenceswiththetargetsystemisminimized,itrequiresnosourcecode
access,ortrace-basedexecutionofapplicationsorOS.Focusisonsimulating
hardwaretransientfaults,astheseformthemajorityoffaultsinmodernpro-
cessors[Shivakumaretal.,2002;Constantinescu,2005].Severaltriggersare
supported,includingaddress-based(fetchofopcodefromspecificaddress)
andtimeout-based.Faultsareinjectedasbitlevelfaults(stuck-at,flipsand
masks).Xceptionwaslaterusedforotherstudies,including[Madeiraetal.,
2002].2000,PROPANE(PROPagationANalysisEnvironment)isafaultinjec-
tiontoolusedtoprimarilystudyerrorpropagationinembeddedsoft-
ware[Hilleretal.,2002b;Hiller,2002].Data-levelerrorsaretargeted,by
modifyingdatavalues,eitheronthebit-levelorbyfixedvaluesoroffsets.
Injectionsaretriggeredbyselectinginjectionlocations.Additionallytimers
canbeset,eitherusingclockcountersorcountersonreachinginjectionlo-
cations.PROPANEcaninjectbothtransient,intermittentandpermanent
errors.Sinceinstrumentationisdoneonthesourcecodeofthetargetsoft-
ware,propagationcanbestudieddowntoindividualsignals(variables)of
2.3.FAULTINJECTION
25
thecomponentsofthesystem.Togetherwiththemeasuresdefinedinthe
EPICframework(Exposure,Permeability,Impact,Criticality),PROPANE
wasusedtoevaluatethepropagationoferrorsinanaircraftarrestmentsys-
tem[Hilleretal.,2004;Hiller,2002].
RIDDLE(RandomandIntelligentDataDesignLibraryEnvironment)
testsapplicationandsystemservices/librariesonWindowsNTusingrandom
butsyntacticallycorrectstringsasinput[Ghoshetal.,1998].Theprogram
isobservedforunexpectedtermination,crashes,unhandledexceptionsetc.
Theapproachtakenissimilartothatof[Milleretal.,1990]describedin
2.4.SectionFST(FailureSimulationTool)wrapsapplicationsrunningontheWin-
dowsfamilyofOS’swithaninstrumentationlayer,wherebyfailingOSfunc-
tionscanbesimulated.Onatechnicallevelthewrappingisperformedin
averysimilarmannertotheInterceptormodulesusedinthisthesis(see
Section4.5formoredetails).FailuresintheOSaresimulatedbythrowing
exceptionsandreturningerrorcodes.Thefaultswereselectedfromtheset
ofoutcomesfrompreviousexperimentsonthesystemusingRIDDLE[Ghosh
etal.,1998].Applicationsaredeemedasrobustiftheydonothang,crash
ordisruptthesysteminpresenceofperturbations.
HEALERS(HEALersEnhancedRobustnessandSecurity)isasystem
forautomaticallyincreasingtherobustnessofClibraries[FetzerandXiao,
2002a,b].HEALERSusesadaptivefaultinjectiontoevaluatetherobustness
ofindividualparametersinlibraryfunctions.Informationpresentinheader
filesandmanualpagesisusedtobuildfaultinjectors,whichprogressivelytest
functionstocomputetherobustargumenttype,i.e.,thesetofvaluesforwhich
thefunctiondoesnotcrashorreturnwithanerror.Thisinformationisused
toautomaticallybuildrobustnesswrappersfortheselectedlibraryfunctions.
HEALERSwaslaterextended(thenewtooliscalledAutocannon),using
anextendedtypesystemfromBallista(seeSection2.4)tofurthersimplify
thegenerationofrobustnesswrappers[S¨ußkrautandFetzer,2007].Wrappers
aredefinedaspredicatesoverasetoftestsonparametervalues,makingthe
approachmoreflexibleandextensiblethantheoriginalHEALERSapproach.
AutoPatchreusespartsoftheHEALERSsystemtoinvestigateappli-
cations’handlingoferrorcodesfromlibraryfunctions[S¨ußkrautandFetzer,
2006].Errorinjectionisusedtofindunsafefunctions,i.e.,errorcodere-
turnvaluesareinjectedforfunctioncalls,andapplicationsnothandling
them(crashing)arelabeledunsafe.Unsafefunctionscanbeautomatically
patchedusingavarietyofpatchingtechniques.
DTS(DependabilityTestSuite)testsapplicationsrunningonWindows
NTbycorruptingtheparameterstolibrarycalls[TsaiandSingh,2000].The
serverinaclient-serversystem(ApacheIIS,SQLServer)wastargetedand
26
CHAPTER2.BACKGROUNDANDCONTEXT
outcomeswereclassifiedfromaclient’sperspective,i.e.,retryrequired,server
restartrequired,completefailureetc.Theserversweretestedrunningstand-
aloneandusingtwodifferentfault-tolerancemiddlewaresolutions.Threebit-
levelfaultmodelswereusedfortheparameterstothelibrarycalls:setting
allbits,zeroingallbitsorflippingallbits.Large-scaleinjectionsverifiedthat
theevaluatedmiddlewarereducedthenumberoffailuresconsiderably.
G-SWFIT(GenericSoftwareFaultinjectionTechnique)usessoft-
waremutationstoinjectsoftwarefaultsintothebinaryofatargetprogram
[Dur˜aesandMadeira,2006].Afieldstudyofrealfaultswasusedtogenerate
mutations.FirstthefaultswereclassifiedaccordingtoODC(seeSection
2.2.2).Thisclassificationisthenrefinedwithanorthogonalclassification
ofmissing,wrongandextraneousconstructs,whichallowformoreprecise
faultinjection.Thebinaryofthetargetissearchedforpatternsrelatingto
higher-levelcodeconstructs,wherecodemutationschosenfromarepresen-
tativesetofsoftwarefaultsareinserted.Usingfailuremodeanalysisthe
behaviorofthreetestprogramsiscompared,wheninsertinglow-levelmu-
tationsandsourcecodefaults.Overallmostofthesourcecodelevelfaults
couldbereproducedbythemutationstosomeextent.
Severalhardware-basedtechniqueshavebeendevelopedaswell,injecting
faultsatdifferentlevelsofthesystem.MESSALINE[Arlatetal.,1990,1993]
injectsfaultsatthepinstoICsoffaulttolerantsystems.Karlssonet.al.
[Karlssonetal.,1994]useheavy-ionradiationforvalidationoffault-handling
hanisms.mec
Faultinjectionhasmainlybeenusedtoevaluatefaulttolerancemecha-
nismsorrobustnessissues.However,ithasalsobeenfoundusefulinthearea
ofsecurity,especiallyforprotocols[PRO].In[Chenetal.,2002]errorswere
ininjectedfirewallsintcanwoinksomeernel-basedcasesleadfirewtoallsonsecuritLinyux,andvulnerabilities.itwasMofounddelingthataerrorsreal-
isticinstallationsuggeststhaterror-causedvulnerabilitiesisanon-negligible
ensourcevironmenfortosecuritfanyconcerns.applicationDuandandobservMathedurthe[2000]applicationsinjectedforerrorssecuritinythevi-
olations.NFTAPEhasbeenusedtoinjectcontrolflowbit-flipsintheuser
authenticationsectionofsshdandftpdonLinuxanditwasfoundthat
suc2001].hfaultsNevesmaetyal.openpresenuptthetheaffectedAJECTtservoolerswhicforhperformvulnerabilitiesattack[Xuinjeetctional.,
fordetectingvulnerabilities[Nevesetal.,2006].Attackstargetprotocolsby
used.messagesthearyingv
2.4.OPERATINGSYSTEMSDEPENDABILITYEVALUATION27
2.4OperatingSystemsDependabilityEvalu-
ation
Severalpasteffortshavefocusedonevaluationofdependabilityandrobust-
nessissuesinOS’s,includingthepreviouslymentionedfieldstudies.This
sectionisdedicatedtodependabilitybenchmarks,whereastandardmethod-
ologiesandtoolsareusedtoevaluateandcomparesystems.
Inbenchmarkingtheaimistocomparecompetingsystemsusingafair
andrepeatableprocess.Benchmarksforcomparingcomputerperformance
areabundantandhavefoundwidespreaduse,eventhoughtheinterpretation
oftheresultsoftenisnon-trivial.Oneofthemostinfluentialbenchmarksis
theStandardPerformanceEvaluationCorporation(SPEC)benchmark[SPE;
Henning,2000].ThemostwellknownbenchmarkfromSPECisprobably
SPECintformeasuringintegercomputingcapabilitiesofCPUs,buttheorga-
nizationoffersbenchmarksinmanyareas,suchasgraphics,high-performance
computing(HPC)andweb-basedsystems.Severalotherbenchmarksexist
forspecificareas,suchastheEmbeddedMicroprocessorBenchmarkConsor-
tium’s(EEMBC)benchmarks[EEM;WeissandClucas,1999],Transaction
ProcessingPerformanceCouncil(TPC)[TPC]andLINPACK[LIN]tomen-
few.abuttionDependabilitybenchmarksarenotaimedatcomparingperformance
(only),buthow“dependable”asystembehaves.Twowellknownprojects
ondependabilitybenchmarkingaretheBallistaprojectfromCarnegieMel-
lonUniversity[Bal]andtheEU-ISTprojectDBench[DBE;Kanounetal.,
2001].Anintroductiontothegeneralproblemofbenchmarkingandspecific
issuesrelatedtodependabilitybenchmarkingisgivenin[Johansson,2001].
Ballistaisarobustnessbenchmark[Koopman,1999;DeValeetal.,1999;
KoopmanandDeVale,1999].ThefirstversionofBallistatargetedthePOSIX
interfacefoundonmanyOS’s.Itbuildsatestingwrapperforthefunction
targetedandthenautomaticallybuildstestcasesbyselectingparameter
valuesfromasetofvalidandinvalidvaluesforthatparticulardatatype.
SincethenumberofdifferentdatatypesusedinthePOSIXinterfaceis
relativelylow,thenumberoftypesforwhichinjectorsneedtobespecifiedis
alsolow.Additionofnewfunctionstobetestedrequiresonlytodefinevalues
foranynewtypenotpreviouslyused,makingtheapproachveryscalable.
ExtensiveexperimentationdoneonseveralOS’srevealedmultiplerobustness
issues[KoopmanandDeVale,1999,2000].Ballistawaslaterusedfortesting
I/Olibraries[DeValeandKoopman,2001],CORBAimplementations[Pan
etal.,2001]andforWin32interfacesinWindows[Sheltonetal.,2000].
MendoncaandNevesusedfaultinjectiontotestfunctionsintheWin-
28
CHAPTER2.BACKGROUNDANDCONTEXT
dowsDDK(theinterfacefordevicedrivers)[MendoncaandNeves,2007].
SincetheDDKforWindowsexportsmorethanathousandfunctions,only
functionsusedinatleast95%ofthedriversweretested.Eachfunction
wastestedinisolation,similartotestsinBallista,andfailuremodeanaly-
siswasperformed.Thefailuremodeswererelated,notonlytothesystem
robustness(crash,hangetc)butalsotoconsistencyofdataondisk,where
FAT32andNTFSwerecompared.ThreeversionsofWindowswerecom-
pared(XPSP2,Server2003andVistaRC1)andtheresultsshowedgreat
similarities,indicatingthatthetestedfunctionshavenotundergonefunda-
mentalchangesimpactingrobustnessthroughoutthethreeversionstested.It
wasalsofoundthatNTFS,asexpected,showednofilesysteminconsistencies,
whereasFAT32didinsomecases.
EarlybenchmarkingprojectsaimedatOS’stargetUNIXsystems.The
crashmeprogramwasdevelopedtotesttherobustnessoftheOSbyexecuting
randomdata[Carrette].Thiswasachievedbyfirstallocatinganarrayand
fillingitwithrandomdata.Thenseveralchildprocessesarespawnedthattry
toexecutethedataasifitwasacodesegment.Thesystemissubmittedtoa
largenumberofsuchprocesses,withtheintentoftestingtheerrordetection
andhandlingcapabilitiesoftheOS.Thissimpletestsuccessfullycrashed
severalUNIXsystems.TheprogramwaslaterextendedtoCMUCrashme
whichsubjectedUNIXsystemcallstorandomstrings,therebytestingtheir
parametercheckingcode[MukherjeeandSiewiorek,1997].Thismodified
versioncouldcrashtheMach3.0OSinlessthantenseconds.Theauthors
alsopointedouttheusefulnessofmodularbenchmarks,targetingspecific
areasofthesystem,suchasfile,memoryandinter-processcommunication
subsystemsforanOS[Siewioreketal.,1993;Dingmanetal.,1995;Mukherjee
1997].Siewiorek,andAnotherapproachusingrandomdatawascarriedoutbyMilleretal.
[1990],whereaseriesofcommercialUNIXimplementationswerebench-
markedandcompared.ThetargetwasnottheUNIXkernelperse,buta
setofutilityapplicationscommonlyincludedinmostUNIXOS’s,suchas
awk,diffandgrep.Thetestsconsistedofsupplyingrandomstringsasinputs
totheseutilities(whichtypicallyworkontextinput).Thetechniquewas
namedfuzzingandhasservedasinspirationtotheareaofRandomTest-
ingandalsotothefuzzingerrormodelusedinthisthesis.Robustnesswas
measuredbutobservingthebehavioroftheapplication,wherecrashesor
hangswereundesirableoutcomes(showsnon-robustbehavior).Asurprising
numberofdeficiencieswerefound,withlargedifferencesbetweenthebench-
markedsystems.Theexperimentswerelaterrepeatedwithsimilarresults
[Milleretal.,1995].AlsostudiesforWindowsNT[ForresterandMiller,
2000]andMacOShavebeenconducted[Milleretal.,2006].
2.5.OTHERTECHNIQUESFORVERIFICATIONANDVALIDATION29
[TsaiAetdepal.,endabilit1996]yusingbencthehmarkFTforAPEfaultfault-toleraninjectionttosystemsol.wasMeasureddevelopwasedthein
nutem.mbFeraultsofwerecatastrophicinjectedinincidentotstheandCPU,thepmemoryerformanceandI/Odegradationcomponenoftstheofsys-the
Mosystem.dularSinceRedundancythesystems-TMR)testedtheexpconsistedectedofoutcomeredundanistonlyacomputersperformance(Triple
degradation.UsingPostMarkTM,afilesystemperformancebenchmark,asworkload,
Kanounet.al.developedadependabilitybenchmarkforseveralversions
ofWindowsandLinux[Kanounetal.,2005].Thebenchmarkisdefinedas
ameasureoftherobustnessoftheOS’sabilitytowithstandinvalidAPI
inputs.MeasuredisalsoreactionandrestarttimesforthecomparedOS’s.
TheTPC-Csametransactionalauthorshapvealsoerformancedefinedbencadephmarkendabilitaswyborkloadenchmark[Kalakechusingettheal.,
ws.Windofor2004a,b]abilitBroybwnencet.hmarkal.bargueyshothatwingthethathumanmostoffactorthemustoutagesbeforincludedlargeinasystemsdepend-are
depcausedendabilitby(hybuman)encophmarkeratorswhich[BrownincludesandrealPhatterson,umanop2001].erators,Theyppresenerformingta
bothdetectionandrecoveryactionstoinjectedfaultsandtoperformstan-
dardizedmaintenancetasks[Brownetal.,2002].VieiraandMadeiraalso
consideroperatorfaultsforstudyingrecoveryproceduresindatabaseman-
agementsystems(DBMS)[VieiraandMadeira,2002a,b].Aportablefault-
loadforDBMS’sisdefinedin[VieiraandMadeira,2004],usedtoformade-
pendabilitybenchmarkforOn-LineTransactionProcessingsystems(OLTP).
depTwoendabilitkindsofy-related.measuresPareerformanceused,prelatederformance-relatedmeasureswere(fromtakenbTPC-C)othwithand
andwithoutfaultsinjected.Dependability-relatedmeasuresincludedata
inferentegrittyDBMS’sandavandailabilitthreeydifferenmeasures.tOS’s,Thebandenchmarkcomparisonswaswappliederetomadefouracrossdif-
them.
2.5OtherTechniquesforVerificationand
alidationV
tionThisandsectiontheirpresenrelationtstocomplementhefaulttarytecinjection-basedhniquesforvtecerificationhniquesusedandvinalida-this
sizethesis.thatInalsogeneral,thepropnooneosedtecsinglehniquestechniquemustisbetobusedeaspreferredcomplemenandwtsetoempha-other
30
CHAPTER2.BACKGROUNDANDCONTEXT
verificationandvalidationtechniques.
estingT2.5.1devSoftwelopareer(ortestingaisdesignatedthemosttester)basicbuildsformaoftestvcaseerificationwhichfordefinessoftwtheare.conThetext
forthetestandtheinputs,aswellastheexpectedoutputs,basedonthe
spcomparedecificationtoofthetheexptestedectedcompresult.onenAt.Theplethoratestofistestingexecutedtecandhniquestheresultexist
andinthissectionwehighlightthemorerelevantonestothiswork.Good
introSoftwareductoryTestingtextsbytoMysoftersware[2004],testingorTincludesestingtheComputerseminalbSoftwarookebTheyArKanertof
[1999].al.etFromanimplementationpointofviewtestingandfaultinjectionhave
manycommonalities.Especiallyfaultinjectionfocusingoninterfaces,which
isthecaseinthisthesis.Thistypeoffaultinjectionresembleswidelyspread
unittestingapproaches,suchasequivalenceclasstestingorboundaryvalue
2004].ers,[MytestingConceptually,softwaretesting’sgoalistoidentifyfaults,i.e.,bugs,
whereasthegoalofarobustnessevaluationistoidentifyweaknesses.A
wlieeaknessoutsideinofthisthesensescopemaofythenotbspeabugecification,(althoughormaityarisemight),onlybinecauseaitcertainmay
text.conInequivalencepartitioningtestingofafunctiononetypicallyfocuseson
bothvalidandinvalidclassesofinputs.Theinputspaceissplitintoasetof
(inequivtermsalenceofclassescorrectness)whereforitallisvaluesassumedwithinthatatheclass.functionThespbehaecificationvesforsimilarlythe
inputisrequiredforperformingthepartitioning.Boundary-valuetestingcan
beseenasanextensionofequivalencepartitioningtestingwhereonefocuses
onthevaluesaroundtheboundariesoftheequivalenceclasses.
dsMethoormalF2.5.2Anyformoffaultinjectionisinherentlyadynamictestingmethod[My-
ers,2004].Weconsiderdynamictestingtechniquesasours,andmoreformal
techniques,includingstaticanalysis(like[BallandRajamani,2002])ascom-
plements.Bothareusefulforbuildingmoredependablesystemsandboth
havetheirstrengthsandweaknesses.Formalproofs(liketheoremproving)
areusedtoprovethattheimplementationismadeaccordingtothespeci-
fication,whichalsoneedstobeexpressedformally.Anotherapproachisto
usingamodelofthesystem,buildandcheckallthestatesofthesystemand
YSUMMAR2.6.
31
voferifythethatsafetystheypdoecificationnotviolateforthethespsystemecification,[KumarforandLi,instance2002].leadaviolation
[HayesandOffutt,2006]usesstaticanalysisoftheuserinputspecifica-
andtionstotogenerateprogramstesttobcasesothforidentifytesting.Ainconsistencieslargeempiricalinthespcasestudyecificationshowitselfed
thattheautomatictoolfounddefectsfasterthanexperttesters,butnot
Thenecessarilyresultssuppmore.ortsItthealsofoundcomplemendefectstarynotuseoffoundbautomaticyhumantoolswithtestersatdomainall.
ertise.expmeansEventoreacthoughhformalcompleteness,methodstestingaretectheoreticallyhniquesandattractivrelatede,sinceexptheyerimenoffertal
techniqueslikefaultinjectionarelikelytoprevailformanyyearstocome,
duetotheireaseofuseandunderstanding.However,testautomationisa
necessaryevolutionintesting,assystemsgrowlargerandmorecomplex.
Summary2.6
Thischapterhaspresentedbackgroundinformationandreviewedrelated
researchwithintheareasofOSrobustnessevaluationandfaultinjection.
OnthisbackgroundwehaveidentifiedtheOSasbeingthekeytosystem
dependabilityandrobustnesssinceitistheplatformonwhichapplications
andservicesarebuilt.Furthermore,devicedriverswereidentifiedasthemain
sourceofsoftware-relatedcausesofsystemfailures.Severalpreviousstudies
havefocusedoninterfaces(OS-ApplicationandOS-Driver)asitfacilitates
portabilityandfaircomparison,importantaspectsofbenchmarks.Fault
injectionhasinmultiplepreviousstudiesbeenshowntobeaneffectivemeans
forevaluationofdependabilityattributesofOS’s.
Ourreportonrelatedworkdoesnotstopwiththischapter.Throughout
thethesiswewillgivepointerstorelevantstudieswhereappropriate.
32
CHAPTER2.CKBAOUNDGRANDCONTEXT
3Chapter
ErrorandSystemdelMo
Whatarethesystemboundaries,andwhatisanerror?
OS’sarekeybuildingblocksinvirtuallyallcomputerbasedsystem,rang-
ingfromsmalldeeplyembeddedcontrolsystems,todesktopworkstationsand
largeserversforonlinetransactions.ConsequentlyOSdependabilityisan
importantobjectiveandaprerequisitefordependableprovisionofservices.
Thischapterbuildsthefoundationforthefollowingchapters,startingby
presentingthesystemmodelused.Thenageneralerrormodelisdefined,in
termsoflocation,typeandtrigger,followedbythepresentationofthethree
errormodelsusedinthefollowingchapters.Theexperimentalsetupusedis
presented,bothintermsofhardwareandsoftware.Thechapterisconcluded
withasummarycontainingatableofthesymbolsintroducedforreference
hapters.claterin
33
34
CHAPTER3.SYSTEMANDERRORMODEL
delMoSystem3.1
MostmodernOS’saremonolithic,i.e.,theOSkernelprovidingthemost
basicfunctionalitiesrunsinkernelspace,asillustratedinFigure3.1.Thisis
incontrastto,forinstance,microkernel-basedOS’s,wherethefunctionality
oftheOSkernelisspreadacrossmultiplesubcomponentswithwellspecified
terfaces.inWeuseagenericmodeloftheOS.Similartomanyotherstudies,e.g.,
[Albinetetal.,2004;Dur˜aesandMadeira,2003],wemodelamonolithic
system.Themodelconsistsoffourlayers:applications,OS,driversand
hardwareplatform.Wehavechosenthismodelasitisgenericenoughto
applytoseveralcommercialOS’s,likeWindowsorLinux.Itisalsosuffi-
cientformeasuringtherobustnessofthesystemduetoerrorsindriversby
propagation.errorstudyingEachlayerconsistsofoneormoresubcomponents(likedifferentapplica-
tionsintheapplicationlayer,ordifferentdriversinthedriverlayer).Our
modeldoesnotspecifythesubcomponentsrequiredineachlayersincethey
differforeachspecificOS.Eachlayerprovidesservicestobeusedbyneigh-
boringlayers.Aservicecanberealizedinmanyways.Commonisfor
instancefunctioncalls(likeAPI’s,ApplicationProgrammingInterfacesor
systemcalls),butingeneralothermechanismscouldbeusedlikethemessage
passingparadigmusedforcommunicationbetweentheOSandthedrivers
definedintheWindowsDriverModel(WDM)foundonWindowsXP[Oney,
2003].Thenatureofthecommunicationisnotimportantforthemodel,
importantisthattheserviceissyntacticallyspecified,andthattheflowof
informationcanbeinterceptedandmodified.Thisisrequiredtobeableto
injecterrorsandtoobservetheoutcomeofeachinjection.Thespecifica-
tionisdefinedinaninterface.Thetwointerfacesofinterestherearethe
OS-ApplicationandOS-DriverinterfacesindicatedinFigure3.1.
Thesystemhasasetofnapplications,APP1...APPn.Theapplication
setincludesallapplicationsrunningonthesystemwhicharenotrequiredfor
theOStofunctionproperly.Thisincludesapplicationsaddedforthepur-
poseoftheevaluation,calledbenchmarkapplications,ortestapplications.
ApplicationsmakeuseofOS-levellibrariestoimplementtheirfunctionali-
ties.Typically,applicationsruninuserspaceandtheOSandthedevice
driversexecuteinprivilegedmode.
TheOSlayerincludestheOSkernelandallrequiredlibrariesdeliveredas
partsoftheOS.Anexampleofsuchlibrariesarelibrariesusedbyapplications
tointerfacewiththeOS(POSIX,C-runtimesetc.).
TheOSprovidesasetSofservicestobeusedbyapplications(si’sin
Figure3.1).TheOS-Driverinterfaceconsistsofservicesprovidedbothby
MODELSYSTEM3.1.
35
intheOSFigure(os3.1).x.y’sinCollectivFigureely3.1)theyandareservicesreferredprotoasvidedthebysetdrivOersof(theservicesdsx.yin’s
thisinterface.EachapplicationAPPxusesasetofOSservices,termedAx,
⊆SAwherex
......Application layerAPP1...APPn
......[OS-Application inte]efac......s}ireyOS LaOperating System[OS-Driver interface]...............}osx.y
.........}dsx.y
Driver LayerD1D2...DN
Hardware LayerHardware Platform
del.moSystem3.1:Figure
Adriverismodeledasacomponenthavingbothimportandexportinter-
facesasillustratedinFigure3.2.Theexportedinterfaceconsistsofasetof
servicesthattheOScallstorequestthedrivertoperformoperations.These
servicesaretermeddsx.yfortheythserviceprovidedbydriverDx.Theim-
portedserviceinterfaceisusedbythedrivertoaccomplishtheserequestsand
canbefromtheOSitselforotherlibrariesinthesystem.Theseservicesare
termedosx.yfortheythserviceimportedbydriverDx.Whennodistinction
ismadebetweenimportedandexportedserviceswetermaservicesatthis
interfacesx.y∈O.Inthetargetenvironmentusedinthisthesis(Windows
CE.Net)aservicecorrespondstoafunctioncall.
Toperformerrorpropagationanalysiswerequiresufficientaccess(with
specification)tothesystemtobeabletointerceptinformationflowinthetwo
interfacesdefined(OS-DriverandOS-Application).Inmostcasesthiscanbe
achievedwithoutrequiringaccesstosourcecode,neitherforthedrivers,nor
fortheOSitself.ForWindowsCE.Netnosuchaccessisrequired.However,
accessisneededtothesourcecodeofthebenchmarkapplications,forinstru-
mentingthemwithassertionsusedtotracktheoutcomesofinjections.The
availabilityofinterfacespecificationsisabasicrequirementforanyOSopen
forextensionsbynewtypesofdrivers/applications.
36
CHAPTER3.SYSTEMANDERRORMODEL
Services exported
by the driver (dsx.y)
Target driver Dx
Services imported
om the OS and fr)omponents (osother cx.yFigure3.2:Drivermodel.
delMoError3.2Inordertoconductfaultinjectionbasedexperimentalstressing,threeques-
namely:arise,tionsinject?toWhere•inject?toWhat•inject?toWhen•Theanswerstothesequestionscorrespondtothreepropertiesofanerror
model,referredtoastheerrortype,errorlocationanderrortrigger.Another,
fourthproperty,relatedtotheerrortriggerisforhowlongtoinject.Eachof
thesepropertiesofanerrormodelisdiscussedinthefollowingsubsections.
Throughoutthisthesiswedonotmakeanydistinctionbetweentheterms
errorsandfaults.Consequently,wewilluseerrorwhendiscussingtheper-
turbationsinsertedinthesystem.Whenthedistinctionisneeded,wewill
explicitlyusethetermfault.
3.2.1ErrorType
Theerrortypeconstitutesthenatureoftheerror.Theerrortyperelatesto
theoriginoftheerror,i.e.,thefault,butalsotothemanifestationoffaultsas
errors.Theerrortypedescribeshowanerrorchangessomeinternalstateof
thesystem,fromtheoriginally(assumed)correctstate,toanother,possibly
state.erroneous,
MODELORERR3.2.
37
Dependingonthegoaloftheevaluation,errortypesarechoseneitherto
ascloselyaspossiblematcherrorsexpectedtoappearinthesystemasitis
deployedinthefield,orgenericerrortypesareused,basedontheirability
toprovokethesystemsuchthatweaknessesinhandlingperturbationsare
discovered.Asourinterestisonrobustness,i.e.,howthesystemhandles
externalperturbations,ourgoalistousemodelsthatprovokeasmanyand
asdiversevulnerabilitiesaspossible.Itcanalsobearguedthatwhenthe
purposeoftheevaluationiscomparative,asisthecasehere,thevalueof
usingarealisticerrormodelisdecreased,assumingthattherelativeeffectof
differentmodelsisthesame[Hiller,2002].Chapter6studiestheselectionof
differenterrormodelsexplicitly.Aspreviouslynoted,robustnessevaluation
canalsobeameansforfindingsecurity-relevantvulnerabilities.
Faultinjectionoriginatesfromthedesiretoestimatetheeffectiveness
oferrordetectionandrecoverymechanisms(EDRMs)builtintoasystem.
Forthispurposeonechoosestousetheerrormodelusedforthedesignof
thesemechanisms.Thesecondtypeofevaluationisexplorativeinnature.
WithoutknowledgeofpresenceorcoverageofanyEDRMs,thesystemis
evaluatedtoseehowithandlestheperturbationsinjected.Thistypeof
evaluationcanbeguidedbytheneedtoexploreextra-functionalbehaviorof
thesystem(orlackthereof)orbylackofoperationalscenarios.Thesecond
caseisespeciallytrueforgeneralpurposesystemcomponents,suchasOS’s,
whichmaybeusedinmany,fundamentallydifferent,operationalcontexts.
Thefocusofthisthesisisonexplorationofrobustnessvulnerabilitiesof
OS’s.Tothisendwehavechosenerrortypesbasedontheirusefulnessin
otherresearchprojectsaswellasreal-worldprojectsasreportedinliterature.
Threemainerrormodelshavebeenused:datatype-basederrors,bit-flips
alues).v(randomfuzzingandAnerrorappearingattheinterfaceofacomponent(suchasadevice
driver)appearsasadatalevelerror,i.e.,the(data)valueofsomeparameter
usedintheinterfacehasanerroneousvalue.Whatconstituteserroneous
dependsontheinterface/parameterinquestionandthestatethesystem
isin.Forinstance,adriverreturninganerroneous“busy”valuemayonly
causeasmalldelayfortheoverallsystem(providedthataretrymechanism
exists).Adriverresponding“ready”whenitinfactisnotreadytoreceive
commandsmaycauseseverefailuresinthesystem.
Therearemanypossiblesourcesforerroneousvaluestoappearatthe
interface,suchaspropagatinghardwareerrors,faultyassignmentsofvari-
ablesinthedrivercode,wronguserinputsorconcurrencyproblems.ODC
isaframeworktoclassifysoftwaredefectsandmanyofthemcanmanifest
asinterfaceerrors(oneclassinODCrefersspecificallytointerfacedefects)
[Chillarege,1996].Aswemodeltheeffectsoffaults,i.e.,errors,theinterface
38
CHAPTER3.SYSTEMANDERRORMODEL
levelerrorsmodeltheeffectsofmanyoftheunderlyingfaults(andconse-
quensolelytlyonODCdatalevclasseseloferrorsfordefects)deviceasdrivpropagatingersattheierrors.nterfaceHowetover,theasOSwewfoecusdo
notshouldofferbecompletetreatedascovaerageofcomplemenallopttoerationalothertecfaults.hniques.Therefore,ourapproach
menAlltationthreeandmothedelslevusedelofrepresensemantticdataexpressivleveleness.errors,butThedifferthreeinmothedelsimple-will
nowbepresentedonebyone.
DataTypeErrorModel
Theparameterinmanifestationquestion.ofaSincedatamoerrordernalsocompilersdependsconontainttheypecdatahectkypers,eofnottheall
assignmentsarepossibletodoforaparameters,restrictingthesetofpossible
errorsthereforefortheselectedparameters.dependingTheontheerroneousdatavtyaluepeofforthedatatypparametere(DTin)errorsquestion.are
AsmostdevicedriversarewrittenintheCprogramminglanguagewewill
suppconsiderortedC-stinyleotherdatatyhigh-levpes.elThislanguages,excludessuchhighaslevelclasses/obabstractjectsindataobtypject-es
languages.programmingtedorienSomeOS’sdoprovidethepossibilitytowritedevicedriversinother
programminglanguages,forinstanceC++.However,sincemostdriversare
stillwritteninCwefocusonsuchinterfaces.Inprinciple,object-oriented
interfacescanbeseenasextensionsofthedatastructuresused(wealready
supportthestructdatastructure)inC.
Foreachdatatypeusedasetofinjectioncasesaredefined.Theseare
predefined,beforeinjection,andarechosenbasedontheireffectivenessin
exposingvulnerabilitiesinthesystem[Koopmanetal.,1997].Valuesinclude
predefined(norandomnessinvolved)testvalues,offsetvaluesandboundary
values.Offsetvaluesmodifytheoriginalvalue,forinstanceusingaddition
ordefinedissubstractiontypicallyoperationsrelativelyonlothew,allooriginalwingvthisalue.errorThenmoumdelbertoofincurinjectionsfewer
1997].injectionsChapter(onav6erage)discussesthan,thefornuminstance,beroftheinjectionsbit-flipmodelrequired[Koopmancomparedetal.,to
dels.moerrorotherSincetheinjectioncasesaredefinedonadatatype-basisthenumberof
Tsuchypicallydata,tympesultipleusedbservicesecomesinthesuchiscalingnterfacesfactoruseforthethedatasametypdataeterrorypes.moFdel.or
instance,in[Kroppetal.,1998]only20datatypeswereusedforthe233
thePOSIX20datafunctionstypes,targeted.makingthisEacherrorofthemodel233scalefunctionsverywusedellawithcomthenbinationumberof
MODELORERR3.2.
39
offunctions.Nospecializedinjectionscaffoldingisrequiredforeachtested
function.However,onecaveatisthatinformationonthedatatypeusedis
requiredtoselecttherightinjectioncases.
Table3.1:Overviewofthedatatypesused.
DatatypeC-Type#Cases
7inttegersIn5intunsigned7long5longunsigned7short5shortunsigned7INTEGERLARGE3void*Misc6HKEY*}...{struct4Strings7charCharacters5charunsigned5twchar1boololeanBoEnumsmultiplecases#identifiers
Table3.1showsanoverviewofallthedatatypesusedandthenumber
ofcasesimplementedforeachofthetypes.Theinjectioncaseswerecho-
senbasedontheirreportedusedinliterature,suchastheBallistaproject
[Bal]andtoincludecasesmodifyingtheoriginalvalue.Notrelyingsolely
onexplorationstaticallyofdefined“closevtoalues,correct”likevbalues,oundarywhicvhaluesmayandbespveryecialvaluesproblematicallowsto
candetectoccurandinrecorealvecorde.from.ItisFimpurthermoreortanttowbeehaablevetoonlymatchselectedthevinjectedalueswhicerrorh
toeacahhyperrorotheticalinjectedfaultmustinhathevecobde.eenpSinceossiblewetosimintroulateducemostlybyansoftwareimplemenfaults,ta-
tionfault.Consequently,eachinjectederrormustbecompilable,i.e.,itmust
passthetypecheckbythecompiler.Byhavingaspecificinjectioncasefor
eachdatatypethispropertyismaintained.
oneDatainjectiontypecaseerrorsforalsotheptreatoinpter,ointersnamelyasassettingpecialittodatatNULLyp.eWandrongreservuseesof
poinNULL-pters,oinbuttersnotisforacommonimplicitpointers,programmingsuchasmistakstrings,e.Thiswhichdonehaveforthisexplicitcase
40
CHAPTER3.SYSTEMANDERRORMODEL
Table3.2:Datatypeerrorcasesfortypeint.
aluevNew#Case1(Originalvalue)-1
2(Originalvalue)+1
1304-15MININT6MAXINT7
Table3.3:Datatypeerrorcasesforstrings.
aluevNew#Case1Overwriteendofstring(’\0’)
2Increasereferencepointer
3Replacewithemptystring
NULLtoreferenceSet4
definedasaspecialinjectioncase.Tofurtherillustratehowdatatypeerrors
aredefined,Table3.2showsthecasesforthetypeint.Cases1and2
modifytheoriginalvaluebyaddinganoffsettoit.Cases3-7usecommonly
difficultvaluesandboundaryvalues.Table3.3showstheerrorsinjected
forstringparameters(bothforUnicodeandASCIIstrings).Thefirstcase
effectivelyevaluatestherelianceontheendcharacterforstrings.Thesecond
caseshortensthestringbydisregardingthefirstcharacterandthethirdcase
replacestheentirestringwithanemptystring.Thelastcasesetsthepointer
tothestringtoNULL.
Thechoiceofvalueswasdonebasedonknownproblematicvalues,and
previousstudies;anditwaskeptrelativelylow.Thechoiceofeffectivevalues
(thoseexposingvulnerabilities)isdifficultandcontextdependent,andis
similartotheproblemsarisingwhenselectingsuitableequivalenceclasses
forfunctionaltesting[Hamlet,2006].Forthisstudywehavethereforeopted
forasimpleandlightweightdatatypemodel.Ourmodeldoesnothave
thesameexpressivepowerastheonesusedforinstancein[Koopmanetal.,
1997]or[FetzerandXiao,2002b],asitisbasedsolelyonthedatatypeof
theparameter.Thein-situinjectionstrategyeffectivelylimitsthepossible
typesofinjectionsthatcanbecarriedout.Themodelusedischosenforits
simplicityandlownumberofinjectioncases.
MODELORERR3.2.
41
delMoErrorBit-FlipvWhenoltagelevhardwelsareinelementransistorstsarecanexpcosedhange,to,forcausingtheinstance,logicalradiationonesandorEMIzerostheto
csimhangeulatesvaluestheseortyevpeensogetfstucfaultskinatachardwertainare,vbyalue.selectivTheelybit-flipflippingmodel(certainBF)
bits,changingthevaluefromonetozeroorviceversa.
TheBFmodelwasfirstintroducedtosimulatehardwareerrorsasabove,
andwasusedinmultiplefaultinjectiontools(seeSection2.3forseveralex-
amplesofsuchtools).Atfirst,hardware-basedinjectiontoolswereused,but
SoftWareImplementedFaultInjection(SWIFI)soonemerged,wherehard-
warefaultsareinjectedusingsoftwaremechanisms.SWIFIimprovesflexibil-
ityandeaseimplementation(nospecialhardwarecomponentsneeded),but
maybelimitedinwhichareasofhardwarecanbetargeted.Onceinjection
couldbeconductedusingsoftware,bit-flipsweresoonalsousedtosimulate
softwarefaults[VoasandCharron,1996].Thereisstilladebatewhether
theBFmodelaccuratelyreflectssoftwarefaults.Someauthorsarguethat
thisrelationisoflesserimportance,especiallyforrobustnessevaluation,and
thattheimportantquestioniswhethertheeffectsoftheinjectedfaults(the
errors)arethesameasthoseofrealfaults[Jarbouietal.,2002b].
IntheBFmodeleachparameterisseenasadataword,whereselected
twbitseenaresingleflippevedentotsimupsetsulate(SEU,faultsalsointhereferredmotodule.asAsofterrdifferenceorsinismadediscussionsbe-
onhardwarereliability)whereonlyonebitisflipped,andthecasewhere
mwhereultipleonebitsofthearebitsflippised.Inselectedthisasthesistargetfoandcusisflipponed.theForasimpler32bitSEUarcmohitec-del,
doturenotthisusettheypicallyfull32resultsbitsinand32thusainjectionssmallerpernumbparameter.erofbitsSomecanbedataused.types
Thegreatestadvantageofthebit-flipmodelwhenusedoninterfacepa-
beingrametersveryissimsimpletoultaneouslyimplemenitstitgreatestlackswineakness,expressivnamelyenessforsimplicitmorey.complexWhile
errorsinabstractdatatypes,suchasstringsetc.
delMoErroruzzingFThefirstuseofthefuzzingerrormodel(FZ)inthecontextofrobustness
evaluationwasreportedin[Milleretal.,1990].HereUNIXutilityprograms
werefedrandominputdataandtheirbehaviorwasobserved.Thetechnique
ofrandominputforrobustnessevaluationwasfurtherdevelopedin[Ghosh
etstanceal.,by1998;MicrosoftOehlert,as2005;partGofotheirdefroidetSecureal.,Dev2007]elopmenandtisadvLifecycleocated[Hoforwardin-
42
CHAPTER3.SYSTEMANDERRORMODEL
andForiLipner,nterface2006]faultandisinjection,mainlyfofuzzingcusedontranslatesfilesandintonetwreplacingorkprotothevcols.alueof
aparameterintheinterfacewitharandomvalue.Therandomvalueisuni-
formlyselectedacrossall32-bitvalues.Theuniformdistributionisselected
sinceparameternoknovalueswledgearepresenregardingt(oroperationalassumed),whicprofileshorcouldequivjustifyalenceusingclassesadif-of
theferentareaofdistribution.randomMoretesting(seesophisticatedSectiontec2.5.1).hniquesAstruecanalsorandomnessbeapplied,isadifficultsin
toachieve,pseudo-randomgeneratorsareused.Careneedstobetakento
eacmakheexpsureerimenthatt,theelseseedstheusedsametovaluethesewillbegeneratorschosenareforeacselectedhinjection.differentlyfor
ItisimportanttonotethatwhereasBF(andinsomecasesalsoDT)
modifiesagivenvalue,i.e.,thenewerroneousvaluedependsontheoriginal
(presumablycorrect)value,fuzzingcompletelyreplacesthevaluewithanew
one.correct”Thisvalues,meansthatwhereasBFFZcanisbebexpetterected(thankstomoretotheeffectivrandomelytestselection)“closetoat
findingmorerarevaluescausingvulnerabilities.
cationLoError3.2.2Aslocationanddistributionofactualfaultsmaybeunknown,ortoocostly
tofullyexplore,acommonapproachtoinjecterrorsinsteadoffaults,i.e.,to
injecttheconsequencesofactivatedfaultsratherthanthefaultsthemselves
[Bartonetal.,1990].Manyfaultsmaymanifestasthesameerror,i.e.,atthe
samelocation/levelinthesystem.ThisconceptisillustratedinFigure3.3.
Aninjectederrormayrepresentmultiplefaults(at2aor2b),originatingat
differentlocations.Anerrorinjectedattheinterface(at3.)maytherefore
representmultipleerrorshavingpropagatedtothesamelocation.
Jarbouietal.[2003]makesadistinctionbetweenthelevelinthesystem
whereafaultisinjectedandthereferencelocationoftheoriginatingfault,
dr.InFigure3.3thiscorrespondstothedistancebetweenpoint1and2.
Furthermore,adistinctionismadebetweenthelocationoftheinjectederror
andthelevelwherethefailuresofthesystemareobserved,do.InFigure3.3
betweenpoints1and4.
Inthisthesiswehavefocusedonerrorsappearingintheinterfacebetween
theOSanditsdevicedrivers.Forthepurposeofrobustnessevaluationof
anOS,thisinterfacerepresentsagoodlocationforinjectingerrorsforthe
reasons:wingfollo
•faceThisisafacilitatesstandardfairinterfacecomparisonsdefinedacrossforthedriversOS.forUsingtheasamestandardOS.inter-
MODELORERR3.2.
omponent BC43
12b2aomponent AC
tion locationault injec F1.ult originsa F2.
3. Error location in interface
ation pointv Obser4.
example.manifestationError3.3:Figure
43
•Theinterfaceallowsforlow-intrusioninterceptionofthecallsbeing
madeacrosstheinterface.Lowintrusionmeansthatnosourcecode
cavhangesailableareforneeded,commercialneitherprotoducts.OS,nortodrivers,whichmaynotbe
•AsandthedescribOSedalloabowsve,forsiminjectingulatingerrorsminultipletheinfaultsterfacewithinbetwtheeendrivtheerdrivander
inthehardwareitcontrols.
•forInjectingtheinjectederrorsaterrorsthislevusingelallowspre-profiling.forachievingThispro100%cessisactivdescribationedratioin
4.6.Section•Nodriver-specificknowledgeisneeded,makingtheapproachreadily
availablealsofornon-driverexperts.
•Theinterfaceisanopeninterface,inthatother3rdpartydevelopers
thearevgivendorenofaccessthetoOS.thefullinterface,makingrobustnessakeyissuefor
riggerTError3.2.3Anerrorcanbeapermanentdefectpresentinthesystem,oracombination
ofdefectsandexternalperturbationsthattogetherleadtoanerror.This
givesrisetotwodistinctpropertiesofanerror,relatingtotiming:theevent
triggeringtheappearanceofanerrorandthedurationoftheerroronceit
ears.app
44
CHAPTER3.SYSTEMANDERRORMODEL
thoughForrecenhardwtareresearcrelatedhindicateserrors,pthatermanenthetratioerrorsofaretransiennotterrorsuncommon,isincreas-even
ing.Forsoftware,permanenterrors(Bohrbugs)aretargetedusingtesting.
ForrobustnessevaluationthemaintargetisHeisenbugs.Furthermore,the
vtionerybetnatureweenofinputsHeisenandbugsfailuremaketcannothembedifficulttoestablished.findsinceTherefore,asimpletheyrela-are
havinsteadefosimcusedulatedonabythetransientinjectionerrorofmodelerrorsaswinebtheelievesystem.thisIntothismoreworkcloselywe
tines.representMultiplebehaviorotherbyfaultthedrivinjectionerstonotolsfoundallowforthroughinjectionstandardofbothtestingtransienrou-t,
intermittentandpermanentfaults,e.g.,[Hanetal.,1995;Stottetal.,2000].
ThetriggerusedforaninjectionisstudiedindepthinChapter7.Which
typeoftriggertouse(event-ortime-driven)andwhichparameterstouse
isaresultingnon-trivialinantask.effective,Thisyetthesissimple,propproosescessafornoveltheevenselectiont-drivenofapproactriggeringh,
evwhicenhts.alloThewspropinjectionosedinapproacdifferenhitsstatesbasedofonthethesystem.usageprofileofthedrivers,
3.2.4OtherContemporarySoftwareErrorModels
Theworkreportedin[Albinetetal.,2004;Dur˜aesandMadeira,2003;Arlat
etal.,2002;Guetal.,2004;Jarbouietal.,2002a]exploredtheuseofvari-
ouserrormodelsandinjectiontechniquesforOSrobustnessevaluationand
benchmarking.In[Jarbouietal.,2002a],forinstance,errormodelssimilar
tooursareused,butareinjectedatdifferentlevelswithintheLinuxkernel.
Dur˜aeset.al.useacodemutationerrormodel,wherecodesegmentsof
devicedriversaretargeted[Dur˜aesandMadeira,2003].Mutationshavelong
beenusedtoassesstheeffectivenessoftesting.Forinstance,DeMilloused
codemutationsforinvestigatingtheefficiencyofasetoftestcasesindiscov-
eringflawsofapieceofsoftware[DeMilloetal.,1978].Theauthorsdevelopa
theoryregardingthecouplingeffect,namelythatifasetoftestcases(input
values)candistinguishall(simple)mutationsofaprogramfromthecorrect
one,thenitwillalsodetectmorecomplexfaultsinthecode.Mutationswere
laterusedinfault-basedtesting(forinstance[Zeil,1983;Morell,1990])to
verifythatcertaincodelevelerrorsarenotpresentinapieceofsoftware.
Finally,in[Moraesetal.,2006]theauthorsnotethaterrorsappearing
atinterfacesofcomponents,thoughbeingusefulforrobustnessevaluation,
donotnecessarilyrepresentfaultsinthecode.Sinceweareindeedfocusing
onrobustness,interfaceerrorsarerelevant.However,continuedresearchon
errorpropagationcanhopefullyrevealwhicherrorscanberepresentedatthe
interface,andwhichnot.
ONMENTENVIRALEXPERIMENT3.3.
3.3ExperimentalEnvironment
45
senThetaexperimencommonlytalenusedOS.vironmenWtechosedetailedWindointhiswsCEsection.Netwabscecausehosenittoprorepre-vides
ittheispaossibilitwide-spreadytoOS,customizewiththeusesinwholeawidesystemrangeimageofproinanducts.easyFmannerurthermore,and
itscasearcstudyhitecture.ThisresemsectionblesfirstthatinoftromostducesmoWindodernwsOS’s,CE.Netmaking,itsitanarcexcellenhitecturet
andtoolsupport.Thenthehardwaresetupusedispresentedtogetherwith
adescriptionofthesoftwaresetup.
.NetCEwsWindo3.3.1WindowsCE.NetisanOSfromMicrosoft,targetedmainlyattheembed-
dedmarket.Itishighlyconfigurable,makingitwidelyusedindifferent
surfconfigurationstations,Pinoindiverset-of-Saleproducts,stations,suchGPSasnamobilevigatorsphonesetc.andWindoPDwsAs,CEInisternetthe
foundationformorespecificembeddedOS’sfromMicrosoft,suchasPocket
.MobileWindowsandPCThefirstreleasedproductsbasedonWindowsCEwerereleasedin1996
assocalledHandheldPCs.Furtherrevisionsofthefirstversionhasledto
thethiscurrenthesisistlybasedlatestvonversionersion6.04.2atofthethetimeOS,ofcalledwritingWindothiswsthesis.CEThe.Net,workwhicinh
wasreleasedin2003.Forreasonsofcontinuitywehavechosennotmoveto
anewerversion.Therefore,therestofthisthesisdescribesversion4.2of
theOS.AgoodintroductorytextbookonprogrammingforWindowsCEis
DouglasBoling’sbookProgrammingMicrosoftWindowsCE.Net[Boling,
2003].Figure3.4showsanoverviewofthearchitectureifWindowsCE.Net.
ItshowshowtheOSlayerissplitintwoparts,thegenericOSlayerpro-
videdbyMicrosoft,whichformstheinterfaceusedbyapplications,andthe
OEMlayerwhichisprovidedbytheOEM(OriginalEquipmentManufac-
layturer)ermakembesiteddingpossibleWindotowsuseCEWindoinawsproCEductforsoldmanytohardwcustomers.arearcThehitecturesOEM
drivandersforaaremtheultitudemainofpresperipheralonsibilityodevicesftheandOEMs,technologies.somegenericAlthoughdriversdeviceare
alsoprovidedbyMicrosoftaspartoftheOSpackage.
Inrelationtooursystemmodel(Figure3.1)theDriverLayerisformed
bythedriversprovidedeitherbytheOEMorMicrosoft.Therestofthe
OEMWindolayerwsisCEconsidered.Net,asalthoughpartbofeingtheaOSLacompletelyyer.differentOSthanother
46
CHAPTER3.SYSTEMANDERRORMODEL
Internet Client ServicesWinCE ApplicationsUser Interface
Custom ApplicationsApplication Layer
Core DLLObject Store
Graphic WindowingCommunication
Multimediaand Event SystemDeviceServices and
Technologies(GWES)ManagerNetworking
KernelOperating System Layer
OEM Adaptation Layer
BootConfigurationDrivers
LoaderFilesOEM Layer
reyOEM La
Hardware Layer
Figure3.4:AnoverviewofthearchitectureofWindowsCE.Net.Figure
[MSDN].fromadopted
OS’susedonprotheducedbWindoywsMicrosoft,platform.offersmanExamplesyoftheincludesametheservices.Netandplatforminterfaces(asa
subsetknownas.NetCompactFramework),Win32,MFCetc.Atdesign
AtimecomptheonentdesignerisacanseparatechoosepiecewhicofhcompfunctionalitonentsytothatincludecanineitherthebeOSincludedimage.
intheOSimageornot.ThedesignerusesthePlatformBuildertooltobuild
theOSimageandtodownloadittothetargetmachine.
3.3.2DeviceDriversinWindowsCE
AdevicedriverforWindowsCE.Netisadynamiclinklibraries(Dll).Itis
dynamicallylinkedintoanotherprocessatloadtime.Thishostprocesscan
thenusetheservicesprovidedbythedriver.MostdriversinWindo1wsCE
.Netareloadedbythedevicemanager(device.exe).TheRegistryisused
tospecifywhichdriversaretobeloadedinthesystem,andinwhichorder
(iftherearedependenciesacrossdriverstheordermightbeimportant).
TheinterfaceusedforcommunicationbetweentheOSandthedriver
1TheRegistryisaWindowsspecifictechniquetocentrallystoreconfigurationinfor-
mation,bothforthesystemandforapplications.
ONMENTENVIRALEXPERIMENT3.3.
47
isdefinedintheCprogramminglanguage.Eachdriverexportsasetof
servicesanduses(imports)servicefromtheOStoperformservicerequested.
ApplicationsaccessdevicesthroughtheOS,forexamplethroughthefile
systeminterface.Thesecallsaretranslatedtocorrespondingcallsintothe
er.drivSinceWindowsCE.Netissupportedonmanyhardwareplatformsand
usedmainlyforembeddedsystemsitsupportsmanydifferentperipheral
devices.WindowsCE.Netsupportsthreebasictypesofdrivers,native,bus
andStreaminterfacedrivers.Nativedriversarebuilt-indriversprovided
bythehardwarevendors.Theyaretypicallytiedtospecifichardwareand
OSversions,forcontrollingthingssuchaskeyboards,touchscreensetc.They
mightusecompletelycustominterfacestotheOSandthereforeoftenrequire
changeswhennewversionsoftheOSarereleased.Nativedriverscanbeseen
asextensionsoftheOSfortherequiredhardware,ratherthansupporting
devices.add-onThemostcommontypeofinterfaceistheStreaminterface,whichpro-
videsstandardentrypointsforadriver.Sincetheinterfacesiswellspecified
itallowsfor3rdpartydeveloperstobuilddriversoftheOS.Table3.4shows
anoverviewoftheentrypointsprovided.TheprefixCOMisbyconven-
tionusedforserialdrivers,otherdriversuseotherprefixes,suchasCONfor
consoledriversorWAVforaudiowavedrivers.
Table3.4:Streaminterfaceforserialdriver.
NamerebNum0InitCOMDeinitCOM1enOpCOM2CloseCOM354COMCOMWReadrite
SeekCOM6trolIOConCOM78COMPowerDown
9COMPowerUp
WefocusmostlyonStreaminterfacedrivers,astheseuseastandard
3inrdterfacepartiesalloandwingrepresenfairtadd-oncomparisonpacrosseripheralsdrivwhereers,aarecthoiceypicallyamongdevcompelopedetingby
productscanactuallybemade.
48
CHAPTER3.SYSTEMANDERRORMODEL
areHardw3.3.3Thehardwaresetupusedfortheexperimentspresentedinthisthesisis
anXScale-basedreferenceboardproducedbyIntrinsycLtd[Int].Multiple
boardswereacquiredtoallowforparallelexecutionoftheinjectionexperi-
ts.menTheboardsarebasedontheIntelPXA250architecture,withanIntel
XScaleprocessorchip.Eachboardcarries64MBRAMand32MBFlash-
basedROM.Abootloaderispresentinflashandallowsforsimpledownload
ofnewOSimages,eithertotheROM,ortoRAMforimmediateboot.A
dedicatedflashchipisalsopresent,withaccessfromuserspaceapplications.
RS232),Thebowhereardsareoneactsequippasedwithdebugapsetort.ofTwserialoportstandardconnectionsEthernetso(standardckets
(RJ45)CompactFlashallowforsocnetket,workwherepconnection.eripheralEachdevicesboardcanisbealsoattacequipphededontowiththea
bus.PCMCIA
reawSoft3.3.4UsingtheprovidedPlatformBuildertool,asmall-footprintimagecontain-
ingtheOSandtheassociatedsoftwaremodulesdescribedinSection4.5is
builtanddownloadedtothetargetboardusingEthernet.Startingwiththe
smallestsupportedimage,onlycomponentsforthedesireddriversandthe
hardwarespecificcomponentssuppliedbythevendorwereincluded.This
resultedinanimagewithafootprintoflessthan3MB.Sincetheboardsused
areheadless,onlyminimalgraphicsandwindowingcomponentsareneeded.
Alsomediarelated(e.g.,readersandviewersofdifferentfileformats)and
Internetcomponents(web,telnet,andftpserversetc.)areleftoutofthe
image.Thiswaywegetasystemwhichcontainsaminimumnumberof
componentsthatmayinfluencetheresultoftheexperiments.
3.3.5SelectedDriversforCaseStudy
Threedriverswerechosenasrepresentativeforacasestudy:aserialport
driver(cerfioserial),anetworkcarddriver(91C111)andadriverforaccess-
ingaCompactFlashcardconnectedtothePCMCIAbus(atadisk).These
driverswerechosenastheyrepresentdifferentclassesofdrivers,thefirst
twoarecommontypesofcommunicationandthethirdaccesstoexternal
sourcesattachedtothesystem.Thefirsttwoaresupplied,inthiscase,by
thevendorofthedevelopmentboard(notthesameastheproducerofthe
hardwarecircuits),whereastheCompactFlashdriverisdeliveredaspartof
SUMMAR3.4.Y
49
theOS.Theyalsorepresentdriverstypicallyfoundonmanysystemsand
platforms.AllthreeprovidetherequiredStreaminterfaceentrypointsand
areloadedbydevice.exeatloadtime.
3.4Summary
Thischapterhasintroducedthepreliminariesneededforthediscussioninthe
followingchapters.Thesystemmodelusedwasintroducedandforreference
Table3.5providesanoverviewofthesymbolsdefined.Theerrormodelsused
wereintroducedanddiscussedfollowedbyadescriptionoftheexperimental
environmentused,includingbothhardwareandsoftwareaspects.
lobSymPAPiDxsiosx.jdsx.jsx.ySAxO
Table3.5:Summaryofsymbolsintroduced.
DescriptionApplicationioutofatotalnapplicationsrunningon
systemtheDriverxfromatotalofNdriversinthesystem
AserviceprovidedbytheOS,tobeusedbyanapplica-
tion.AserviceprovidedbytheOS,usedbydriverDx.
AserviceprovidedbydriverDxtobeusedbytheOS.
AnyserviceintheOS-Driverinterface,disregardingthe
differencebetweenimportsandexports.
ThesetofOSservicesprovidedintheOS-
terface.inApplicationThesetofOSservicesusedbyAPPx.
ThesetofservicesintheOS-Driverinterface.
50
CHAPTER3.SYSTEMANDORERRMODEL
Chapter
4
InjectionaultF
orkramewF
Howtoperformfaultinjectionfordevicedrivers?
requiredInorderexptoerimenprotsvideathefaultinjectioninfrastructureframewandorksupphasortbeenneededtoimplemenperformtedforthe
errorWindomowsdelsCEin.Net.theinTheterfaceframewbetworkeenallothewsOSforandinjectionitsdeviceofavdrivarieders.setThisof
chapterfirstdiscussestherequirementsontheinjectionframeworkandthen
describesthearchitectureoftheframework,itsimplementationforWindows
CE.Netandtheextensionpossibilitiesitprovides.
51
52
CHAPTER4.FAULTINJECTIONFRAMEWORK
troIn4.1duction
Whenperformingresearchusingfaultinjectionaflexibleenvironmentis
needed,suchthatnewideas1caneasilybepursued,withoutextensivere-
designoftheunderlyingtool.Theframeworkshouldsupportautomatic
configuration,suchthatmanagementofconfigurationsettingsissimplified
andtheriskofmistakesisminimized.Itshouldalsosimplifythecollection,
data.ofanalysisandstorageAplethoraoffaultinjectiontoolsexistinliterature(seeSection2.3fora
comprehensivelist).However,eventhoughsomeofthetoolsmayhavebeen
abletoadaptwedecidedtoimplementanewinjectionframeworktobetter
suitourneeds.Fortheimplementationseveralrequirementswerepostulated,
whichallowforaflexiblefaultinjectionenvironment:
•Extensibility:Injectionofmultipledrivers,usingmultipleerrormod-
elsshouldbesupported.Theper-driverscaffoldingshouldbemini-
mized.
•Blacexternalk-boevx:aluatorsNoaccessthepotossibilitsourceytocodeusetheshouldtool.beassumed,toallow
•Datahandling:Thedataextractedfromtheexperimentsshouldbe
processedandstoredwithoutlossofinformation,andallowforeasy
extensionandinteroperabilitywithexternaltools.
•Automation:Automationiskeytoa)reducethetimeoverheadasso-
ciatedwithconfiguringandrunningfaultinjectionexperiments;b)to
minimizetheriskofusermistakesinconfiguringtheexperiments;and
c)tomoreeasilyadapttochangesinthesetuprequiringconfiguration
hanges.c
Thesedesignrequirementswerethebasisforthedesignofourfaultin-
jectionframework.Eventhoughoneofthegoalsisextensibilitywehave
chosentoimplementtheframeworkforaspecificOS,WindowsCE.Net.
ExtendingtheframeworkforotherOS’sispartoffuturedirections.The
restifthechapterdescribestheoveralldesignoftheframework,thephysical
setupusedandthefunctionalityofthesystemcomponents.
1Asourtoolisflexibleandsupportsextensionswewillrefertoitasaframeworkfor
faultinjection.
4.2.EVALUATION,CAMPAIGN&RUN
53
4.2Evaluation,Campaign&Run
Tosimplifythediscussionwemakeadistinctionbetweenanevaluation,an
spinjectionecificcerrorampandaigntheandanobservationinjection(andrun.Alogging)nofinjectiontheruneffectsisofthetheinjectioninjection.ofa
Ancase,asinjectionpecificdrivcampaigner,iserroramocollectiondelandofininjectionterfaceruns,(Dll,pimpertainingorted/expto,inorted).our
Asetincludingofpinjectionossiblymultiplecampaignsdrivformersandthebasiserrorformothedels.evaluationofasystem,
SetupareHardw4.3Thetargetsystemfortheevaluationrunsonadedicatedcomputer,the
TargetComputer.Asalreadymentionedweusespecialdevelopmentboards
fortheevaluation.TheevaluatorcontrolstheevaluationfromanormalPC
workstation(inoursetuprunningWindowsXPSP2).
HosteromputC
onnecrial ceStion
EthernetchwitS
CTaomputrget ers
devFigureelopmen4.1:tTheboardhardwtoarethesetup.privateUsingnetwanork.EacEthernethboardswitchistoalsoconnectconnectedeach
directlytotheHostComputerviaserialcables.
Figure4.1showsthehardwaresetup,withtwotargetboardsconnected.
proEachvidesboardisdynamicconnectedIPtoaddressesaprivusingatenetaworkbuilt-inusingDHCPanservEtherneter.Eacswitchbh,oardwhicish
connectedtotheHostComputeroveradebugserialconnectiontoaccessthe
54
CHAPTER4.FAULTINJECTIONFRAMEWORK
bootconnectionsmenuofaretheusedbobotyeacloaderhbandoard’stowreadorkload,debugasoutput.describedbeloAdditionalw.serial
SetuparewSoft4.4
Thebinaryimagedownloadedtoaboardforeachexperimentcampaign
containsallnecessarysoftwarecomponentsrequiredforperformingtheex-
periments.Thisincludesallcomponentscomprisingthesystemdepictedon
thetargetcomputersideofFigure4.2.
PlatformBuildertogetherwiththeEmbeddedVisualC++4.0toolwere
usedtocompileandbuildtheapplicationsandOSimagesused.Mostofthe
codeforthecomponentsdetailedinSection4.5waswritteninC++,orin
somecasestheCprogramminglanguages.FurthermoreSQLServerwasused
ontheHostComputer(WindowsXPworkstation)tostoretheexperiment
data.ApplicationsrunningontheHostComputerarewritteneitherinC#
C++.or(.Net)
SetupInjection4.5
Eachinjectionisspecifiedusingthree(integer)parameters:serviceID,pa-
rameternumberandinjectioncasenumber.TheserviceIDscanbeselected
inanyorder,aslongaseachserviceforeachcampaignisuniquelyidentified.
Whenviouslyhasstoringtohadataveingloballythedatabaseuniqueonidenthetifier.HostParametersComputer,eac(includinghservicereturnob-
values)aresimplynumberedastheyappearintheargumentlist.Finally
injectioncaseshavetobeuniquelyidentifiableforeachparameterandare
definedforthedatatypeanderrormodelused.
Tosupportextensibilityandtohaveaflexibleinjectionframeworkwe
haveinjectionsoptedaretopuseerformedSWIFI.usingNospsoftwaecializedreonly.hardwFigurearesupp4.2ortshoiswsanrequiredoverviewand
ofthemainsoftwarecomponentsofthesystem,showingboththetargetand
HostComputer.ApartfromtheOSitselfanditsdrivers,thesystemcontains
thefollowingmainmodules:
•HostComputer:ThemainresponsibilityoftheHostComputeristo
receiveandstorelogmessagessentbytheExperimentManager.
OSTheimagesHosttoComputerthetargetisadditionallycomputers.usedtobuildanddownloadnew
SETUPINJECTION4.5.
Target Computer
Experimentest ApplicationsTanagerMOperating System- Exp- Exp S. Syncetup.
ggingo- LInterceptor- Restarting
erget drivraT
HosteromputCchoErevserggingoLrevser
Figure4.2:Anoverviewoftheexperimentalsetup.
55
•ExperimentManager:Responsibleforsetup,controlandloggingof
experiments.ItcommunicateswiththeHostComputer,whichstores
messages.log•Interceptor:TheInterceptorisamoduleusedtointerceptcommu-
nicationbetweentheOSandthetargeteddriver.TwotypesofInter-
ceptorsareused,onefortrackingcallsandoneforinjectingerrors.
•TestApplications:Theworkloadconsistsofasetoftestapplications
exercisingthesystemandthedriversinamultitudeofways.
TheInjectorandtheExperimentManagerinteracttocoordinateeach
injectionrunandtosendlogmessagestotheHostComputer.Similarly,the
testapplicationsreporttheirprogresstotheExperimentManager,which
forwardslogmessages.Informationbetweenmodulesisexchangedusing
messagequeues,amessagepassingprimitivenativetoWindowsCE.Net.
ThecomponentsofthetargetcomputerarebuiltintoanewOSimage
asdescribedintheprevioussection.TheOSimageisdownloadedtothe
onboardflashmemoryandloadedintoRAMeachtimetheOS(cold)boots.
Apossibleriskwhenconductingfaultinjectionisthepresenceofdormant
faults,i.e.,faultsfrompreviousinjectionsthatareleftdormantinthesystem.
Thiscanleadtounpredictableandnon-reproducibleresults,astheoutcome
ofaninjectionmaybeaffectedbysuchdormantfaults.Tominimizethis
riskthesystemis(cold)restartedbetweeneachinjection,resultingina
56
CHAPTER4.FAULTINJECTIONFRAMEWORK
freshcopyoftheOSimagebeingloadedforeachinjectionrun.Logsare
sentandstoredonadifferentmachine(theHostComputer)andaminimal
setofconfigurationinformationisstoredinflashmemory.Thisprocessis
conservativeandcommonforfaultinjectionexperiments,e.g.,[Chillarege
andBowen,1989;Guetal.,2003].However,therestartbeforeforeach
injectionincursasubstantialrun-timeoverheadwhenerrorsaremaskedor
rwritten.evoTheprocessofproducinganinjection-readyOSimageisillustratedin
Figure4.3.Firstthebinaryoftheoriginaldriverisscannedtoidentifyex-
portedandimportedservices.Togetherwithinformationinsystemheader
filesandtheonlinedocumentationtheInterceptormoduleisconstructed(A
inFigure4.3).Forinterceptingimportedfunctionsthebinaryoftheorigi-
naldriverismodifiedtoimportthefunctionsfromtheInterceptormodule
insteadoftheoriginalservices(BinFigure4.3).Forexportedservicesthe
systemisconfiguredtousetheInterceptormoduleinsteadoftheoriginal
driver,bymodifyingtheconfigurationoftheloadingprocessofdriversin
thesystemRegistry(CinFigure4.3).Lastlythe(modified)driver,Inter-
ceptor,configurationandothersystemcomponentsaremergedintoasingle
OSimagetobedownloadedontothetargetcomputer(DinFigure4.3).
ABuild specialized interceptor wrapper
BModify driver to use wrapperExperiment
CModify RegistryManagerD
DBuild new OS imageTApplicationsest
BerDrivModifiederDrivHeaderFilesInterceptor
CSpecification/DocumentationAconfigurRegistryation
CconfigurRegistryation
Figure4.3:BuildinganOSimageforinjection.
tionInjecimage
SETUPINJECTION4.5.
57
ManagerterimenExp4.5.1TheExperimentManagerrunsasaseparateprocessoneachboardand
isresponsibleforsetup,monitoringandlogging.Theparametersneededto
configurethesystemaresetupusingthesystemRegistry,whichispartofthe
binaryimageoftheOSbuiltoffline.Thesesettingsremainstaticthroughout
theexperiment.Dynamicinformationconcerningwhichinjectionshavetaken
placeandtheconfigurationforthenextinjectionisstoredinaplaintext
configurationfileinpersistent(flash)storageonthedevice.
Atboottimeitstartsbysettingupaconnectionforsendinglogmessages
totheHostComputer,eitherusingEthernetorserialcommunication.From
thispointonlogmessagescanbesenttotheHostComputer.Dependingon
thetargeteddrivereitherserialorEthernetcommunicationisusedtosend
logmessagestotheHostComputer.
Nextitreadstheconfigurationfiletofindoutwhichserviceistobetar-
getedforthenextinjection(seeFigure4.4).Whentheinjectiondatahas
beenreadtheyaremarkedaspendingandthechangeisflushedtobemade
persistent.Onceanexperimenthasfinished,andbeforerebooting,thepend-
ingflagischangedtofinished,indicatingthattheExperimentManagercould
rebootthesysteminacleanway.Thissimplemechanismallowsustode-
tecthangsanduncleanrebootsbythesystem.If,atboottime,apending
flagisfoundforthenextinjectionaspeciallogmessageissentwiththis
information.TheExperimentManagerthenwaitsfortheInterceptortosenda“Ready
forinjection”message,afterwhichitsendstheinjectiondata(service,pa-
rameterandinjectioncase).TheInterceptorthenhandlestherestofthe
injection.actualAfteraninjectionhasbeenconfiguredthetestapplicationsarestarted
andmonitored.EachtestapplicationupdatestheExperimentManager,on
anyassertionviolation.TheirexitstatusisalsomonitoredbytheExperiment
Managerandiftheyexitabnormallythisislogged.Iftheyhavenotexited
withinagiventimeperiod(aboutthreetimesnormalexecutiontime)they
areconsideredhung/crashedandthisfactislogged.
Oncetheoutcomesforeachofthetestapplicationsisknownthesystem
isautomaticallycoldrestarted.Thecoldrestartensuresthatanydataleft
inRAMisreplacedandthatacleanimageoftheOSisreadbackinfrom
flashstorage.Forthecasewhenanerrorcausesthesystemtonotrespond
totheExperimentManager’sattemptstorebootthesystem,adedicated
rebootprocessisused.Thisprocessisstartedautomaticallyatboottime
andsimplytriestorebootthesystemafteraspecifictimeouthastriggered
(currentlyfourminutes).Todefinesuchatimeoutiscommonpracticefor
58
CHAPTER4.FAULTINJECTIONFRAMEWORK
failsfaultamaninjectionualhardw[Arlatareetal.,resetis1990;requiredDur˜aesbyandtheevMadeira,aluator.2006].Ifthisalso
ComputerHost4.5.2TheHostHostComputerComputerrunsiassetusedoftoexpmanageerimentandservers.supportEachtheservexperiserimenrespts.onsibleThe
forcommunicatingwithoneTargetComputer.Theservercanbeconfigured
btoothrespecondhoingbothtestonnetapplicationworkanddatasenserialtascommpartoftheunication.wIorkloadtisonrespeachonsibleTargetfor
Computerandforreceivinglogmessages.Italsokeepsatimerforeachlog
alertedstreamthatandifthenoboardmessagemayishavehdetectedung,withinrequiringagaivenhardwtimearethereset.operatoris
Logmessagesarestoredsequentiallyinatextfile,onefilepermanaged
binstanceoard.Thethatopaberatoroardwcanasalsomanadduallycustomrestarted.logThemessageslogtofilesthearelogprofile,cessedfor
off-lineandtheresultsarestoredinarelationaldatabase.Currentlyweuse
SQLServer2005fromMicrosoft,butotherdatabasescouldbeusedaswedo
norelyonspecificfunctionalitiesoftheunderlyingserver.Headerfilesare
processedtomatchservicestofunctionnamesforeasierhandlingofthelog
data.QueriesThecanusebepofaosedindatabaseastructuredsignificanwatlyyandimprosavveesdwfororkinglaterwithusetheusingdata.the
SQLquerylanguage,inourcaseTransact-SQL[TSQ].
ThisExpisdoneerimentsareautomaticallyclassifiedoninthetodatafailurestoredclassesinasthepresendatabasetedinforSectioneach5.2.ex-
periment.Failureclassesaredefinedasdisjointpredicatesontheexperiment
data,andareimplementedasviewsinSQL.Thisisaveryflexibleapproach
asquicklyimprobvyemenmotsdifyingtothethefailureSQLdefinitionsclassificationforsceachhemeview.canbIteisintroalsovducederyveasyery
toindeedrundisjoinconsistencytandcheccompletekson(eacthehdataexptoerimenchectiksinthatone,theandfailureonlyoneclassesclass).are
terceptorsIn4.5.3InterCommceptorunicationmobdules.etweenTherethearettargetedwotypedrivsoerfInandtheterceptors,OSistrtracackerskedandusingin-
jectors.TheTrackerisusedonlyinChapter7totrackthecallsmadetoa
driver.TheInjectorisusedtoperformtheactualinjection.
Eachserviceexportedandimportedbyadriverisafunctioncallto/from
aerrordynamicmodelslinkrequirelibrarythat(Dll).theAsdatatyppreviouslyesofthedescribedparametersinSectionusedin3.2.1functionsome
SETUPINJECTION4.5.
59
callstobetracked.Trackingthedatatypeusedservestwopurposes,first
andforemostitisusedtoselectthespecificerrortoinject,butitisalsoused
toreducethenumberofinjectionsforerrormodelsbasedonbitstrings(like
theBFmodeldescribedinSection3.2.1)byrestrictinginjectiontothebits
used.TheC-languagedoesnotprovideanyreflectivemechanismswherebythe
datatypeofaparametercanbediscoveredatrun-time.Thisinformationis
pertinentfordatatype-basedinjection.Formostfunctionstheinformation
isavailableintheformofheaderfilespresentonthesystem.Insomerare
casesonlineproductdocumentationisusedtoresolveparameterdefinitions,
inthiscaseMicrosoft’sonlinedeveloperdocumentation[MSDN].
Injectionisdoneononeinterfaceatatime.Exportedfunctionsare
targetedseparatelyfromimportedfunctions,therebydefiningasinglein-
jectioncampaign.FunctionsfromoneimportedDllaretargetedseparately.
Theinjectionmodulewrapsthedriverandactsasa“trojanhorse”toboth
thedriverandtheOS,byimitatingthebehavioroftheotherparty.Simi-
larstrategieshavebeenusedinpreviousfaultinjectiontools,forinstance,
1988].al.,et[SegallForeachinjectioncampaign(driver/Dll)aseparateInjectormoduleis
built,whereaninjectionwrapperisbuiltforeachfunctionintheservice
interface.TheInjectorinteractswiththeExperimentManagerandactivates
thetargetedinjectionwrappersandcanbeconfiguredtomakemultiplewrap-
persactiveforaninjectionrun.However,fortheexperimentscarriedoutonly
onewasmadeactiveandthenon-activatedwrappersactaspassthroughs,
withouttouchingthetheparametervaluesused.
Tomakesurethatthesystemitselfdoesnotmodifyorperturbthebe-
haviorofthesystemeachexperimentcampaignstartswithanerror-freerun,
i.e.,arunwhereallwrappersareinplacebutactaspassthroughs.This
runallowstheevaluatortoverifythatcommunicationwiththeHostCom-
puterissetupproperlyandthatnounexpectedproblemshavearisen,before
anyactualexperimentsarecarriedout.Duringthiserror-freerunthesystem
isprofiledtominimizethenumberofinjectionsthatneedtobecarriedout.
ThisprocessisfurtherdetailedinSection4.6.
Whentargetingimportedservicesthebinaryofthedriverismodified.
ThebinaryformatofdriversonWindowsCE.NetfollowthePortableEx-
ecutable(PE)format[Mic,2006],whereDllsbeingdynamicallylinkedto
thedriverarespecifiedtogetherwiththeservicesused.Bymodifyingthe
nameofthelibrarybeinglinked,theInterceptorDllcanbelinkedinstead
oftheoriginalDll.Adedicatedapplicationhasbeenimplementedforper-
formingthemodificationofthebinaryimageofadriver.ItrunsontheHost
ComputerandisusedbeforebuildinganewOSimage.TheInterceptoris
60
CHAPTER4.FAULTINJECTIONFRAMEWORK
implementedtoexportallfunctionsthatthedriverusesintheoriginalDll.
TheInterceptortheninturnloadstheoriginalDllandcanpassanycalls
along.TheInjectorcanworkinthreemodes:a)fortestingpurposesitcanact
asacompletepassthrough,b)itcanwaitforinjectioninstructionsfromthe
ExperimentManagerandthenactivatetheappropriateinjectionwrapper,
andc)itcanbuildallinjectionwrapperswithpassthroughfunctionalityand
querytheirinjectioncases.Thelattermodeallowsthecreationofinjection
casesonthefly.Thisisimportant,aswhenanewerrormodelisimple-
mented/enhancedorwhennewfunctionsarewrapped,theinjectioncases
areautomaticallygeneratedwithouthumanassistance,savingtimeandre-
ducingtheprobabilityformistakes.Thetwolattermodesareillustratedin
4.4FigureInFigure4.4,whenthesystembootsitcheckforthepresenceofacon-
figurationfile,specifyingwhicherrorsaretobeinjectedandwhichinjections
havealreadybeenperformed,aspreviouslydescribed.Ifoneexistsitisread
andtheInterceptorbuildstherequiredinjectionwrapper.NexttheInter-
ceptorsendsthe“readyforinjection”messagetotheExperimentManager,
whichthenstartsthetestapplications(workload).TheInterceptorisnow
readytoinjecttheerrorwhenthespecifiedtriggerfires.Wheninjectionis
finished(intermittentandpermanenterrormodelsmayinjectmultipleer-
rors)thesystemwaitsforthetestapplicationstoexitbeforeupdatingthe
configurationfileandthenrebooting.Notethatifthesystemisstillpre-
paredtoinjecterrorswhenthetestapplicationsexitthesystemisrebooted
anyway,elsepermanenterrors,orerrorsnotbeingtriggeredbyacertain
workloadleadtolivelock(thesystemwaitingfortheerrortobetriggered,
whichwillnothappen).
Inparalleltotheinjectionprocessawatchdogtimeoutprocessisstarted
whichrebootsthesystemafterasettimeoutfromboottimehaselapsed.The
purposeofthewatchdogistorebootthesystemincasetherestofthesystem
failsduetoanerrorandthenormalrebootstepisnotreached.Currently
thetimeoutissetto200seconds,morethantwicethenormalexecutiontime
system.theofBuildinginjectionwrappersiscurrentlyamanualprocess,butsincethe
informationrequiredis(mostly)availableinparsableheaderfilesanauto-
maticprocessingispossible,similartotheapproachusedin[FetzerandXiao,
2002b].Asthewrapperonlyneedstobeimplementedonceforeachfunc-
tion,andcanthereafterbeusedforanyerrormodel,theapproachstillscales
reasonablywell.However,foralargescaledeploymentoftheapproachthis
stepneedstobeautomatedasmuchaspossible.
4.5.SETUPINJECTION
Timeout?
Boot
nfig exists?oC
orRead ErrtionBuild injecapperwrtStarloadrkowigger?rTort errInjecinishedFting?injecoWloadrkfinished?UpdateonfigcReboot
No
No
Build all tion injecapperswrtStarloadrkowRprecoofilerd
loadrkoWfinished?cWronfigite
tStaronditionCorait-fWtioncA
61
Figure4.4:Anoverviewoftheinjectionprocess.
delMoErrorecifyingSpEachInjectorbuildsanin-memorydatastructure(aC++object)containing
informationregardingthedatatypesused,andpointerstothevaluespassed
62
CHAPTER4.FAULTINJECTIONFRAMEWORK
forthetargetedservice.Foreachserviceonesuchobjectisbuilt.Thedata
typesusedforaspecificservicearehardcodedintothecode.Thisisnoreal
limitationastheinformationisstatic(functiondefinitions)andonlyneedto
bedefinedonceforeachfunction.Theerrormodelisspecifiedusingaplugin
model,whereamodelisspecifiedusingtheerrortype,timing(duration)and
trigger,analogouslytothedescriptioninSection3.2.
Foreachparametertargetedforinjectionthedatatypeneedstobe
recordedandtheerrorselectedaccordingly.Thedatatrackingmechanism
isimplementedinC++,buttheinterfacebetweendevicedriversandthe
OSisdefinedusingC.Wemakeadistinctionbetweenthreemajorclassesof
parametertypes,namely:
•Basictypes-Thebasictypesincludethebuilt-intypesprovided
bytheprogramminglanguage,likeintegersandbooleans,aswellas
specializedtypeswhereitmakessensetousespecificinjectioncases,
forinstanceHKEYrepresentingahandletoaRegistrykey.
•Structures-ThisisthestructtypeusedinC.Itcontainsasetof
memberswhichthemselvescouldbeofanyofthethreeclasses.
•Pointers-Thesearereferencestoothervalues,whichcouldbasic
types,structuresorotherpointers.
ThetrackingmechanismisbuiltaroundtheconceptofaParameter.
AParameterisofanyofthethreeclasses:basictype,structureorpointer.
Figure4.5showstherelationbetweenthethreeclassesoftypesfound.Struc-
turesandpointersrefertootherparametervalues,whichinturncanbeany
classes.threetheofThethreeclassesareimplementedasC++classesinheritingfromaPa-
rameterbaseclass.Foreachdatatypetrackedanewspecializedclassis
implemented.AParameterprovidesmethodsforspecifyingerrormodels,
queryingforinjectioncasesandtoinjecterrors.Structuresandpointers
furthermorecontainsmethodsforaddingmembersandreferences.
Figure4.5illustratesourimplementationofthedatatrackingmechanism.
Foreachtargetedparameteranobjectisdefined,whoseclassinheritsfrom
theParameterclass.Usinginheritance,newdatatypescaneasilybeadded,
providedtheyinheritfromtheParameterclassandimplementtherequired
methodsforinjectingerrorsetc.
AspreviouslyexplainedinSection4.5.3aninjectionwrapperisbuiltfor
eachtargetedfunction.Thewrapperbuildsamodeloftheinterfaceusingthe
classesexplainedabovetogetherwithinformationregardingtheerrormodel.
Usingthisinformation,theExperimentManagercanquerythewrapperfor
SETUPINJECTION4.5.
aPerametr
ointerP
peyBasic ttureStruc{
intunsigned intcharunsigned charoidvtshortunsigned shorHKEYboolchar_tw
63
ypesExamples of basic data t
Figure4.5:Datatypetrackingmechanism.
theinjectioncasespossibleforthegivenerrormodel.Thisenablesautomatic
configurationandgenerationofinjectioncases,illustratedintherightbranch
4.4.FigureofThepluginmodelmakestheinjectionframeworkconsiderablymoreflex-
ible,comparedtohardcodedinjectionsonaservice/driverbasis.Oncethe
injectionwrapperisdefinedseveralinjectioncampaignscaneasilybeper-
formedbyspecifyingdifferenterrortypes,durationsandtriggermodels.The
errortype,durationandtriggerarespecifiedaskeysintheRegistry,which
areextractedbytheExperimentManageratboottime.Theinjectionwrap-
peristheninstructedtobuildthecorrespondinginjectionobjectonline.This
facilitatesaveryflexiblearchitecture,thatautomaticallyextracttheinjec-
tioncasestobeperformed.Thenumberofinjectionsrequiredisextracted
fromtheerrortypeobject.Thefirsttimethesystemisbootedallinjection
objectsarebuiltandtheinjectionstobeperformedarestoredinafile.
Threeerrortypepluginshavebeenimplemented(BF,DTandFZ),but
additionalmodelscaneasilybeimplemented,includingtimingerrors(de-
lays).Threeerrordurationsareimplemented,transient(occuronlyonce),
intermittent(occurxtimes)andpermanent.Aspreviouslydescribedonly
thetransientmodelhasbeenevaluated.Thetriggeringmechanismsimple-
mentedincludefirst-occurrence,callblock-basedandtimeout-basedmodels.
Onecanalsospecifymoreadvancedtriggeringmechanisms,whereerrorsare
triggeredafterxcallstoaserviceoronlyaftercallstoacertain(other)
service.
64
CHAPTER4.FAULTINJECTIONFRAMEWORK
ApplicationsestT4.5.4Theworkloadforarobustnessevaluationistypicallythesetofuserappli-
cationsrunningonthesystem,togetherwiththeirinputs.Thepurposeof
theworkloadinthiscontextistwo-fold:a)todrivetheuseofthesystem
andattachedmaketosurethethatsystem;allrelevandanb)ttopartsofdetecttheanyOSarerobustnessused,includingviolationsindevicesthe
used.servicesOStype)Whentheworkloadrobustnessevshouldaluationascloselyisascarriedpoutossibleonamimicfinishedtheuseproofductthe(orproproto-duct
initsoperationalsetting.However,forgenericcomponents,likeanOS,in-
formationonoperationalsettingsmightbeunknown.Forsuchcasesgeneric
wmarkorkloadsareapplicationstypically[Bartonused.etal.,Examples1990;Kanaincludewatietstandardal.,p1992;erformanceCarreirabetencal.,h-
1998].reusedinThethesamefuture.problemThefuturearisesuseswhenmaytestingnotcompmatchonenthets,opwhicherationalmaypro-be
fileavailableatdevelopmenttimeandthustestersareforcedtoanticipate
“typical”useofthecomponent[Weyuker,1998].
harnessAnothersimulatingapproacophistoerationaltesteachconditions.serviceTheindividuallyharnessbneedsytodefiningsetupaantesty
spviceecifictobeconcalledtextin(sucahasrealisticheldsetting.resources,Thisopenapproacfileshetc)wasneededusedforfortheinstanceser-
in2002b].BallistaWeha[KoveopmanoptedtoanduseDeVaale,realistic1999]wandorkloadHEALERSinstead,[FthusetzeravandoidingXiao,the
problemofdefiningappropriatecontextscenarios.
Theworkloadusedinthisthesisconsistsofasetoftestapplications.The
sppurposeecificallyofthethetestdevicedrivapplicationsersevistoaluated.exerciseEacahwidetestvarietyapplicationofOSisservicesenhancedand
thewithexpectedadditionalresult.assertionsTheexpthatectedverifyresultthatiseacderivhcalledtofromantheOSdoservicecumenreturnstation
oftheOSandagoldenrunoftheapplication.Thetestapplicationsarekept
assimpleaspossibletomakethemdeterministic.Thisallowsassertionsto
berectness.manuallyThreetinsertedypesinoftotestthecodeapplicationsofeacharetestused,applicationusingthetovOSerifyinthedifferencor-t
ys:aw
[MemoryManagement]:Theapplicationallocatesmemoryandaccessit.
released.thenismemoryThe[FileSystemOperations]:Normaltextfilesarecreatedandopened.
Sometextiswrittentothefileandreadback.Fileattributesare
setandchecked.Thefileisfinallydeleted.
OFILINGPRE-PR4.6.
65
[DriverSpecific]Thedriverspecifictestapplicationusesthedriverby
issuingdriverspecificoperations.
Foreachdrivertestedaspecifictestapplicationisbuilttotestthedriver’s
functionality.Foranetworkcarddriverpacketsaresentandreceivedusinga
connectiontotheHostComputer.Similarly,theserialdriverreadsandwrites
ontheserialportconnectedtotheHostComputer.Specificconsistency
checkingassertionsareaddedtocheckforanyerrorsinthereceivedecho
strings.Similarly,theechoserverontheHostComputeralsochecksfor
incompleteorotherwiseerroneousmessages.TheCompactFlashdriveris
testedinasimilarfashiontothefilesystemtestsabove.
Togetaconsistentsystemforallinjections,andtoestablishacommon
groundforcomparingdrivers,alldriverspecifictestapplicationsareexecuted
foreachtest,evenwhenthespecificdriverisnottargeted.
Pre-Profiling4.6Aforkeytheconcerninjectedwithfaultsanasyhighfaultaspinjectionossible,istoi.e.,kaseepmantheyalevselpofossibleactivofationthe
faultsshouldbeactivatedandbecomeerrors.Sincetypicallythegoalof
faultinjectionistoexercisethesystem’sfault/errorhandlingmechanisms
andobservehowitbehavesinthepresenceoffaults,theactivationrateisa
measureofhoweffectivelythesemechanismsaretested.Notethatthegoal
ofinjectingfaultscanalsobetoassesstheactivationrateitself,i.e.,how
easilyfaultsareactivatedandbecomeerrors.However,thisisnotthecase
inthisthesis.Ingeneral,onewantstoachieveanaccelerationofthefault
→error→failureprocess[ChillaregeandBowen,1989].
Robustnessevaluationsoflargesystems,suchasOS’sdonottargetevalu-
ationofspecificfaulttolerancemechanisms.Therefore,onecannotgenerally
knowifanon-activatedinjectedfaultisstilldormant,hasbeenoverwritten,
ordetectedandcorrected.Tocopewiththisanexperimentaltimeoutisset,
afterwhichthefaultisdeclaredtohavedisappeared,eitherbybeingover-
writtenorhandledbythesystem.Thetimeoutneedstobelongenoughto
justifythisassumption,butshortenoughtomakeexperimentationfeasible.
Ahighactivationlevelnaturallyhelpsspeeduptheexperimentalprocess,
asfewerexperimentsneedtorununtilthetimeoutelapses.Fortheexperi-
mentspresentedinsubsequentchaptersatimeoutoffourminuteswasused.
Thiswassettobemorethan100%longerthantheexecutiontimeofthe
testapplicationsinfault-freescenarios.Initialexperimentswithsignificantly
longertimeoutsdidnotrevealanydormantfaultssurfacing.
66
CHAPTER4.FAULTINJECTIONFRAMEWORK
Byinjectingerrorsatahigh-levelinterfacewhichisreadilyaccessible
onecanachieve100%activationratioofinjectederrors,byemployinga
pre-profilingstagebeforetheexperimentsstarts.Thisstageprofilesthe
componentinquestionandrecordseachinvocationmadeintheinterface.
Thisinformationisthenusedtofilteroutanyinjectionsthatwouldnever
takeplace(becausethefunctionisnotused).Thistechniqueassumesa
deterministicinvocationpatterninthesensethatthesamesetofservices
areinvokedforeachrunofthesystem.Thispropertyisalsorequiredtoget
repeatableresults,animportantaspectofsystemevaluation.Theuseoftest
applicationsthatgiverisetoadeterministicworkloadmakesthisassumption
justifiedandindeedfortheexperimentscarriedoutforthisthesis,nosuch
deviationswereobserved.However,itisimportanttore-profilethesystem
asanychangesaremade,especiallyregardingthetestapplications,asthey
mightgiverisetonewinvocationpatternsforthedrivers.
Fortheinjectionexperimentscarriedoutapre-profilingstageisrunfor
eachexperimentcampaign.Thefirsttimethesystembootsthetestappli-
cationsareexecutedastheyarewhenerrorsareinjected.Inthiscase,an
invocationprofileofthetargeteddriverisautomaticallycollectedinsteadof
anerrorbeinginjected(rightbranchofFigure4.4).Basedontheservices
thataremarkedasusedtheinjectioncasesgeneratedcanbefiltered,such
thaterrorsareonlyinjectedforservicesthatareactuallyused.Byautomat-
icallyperformingtheprofilingforeachnewconfigurationanychangesmade
tothesystemconfigurationarealwaysconsidered.
Additionallytorecordingwhichservicesareinvoked,alsothenumberof
invocationsisstored.Thisisusedtofurthereliminatenon-activatedinjec-
tionswheninvestigatingthetimeofinjectioninChapter7.
IntheDTStoolasimilarapproachtooursisused[TsaiandSingh,
2000].Librarycallsmadebyanapplicationaretargeted.Onlycallsactually
performedaretargeted,reducingthepotentiallylargesetoffunctionsconsid-
erably.Weusethesamestrategy,reducingthenumberoftargetedfunctions
by36.7%onaverage.Table4.1showsthenumberofservicesspecifiedand
thenumberofservicesactuallyused.Thetableshowsthatmanyservices
arenotused,indicatingthatmorecomplexworkloadsmaypotentiallyincur
moreservicestobetargetedandthusformoreinjectionstobeperformed.
In[S¨ußkrautandFetzer,2007]staticanalysisoflibrarycodeisperformed
toreducethenumberofinjections.UsingDTinjectionssimilartoours,the
injectioncasestouseareselecteddependingonhoweachparametermightbe
used.Thisinformationisfoundbystaticallyanalyzingthecodetofindout
which(other)libraryfunctionsarecalledforthetargetedfunctionandbased
onthisrestrictthenumberofinjections.Thisdoesnotlimitthenumberof
servicestargeted,butthenumberofinjectioncasesrequiredforeachfunction
4.7.SUMMARYOFRESEARCHCONTRIBUTIONS
67
Table4.1:ThenumberofservicesspecifiedintheOS-Driverinterfaceand
thenumberofservicesusedforthespecifiedworkload.
DriverSpecifiedUsed[%]
cerfio91C111serial5460462676.748.1
63.83047atadisk
parameter.
4.7SummaryofResearchContributions
Thisinjectionchapterexperimenpresentstsptheerformedinjectionforframewthisorkthesis.usedTheforpframewerformingorkistheimple-fault
mentomation.tedforUsingWindoawspluginCEmo.Netdelspthreeecificallyerrorfomocusingdelsonhavebeenextensibilityimplemenandtedau-
andtheframeworkiseasilyextendedtoincludemoreerrormodels.Sev-
usereralaspmistakectsesofandthetoproexpcesseditehavexpebeerimenentation.automated,Thetoframewminimizeorkprothevidesrisktheof
followingbenefitsfortheevaluationofOSrobustness:
•Aflexibleandextensiblepluginmodelforeasyadoptionofnewerror
dels.mo
•Alow-intrusion,black-boxinjectionmethodology,notrequiringaccess
de.cosourceto
•Awhenhighchangesdegreeoftotheautomation,usederrornotmodelrequiringorwmanorkloadualareactionsptoerformed.betaken
•wAnouldefficiennotthavebselectioneenactivofatedinjectionforthecases,usedweliminatingorkload.injectionsthat
68
CHAPTER4.AFTULINJECTIONORKFRAMEW
5Chapter
ErrorSystems-PropagationWheretoinOpInjecterating
HowtomeasureerrorpropagationinOS’s?Whatarequantifi-
ablemeasuresoferrorpropagation?
iorsinThatthesoftformwareofcomperrorsonenandtsconfailurestainisafaultsfactthatthatleadwilltonotdisappundesirableearbinehathev-
howforeseeablesucherrorsfuture.mayInaaffectsystemthesystemdesignitasisawhole.thereforeAnimpimportanortantttoaspquanecttifyof
thisishowerrorsspreadthroughoutthesystem,i.e.,howtheypropagate.
to.AnotherBothaspareectimpistheortanteffectastheytheyhaallovew,ai.e.,dtheesignerfailuretoa)moquandestifytheypgivotenetialrise
depqualityendabilitcompybonentsottlenecandksc)intotheguidesystem,additionb)ofassistinenhancementhetsindesigningthesystemhigher
manner.eeffectivaninThischapterintroducesaseriesofmeasuresthatcanbeusedtoquan-
tifybasederroronthepropagation,previouslyRQ2describ-edquansystemtifiablemodel,measuresandofshowthatrobustness.byintroTheyduc-are
ingstudyerrorshowinthetheOSintreatsterfacebfaultetwyeendrivers.deviceAsdrivsuchersitandalsotheOSconsiderscanbeRQ3used-theto
questionofwheretoinjecterrors.
69
70CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS
ductiontroIn5.1
Thischapterdefinesaframeworkfortheevaluationoferrorpropagation
withrespecttorobustnessinOS’s.AsdetailedinChapter1animportant
aspectwhenincreasingtherobustnessofasystemisidentifyingpotential
sourcesandsinksforerrorpropagation(RQ1).Thegoalofthischapteris
todefinemeasuresthathelpidentifyingsuchservices.Toachievethiswe
usefailuremodeanalysisandfourseparatemeasures:ServiceErrorPer-
meability,ServiceErrorExposure,ServiceErrorDiffusionandDriverError
Diffusion.Aftertheirdefinitionadiscussionontheiruseanditsimplications
ted.presenisAspreviouslydiscussed,theuseoftheOS-Driverinterfaceformeasuring
errorpropagationandeffectissuitableformanyreasons,suchasportability
acrossdrivers,lowintrusion(asnosourcecodechangesarerequired)and
itallowsinjectionofmultipledriverfaults.Themeasurepresentedinthis
chapteraredefinedforerrorsappearingatthislevel,andthereforefurther
substantiatethechoseoferrorlocation,i.e.,answersthequestionofwhere
errors.injecttoThischapterpresentstheanalyticalfoundationuponwhichthefollowing
chaptersbuild.Furtherdiscussiononusingtheframeworkpresentedhere
anditsimplicationfromaquantitativepointofviewwillbediscussedin
hapters.ctsubsequen
5.2FailureModeAnalysis
RobustnessevaluationissimilartoFailureModeandEffectAnalysis
In(FMEA),FMEAathewellfailureknownmotecdeshniqueofinindividualreliabilitycomponenengineeringtsarep[Leveson,ostulated1995].and
theireffectonothercomponentsandthewholesystemderived.Robustness
evaluationdiffers,asitisexperimentalinnature,treatingasystemnotasa
staticentity,butadynamicsystemincontext.
bFeforehandailuremo(theydesmforayusebeiniterativrobustnesselyrefinedevaluationofcourse)aretasypicallyasetpofostulatedfailure
tomoinvdes,orestigateclasses.whichForfailuresafetymodescriticalthesystems,systemorFMEAcompmaonenytalsoshobewspinoperformedera-
tion.Eventhoughthetechniquesdevelopedheremayhelptowardsthisgoal
aswell,itisnottheprimefocusofthisthesis.
Thefailureseverityscaleusedinthisthesisissimilartoseveralprevious
bsevencerityhmarkscales.[SiewiorekSiewioreketet.al.,al.1993].usedaManfivyefaultgradescaleinjectionfortotheirolshaverobustnessused
5.3.ERRORPROPAGATION
71
similarscalesaswell,likeMAFALDA[Arlatetal.,2002;Rodriguezetal.,
Bo2002],wen,NFT1989;APEBarton[Gueettal.,al.,2003]1990;andMarsdenothers,andforFabre,instance2001;[ChillaregeDur˜aesandand
Madeira,2006].Asarepresentativeexampleoffailuremodesdefinedfrom
etanal.,application1997]ispshoerspwnectivinTe,ablethe5.1.CRASHThesevAPIeritofythescaleOSpresenistestedtedinby[Kocreatingopman
aspecifictaskthatcallsthetargetedfunctionandtheoutcomeisclassified
scale.CRASHthetoaccording
Table5.1:TheCRASHseverityscalefrom[Koopmanetal.,1997].
DescriptiondeMoailureFcrashSystemCatastrophicRestartThetaskishungandrequiresarestart
AbortThetaskterminatesabnormally
SilentNoerrorreportisgeneratedbytheOS,even
thoughtheoperationtestedcannotbeper-
formedandshouldgenerateanerror
returnederrorIncorrectHindering
ThefailureclassesusedinthisthesisarelistedinTable5.2.Weusethe
termfailureclassaseachfailureclassmaycorrespondtomultiplefailure
morepresendestdepgenericendingonclassesthethatdesiredapplylevtoelofgeneralgranpurpularitosey.Thesystems.chosenTheclassesfailure
beclassesunamarebiguouslydefinedtobedetermineddisjoint,tobsucehathatmembtheerofaoutcomespofecificanclass.experimenWhenevtcaner
theoutcomefitsthedescriptionofoneormoreclassesitisassignedthemore
sevanderethenone.theForestrofinstance,theansystemerrorwouldthatonlyfirstbecausesconsideredaninapplicationthelattertoclass.crash
PropagationError5.3Errorpropagationinsoftwarehappenswhenafaultisactivated(becomes
anerror)andthensubsequentlyusedinacomputation,leadingtoanew
erroratadifferentlocation[LeeandIyer,1993;Voasetal.,1996].Asan
example,considerafaultylineofcodewherethewrongvalueisassignedan
integervariable.Thisvalueisreadandusedinaconditionstatementand
thewrongdecisionistaken,leadingtoasetofstatementsbeingexecutedin
error.Theerrorhasnowpropagatedfromtheassignmenttoanotherpartof
thecomponent.Theerrormightcontinuetopropagateandmaypropagateto
72CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS
1Class
Table5.2:Thefailureclassesused.
DescriptionClassailureFClassNFWhennovisibleeffectcanbeseenasanoutcomeof
anexperiment,theNoFailureclassisused.This
indicatesthattheerrorwaseithernotactivatedor
wasmaskedbytheOS.
Class1Theerrorpropagated,butstillsatisfiedtheOSser-
vicespecificationasdefinedinthedocumentation.
ExamplesofClass1outcomesarewhenaner-
rorcodeisreturnedthatisamemberofthesetof
allowedcodesforthiscallorifadatavaluewas
corruptedandpropagatedtotheservice,butdid
ecification.sptheviolatenotClass2Theerrorpropagatedandviolatedtheservice
specification.Forexample,returninganunspec-
ifiederrorcodeorifthecalldirectlycausesthe
applicationtohangorcrashbutotherapplications
inthesystemremainunharmed,resultinthiscat-
.egoryClass3TheOShungorcrashedduetotheerror.Ifthe
OShangsorcrashes,noprogressispossible.For
acrashedOS,thisstatemustbedetectedbyan
outsidemonitorunlessthisstateisautomatically
detectedinternallyandthemachineisrebooted.
2Class
3Class
etc.otherEvcompentuallyonen,tsthebyerrorfunctionmightcalls,propagatemessagetopassing,theoutputssharedofmemorythesystem,areas
failure.acausingthereKnowingwhicherrorspropagateandwhereisimportantbecauseiten-
ablescountermeasurestobetaken.Ageneraldesignprincipleinthedesign
ofnentdepmasksendableanysystemserrors,isnottheexpconceptosingofinerrteractingorccompontainmentonen,tsi.e.,tothatapropagatingcompo-
errors[Pradhan,1996].Forsoftwarethisisdifficulttorealizeinpracticefor
everytypeoferrors.Instead,thefailuremodesandthepropagationpaths
needAttobleastefound,threesucmainhtheirusesforimpactknocanwledgebecregardingharacterized.errorpropagationcan
visioned:eneb•beIdentifyingmorelikrelytoobustnessspreadbottleneerrorscksorinmoretheliksystem.elytobeSomethecompsinkonenfortserrorsmay
5.3.ERRORPROPAGATION
73
propagatinginthesystem.Thesecomponentsshouldbethefocusof
otherverificationandvalidationefforts.
•Exposeflawsandandtheirimpactonsystemdependability.Errorprop-
agationmayrevealrealflawsinthesoftwareandmaytherebyassistin
thedesigninghigherqualitycomponents.Theimpactofsuchflawscan
becharacterizedusingforinstancefailuremodeanalysis,whichhelpsa
designerfocusattentiontothecomponentswhichcauseseveredamage.
•Locatingerrordetectionandrecoverymechanisms.Theerrorpropaga-
tionpathsidentifylocationswherespecificerrordetectionandrecover
maybeadded.Byplacingthemalongsuchpathstheireffectivenessis
increased.
AThediscussionframewisproorkvidedpresenontedhoinwthisthechaptermeasuresaimsdefinedatallcanofhelptheseacthreehievegoals.this.
Chapter6presentsanimplementationofthesemeasuresonarealOSand
.ysuitabilittheirdiscusses
DistributionClassailureF5.3.1Thesimplestwaytocompareasystem’sabilitytowithstanderrorsindrivers
istocomparethenumberofseverefailuresthesystemincursasaresultof
injectederrors.Sincethenumberoffailuresdependsonthechosenerror
model,i.e.,thenumberofinjectionsperformed,onecanusetheratiosof
failuresintodifferentfailureclasses,thefailureclassdistribution.
Thefailureclassdistributionhighlightskeydifferencesbetweendrivers
andgivesafastoverviewofdifferentdriver’sand/orerrormodel’sability
toprovokefailuresinthesystem.However,whenmoredetailedresultsand
guidanceisneededmorerefinedmeasuresshouldbeused,astheonepre-
sentedinthenextsection.
MeasuresPropagationError5.3.2Inthepropagatecontextofthroughoutthisthethesiswsystem.eareTinodoterestedthis,inwehowneedtoerrorsinclearlydevicespecifydrivtheers
observationpointswhereerrorpropagationismeasured.Errorsareinjected
intheinterfacebetweenthedriverandtheOS.Observationsaremadefrom
auserperspectivebyobservingthebehaviorofuser-spaceapplications.This
givspreadesuserrorstheandabilityantocapplication’sharacterizeusetheofOSrelationservices.betweendrivers’abilityto
74CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS
APP1...APPnAPP1...APPnAPP1...APPnAPP1...APPn
Operating SystemOperating SystemOperating SystemOperating SystemD1...D2...DND1...D2...DND1...D2...DND1...D2...DN
dcab
Figure5.1:Thefourpropagationmeasuresintroduced:a)ServiceError
Permeability,b)ServiceErrorExposure,c)ServiceErrorDiffusion,andd)
Diffusion.ErrorerDriv
Withtheintentoffindingrobustnessbottlenecksinthesystemthreemain
goalsareidentified:a)toidentifyservicesintheOS-Applicationinterface
thatarethelikelysinksforpropagatingerrors,b)toidentifyservicesinthe
OS-Driverinterfacethataremorelikelytospreaderrors,andc)toidentify
driversthataremorelikelytospreaderrorsinthesystem,giventhaterrors
arepresent.Tofacilitatesuchanidentification,abasicpropagationmeasure
isdefined,theServiceErrorPermeability,capturingthelikelihoodthatan
errorintheOS-DriverinterfacewillspreadtoanOS-Applicationservice.The
objectivesforourmeasuresareillustratedinFigure5.1andaresummarized
ws:folloas
(a)MeasurefordegreeoferrorporosityofanOSservice:ServiceError
,abilityPerme(b)MeasureforerrorexposureofanOSservice:ServiceErrorExposure,
(c)Measureofadriverservice’pronenesstospreaderrors,ServiceError
and,Diffusion(d)OS-DrivMeasureerofinadrivterface:er’sabilitDriverytoErrorspreadDiffusionerrors.inthesystemthroughthe
yermeabilitPErrorServiceTheServiceErrorPermeabilityistheprobabilitythatanerrorpropagates
fromaspecificserviceintheOS-Driverinterfacetoaspecificserviceinthe
OS-Applicationinterface.Itisconditionedonthepresenceofanerrorinthe
firstplace.Thatitisaconditionalprobabilityissignificant,sinceelsewe
woulddifficulthavandetosystemknowsptheecific.probabilitWithyotheferroroconditionalccurrence,probabilitwhichyisweinherenstillgettly
anassessmentofthesystem’sabilitytocontainerrorpropagationandwhen
5.3.ERRORPROPAGATION
75
aerroroccurrenceprobabilityisknowitcanbecombinedwiththeService
.yermeabilitPErrorTwoclassesofservicesareidentifiedintheOS-Driverinterface,asshown
previouslyinFigure3.2(page36).EachdriverDxexportsasetofservices
dsx.1···dsx.N.ThesearetheservicesthattheOScallstoinstructthedriver
toperformacertainoperation.Toimplementitsfunctionalityadriveralso
useasetofOSservicesosx.1···osx.M.
Foragivendriverorproject,onlyoneoftheclassesmaybeofinterest,for
instance,fordriversthatdonotmakeextensiveuseoftheexportinterface.
TheServiceErrorPermeabilityisthereforedefinedforeachclassexplicitly.
PDSix.yisdefinedfortheexportedservicesandPOSix.yfortheimported.
TheServiceErrorPermeabilitiesisdefinedforadriverDx,anOS-
ApplicationservicesiandanOS-Driverservice(eitherdsx.yorosx.y):
PDSix.y=Pr(errorinsi|errorindsx.y)(5.1)
POSix.y=Pr(errorinsi|errorinuseofosx.y)(5.2)
Typically,propagationisevaluatedtoaserviceinaspecificapplication,
i.e.,si∈AxisdefinedforaspecificapplicationAPPx,andthisisthewayit
here.terpretedinisServiceErrorPermeabilitygivesanindicationofthepermeabilityofthe
particularOSservice,i.e.,howeasilytheOSletserrorsinaspecificservice
intheOS-Driverinterfacepropagatetoaserviceusedbyanapplication.
Ahigherpermeabilityimpliesthatprecautionsneedtobetakenforthe
servicesinvolved.Suchprecautionscouldentaileitherensuringthatthe
servicesareproperlyused(faultpreventionandremovalmethods),including
handlingexceptionalsituations,oradditionoferrorhandlingcode.Note
thatEquation5.2allowsustocomparethesameOSserviceusedbydifferent
drivers.Theimpactofthecontextinducedbydifferentdriverscanthusbe
studied.NotethattheServiceErrorPermeabilityisdefinedwithrespecttosubsets
oftheservicesattheOS-Applicationinterface,S,andOS-Driverinterface,
O.Forservicepairsnotmembersofthissubset,noassertioncanbemade
abouttheirpermeability.Itisthereforedesirabletomakethesesubsets
representativeofthesetofservicesusedwhenthesystemisoperational.
osureExpErrorServiceTtheothesystem,findaOSsetofservicesrelevanthattaredriversmoreneedsexptoosedbetoerrorsconsidered.propagatingThepropagationthrough
76CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS
fromthissetofdriverscanbecombinedintoacompositemeasure,namelythe
ServiceErrorExposure1(Ei).ServiceErrorExposureconsiderseachdriver’s
contributiontothepropagationoferrorstoaspecificOS-Applicationservice
si(seeFigure3.1).Thusitisanestimationonhowexposedthisserviceis
topropagatingerrorsfromthesedrivers.BothPDSandPOScontributeto
theServiceErrorExposure,andconsequentlybotharepartofEquation5.3.
WeusethemeasureServiceErrorPermeability,tocomposetheService
ErrorExposureforanOSservicesi,namelyEi:
Ei=POSix.j+PDSix.j(5.3)
∀x∀j∀x∀j
ServiceErrorExposureconsidersalldriversinfluenceononeOSservice.
ThusitsuseismostlytocompareOSservicesandrankthembasedontheir
exposuretopropagatingerrors.Itcanthereforebeusedtoguideverifica-
tioneffortsofapplicationsorplacementoferrorhandlingmechanismsonthe
applicationlevel.Notethatthisexpressionimpliesaggregatingallimported
andexportedServiceErrorPermeabilities(5.1&5.2above)andallconsid-
ereddrivers.Whencomparingservicesonadriverperdriverbasisthedriver
specificServiceErrorExposurecanbeapplied:
Exi=POSix.j+PDSix.j(5.4)
j∀j∀Thedriverspecificserviceexposureallowsstudyofdriverattributeddif-
ferencesinexposureofpropagatingerrors.Italsomakesassessmentofex-
posureindependentoftheselectedsetofdrivers.
DiffusionErrorServiceAsServiceErrorExposureconsidersaspecificserviceattheOS-
Applicationlevel,ServiceErrorDiffusion(SEx.y)focusesonspecificservices
ontheOS-Driverlevel.Thisallowsustopin-pointserviceswhicharemore
likdefinedelytoforaspreaddrivererrorsDandthroughasptheecificsystem.os:SEx.yforanimportedservicesis
x.yxSEx.y=POSix.y(5.5)
i∀ServiceErroriDiffusionforexportedservicescanbecalculatedanalogously
to5.5usingPDSx.y.
1WewillusethetermServiceErrorExposureandServiceExposureinterchangeably.
5.3.ERRORPROPAGATION
77
ServiceErrorDiffusioncanbeusedtorankdriverservicesontheirability
tospreaderrorsinthesystem.Valuescanbecomparedeitherglobally(across
alldrivers)orlocally,foraspecificdriverDx.
DiffusionErrorerDrivDriverErrorDiffusionisusedtorankdriversontheirabilitytospreaderrors
inthesystem.Todothis,theServiceErrorPermeabilityvaluesforonedriver
areaggregated.Ahighervaluemeansthatthedrivermaymoreeasilyspread
errorsinthesystem.ForadriverDxandasetofservices,theDriverError
Diffusion,Dxisdefinedas:
Dx=POSix.j+PDSix.j(5.6)
∀i∀j∀i∀j
AnalogoustoServiceErrorExposure,ahigherDriverErrorDiffu-
sionvalueisanindicationofwhereeffortsneedtobespentonverification
orwhereerrordetection/recoverymechanismsshouldbeplacedinthesys-
tem.SinceDriverErrorDiffusionfocusesonthedriverlevel,locationsare
identifiedonthislevelaswell.
Oncearankingacrossdriversexists,thedriver(s)withthehighestDriver
ErrorDiffusionshouldbethefirsttargets.Detailsonspecificerrorpathscan
nowbeused(i.e.,ServiceErrorPermeability)toguidethecompositionand
placementofdetectionandrecoverymechanisms.
MeasuresofUse5.3.3evThealuatorpreviousmighthsectionaveispresenthereforetedsixwhicdifferenhoftthesemeasures.touseAforanaturalspecificquestionproject.an
5.3.ThreeWhenketheyusesgoalisfortoerroridentifypropagationrobustnessbanalysisottlenecwereks,idenServicetifiedErrorinExpSectiono-
sureandServiceErrorDiffusioncanbeusedtoguidethesearchforspecific
services,andmoreinformationcanbethenbegainedbystudyingindividual
ServiceErrorPermeabilityvaluesforeachconsideredservices.Driverwith
potentialforspreadingerrorscanbeidentifiedusingDriverErrorDiffusion.
Informationusedfordebugging(exposingflaws)canbegainedbylooking
atErrorthespecificDiffusion.injectionThesecasesalsoidenhelptifiedinlobycatingServiceerrorErrorExpdetectionosureandandrecoServicevery
mechanismsUltimatelyinittheisthesystemlevelbyofidendetailedtifyingrequiredprominentthatguidespropagationtheusepaths.ofthe
propdata,osednosignificanmeasures.tovAserheadallispresenattactedhedtomeasuresthearecalculationbasedonofeacthehsamemeasure.raw
78CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS
FortheexperimentalsetupusedallmeasuresarepredefinedasSQLscripts,
whichareloadedandexecutedonthedatastoredinthedatabase.All
scriptsexecuteforatmostafewsecondsontheusedworkstation.Theuse
ofadatabasemeansthattheonlyoperationrequiredwhennewexperiments
havebeenconductedistoloadthemintothedatabase,ataskforwhicha
dedicatedapplicationhasbeenimplementedgreatlysimplifyingthetask.
Discussion5.4Thissectionprovidesageneraldiscussiononsomeoftheconceptspresented
inthischapter.Amoredetaileddiscussionregardingimplementationdetails
andinterpretationofthemeasuresisfoundinChapter6.
ClassesailureFWhenequally.invSomeestigatingerrorserrorleadtopropagationsevereinafailuressystemandnotsomealltoerrorsmerecanbannoeytreatedances.
Whatconstitutesthe“severity”ofafailureissystemdependentandasub-
jectiveproperty.Differentusersmayconsiderdifferentfailuresasworst.For
instance,Dur˜aeset.al.[Dur˜aesandMadeira,2003]defineasetofgeneric
failuremodes,anddependingontheuseroftheevaluationdefineseverity
scalesassubsetsofthegenericfailuremodesaccordingly.Fromafeed-
bacwhereaskpointfromofanviewavtheailabilitworstypoinfailuretofisviewthealossofcompletelydatawithoutunrespanonsivyewsystemarning,
orst.wtheisForgeneralpurposesystems,suchdifferentiatingviewsbecomeproblem-
atic.Asanexample,manyusersarefrustratedwhentheirdesktopPC
crashesduetofailureforsomedrivertheydidnotknowexistedontheirsys-
tem.However,thatthesystem“crashes”maybeanexplicitdecisionbythe
ofOSthetoavcauseoidorremedyinconsistenciesforofandataerrororevexist,enlosstheofonlydata.saneWhenthingnotoknodowledgemight
betotakethesystemdownandhopethattheerrorhasdisappearedwhen
thesystemisrestarted.Hadthesystembeenwrittenforadedicatedpur-
pose,correctdiagnosisandrecoverymighthavebeenpossible,andthecrash
bavehaoided.vior,Ai.e.,crashwhencanthealsobsystemedesirablefailsitifdotheessoOSbisytostoppingimplementotresp“failondsilenandt”
withoutanyothersideeffects[Powelletal.,1988].
SincethisthesisisconcernedwithrobustnessofOS’swehaveoptedtouse
agenericseverityscaleforthefailuremodesdefined.Itisimportanttonote
thateventhoughwedoconsiderthescaleascontainingprogressivelymore
DISCUSSION5.4.
79
severefailures,thisonlyreflectsagenericseverityranking.Contextinput
isneededtorefinetheseverityscaleusedforaspecificsystem,inorderto
defineusefulandcomparablefailureclasses.Additionally,morefine-grained
failuremodescanbedefinedwhenknowledgeregardingaspecificsystemis
known.Forinstance,applicationsrunningonthesystemmightbeofdifferent
criticalityandfailuremodesreflectingthismaybedesirable.
Interpretation&Evaluation
Theusefulnessofanalysisusingfailureclassesisthatresourcescanbeguided
tothemoreseverefailureclasses,thususingthemmoreefficiently.This
appliestobothfaultpreventionandremovalefforts,suchasimprovementof
theengineeringprocessordifferentkindsofverificationefforts,aswellasfor
faulttoleranceapproacheswhereerrordetectionandrecoveryisenhanced
byadditionofsoftwaremechanisms.
Atypicalprocessistostartwiththemostseverefailureclass,andthen
progressivelyapproachthelesssevereclassesastimeandmoneypermits.
Thishelpstoensurethateffortsarespentwherethepay-offisthegreatest
andmayalsobeusedasastopcriteriaforrobustnessenhancement.
Anotherimportantpracticalaspectistheimpactdifferentfailureclasses
have.Class3failuresforcethewholesystemtohalt,i.e.,onecouldargue
thattheerrorpropagatedtoallservicesontheOS-Applicationlevel.In
suchcases,theservicesontheOS-Applicationlayerdonotimpacttherela-
tivecomparisonofdrivers,i.e.,theDriverErrorDiffusion.Whencomparing
driversusingDriverErrorDiffusionforClass3failuresonecantherefore
simplifyEquations5.1and5.2toonlyconsidertheprobabilitythatanerror
propagatesatall(sinceweknowitpropagatestoallservices).Theconsid-
erationofeachOS-Applicationlevelservicewouldonlygivealinearscaling
oftheDriverErrorDiffusion,notaffectingtherelativeorderacrossdrivers.
Chapter6showshowsuchsimplificationscanbemadeforarealsystem.
ortsExpvs.ortsImpInthischapterwemakeadistinctionbetweentheimportedandtheexported
servicesofadriver.Thisdistinctionmaynotbeusefulinallcontexts,andthe
servicescanthensimplybe“bundled”togethertoformonesetofservices.
ThiswouldsimplifyEquations5.3-5.6byusingonlyonetermPSix.ydefined
ws:folloas
PSix.y=Pr(errorinsi|errorinuseofsx.y)
(5.7)
80CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS
wheresx.yisaservicefromthecombinedset(O)ofallimportedand
exportedservicesfordriverDx.
ErrorDistribution&OperationalProfile
Animportantaspectwheninterpretingthevaluesforthemeasurespresented
isthelackofexplicitdependenceonerrorinputdistribution.Inanypractical
settingsuchadistributionisveryimportant.Fromarobustnesspointof
view,theerrordistributionmaybeoflessimportancesincethegoalisnotto
estimatethereliabilityofthesystem.Equations5.1and5.2areconditioned
onthepresenceofanerrorandcanbecombinedwithanerrordistribution
ailable.vawhenAnotherimportantaspectistheimplicitdependenceontheoperational
profileofthesystem.Theoperationalprofileincludestheusageprofileof
theapplicationsrunningonthesystemwhichimplicitlygivesrisetoadriver
usageprofile.Dependingonhowapplicationsareused,differentservices
providedbytheOSareusedandtheusageprofileofdriversdiffer.Fora
certainprofilesomeservicesmaynotbeusedatall,whereasinothersthey
arefrequentlyused.Thisinfluencesthevaluesofthemeasures,sinceonly
included.areusedactuallyservicesTheoperationalprofileofasystemmaynotbeknownatthetimeof
theevaluation,ormaychangewithtime.Thismeansthattherobustness
profileofthesystemmaychangeaswell.Itisthereforeimportanttotry
touseaprofilecloselymatchingtheexpectedonewhendoingexperimental
measures.theofestimationsAsalastpointitisimportanttonotethatwedonottrytotestdrivers
perse,sothismeasureonlytellsuswhichdriversmaycorruptthesystem
byspreadingerrors.Also,weemphasizethattheintentofthesemeasuresis
notforabsolutevalues,buttoobtainrelativerankings.
orkWRelated5.5
Errorconcepts.propagationSincebothanalysisarewellandfailureestablishedmotecdehniquesanalysistherearetiswoaintertplethorawinedof
tantliteratureresearchmakingconusetributionsofthem.withinThisbothsectionareas.reviewssomeofthemoreimpor-
systemErrorandaffectpropagationotherstudiescomphoonenwtsthethaneffectstheofsourceerrorspcompercolateonentofthroughthefaultthe
[LeeAnalysis)andIy[Ver,oaset1993].al.,V1996,oas1et.997;al.Voas,presen1997a]tsEPwhicAhiden(ExtendedtifiescodeloPropagationcations
ORKWTEDRELA5.5.
81
whichmightviolatethesafetyrequirementsofthesystemiffaultsoccurin
theselocations.Theauthorsintroducethetermfailuretolerancetomean
thatthesystemistoleranttofailuresof3rdpartysoftware.In[Voasetal.,
1997]theauthorsfurtherspeculatethatasimilartechniquewouldbemost
usefulforanOSsetting,sincesystemsoftwareconsistsofamultitudeof
ts.onencompteractingin
Fordependablesystemdesigns,errorpropagationisaphenomenathatis
toiorbofeavotheroided,sincesubsystemsitalloandwstheleadtofailureaoffailureoneofthesubsystementiretosystem.affecttheHobweehaver,v-
errorpropagationisausefulpropertyinsoftwaretesting,asithelpstoreveal
statecorruptionduetofaultsbypropagatingsuchfaultstotheinterfacesof
thesystem[VoasandMiller,1994a,1995;Voasetal.,1997].Therefore,for
revcompealonenfaultstsinwiththehighcode,erroriftheyarepropagationpresent.Fprobabilitromy,thisptestingointoisfmoreview,likelyrobust-to
sevnessereevaluationconsequences.identifiesOncehot-sptheseotshot-spintheotshasystemvebeenwhereidenerrorstifiedandcanleadtreatedto
(eitherbyensuringthatfaultsarenotpresentorbyaddingerrordetec-
tion/recoverycapabilities)thelikelihoodofthesystempropagatingerrorsis
loplacewered.suchIn[VoasassertionsandatMiller,themost1994b]effectivtheelopropagationcationsintheinformationcode,issuchusedthatto
e.effectivecomesbtesting
MichaelandJonesshowthatdatastateerrorsinsoftwarepropagate
uniformly,i.e.,eitheralldatastateerrorsforaspecificlocationpropagate
totheoutputsofaprogram,ornoneofthemdo[MichaelandJones,1997].
Fnotrombea.Whentheoreticalonlyaviewpsmallointthissubsetisofthesurprising,valuebuttodomainaforpractitioneravariableitmighcant
beconsidered“correct”,thenmostchangestothisvariablewillbeerroneous
andmaytriggerpropagationoffaults.Thisisespeciallytrueforvalueswhich
forarevaliditassumedyintothebecode.correctbythedeveloper,andarethereforenotchecked
Hillerdevelopedanextensivepropagationprofilingframeworkforembed-
dedcontrolsystems[Hilleretal.,2004;Hiller,2002].Basedonacomponent
modelerrorpropagationmetricssimilartotheonesdevelopedinthisthesis
arepresented.Whereasthefocustherewasondatalevelerrorsincontrol
software,wefocusonasingle(althoughcomplex)componentofcomputer-
OS.thesystems,based
82CHAPTER5.ERRORPROPAGATIONINOPERATINGSYSTEMS
5.6SummaryofResearchContributions
ureThismocdehapterinanalysis.troAducedseriestheofconceptOSlevofelerrorpropagationpropagationmeasuresinOS’s,wereusingdefinedfail-
tothatcomphelponenantsevthataluatorarefindmoresystemlikelybtoottlenecspreadksanderrorstoorguidemorelikfurtherelytoeffortsbe
thesinkforpropagatingerrors.Relatedresearchprojectsintheareaof
reviewfailureed.moTdeableanalysis5.3andsummarizeserrorthepropagation,measuresespintroeciallyducedinrelatedthistocOS’shapter.were
Thefollowingresearchcontributionsarepresentedinthischapter:
•ErrorpropagationinthecontextofOS’sisdefinedinagenericand
systemindependentmanner.Thefundamentalpropagationmea-
sureServiceErrorPermeabilityisusedtomeasurethepropagation
acrossservicesintheOS-DriverinterfacewithservicesintheOS-
er.ylaApplication
•TheServiceErrorExposuremeasureisintroducedtomeasurethein-
fluencepropagatingerrorshaveonspecificOSservices.Itcanbeused
tocompareservicesinarelativemanner.
•ServiceErrorDiffusioncanbeusedtorankservicesintheOS-Driverin-
terfaceontheirabilitytospreaderrors.
•TheacrossDrivdriverersErrorontheirDiffusionabilitytmeasureospreadsimilarlyerrors.allowsrelativecomparison
5.6.SUMMARYOFRESEARCHCONTRIBUTIONS
Table5.3:Summaryoftheerrorpropagationmeasuresintroduced.
iSPOx.y
iPSx.yiE
5.2
5.75.3
SymbolEquationDescription
TheServiceErrorPermeabilityfordriver
PDSix.y5.1drivserviceserisservicethedsprobabilitwillythatpropagateantoerroraninOS-a
Applicationservicex.ysi.
TheServiceErrorPermeabilityforanOS-
Driverservicesistheprobabilitythataner-
POSix.y5.2rorinanOSserviceusedbydriverDxwill
driverservicedsx.ywillpropagateanOS-
.sserviceApplicationiThecombinedServiceErrorPermeabil-
PSix.y5.7itymakesnodistinctionbetweenimported
functions.ortedexpandTheServiceErrorExposureisusedtocom-
Ei5.3pareOS-Applicationservicesontheirsuscep-
tibilitytopropagatingerrors.
ThedriverspecificServiceErrorExposureis
iusedtocompareOS-Applicationserviceson
Ex5.4theirdiffersfromsusceptibilitEiinythattoitpropagatingconsiderseacherrors.driverIt
.individuallyServiceErrorDiffusionisusedtocompare
SEx.y5.5OS-Driverservicesonintheirabilityto
system.theinerrorsspreadDriverErrorDiffusionisusedtoidentifyand
Dx5.6comparedriversontheirabilitytospreader-
system.theinrors
iEx
SEx.yxD
5.4
5.55.6
83
84
CHAPTER5.ORERROPPRGATIONAINTINGOPERASYSTEMS
6Chapter
ErrorModelEvaluation-What
Injectto
WhatWhicharerreorthemotrdelade-offsshouldtobemakeusedacrforossOSerrrorobustnessmodels?evaluation?
theTheresultschoiceandoftheerrortimemodelrequiredfortorobustnessperformevthealuationevofaluation.OS’sThisinfluencescbhapteroth
inandveFstigatesuzzingtheerrors.effectivItenessbuildsofonthreetheerrormeasuresmodels:introducedbit-flips,indata-tChapterype5errorsand
usesthesetoevaluatethethreeerrormodels.IthelpsansweringRQ2-
quanAntifiableextensivemeasuresseries-ofandfaultRQ4-injectionwhattoexpinject.erimentsshowthatthebit-flip
mothedelnumalloberwsofforinjectionsmoredetailedrequired.results,Fuzzinghowisever,foundattaobehighercheapcosttoinimplementermsoft
butpresenistedlesswhereprecisethelowcomparedcostoftofuzzingbit-flips.isAcomnovbinedelcompwithositetheerrorhighermolevdelelofis
detailsofbit-flips,resultinginhighprecisionwithmoderatesetup/execution
costs.troFducedinurthermore,Chapterthis5ccanhapterbeshousedwsinhothewthecontexterrorofapropagationrealsystem.measuresin-
85
86
CHAPTER6.ERRORMODELEVALUATION
ductiontroIn6.1
theWhenchoicepoferformingtheaerrormorobustnessdelused.evInaluationmanofyancasesOStheseveralrepresenfactorstativenessinfluenceof
ofthemostusedimperrormortance.odelTobcomparedeabletototheestimateerrorsthefoundbinehaaviordeploofyethedOSsysteminanis
operationalsetting,theinjectederrorsneedtoascloselyaspossiblematch
realsincea)errors.thetHoypweevander,fordistributionCOTScompofrealonents,errorssuchmaaysnotOS’s,bethisknoiswn,b)difficult,the
operationalsettingmightnotyetbeknown,orc)theoperationalcomposition
ofthesystemmaynotbeknown.
compThisositionthesisoffothecusessystemontheintorobustnessconsideration.aspectoftheRobustnessOS,oftakingtheassystempecificis
evaluatedwithatypicalcomposition,toidentifysystemvulnerabilitiesin
theplatformsformof(OSerrorandhardwpropagationare)arepaths.evThisaluatedisforonainstanceprototypdoneestage,whenordifferenwhent
platformrobustnessisevaluatedaspartofqualityassuranceofanentire
sourcesystem.coIndeofthistysystempeofcompsetting,onents,theorevlacaluatorkthetabilitypicallyy(orlackspaccessermissions)totheto
it.difymoOS)Givandenlacthekofconsourcetextofcode,wrobustnessetargetevthealuationinterface(errorsbetwareeenexternaldevicetodriversthe
andcalledthetopOS.erformThisinservices.terfaceisThettargetypicallyfordefinedinjectionasaissettheofparametersfunctionstothatsucareh
functionsthatcarryerroneousinformationfromadrivertotheOS.
i.e.,Ahokweyarequestionerroneousbecomesstatesofwhictheherrorsystemmomodeldeledtochboyoseforinjectingtheeverrors.aluation,This
cnesshapterinevfindingaluatessystemthreecontempvulnerabilities,orarytheirerrormoeasedelsofbasedimplemenontheirtationandeffectivthee-
eactimehoftherequiredcriteriaforpinverformingestigatedtheisexpproerimenvided.tation.Adetaileddiscussionon
duced:Thecerrorerfiomoserialdels(serialareevport),aluated91C111usingthe(Ethernetthreedrivcard)ersandapreviouslytadiskin(Com-tro-
pactFlash).
delsMoErrorConsidered6.2
Theconsiderederrormodelsforthisstudywereintroducedanddescribed
inSection3.2.Thissectionfirstbrieflydiscussesthethreemodels.Table
6.1showsanoverviewofthethreemodels,showingthenumberofservices
MODELSORERRCONSIDERED6.2.
87
intheOS-Driverinterfaceandthetotalnumberofinjectionsperformedfor
eachoferrormodel.Thenumberofusedservicesdiffersacrossthemodels,
withtheserialdriverusingthemost.OnecanalsoseethattheBFmodel,
asexpected,incursthehighestnumberofinjectionsandtheDTmodelthe
est.few
Table6.1:Overviewofthetargetdrivers.
casesInjection#Driver#ServicesBFDTFZ
cerfioserial6026533971395
91C111541850283990
atadisk471486267899
6.2.1DataTypeErrorModel
Thedatatype(DT)errormodelmodifiesthevalueofaparameterbasedon
itsdatatype,andispresentedindetailinSection3.2.1.Ithasbeenshown
inpreviousstudiesthatthistypeofinjectionisveryscalableintermsofthe
numberofdatatypesusedinAPI’s,suchasPOSIX[DeValeetal.,1999].
Thetotalnumberofdatatypestargetedfortheexperimentsreportedhere
was22.Giventhattheaveragenumberofservicestargetedacrossallthree
driverswas54,withtypicallymorethanoneparameter,thisisafairlylow
er.bmun
delMoErrorBit-Flip6.2.2Forthebit-flip(BF)errormodeleachtargetedparameterisconsideredas
a32bitvalue.32injectionscasesaredefined,flippingthebitsfrom0(least
significantbit)tobit31(mostsignificantbit)oneaftertheother.
theThexor-function.flippingisacThishievedapproacbyhcastingisalsothevdetailedaluetoforanininstancetegerandin[Vthenoasusingand
Charron,1996].Thenewvalueisthenusedinthecalltotherealfunction.
used.TheHowBFever,moitdelisdobeseneficialnottonecessarilydosoforneedsometobspeecificadaptedreasons:tothedatatype
•Reducingthenumberofbitsusedfordatatypesusingfewerbitsreduces
thetotalnumberofinjectionsrequired.Forinstance,thedatatype
charusesonly8bits,whereasthetypeintuses32.Sinceinjecting
inthisthetype,remainingthenum24berbitsofofbitsachartargeteddoescannotbereflectrestrictedasofttow8.areerrorfor
88
CHAPTER6.ERRORMODELEVALUATION
•Mansomeyofotherthedatatyparameterspe(orusedvoidin).theByintracterfacekingaresuchpoinrelationsterstoathevpalueointerof
thantargetcantargetingbeusedtheforreferenceinjection,pointermorevaluescloselyalone.simulatingsoftwareerrors
•Acommonfeatureinthedriverinterfaceistousepointerstostruc-
turesdirectly(structtarget’s).them.WithoutDatatypdetailsetraconkingtheirmemfacilitatesbers,this.BFerrorscannot
6.2.3FuzzingErrorModel
Thefuzzing(FZ)errormodelusesapseudo-randomgeneratortogenerate
randomvaluestoinject.Thetargetedserviceinvocationisinterceptedand
anewrandomvalueischosentoreplacetheexistingvalue.Weusethe
standardC-runtimefunctionrand()togeneratetherandomvalues.Each
targetcomputer(board)storesthegeneratedrandomvalueinpersistent
storageinjection.andThisuseswaythiswevaaluevoidasseedgeneratingtothetherandomsamevaluegeneratorforeacforhtheinjectionnext
(whichwouldhavebeenthecasehadthesameseedbeenused).
Foreachservicetargetedfifteeninjectionswithdifferentrandomvalues
areperformed.Thisnumberwasselectedtogiveareasonableexecution
timeoftheexperimentsandyetproduceusefulresults.Section6.5.2further
discussesthenumberofinjectionsfortheFZmodel.
PropagationError6.3Thissectiondetailsourexperimentalestimationoftheerrorpropagation
pressionsmeasuresindefinedtheinpreviousChapterchapter5.Itcanisbedemonstratedadaptedtohowassessmenthetanalyticalwithfaultex-
tionsinjection.fromAlargeseriesscaleoffaultsimplificationsinjectionareexppresenerimentedtsareandpresenresultsted.andTinoterpreta-shorten
thewithSectiondiscussion,6.4fodevcusinotedthistothesectioncomparisonwillbeputacrosssolelytheonthreethemobit-flipdels.model,
DistributionClassailureF6.3.1Table6.2showsthefailureclassdistributionforthethreedriversusingthe
BFthoughmofordel.allThethreeatadiskdriversdrivtheerhasratiothestayshighestbelowratio4.5%ofofClassthe3injectedfailures,errors.even
ThecerfioserialdriverhasconsiderablymoreClass2failuresthantheother
twodrivers.Thisisduetoanumberofinjectionsforthisdriverleadingto
6.3.ERRORPROPAGATION
Table6.2:ThefailureclassdistributionfortheBFmodel.
DriverNF[%]C1[%]C2[%]C3[%]
cerfioserial206077.65381.4348118.13742.79
91C111132071.3541622.49502.70643.46
atadisk111775.1730020.1930.20664.44
89
hangsinOSservicesusedbyapplications,i.e.,serviceshangunexpectedly.
Theserialdriver,beinginherentlyofblockingnature,istheonlydriverto
showsuchbehaviortoasignificantextent.Theothertwodrivershavehigher
Class1ratiosinsteadandallthreedrivershaveroughlythesameamount
ofClassNFfailures,above70%.ThehighnumberofClassNFfailures
suggestthatthereispotentiallyroomforreducingthenumberofinjections
further,beyondwhatisalreadydonethroughthepre-profilingstage.
6.3.2EstimatingServiceErrorPermeability
ServiceErrorPermeabilityistheconditionalprobabilitythatanerrorap-
pearinginaOS-DriverinterfaceservicewillpropagatetoaserviceintheOS-
Applicationinterface,giventhatoneappears(seeEquations5.1and5.2).A
distinctionismadebetweenerrorsappearinginservicesprovidedbydrivers
(exports)andthoseprovidedbytheOSitself(imports).Asimplificationis
alsomadeinEquation5.7wherenosuchdistinctionismade.
ServiceErrorPermeabilityisestimatedbytheuseoffaultinjectionasthe
ratiobetweenthenumberofinjectionsperformedresultinginafailuretothe
totalnumberofinjectionsforagivenservice.ServiceErrorPermeabilityis
calculatedthesamewayforbothimportedandexportedservices.Wedenote,
foradriverDx,thenumberofinjectederrorsinaservice1osx.ywithNx.y
andthenumberoffailuresforservicesiwithni.TheestimatedServiceError
Permeabilityisthencalculatedasfollows:
SPix.y=Nni(6.1)
x.yTypicallyonestudieseachfailureclassinisolation.Inthiscaseniis
thenumberofinjectionsresultinginfailureofthespecificclassunderstudy.
SPix.yisusedasanestimateofbothPDSix.yandPOSix.yandassuchcorre-
5.7.EquationondspServiceErrorPermeabilitycanbeusedtostudytherelationbetween
tuplesofOS-DriverservicesandOS-Applicationservices.Thenumberof
1Thecalculationforexportedservices(dsx.y)isanalogoustothatforimported.
90
CHAPTER6.ERRORMODELEVALUATION
servicesintheOS-DriverinterfacecanbeseeninTable6.1.Foreachof
theservices,ServiceErrorPermeabilityisdefinedinrelationtoeachservice
studiedattheOS-Applicationlayer.SinceServiceErrorPermeabilityisa
probability,thevalueswillbeintherange[0...1].Itisimportanttonote
thatavalueof0.0mustnotbeinterpretedasanproofthatnoerrorswill
propagatealongthispath.Itisonlyanindicationthatthelikelihoodislow,
giventheerrormodelused.Similarly,avalueof1.0onlyindicatesthaterrors
arelikelytopropagate,butisagaindependentontheusederrormodel.
Aspreviouslymentioned,propagationresultsarebestinterpretedby
studyingtheindividualfailureclassesseparately.ForServiceErrorPer-
meabilityClass3failuresarenotrelevant,sinceaClass3failurewillhave
thesameeffectonallapplicationlevelservices,sinceitrenderstheentire
systemirresponsive,eitherthroughahangoracrash.Similarly,errorprop-
agationishardtotrackformostClass2failures,sincetheeffectneedsto
bepinpointedtothespecificservicebeingthevictimofthefailure.This
wouldrequiretrackingnotonlynegativereportsforeachservice,i.e.,when
errorspropagate,butalsopositivereports,i.e.,eachservicecallneedstobe
logged.Thiswouldputatremendouspressureonthetrackingmechanism
tosafelystoreorforwardinformationoneachcall.Consequentlywehave
notconsideredServiceErrorPermeabilityforClass2failures,anditthus
remainsasafutureextensiontoourwork.
OurinvestigationusingServiceErrorPermeabilityfocusesonClass
1failures.Manysuchpropagationpathsexist(althoughwithmanyhav-
inganestimatedpropagationpermeabilityof0.0).Therefore,Tables6.3,
6.4and6.5forbrevitypresentsonlytheprominenterrorpropagationpaths
identified,foreachofthethreedriversusingtheBFmodel.
Table6.3showsallpropagationpathsforcerfioserial.“Stringcompare”
isnotaspecificOSserviceperse,butanaddedconsistencycheckforthe
receivedechostringssentbythetestapplication.Similarly,“SerialEcho
Error”istheechoservercheckperformedonthehostside.Driverservices
withtheprefixCOMareexporteddriverservices.
Severalobservationscanbemaderegardingthepropagationpathsre-
portedinTable6.3.First,thetableshowsthatbothimportedandexported
functionscanleadtoClass1failures.Second,somepropagationpathsare
distinctlymoreprominentthanothers,havingServiceErrorPermeabilityval-
uesofupto1.0,i.e.,eachinjectederrorleadtoaClass1failurereported
bytheapplicationtestingtheserialportfunctionality.Third,a“clustering”
effectcanbeseen,wheremanyOS-Driverserviceshavemultiplepropaga-
tionpathswiththesameServiceErrorPermeabilityvalue.Thisindicates
thatinjectingatransienterrorintheseserviceswillcorruptthe“state”of
thesystemandcausesubsequentserviceinvocationstofailaswell.This
6.3.ERRORPROPAGATION
91
Table6.3:Class1errorpropagationpathsforcerfioserial,basedonService
ErrorPermeability(SEP),fortheBFmodel.
SEPOS-ApplicationerOS-Driv#1.000CreateFileterruptDisableIn11.000GetCommStateterruptDisableIn23InterruptDisableWriteFile1.000
1.000SetCommTimeoutsterruptDisableIn465InInterruptDisableterruptDisableReadFileGetCommTimeouts1.0001.000
7InterruptDisableStringcompare1.000
98InmemcpyterruptDisableCloseHandleSetCommState0.0421.000
10COMOpenWriteFile0.031
11COMOpenSetCommState0.031
12COMOpenReadFile0.031
13COMOpenSetCommTimeouts0.031
14COMOpenStringcompare0.031
15COMOpenGetCommState0.031
16COMReadStringcompare0.016
17EventModifyStringcompare0.016
18EventModifySerialEchoError0.016
19COMIOControlStringcompare0.010
20COMIOControlGetCommState0.010
21COMIOControlSerialEchoError0.010
isnobigsurpriseconsideringthetypeofservicesinvolved.Forinstance,
COMOpen,whichwhenfailingtoopentheserialportwillcausesubsequent
servicerequestsbytheapplicationfailaswell.
Table6.4showsthetopthirtyClass1errorpropagationpathsforthe
Ethernetdriver.OneservicehasaServiceErrorPermeabilityvalueof1.0,
withcatessevthateraltheseotherhaservicesvingarehighpropvidedermeabilitbytheyvNdisalues.libraryThe,aNdissystemprefixlibraryindi-
providedtosupportandsimplifynetworkcarddrivers.Similarly,theOS-
Applicationservicesarealsonetworkrelated,asexpectedsincethetestap-
plicationmostaffectedisusingnetworkingservicesheavily.Themostperme-
ablepathissurprisinglyenoughforNKDbgPrintfW,afunctionwhichprints
debuginformationusedbydevelopers.Thissuggeststhatevenfunctions
thatarenot“expected”bydeveloperstocausepropagatingerrorsmustbe
92
CHAPTER6.ERRORMODELEVALUATION
Table6.4:Class1errorpropagationpathsfor91C111,basedonService
ErrorPermeability(SEP),fortheBFmodel.
SEPOS-ApplicationerOS-Driv#1NKDbgPrintfWWSACleanup1.0000
1.0000connecttfWNKDbgPrin23NKDbgPrintfWshutdown1.0000
45NdisOpNKDbgPrintfWenConfigurationWSAclosesocCleanketup0.93751.0000
0.9375connectenConfigurationNdisOp687NdisOpNdisOpenConfigurationenConfigurationshclosesoutdocwnket0.93750.9375
9NdisCloseConfigurationWSACleanup0.8750
0.8750connectNdisCloseConfiguration1011NdisCloseConfigurationshutdown0.8750
12NdisCloseConfigurationclosesocket0.8750
13NdisInitializeWrapperconnect0.5625
1514NdisInitializeWNdisInitializeWrapprapperershWSAutdoCleanwnup0.56250.5625
1716NdisInitializeWNdisMRegisterMiniprapperortclosesoclosesocckkeett0.56250.4844
1918NdisMRegisterMinipNdisMRegisterMiniportortshWSAutdoCleanwnup0.48440.4844
2120NdisMRegisterMinipNdisMRegisterAdapterShortutdownHandlerWSAconnectCleanup0.48440.4688
22NdisMRegisterAdapterShutdownHandlerconnect0.4688
23NdisMRegisterAdapterShutdownHandlerclosesocket0.4688
24NdisMRegisterAdapterShutdownHandlershutdown0.4688
25QueryPerformanceCounterclosesocket0.4688
26QueryPerformanceCountershutdown0.4688
27QueryPerformanceCounterconnect0.4688
2928QueryPNdisReadConfigurationerformanceCountershWSAutdownCleanup0.43930.4688
30NdisReadConfigurationclosesocket0.4393
care.withusedTheclusteringofservicesisagainshownclearly,i.e.,severalapplication
levcateselthatservicestheyshowfailthetogether;sameServicewhenoneErrorservicePermeabilitfail,sevyveralalues.otherThisservicesindi-
6.3.ERRORPROPAGATION
93
Pfailtoermeabilito.9yv1C111alueabhadovine0.0.total71propagationpathswithaServiceError
TonableService6.5:ErrorSelectionPofermeabilitClassy1(SEP),errorforthepropagationBFmopathsdel.foratadisk,based
SEPOS-ApplicationerOS-Driv#1.0000GetFileTimeInitDSK132DSKDSKInitInitCloseHandleGetFileInformationByHandle1.00001.0000
4READPORTUSHORTCloseHandle1.0000
65READREADPORPORTTUSHORUSHORTTCreateFileGetFileInformationByHandle1.00001.0000
7DetectATADiskCreateFile1.0000
8DetectATADiskWriteFile1.0000
9DetectATADiskCloseHandle1.0000
10DetectATADiskSetEndOfFile1.0000
Table6.5fortheCompactFlashdrivershowsasimilartrendtothepre-
vioustwotables.Forthisdriver,twoexportedservicesshowup,DSKInit
andDetectATADisk.Theapplicationlevelservicesinthelistarerelatedto
fileoperations,asexpectedforthisdriver.Intotalatadiskhas176registered
paths.propagationItisimportanttonotethatinTables6.3,6.4and6.5ahighervaluefor
ServiceErrorPermeabilityindicateshigherlikelihoodofpropagatingerrors
resultinginClass1failures.Alowervaluemaythereforebeanindicationof
pronenesstohigherseverityfailures,ortoahigherdegreeoffaulttolerance.
ServiceErrorPermeabilitymustthereforebeusedinconjunctionwithother
measures,suchasServiceErrorExposureandDriverDiffusion.
osureExpErrorService6.3.3ServiceErrorExposureconsidersalldriver-levelservices’contributiontothe
failureseenforaspecificOS-Applicationservice.Therefore,weanalogously
totheServiceErrorPermeabilityonlyconsiderClass1failuresalsofor
osure.ExpErrorServiceThenumberofservicesusedbyeachtestapplicationwaspurposelykept
low,asaconsequenceofkeepingthetestapplicationssmallandsimple.
Tables6.6,6.7and6.8showthedriverspecificServiceErrorExposurecal-
culatedforeachserviceforinjectionsincerfioserial,91C111andatadisk,
.elyectivresp
94
CHAPTER6.ERRORMODELEVALUATION
ThedriverspecificServiceErrorExposureiscalculatedusingEquation
5.4,whichisbasedontheServiceErrorPermeabilityvalues(partially)pre-
sentedintheprevioussection.Sinceitisasumofprobabilities,theyarenot
uselimited,liesinandthenosprelativecificeincomparisonterpretationacrosscanmanbeymadeservices.onaAsinglehighervvalue.alueTheirindi-
catesthataservicesismoreexposedtopropagatingerrorsfromthedrivers
considered.
Table6.6:ServiceErrorExposurevaluesforthecerfioserial,usingthe
del.moBF
#ServiceServiceErrorExposure
1.0730compareString11.0417GetCommState21.0313SetCommTimeouts31.0313riteFileW41.0313ReadFile51.0000CreateFile61.0000CloseHandle71.0000GetCommTimeouts80.0729SetCommState910SerialEchoError0.0260
Allthreetablesshowthesameclusteringeffectobservedforfailures,in-
casedicatedisbnotyobservingsurprising.thesameConsideringServiceforErrorinstanceExposureCreateFilevalue.andThatthisCloseHandleisthe
initsTablehandle6.8willitbiseinvalid.understandableAsubsequenthatifttryCreateFiletoclosereturnsitwillwithalsoerror,returnthenan
error,andconsequentlytheerrorpropagatestobothservices.
AnotherusefulpieceofinformationcanbeseeninTable6.6whichshows
this“Stringisacheccompare”kptoerformedhaveonthethehighestreceivedServicedataErrorandnoExposurereturnedvalue.errorcoSincede
forreceivanedOSwithoutserviceanythisotherindicatesservicethatiindicatingnsomeancaseserror.Thiserroneouscorrespdataondscantobae
silendata.terror,Similarly,suggesting“Serialthatechodataerror”integritsignalsychecthatksthemightbdataesenneededttoforthecriticalHost
checComputerksmayinbensomeeededcasesatisthecorrupted,receivingsidesuggestingaswell.thatsimilardataintegrity
InTable6.7services5-15arefromthetestapplicationforatadisk,in-
causedicatingClassthat1infailuressomealsocasesforinjectingapplicationserrorsinnottheusingintheterfacefaultfory9driv1C111er.can
6.3.ERRORPROPAGATION
95
Table6.7:ServiceErrorExposurevaluesfor91C111usingtheBFmodel.
#ServiceServiceErrorExposure
6.5020connect16.5020wnutdosh26.5020upCleanWSA34closesocket6.4083
0.1250CreateFile50.1250CloseHandle60.0938ReadFile70.0625sizeof80.0625GetFileSize90.0313GetFileTime100.0313GetFileInformationByHandle110.0313SetEndOfFile120.0313riteFileW130.0313teroinSetFileP140.0313DeleteFile150.0208getaddrinfo16
Table6.8:ServiceErrorExposurevaluesforatadiskusingtheBFmodel.
#ServiceServiceErrorExposure
30.1790CloseHandle130.1790CreateFile234sizeofReadFile22.643615.0957
15.0832GetFileSize576WSetFilePriteFileointer7.547877.54787
98DeleteFileSetEndOfFile7.547877.54787
7.53537GetFileTime107.53537GetFileInformationByHandle11
DiffusionErrorService6.3.4WhenconsideringwhichOS-Driverservicesaremorelikelytospreaderrorsin
thesystemonecanusetheServiceErrorDiffusionmeasures,whichconsiders
oneservice’spronenesstospreaderrors.Sinceweareconsideringdriver
96
CHAPTER6.ERRORMODELEVALUATION
servicesspecificallywedonothavethefailureclassrestrictionsthatapplyto
applicationlevelmeasures.Wewillthereforeconcentrateonthemostsevere
failures.3Classclass,failureServiceErrorDiffusionisdefinedinEquation5.5asasumoverallappli-
cationlevelservices.Sinceallservicesareaffecteduniformly,thistranslates
intoasimplescalingoftheeffectswiththenumberofservicesused.A
simplifiedexpressioncanthereforebeapplied,wheretheapplicationlevel
servicesarenotaccountedforindividually.Thissimplifiedversionisshown
inEquation6.2,andisdefinedfordriverDxandservicesx.y(either:osx.yor
):dsx.y
SEx.y=nx.y(6.2)
Nx.ywherenx.yisthenumberofClass3failuresandNx.ythenumberof
injectionsperformedforservicesx.y,asabove.
Table6.9:Class3ServiceErrorDiffusionvaluesforcerfioserialusingthe
del.moBFDiffusionErrorServiceService#0.3125memset10.2083ymemcp20.1528MmMapIoSpace340.0909LoadLibraryW65FreeLibraryDisableThreadLibraryCalls0.06250.0625
87LoSetProcalAllocPcermissions0.03130.0313
910HalTCreateThreadranslateBusAddress0.02360.0234
Tables6.9,6.10and6.11presentthenon-zerovaluedservicesforthe
threedrivers.Oneservicestandsoutamongthedata,wcslen,foratadisk,
whichhasaServiceErrorDiffusionvalueof1.0,whichmeansthatallin-
jectederrorsforthisserviceresultedinaClass3failure.Thismakesthis
serviceatopcandidateforfurtherrobustnessenhancement.Comparingthe
numberofservicesinthesetableswiththenumberofservicesusedinthe
OS-Driverinterfaceforthethreedrivers(Table6.1)onecanobservethata
smallnumberofservicesgiverisetoallClass3failures,forallthreedrivers.
Furthermore,itcanbeseenthatsomeservicescauseseverefailuresfor
allthreedrivers,suchasmemsetandmemcpy.Thesearelow-levelsystem
6.3.ERRORPROPAGATION
97
Table6.10:Class3ServiceErrorDiffusionvaluesfor91C111usingthe
BFmo#del.ServiceServiceErrorDiffusion
0.2708memset123NdisAlloDisableThreadLibraryCallscateMemory0.12500.1250
54QueryPLoadLibraryWerformanceCounter0.09090.0938
76FNdisMSyncreeLibraryhronizeWithInterrupt0.07810.0625
0.0313yVirtualCop8910RegOpNdisMSetAenKeyExWttributesEx0.00930.0188
1211NdisInitializeWNdisMRegisterInrappterrupter0.00770.0078
0.0063trolKernelIoCon13
Table6.11:Class3ServiceErrorDiffusionvaluesforatadiskusingthe
del.moBFDiffusionErrorServiceService#1.0000cslenw10.2727ycscpw20.2708ymemcp30.1875memset40.0625DisableThreadLibraryCalls576LoMapPtrTcalAllocoProcess0.03130.0313
DiffusionErrorService1.00000.27270.27080.18750.06250.03130.0313
functionspresentinmanydriversandtheServiceErrorDiffusioniscompa-
rableacrossallthreedriversindicatingthatforthesedriverspropagationis
independentfromthedriveritself.Ifgenericerrordetectionandrecovery
mechanismscouldbedefinedforthesetwoservices84Class3failurescould
beremovedacrossallthreedriversfortheBFmodel.Thiscorrespondsto
41%oftheClass3failuresreportedfortheexperiments.
98
CHAPTER6.ERRORMODELEVALUATION
DiffusionErrorerDriv6.3.5WhileServiceErrorDiffusionisusedtoidentifyindividualservicesthat
bareecomemorehardlikelytotoovspreaderviewanderrorsservicespresentmaatybtheespreadOS-Driveracrossinmterfaceultiplethisdrivmaersy
maymakingwanantytospfoecificcusondrivertheleveldriverimprothatvisementmoreeffortspronecostlyto.Inspreadingthiscaseerrors,one
ratherthanindividualservices.TothisendweuseDriverErrorDiffusion.
plifiedDrivforerErrorClass3Diffusionfailures.canEquationsimilarlyto6.3ServicepresentsErrortheDiffusionestimatedbeDrivsim-er
ErrorfusionthusDiffusiontransformsnottoconsideringasumofapplicationServicelevErrorelservices.DiffusionvDrivalueserasErrorfolloDif-ws:
Dx=SEx.y=nx.y(6.3)
∀y∀yNx.y
Table6.12showstheDriverErrorDiffusionvaluesforallthreedrivers
usingexpressiontheBFinmoEquationdel.The6.3.valuespresentedarecalculatedwiththesimplified
Table6.12:DriverErrorDiffusionforallthreedriversconsideringClass
failures.3DiffusionerDriv9cerfio1C111serial0.931.00
1.86tadiska
FromTable6.12onecanseethatatadiskisclearlymorepronetodif-
fusingerrorsinthesystem.91C111andcerfioserialareveryclose,with
cerfioserialhavingslightlyhighervalue.Consideringthesevaluesaneval-
uatormightconsiderdevotingextraresourcestoensuringthatatadiskdoes
errors.tainconnot
delsMoErrorComparing6.4DepThereendingaremanonythecriteriagoalofonethecouldevhavaluation,eforcriteriaselectingsucthehaserrormoexecutiondeltotimeuse.
ornumberoffoundfailuresmaynotbeequallyimportant.Therefore,we
thecompareusestheandthreeimplicationsmodelsofonaeachwideevrangealuationofcriteriacriteria.isFirst,presenated,discussionfolloweond
6.4.COMPARINGERRORMODELS
99
byapresentationandinterpretationoftheresults.Table6.13showsan
overviewoftheresultsforthethreemodels.
Thefollowingcriteriaareconsideredwhencomparingtheerrormodels:
•tanNumtasbearofhighernfailuresumberfound:maygivTheemoreabsoluteinsighntuimnbtoerhofowthefailuresissystemimpcanor-
failandconsequentlygivebetterfeedbacktodevelopersofthesystem
it.evimproto
•Numberofinjectionsandexecutiontime:Thenumberofin-
jectionsrelationshipinfluencesisnotthelinear,timesincerequiredthetoexecutionperformtimethealsoevdepaluation.endsonThethe
outcome,butmoreinjectionsgenerallymeanslongerexecutiontime.
•Injectionefficiency:Theefficiencyoftheinjectionsismeasuredas
thenumberoffailuresperinjection.Thismeasurehelpsmakinga
trade-offbetweenthetwopreviouscriteria.
•Coverage:Thetermcoverageishereusedtocomparedifferentmodels
abilitytopinpointcertainservicesaspotentiallyvulnerable.Sinceno
informationontherealvulnerabilitiesexist,thecomparisonisbased
onabest-effortstrategy,wheretherelativecoverageacrossmodelsis
compared.
•Implementationcomplexity:Thecomplexityoftheimplementa-
tionisameasureoftheeffortrequiredtoimplementtheerrormodel.
Sincenolabexperimentswithrealdevelopershavebeenconducted,the
comparisonremainssubjective.However,thereareclearanddistinct
differencesintheimplementationeffortneededforthestudiedmodels.
100
CHAPTER6.ERRORMODELEVALUATION
%2.79%3.78%4.73%3.46%4.59%2.73%4.44%1.50%0.67%
3Class7415666413276646
%18.13%13.35%28.74%2.70%0.35%0.00%0.20%0.37%0.44%
2Class481534015010314
%1.43%16.37%1.43%22.49%23.32%39.29%20.19%33.71%34.48%
1Class3865204166638930090310
shoistserimenexpoferbmunThe6.13:ableTclass.failureanddelmoerrorer,drivheacforwnserial
%77.65%66.50%65.09%71.35%71.73%57.98%75.17%64.42%62.85%
ailureFNo206026490813202035741117172565
delMoBFDTFZBFDTFZBFDTFZ
Errorer91C111atadiskcerfioDriv
91C111
6.4.COMPARINGERRORMODELS
6.4.1NumberofFailures
101
Theabsolutenumberoffailuresthatanerrormodeltriggersisimportant
fromafeedbackperspective.Themorecasesoftriggeringvulnerabilities
shown,theeasieritwillbetoidentifythevulnerabilityandpossiblyremove
it.Table6.13showsthenumberoffailuresfoundforeachofthefourfailure
classes,errormodelanddriver.Fromthetableitcanclearlybeseenthatthe
BFmodel,havingthemostinjections,alsoincurthemostClass3failures.
ThenumberoffailuresfortheBFmodeliscomparableacrossallthree
drivers.Fortheothertwomodelstherearedifferencesinthenumberof
Class3failures,indicatingthattherearedifferencesbetweenthedriversin
theirabilitytospreaderrors.
Table6.13furthersubstantiatethefactthatcerfioserialismoreprone
toClass2failuresthantheothertwodrivers.Thatthisbehaviorisdriver
related,andnotdependentontheerrormodelisfurthersupportedbythe
factthatthepercentageofinjectionsleadingtoClass2failureisdistinctly
higherforallerrormodelsforcerfioserial,comparedtotheothertwodrivers.
However,itisimportanttonotethatsincetheapproachisexperimental,
allresultsaredependentonthespecificsetupused.Inthiscasealltest
applicationsarewritteninastraight-forwardmanner,withoutanyexplicit
fault-tolerancemechanisms.Suchmechanismswillmostprobablychangethe
resultsoftheevaluation,andtheresultspresentedindeedsuggestthatsuch
needed.arehanismsmecFurthermore,manyinjectionsdonotresultinanyobservableerrorprop-
agation(58-78%)withinthetimeusedforeachinjection,i.e.,noobservable
deviationfromtheexpectedbehaviorwasobserved.Thisisinlinewithmul-
tiplepreviousstudies,e.g.,[Dur˜aesandMadeira,2003],[Guetal.,2003]and
[Jarbouietal.,2002a].ExperimentsintheClassNFcategoryareeither
maskedbythesystem,forinstanceparametersnotusedinthiscontextor
overwritten;orhandledbybuilt-inerrordetection/correctionmechanisms
checkingincomingparametervaluesforcorrectness.Anotherexplanation
couldbethatthefaultisdormantinthesystemandhasnotyetpropagated
totheOS-Applicationinterface.Itisimportanttonotethatallerrorsin-
jectedwereinfactactivated,sincethepre-profilingeliminatesservicesnot
usedpriortoinjection(Section4.6).ThehighnumberofClassNFexper-
imentsindicatesthatthereisroomforimprovingtheselectionofinjection
casesbeyondthepre-profilingalreadycarriedout.
102
CHAPTER6.ERRORMODELEVALUATION
TimeExecution6.4.2Thenumberofinjectionsperformedandthetimerequiredforexecutingthe
experimentsarerelated.Anincreaseinthenumberofinjectionswillmean
increasedexecutiontime.However,theoutcomeoftheexperimentsinfluence
theexecutiontime.Aninjectionthatdoesnotleadtoerrorpropagationcan
beconsiderablyfasterthanonethatleadstoasystemhang,requiringfirst
thatthehang,isdetectedandthenarebootofthesystem.
Table6.14reportstheexecutiontimesfortheinjectionsperformed.The
timesreportedincludeonlytheactualexecutiontime,notimplementation,
setupandoff-lineprocessingtimes.
Table6.14:Experimentexecutiontimes.
DriverErrorModelhoursExecutionminTimeutes
cerioserialDTBF5381514
4420FZ91C111DTBF1175620
487FZatadiskDTBF2205651
5511FZ
Finjections,romthealsotablehasitthecanlongestclearlybeexecutionseenthattime.theTheBFtimemodel,requiredhavingfortheBFmostis
roughlytwiceasmuchasforFZandseventoeighttimesasmuchasDT.As
notedabovetheoutcomeoftheexperimentsinfluencetheexecutiontime,and
thismightdifferacrossdrivers.Inoursetupcerfioserialandatadiskboth
takeAlongerfactortimeinfluencingwhenthefailing,effewhicctivehexpalsoerimeninfluencesttimetheistheexecutiondegreeoftime.opera-
tor(whicinhvolvdrivemener,t.errorThemoopdeleratoretc).isTherequiredsetuptotimespisecifythethesameexpforerimenalltmotodels.run
Additionally,someinjectionsforcethesystemintoastatewhereitcannot
automaticallyreboot,requiringamanualrebootbytheoperator.Conse-
quently,withoutexternalrebootmechanismstheexperimentisdelayeduntil
theoperatorisnotifiedandcanperformthereboot,whichcanprolongthe
tionexecutiontimeintimeTablesubstan6.14tiallysince.noThisassumptionadditionalisdelamadeyisonnotthepartofpresencetheofexecu-the
6.4.COMPARINGERRORMODELS
103
operator.TheissueofmanualrebootsisfurtherdiscussedinSection6.6.
EfficiencyInjection6.4.3Theabsolutenumberoffailureseacherrormodelgivesrisetoalsoneedstobe
putincontrasttothenumberofinjectionsperformed,togiveanindication
oftheefficiencyofthemodel.Figure6.1graphicallyshowsthedatainTable
This6.13.wItouldcanbeclearlyseenfavthatormtheoodelsverallincurringtrendisfewersimilarinjections,acrossallespthreeeciallydrivDTers.,
injectionsFZthealsobut
Class 3Class 2Class 1
45.0Class 3Class 240.0Class 135.030.025.020.0Failure class distributions in percentBFDTFZBFDTFZBFDTFZ
15.010.05.00.0cerfio_serial91C111atadisk
.efficiencyInjection6.1:Figure
Theothercriteriausedarequantitativeinnature,wherearelativescale
of“goodness”canbedefinedandusedtorankthemodels.Oneaspectthat
cannotbecomparedquantitativelyisamodel’sabilitytoassessthe“true”
propagationpatternsofthesystem.Anefficientmodelmaybeonegivinga
bigger“bangforthebuck”,atleastiffindingfailuretriggeringvulnerabilities,
butmaystillbemisleading.Aseparateconcernisthereforetoinspectthe
differencesinpropagationresultsforthethreemodelsstudied,represented
bytheDriverErrorDiffusionvaluesforallthreemodelsanddriversinTable
6.15.Table6.15showsthatthereareindeeddifferencesacrosstheresultsof
thethreemodels.DTandFZidentifytheserialdrivertobethemost
104
CHAPTER6.ERRORMODELEVALUATION
Table6.15:DriverDiffusionforClass3failures.
FZDTBFerDrivcerfio91C111serial0.931.000.981.500.591.93
0.190.631.86atadisk
withvulnerabletheserialdriver,andwhereasEthernetBFdriverspin-phaoinvingtsaverytadisktosimilarbevthealues.mostItcanvulnerable,alsobe
observedthattheresultsforatadiskisclearlymorespreadthanfortheother
twodrivers,with91C111beingfairlyconsistentacrossallthreemodels.This
indicatesthattheservicesfor91C111givingrisetoClass3failuressuffer
from“uniform”vulnerabilities,i.e.,anysmallchangeinthedatasuppliedwill
triggertriggeredfailures.onlyforOnmoretheconspecifictrary,values,servicesinusedthisbycaseatadisktriggeredhavebybit-flips.vulnerabilities
theAnusmbtraigheroftforwexparderimenviewtsofintheeachresultsfailureisclasspresenisteddetailed,inTableboth6.13,inactualwhere
numbersandaspercentagesofallinjections.
Thefirstobservationisthatforalldriversanderrormodelsthepercentage
ofOSisinjectionscapableofendinguphandlingasmanClassyp3failureserturbationsisbeloandwav5%,oidingaindicatingcatastrophicthatthe
failure.
Table6.16:Class3ServiceErrorDiffusionvaluesforcerfioserialusingthe
del.moerrorDTDiffusionErrorServiceService#0.5000MmMapIoSpace10.4000ccalAlloLo20.2500LoadLibraryW30.2000ermissionscPSetPro40.0909ymemcp50.0625CreateThread6
Whencomparingtheerrormodels,cleardifferencescanbeidentified.
Forinstance,whereforDriverErrorDiffusionDTpreviouslyidentified
cerfioserialasthemostdiffusivedriver,91C111hasahigherratioofClass
3failuresforthiserrormodel.Onlyconsideringtheratiomayinthiscasebe
misleading,ascerfioserialinthiscasehasmoreserviceswithhighService
ErrorDiffusioncomparedto91C111forDT,asseenfromTables6.16and
6.4.COMPARINGERRORMODELS
105
Table6.17:Class3ServiceErrorDiffusionvaluesfor91C111usingthe
del.moerrorDTDiffusionErrorServiceService#0.2500LoadLibraryW10.1818ymemcp20.1764cateMemoryNdisAllo30.1500enKeyExWRegOp40.1333memset50.0555terruptNdisMRegisterIn60.0322ttributesExNdisMSetA7
6.17.Similarly,whereDriverErrorDiffusionwiththeBFmodelindicates
atadisktobebyfarthemostdiffusivedriver,theratiosofClass3failures
showthemtoberelativelyclose.Thisisaneffectofdiffusionbeinga“sum
ofprobabilities”.Diffusionshowsthatatadiskhasmoreservices(especially
wsclenwith1.0)withahighpermeabilitythan91C111(Tables6.10and
6.11).
6.4.4Coverage:IdentifyingServices
Tablefailures6.18foreachdepictsservice/errorservicesmoincurringdel.BFClassoutp3erformsfailurestheandothertheerrornummoberdels,of
bnuothmbeinroftermsClassof3thenfailuresumber(moreofidenclearlytifiedvisiblevulnerableinTableservices,6.13).andBFtheidentotaltifies
22individualservices,DT12andFZ11services.
servicesConsideringwhichonlywhiconehmoservicesdelidenthetifies,differenagaintmoBFdelsoutpuniquelyerformsidenDTtifies,andi.e.,FZ.
BFerroridenmodel.tifiessevDTenidenservicestifiesnoandsucFZhtwounique,whichservices.arenotFromidenthetifiednubmybaneryofotherfail-
uresidentified,FZidentifiesseveralserviceswithonlyonecase,suggesting
thattherandomnatureofheFZerrormodelhasahigherprobabilityoffind-
pingerformeduniqueforserviceBFtypicallyvulnerabilities.revealmoreWhereasthantheonemorefailure.systematicinjections
yComplexittationImplemen6.4.5Theimplementationcost,measuredasthetimerequiredfortheimplementa-
exptionoferience,anerrorknomowledgedelofisthenaturallyareaandsubthejectivthee.avTheailabilitamounyotfotfoolsandprogrammingdocu-
106CHAPTER6.ERRORMODELEVALUATION
Table6.18:ServicesidentifiedbyClass3failures.
FZDTBFService#013CreateThread12DDKRegGetWindowInfo001
108DisableThreadLibraryCalls34FreeLibrary401
5HalTranslateBusAddress305
76KernelIOConLoadLibraryWtrol120200
8LocalAlloc240
9MapPtrToProcess211
1011memcpmemsety4674331834
27911MmMapIoSpace1213NdisAllocateMemory1230
1415NdisInitializeWNdisMRegisterInrappterrupter110100
16NdisMSetAttributesEx311
17NdisMSynchronizeWithInterrupt500
1819RegOpQueryPenKeyExWerformanceCounter133000
20SetProcPermissions119
21VirtualAlloc001
2322wcscpVirtualCopyy640000
24wcslen1100
mentationallinfluencetherequiredtime.However,someobservationsduring
thecourseofimplementingtheinjectionframeworksuggeststhatthereare
differencesacrosstheerrormodels.
WhereasTable6.14showsthatBFandFZareclearlymoreexpensive
intermsofexecutiontimecomparedtoDT,amajordrawbackwiththe
DTerrormodelisthecostforimplementation.Thedifferenceliesinthatfor
toevberyetrackfunctioned,sucinhthethatservicetheinrightterfaceinjectorthecandatabteycpehosen.ofeachBFandparameterFZonneedsthe
otherhanddonothavethisrequirement,makingtheirimplementationcosts
considerablycheaper.Bothusesimpleinjectiontechnologies,makingthetwo
modelscomparableintermsofimplementationcosts.Additionally,thetime
ORERRCOMPOSITE6.5.MODEL
107
requiredfordefiningtheinjectioncasesforeachdatatypeisconsiderable
higherfortheDTmodel.
ThecostfortheDTmodelcouldpotentiallybereducedbyuseofauto-
maticparsingtoolsand/orreflection-capableprogramminglanguages.The
implementationcostisalsoaone-timecostforeachdriverwhichmightbeac-
ceptableiftheexperimentsaretoberepeatedinaregressiontestingfashion.
Ftionurthermore,effortsused.theFcosturthermightresearcbehonacceptablethetrueincostscomparisonforsuchtoerrorothermovdelserifica-is
ted.arranwindeednotItisrequirealsodataimptyortanpettotracnotekingthattheyevcanenbthoughenefitthefromBFit.andByFZknomowingdelthedo
datatypeused,thenumberofbitstargetedcanbelimitedfordatatypes
notusingallbitsanyway(suchas8-bitintegers).Thistechniquehasbeen
appliedintheexperimentspresentedinthisthesis.
6.5CompositeErrorModel
Twomajorfindingscanbeextractedfromthepreviouslypresentedresults,
namely:a)thatBFidentifiesthemostClass3failures,bothintermsof
absolutenumberandinthenumberofindividualservicesidentified,andb)
FZ,eventhoughnottriggeringasmanyfailuresasBFidentifies,identifies
Class3failuresforadditionalservices,beyondthoseidentifiedbyBFand
bined.comDTEventhoughtheseresultsmustbeinterpretedinthecontextofourcase
studytheyshowdifferenterrormodels,althoughbeinginjectedonthesame
levelandthereforebeingcomparable,havedifferentproperties.Itwould
bedesirabletocombinethemodels(BFandFZ)intoacombinedmodel,
drawingonthestrengthsofbothmodels.Wedothisbycombiningthetwo
modelsintoasocalledcompositemodel(CO).
ForthecompositemodelwewillfocusontheClass3failureclassasit
inmostcasesisthecriticalclassforrobustnessevaluation.Themainhurdle
foruseoftheBFmodelwasthecomparativelyhighnumberofinjections.
Thuswefirstfocusonreducingthenumberofinjectionsrequiredbystudying
theimpacteachbitinjectionhasandselectingonlyasubsetofthebitsfor
injections.FurthermorethenumberofinjectionsrequiredfortheFZmodel
isstudiedtofindareasonableinjectionset.Thefollowingsubsectionswill
studies.thesedetail
108
CHAPTER6.ERRORMODELEVALUATION
6.5.1DistinguishingControlvsData
Asmentioned,thenumberofrequiredinjectionsforBFincreasesthere-
quiredexecutiontimedramaticallycomparedtotheothertwomodels.The
highnumberofcasesforeachparameterisduetothefactthatoneinjection
ismadeforeachbitintheparametervalue,thustypically32injectionsper
parameter.Foraparameteroftypeintholdinganintegervaluethisuniform
injectionmayrepresentavalidselectionoferrorvalues.However,inmany
cases,especiallyfordevicedriverswritteninC,anintegervaluemaynot
actuallybeusedtorepresentall32-bitintegervalues.Insteadonlyasmall
subsetofthevaluesareused,andconsequentlyonlyasmallsubsetofthe
bits.Itisthereforeconceivablethatnotall32bitsneedtobetargeted.
10 8 6Number of Class 3 service failures 0024681012141618202224262830
4 2Bit position
Figure6.2:ThenumberofservicesidentifiedbyClass3failuresbythe
del.moBFFigure6.2showsforeachofthetargetedbitshowmanyserviceswhere
identifiedhavingClass3failureswhenbit-flipswereinjectedinthatbit.
Itcanfromthefigureclearlybeseenthatthereisnouniformdistribution
acrossthebits.Thelowerorderbits,bits0-9,identifymoreservicesthanthe
otherbits,withtheexceptionofthemostsignificantbit(31)whichtypically
hasspecialsignificance,suchasbeingthesignbitforsigneddatatypes.
Figure6.2showthenumberofspecificservicesidentifiedwithvulnera-
bilitiesforeachservice,butnotwhethertheservicesidentifiedbybit0are
thesameasthoseforbit1.ForthisweuseFigure6.3whichshowsthecu-
MODELORERRCOMPOSITE6.5.
20 15Accumulated number of Class 3 failures 10 5 00246
81012141618202224262830
Bit position
109
Figure6.3:Movingfrombit0andupwardsthenumberofservicesincreases
10.bittilun
mulativenumberofidentifiedservices.Readingthefigurefromlefttoright
itshowshowmanyservicesareidentifiedbyfirstinjectinginbit0,followed
byadditionofinjectionsinbit1,then2andsoon.Itshowsthatthesetof
vulnerableservicesincreasesinsizeuptillbit9wheretwentydifferentser-
viceswereidentified.Anotherserviceisfoundatbit27andthelastoneat
bit31,whichaspreviouslymentionedoftenhavespecialsignificance.Closer
inspectionrevealsthattheservicefirstfoundatbit27isalsoidentifiedby
31.bitTheobservationsmaderegardingtheimpactofindividualbitssuggests
thatthesubsetofbitsforwhichbit-flipinjectionsshouldbemadecanbe
reducedtoonlyincludebits0-9and31,i.e.,intotal11bitscomparedto
theoriginal32bitsconsidered.Thistranslatesintoareductionofinjections
by49.8%intotalcomparedtothefullsetusedpreviously.Somefaultin-
jectiontoolssupportsuchspecificationofinjections,likeXception[Carreira
1998].al.,etWhenstudyingtheparametersusedforservicesidentifiedbyBF,butnot
byFZ,showsacleartrend:manyoftheseparametersarecontrolvalues,like
pointerstodata,handlestofiles,modules,functionsetc.Suchparameters
areintuitivelymoresensitivetosmallvaluechanges,i.e.,changescausedby
flippinglowerorderbits.Asanexampleconsiderapointertosomedata
storedinanallocatedmemoryarea.Largechangestothepointervalue
110
CHAPTER6.ERRORMODELEVALUATION
(changesofhigherorderbits)aremorelikelytocausetheerroneousvalueto
lieoutsidetheallocatedmemoryareathanasmallerchange.Memoryaccess
errorsthoughbeingseverecanbedetectedbythesystem(insomecases),
however,smallchangesmaybehardertodetectandmaycausefailuresthat
arehardertoprevent.Similarly,theFZmodel,usingrandomvalues,ismore
likelytochoosevaluesthatarewelloutsidetheexpecteddatarange.This
isreflectedinthedifferenceobservedbetweenthetwomodels.Ontheother
hand,FZ’srandomnaturemeansitcanfindvulnerabilitiesnotfoundby
more“structured”approaches,reflectedinthefactthatFZidentifiesseveral
additionalservicefailuresontopofthosefoundbyBFalone.
6.5.2TheNumberofInjectionsforFuzzing
2.52 1.5Diffusion value1 0.5
0345612
atadisk91C111cerfio_serial
789101112131415
Number of FZ injections per parameter
Figure6.4:StabilityofDiffusionfortheFZmodelwithrespecttothenumber
injections.of
SincetheFZmodel,incontrasttoBFandDT,requirestheevaluator
toquestionsetthefornumbjudgingeroftheinjectionsusefulnesstobeofptheerformed,FZerrorthisbmodel.ecomesanPreviouslyimp,ortanwte
haandveusefulalreadyresults.shownhoOnewthequestionfifteenremaininginjectionsisperformedwhetherprofifteenvideinjectionscomparableis
sufficientforassessingerrorpropagation.Figure6.4showshowtheDriver
ErrorDiffusionvaluesstabilizeasthenumberofinjectionsisincreased.The
vacrossaluesthestabilizedriversaftersuggestingroughlythatteninjections,stabilizationbutmaytherealsobearedrivsomeerdepdifferencesendent.
MODELORERRCOMPOSITE6.5.
111
However,forthesethreedriversthecurvesremainclearlyseparatedforany
numberofinjectionsshown.
6.5.3CompositeModel&Effectiveness
Theresultsfromtheprevioussectionclearlyshowtheneedforusingmulti-
pleerrormodels.Whenresourcesareplentifulitisthereforerecommendable
tousemultipleerrormodelstogetcomprehensivecoverage.Todecrease
thecostofevaluation(inimplementationandexperimentationtime/effort)
thismaynotbedesirable.Inthissectionwethereforeproposeandevalu-
ateacompositemodel(CO).ThenewCOmodelcombinestheBFmodel
usingleastsignificantbits(togetherwiththemostsignificantone)alongside
aseriesofFZexperiments.Section6.5.2indicatesthatevenasfewasten
cFZhosentoinjectionsdecreasegivethestableoverallDrivnerumbErrorerofDiffusioninjections,vbutalues.itisThisnreasonableumberwthatas
moreFZinjectionswillincreasetheprobabilityoffinding“rare”cases.
Table6.19:Diffusionvaluesforthethreedrivers
DiffusionerDriv1.58serialcerfio0.9191C1111.01atadisk
Class 3 Class 2 Class 1
45.0Class 3 Class 2 40.0Class 135.030.025.0Failure class distributions [%]20.015.010.05.00.0BFDTFZCOBFDTFZCOBFDTFZCO
cerfio_serial91C111atadisk
Figure6.5:FailureclassdistributionforCOcomparedtoBF,DTandFZ.
112
CHAPTER6.ERRORMODELEVALUATION
TheCOmodelisevaluatedbyconsideringtheBFinjectionsinbit0-
9servicesandbitha31,vingtogetherClass3withfailuresthefirstbuttenoneFZ(VirtualAlloinjections.c)COcomparedidentotifiestheall
fullFiguresetof6.5.BFThe+figureFZshoinjections.wsthatAntheovresultserviewacofhievtheedwithresultsthisisshosubsetwnofin
injectioniscomparablewiththeresultsfortheothermodelsalone,making
itsabilitytoassesspropagationeffectsonparwiththeothermodels.Thisis
presenfurthertedinsubstanTabletiated6.19.bytheDriverErrorDiffusionvaluesfortheCOmodel
4000 3000Number of injections 2000 1000
0
All BF & FZCO only
atadisk91C111cerfio_serial
Figure6.6:Thenumberofinjectionsforthecompositemodelcomparedto
together.fuzzingandbit-flipsThenumberofinjectionsrequirediscomparedinFigure6.6.Thefig-
ureclearlyshowsthatthenumberofinjectionsiswellbelowhalfofthe
experimentsrequiredforthefullsetofBFandFZ,whichtranslatesinto
considerablesaveinexperimentationtime.
Discussion6.6Thissectiondiscussessomeimportantaspectsoftheworkandtheresults
providedforthecasestudy.
ortsExpvs.ortsImpInthesystemmodelpresentedinChapter3adistinctionismadebetween
importsandexports.Thedatapresentedinthischapterhasnotmadeany
DISCUSSION6.6.
113
distinctionbetweenimportedandexportedservices.Thereasonissimple,
innocasewereaninjectioninanexportedserviceabletotriggeraClass
system.theinfailure3
Table6.20:Acomparisonbetweentheresultsforimportedandexported
services.
DriverInterfaceClass3Class2Class1
3310ortexpcerfioserialimport7445035
91C111impexportort6405004160
atadiskimpexportort6603025743
Therecouldbemanyreasonsforthiseffect.Firstofallthenumber
ofservicesintheexportedinterfaceislowerthantheimported(typically
around10comparedto30-40),whichmakesitreasonablethatmostfailures
willbefoundinthelargerset.Secondly,theexportedinterfaceisastandard
interface,usedbymanydrivers.ItisthereforelikelythattheOStakesspecial
caretovalidatemisusesoftheseservicesandthatmajorflawshavealready
beendetectedduringtesting.Thirdlytheseservicesverycloselymatches
OS-Applicationservices,whichsuggeststhattheamountofadditionalwork
donebytheOSissmallfortheseservices.Consequently,theeffectserrors
canhavewillbemostlyontheapplicationsthemselves,asClass2andClass
failures.1
ExperimentalTechniques
Aswithanyexperimentalevaluationtechniqueitisimportanttoconsiderthe
limitationsofthechosenapproach.Uncertaintiesareintroducedatmultiple
levelsandtheyneedtobeidentifiedandunderstoodtoproperlyinterpret
results.theFirstofalltheerrormodelusedandevaluatedherearegeneric.Theyare
notbasedonanyspecificsystemscenario,butratherrepresentthesubset
ofdatalevelerrorsoccurringattheOS-Driverinterface.Ifsystem-specific
faultsaretobeconsideredmorespecificerrormodelsneedtobeincludedas
wellorinsteadofthegenericonespresentedhere.Furthermore,evenforthe
subsetofconceivableerrorsappearingatthisinterfaceonlyasmallfraction
isactuallyused.Theresultsprovidedinthischaptersupportsourbelief
114
CHAPTER6.ERRORMODELEVALUATION
thatthesearerepresentativeforawiderselectionoferrorssinceeventhough
therearedifferencesbetweenthemodels,theyoverallshowasimilarpattern.
Secondly,theresultspresentedareinfluencedbyexternalfactorssuch
astheselectedworkloadandthecompositionofthesystem(theselected
OScomponents).Tominimizethevariabilityoftheresultsandtominimize
unpredictableinfluencesweuseatargetedgenericworkloadandminimize
thenumberofsystemcomponents(seesection3.3).Foraspecificsystema
workloadcloselyresemblingtheexpectedoneshouldbeusedaswell,andthe
systemshouldbecomposedsuchthatitresemblesthefinalsystemasclose
ossible.pasFinally,theexperimentalproceduresthemselvesmaybeasourceofin-
fluenceonthefinalresults,bothintermsofwhatandhowtheoutcomesare
observed,andanyundesirableinfluencecausedbytheaddedsoftwareused
fortheexperimentation.Wehavefollowedcommonpractiseintheselec-
tionofobservationpoints,namelyfromauserperspective.Furthermore,we
haveminimizedthenumberofcomponentsrequiredfortheexecutionofthe
experimentsandmadeeffortstominimizeanypotentialimpacttheymight
have.However,astheexperimentshavenotyetbeenrepeatedinasimilar
environmentwecannotbe100%certainthatnosuchinfluenceexists.
Bugvs.yulnerabilitVtheAninpresenceterestingofquestionClass2arisesandwhenClass3studyingfailurestheisanresultsindicationpresentedofis“bugs”whetherin
thesystem.Theanswerisbothyesandno,sinceavulnerabilitydiscovered
bysystemexp2.erimenAtalcommonfaultcase,injectionespeciallymay,forormadeviceydrivnotbers,eispresenthattdoinacumendeplotationyed
rulesstatesmathatynotcertainberulesenforced,shouldforpbeeobrformanceeyedwhenreasons,usinge.g.,specificthecostservices.ofchecTheseking
eachagreemenparametert”isvused,aluewhereforathedrivOSermayassumesbetoothathigh.servicesInsteadareanot“genmisused.tlemen’sIf
asystemdriverindotheesmisusetraditionalsuchasense,servicebutitsurelymaisyanotberobustnessconsideredvulnerabilitabugy.inSuctheh
vulnerabilitieshaverecentlyattractedmoreattentioninresearch,sincethey
systemconstitutesnon-respthreatsonsivtoetheandsystem’stherebysecuritthreatenywthehereavanailabilitattackyerofcantherendersystem.the
allAsdiscoveredpreviouslyfailuresmenoftionedthewesystemfocusonvulnerabilities.robustnessandthereforeconsider
2Hencetheuseofthetermvulnerabilityinsteadofbug.
DISCUSSION6.6.
115
InjectionStructuredvs.RandomThereisanongoingdebateinthetestingcommunitywhetherrandomtest-
ingisanappropriatetestingtechniqueingeneral,oroneshouldaimfor
moreclassicaltechniques,suchasequivalencecallsorboundaryvaluetest-
ingHamlet[2006].Ourchoiceofmodelsreflectthisconflict,whereDTand
BFrepresentmorestructuredapproaches,whereasFZintroducesrandom-
ness.Theresultspresentedalsosupportsmanyresearchersviewonrandom
testing,namelythatithasmanyweaknesses,butmayinsomecasesbe
preferred,becausenoalternativeisdefinitelybetter.
Theadvantageofstructuredapproachesisthattheycandrawfromex-
istingknowledgewhenselectinginjectioncases,thisisforinstanceveryclear
inthecaseofDT.DTisontheotherhandlimitedtotheabilityofthe
evaluatortoselectappropriateinjectioncases,aninherentlyverydifficult
task.BFmakesthistasksimpler,bydefininginjectionsbasedontherep-
resentationoftheinjectiontarget(theparametervalue),butisstilllimited
tothespecificmodificationsdonebyflippingthebits.FZimposesnosuch
restrictions,simplychoosingrandomlyselectedvalues.
TheresultsclearlyshowthatBFfindsmorevulnerabilities,inmoreser-
vices,andthatDTisclearlymoreefficient(requiresfewerinjections)than
FZ.However,FZisabletoidentifyservicesbeyondthesetidentifiedby
BFandDT,alsowithalimitednumberofinjections.
Overall,theresultsfavorusingmultipleerrormodels,andthecomposite
modelshowsthatusingthetwomodelsrequiringtheleastimplementation
effortcangiveverypromisingresults.
OperatorInvolvement
Thedegreetowhichtheoperator(thepersonsettinguptheexperiments
andsupervisingthem)isinvolvedintheprocessaffectstheeffectivetime
torequiredconfiguretoptheerformsystemtheandexpsperimenecifyts.whichFirstexpofallerimenthetsoptoperatorerform.isForrequiredthe
frameworkusedinthisthesisthistimeisthesameforeacherrormodel.The
secondtaskistosupervisetheexperimentsandwhenneededmanuallyrestart
boardsthathavehung.Fortheexperimentspresentedhere,ithappensin
manycasesthatthesystemcrasheswithoutbeingabletoautomatically
manrestartuallyitself.forceInathesecoldcasesrestarttheofoptheeratortarget(inbthisoard.caseTablethe6.21authorpresenself)tshasdatato
ontheamountofmanualrebootsrequired.
layedWhenuntilthetheopsystemeratorisunablenoticestotherestartproblemautomaticallyandtakesexpaction.erimentsTheareHostde-
116
CHAPTER6.ERRORMODELEVALUATION
Table6.21:ThepercentageofClass3failuresthatrequiredtheboardsto
bemanuallyrebootedbytheevaluator.
DriverErrorModelManualreboots[%]
8.1BFDTserialcerfio46.715.2FZ54.7BF23.1DT91C11137.0FZ9.1BF25.0DTatadisk16.7FZ
Computerisequippedwithawatchdogtimerthatnotifiestheoperatorifno
logmessageshavebeenreceivedwithinthelastfourminutes,wellbeyondthe
executiontimeofanexperimentthatisautomaticallyselfrebooted(which
isalsotriggeredbyawatchdogtimeoutasdescribedinSection4.5.3).
Sinceonlyelevenservicesoverallhavefailuresrequiringtheoperatorto
manuallyrebootthemachines,thenumberofsuchrebootsforagivendriver
anderrormodeldependsonhowtheseservicesareused,givingrisetothe
differencesreportedinTable6.21.
Afurtherdevelopmentoftheinjectionframeworkwouldbetoimplement
thehardwarerequiredtoautomaticallyrebootthetargetboardwhenthe
hostmachinewatchdogistriggered.
Asdescribedpreviously,agenerictimepenaltyisassignedeverymanually
rebootedexperiment.ThetimesreportedinTable6.14arethereforenot
time.eratoroptheconsideringTheresultsinTable6.21indicatestheefficiencyachievablewithwatch-
dogtimersmonitoringsystemprocesses.Formanyoftheinjectederrorsa
systemlevelmonitoringwatchdog,whichrestartsfailingprocessescouldin-
creasetheavailabilityofthesystem.Thisrequiresthat“micro-rebooting”of
thetargetedcomponentsispossible.Suchstrategieshaveforinstancebeen
deployedin[Candeaetal.,2004;Herderetal.,2007].
ProfilesBitofExtractionSection6.5.1usesthebitprofilestofindwhichbitsfindthemostservice
failures.applicableAtoverymanypracticalsystemsorquestioniftheisofresultscourseareifsptheecificfoundtotheprofilesetupisusedgenerallyfor
ORKWTEDRELA6.7.
117
theseexperiments.Togetaclearpictureofthismoredriversandsystems
wouldhavetobeprofiled,andrelationstospecificdriversandservicetypes
aluated.evfurther
orkWRelated6.7Therehavebeenseveraleffortsmadetocompareerrormodelsandtofind
representativeerrorstoinjectforspecificsystemsandpurposes.Inthis
insectionthiswethesis.reviewWehasomeveofthereforethemostlimitedrelevtheanteffortsselectionthattothoserelatethattothewconsiderork
softwarefaults,especiallywithfocusonOS’sandrobustnessevaluations.A
longertreatmentofrelatedworkisfoundinChapter2.
Albinetetal.havealsostudiederrorsindevicedriverbyinjectingerrors
inDTtheusingOS-Drivtheerinterminologyterfaceinthis[Albinetthesis,etal.,but2004].withaloThewernerrorumbemorodelfusedinjectionis
casescomparedtoours.InjectiononaLinux-basedsystemshowsahigher
ratiodifferencesofkberneletweenhangsthetthanwoobservsystems,edorinthethiscdrivershapter.tested.Thismaybedueto
faultArlatinjection.etal.ThestudyMAFtheALDdepAtooendabilitlisyusedoftomicrokinjectfaultsernel-basedandpsystemerformusingfail-
uremodeanalysis.Theerrormodelconsistsofbothinjectionsinparameter
vofaaluescomptoonenmicrokt.Fernelorbothservicesloandcationsinjectionsbit-flipsinarebothusedcotodesimandulatedatabothsegmensoft-ts
wareandhardwarefaults.Thetypeofinjectionsispartiallysimilartoours
(serviceparameters)butencompassonlytheBFmodel.Theresultsforpa-
rametersuggestinginjectionthatmicrokshowaernelveryarclowhitecturesratioofkareernelbetterhangsatandhandlingcrashes,theseptyossiblypes
monolithicthanerrorsofsystems.Jarbouietal.compareBFandDTerrormodelsfortheLinuxkernel
[Jarbouietal.,2002b,a].Firstlytheyalsofindadistinctdifferenceinthe
nuresultsmbertheofnumbinjectionserofsevererequiredoutcomesforBFaresmallcomparedandbtoothDTmo.delsSimilarlyshowtosimilarour
behaviorintermsoffailuremodedistributions.Thealsocomparethese
resultswitherrorsinjectedinsidethekernelcode(whichwehavenotdone)
andcertainobservleveleofahigherparameterratiovofalidationsevereperformedoutcomesforksuggestingernelservicesthattherepresentisina
system.theThefuzzingmodelwasfirstusedonforutilityprogramsforUNIXsys-
temsapplication[Millerinetal.,terfaces1990][HowandardhasandlaterLipner,been2006;appliedFre].alsoOehlertforprotomakescolsadis-and
118
CHAPTER6.ERRORMODELEVALUATION
formertinctionbmoreetweencloselyintelligenresemtblesandouruninDTtelligenmotdel,fuzzingwherekno[Oehlert,wledgea2005].boutThethe
andformatmayusedthereforeisbassumed.etterTheexplorelatterunexpdoesectednotinputs,requirewhicanyhwpriorasalsoknoshowledgewn
intheresultspresentedinthischapter.
fuzzingToatthebtheestofOS-Drivourerknointerfacewledgeandthisthethesisfirsttimerepresentsfuzzingafirstisquanefforttitativtouseely
comparedtoothererrormodelsinacomparablesetting.
6.8SummaryofResearchContributions
Thischapterpresentsacomparativestudyofthreedifferenterrormodels:
usingbit-flipsdata(BFfrom),adata-trepresenype(DTtativ)eandcasefuzzingstudy(FZconducted).TheonmodelsWindoarewsCEcompared.Net.
Furthermore,therobustnessmeasuresintroducedinChapter5areusedto
comparethethreemodelsontheirabilitiestotriggererrorpropagationin
theusedassystem.inputOvanderallthefollorecommendationswingkeyforobservfutureationsrobustnessaremade,evwhicaluations:hcanbe
•ThemeasuresderivedinChapter5areshowntobeusefulforstudying
failureandpropagationcharacteristicsoftheOS,identifyingservices
anddriverswithpotentialrobustnessvulnerabilities.
•TheBFmodelfindsmorevulnerabilitiesthantheothermodels.Italso
identifiesmoreservicesintheOS-Driverinterfacehavingvulnerabilities
makingitthepreferredchoiceforrobustnessevaluations.
•Allthreemodelsarewellsuitedtostudyerrorpropagationcharacter-
isticsusingespeciallytheDriverErrorDiffusionmeasures.Somedif-
ferencesacrossthemodelsareobserved,relatingtotheuseofcontrol
valuesintheinterfaces.
•TheDTmodelusesthefewestinjections,followedbytheFZmodeland
theBFmodel.Theuseofprofilingcanreducethenumberofinjections
forBFandacarefulstudyofthenumberofinjectionsforFZshows
thatonecanperformexperimentswithrelativelyfewinjections.
•IntermsofimplementationcoststheDTmodelisthemostcostly.
BFandFZarecomparablebutthetimerequiredforanyimplementa-
tiondependsonmanyfactors,includingskills,experiencesandavail-
abilityoftoolsanddocumentation.
6.8.•
•
YSUMMAROFCHRESEARCONTRIBUTIONS119
ForidentifyingservicevulnerabilitiestheBFmodelisthepreferred
choice.However,therandomnatureofFZallowsforfindingother
vulnerabilitiesthattheothertwomodelsdonotfind.
Anewcompositeerrormodelisdefinedasacompositionofbit-flips
areandfuzzingtargetedafterinjections.aAprofilingsubsetstepofrevtheaealingvailablewhichbitsbitsforhaaveaparameterhigher
system.theonimpact
120
CHAPTER6.ORERRMODELALUAEVTION
7Chapter
ErrorTimingModels-When
Injectto
When-inthetimedomain-shoulderrorsbeinjected?
Servicesusedindriver/OSinteractionsaretypicallyinvokedmultiple
timesduringthelifetimeofthedriver.Therefore,wheninjectingerrorsin
willservicesobviouslywhichareaffectcalledthemoutcomeultipleoftimesthetheexptimeerimenatt.whichaConsequennerrortlyis,coninjectedtrol-
evlingthealuation.timeatMultiplewhichtotheolserrorhaveisbeeninjecteddeviselopaedcrucialwhichpartalloofwtheforconrobustnesstrolof
thetimeofinjection.Mosttoolsallowinjectionbasedonuser-definedevents
orhasbeenaccordingspentotonsometimestrategiesfordistribution.selectingHowevinjectioner,times,surprisinglybeyondlittletimeresearcdis-h
tributions.Thischapterisdevotedtoanoveltimingmodelusedforrobustnesseval-
injectuation-ofbyOS’sdefiningtoaerrorsusageindeviceprofiledrivofaers.drivIter,helpswhichanswcaneringbeusedRQ5to-conwhentrolto
andtationselectshowsthethattimeconatwhictrollinghthethetimeerrorsofareinjectioninjected.isindeedExtensivimpeeortanxpt,erimen-and
furthermorethatitseffectivenessdependsontheusageprofileofthedriver.
121
122
ductiontroIn7.1
CHAPTER7.ERRORTIMINGMODELS
Whendiscussingtimingissuesforfaultinjectiontwoaspectsoftheerrors
injectedarerelevant:thetimeatwhichitisinjectedandthedurationit
staysactive.Thischapterfocusesontheformerpropertyofanerror.
Forthedurationofinjectederrorswefocusonsoftwarerelatederrors.
Thesystemisassumedtofunctionproperlywhennoerrorsareinjected.
Thedurationofinjectederrorsistransient.Transienterrorsappearand
thendisappearshortlythereafter.ThismodelreflectsHeisenbugs,i.e.,those
softwarefaultswhichduetoexternalconditionsdonotdeterministically
reoccureverytimethesystemisused.Suchfaultsarehardtofindwith
traditionaltestingtechniquesandmaythereforeoccureveninwelltested
systems.InjectionsareperformedintheOS-Driverinterface,thuslimiting
thepotentialinjectioninstancestowhenservicesinthisinterfaceareused.
Thetransientmodeltranslatesintotheerrorbeinginjectedonceforthe
targetedserviceandthendisappearingbeforethesecondcalltothesame
service.Twomaingenericstrategiesexisttotriggertheinjectionofanerror:
event-triggeredandtime-triggered.Intheformerapproachspecificevents
areusedtotriggertheinjection,andinthelatterapproachtimeisused
totriggerinjection.Event-driveninjectiontypicallyallowsforamorefine-
grainedcontroloftheindividualinjections,butrequiresthetriggeringevents
tobedefined.Time-triggeredinjectionreliesonalargernumberofinjections,
distributedovertime,andconsequentlyrequiresmoreinjections.
Thischapterpresentsanapproachextendingtheevent-triggeredapproach
presentedinChapter6,whereerrorsareinjectedinthefirstcalltoaservice,
asocalledfirst-occurrenceapproach.First-occurrenceonlytargetsthefirst
calltoaservice,disregardinganysubsequentcalls.Theusageprofileofthe
driverisusedtobuildausagemodelofthedriver,andtheservicecallsto
betargetedcanbeselectedtocoverawiderspectrumofsystemstates.
Therestofthechapterwillbestructuredasfollows:Firstadiscussionon
thetwoalternativetimingmodelsprovidesthefoundationandbackground
neededfortherestofthechapter.Thedriverusageprofilemodelispresented
anddiscussed,followedbyadescriptionoftheevaluationcriteriausedforthe
experimentalevaluationoftheapproach.Thedescriptionofthetheimple-
mentationandtheresultsarethenpresentedanddiscussed.Theconclusions
madeandasummaryoftheresearchcontributionsfollowsinthelastsection.
MODELSTIMING7.2.
Timing7.2delsMo
123
Thetimeatwhichanerrorisinjectedisalsoreferredtoasthetriggering
mechanismfortheerror.Theneedforcontrollingandmonitoringthetrigger-
ingeventhaspreviouslybeenidentifiedasimportant,butinherentlydifficult
[Whittaker,2003].SeveralofthefaultinjectiontoolssurveyedinSection2.3
allowforcontrollingthetriggeringoferrors,atleasttosomeextent.Onan
abstractlevelanerrorisalwaystriggeredbyanevent.
Forpracticalpurposesonemakesadistinctionbetweeneventstriggered
byspecialeventstakingplaceinthesystem,andthosetriggeredbytime
alone,givingrisetothetwoclassesoftimingmodelsforfaultinjection,
time-triggered.andt-triggeredenev
riggert-TenEv7.2.1loInctheation-bevaseendt-triggeredinjection.case,Thisthestrategymostiscommonbasedonapproacthehpremiseforsoftthatwaresinceis
thesystem’svulnerablestatescannotgenerallybepostulateda-priori,the
ofevaenmtsodule.triggeringThesimplestinjectionofaresuchbasedonstrategiesreacishingtoinjectcertainthelofirstcationstimeinathecertaincode
locationisreached(first-occurrencestrategy).
Alocation-basedapproachisrelevantespeciallyforcodeinjection,where
errors(orfaults)areinjecteddirectlyintothesourcecode(orexecutable
binary)tomimicsoftwarefaults[Dur˜aesandMadeira,2002;NgandChen,
2001],orintotheinstructionstreamoftheCPU[Guetal.,2003].Thiscomes
fromthefactthatthesoftwarefaultsmimickedhavespecificlocationsinthe
de.cotrigger,Avi.e.,ariationtheoferrortheisfirst-oinjectedccurrenceafterncallsapproactohaistoservice,useorannafter-oreacccurrencehing
alocationforthenthtime.Thisapproachisageneralizationofthefirst-
occurrenceapproach,butrequirestheusertosetthevalueofn,whichis
farfromtrivial.TheapproachisimplementedforinstanceintheFERRARI
tool[Kanawatietal.,1995].
injectionIn[Tsaiusingetal.,bit-flips.1999]ToCPUmaximizeregisterstheandactivationmemoryrateareoftargetedfaults,andfortheirfault
impact,thetimingoftheinjectionsarecontrolledbytheworkloadinthe
system.Offlineanalysisdetectspathsthroughtheworkloadforagivenset
ofrateofinputstheandCPUfaultsisareusedtoinjectedperformalongtheinjectionspaths.atpAlternativeak-usageely,times.theactivation
toevThealuateadvansptageecificoflothecationseveninat-triggeredsystem,orapproacashspisecificthatevitencantsbtakeetailoredplace.
124
CHAPTER7.ERRORTIMINGMODELS
Itcanthereforespeeduptheevaluationprocessbyreducingthenumber
ofinjectederrorsandfocusingononlytherelevanteventsinthesystem.
Thedisadvantageisthatoftentheseeventsneedtobedefinedbytheuser.
Selectingthemisadifficultprocess,possiblyrequiringdeepunderstanding
ofthesystem,itscomponentsandtheirinteraction.
riggerTime-T7.2.2Whenusingatime-triggeredapproachatimeoutisdefined,afterwhichthe
errorisinjected.Typicallyalargenumberofinjectionsareperformed,and
theirinjectiontimesfollowsomespecificdistribution(e.g.,uniform,normal,
exponentialetc)[KaoandIyer,1994;Hanetal.,1995;Rodriguezetal.,2002].
Inthiscase,oftenthelocationisalsorandomlyselectedacrossasetofpre-
definedlocations.Thisapproachiscommonwhensimulatingphysicalfaults
(radiation,EMIetc.)whichareinherently“random”innature[Karlsson
1994].al.,etAlternatively,thetriggeringeventisdefinedasacombinationoftimeand
location,suchthatafterthetimeouthaselapsedtheerrorisinjectedthatapre-
definedlocation,possiblyusingfirst-occurrence,orafterthenoccurrence
ofacall.FERRARI,forinstance,allowsforspecifyingatimedistribution
(suchasuniform)afterwhichthefaultisinjected[Kanawatietal.,1992,
1995].Manyfaultinjectiontoolssupportbotheventandtime-triggeredinjection
[Carreiraetal.,1998;Kanawatietal.,1995],butstillleavestheburdenof
choosingeventsand/ordistributionstotheuser.
Thetime-triggeredstrategydoesawaywiththeburdenofselectingtrig-
geringevents,butontheotherhandinsteadreliesonalargenumberof
injectionstogetstatisticallysignificantresults.Inanycaseadistribution
needstobeselectedandjustified.
ProfileUsageerDriv7.3AsseenfromSection7.2bothmodelshaveadvantagesanddisadvantages.
Sinceeffectivenessisoneofthekeygoalsofrobustnessevaluationandwith
ourfocusonsoftwarefaultswehaveoptedforanevent-drivenapproach.
Theevent-driverapproachismoresuitablefordrivererrorsasitiseasierto
controlandismorefine-grained.Itisalsomoresuitableforsoftwarefaults.
Asmanyservicesarecalledmultipletimesduringtheexecutionofthe
testapplicationsthequestionariseswhichofthesecallstotarget.Targeting
eachcallwillgenerallynotbepossibleduetothelargenumberofcalls.Thus
7.3.DRIVERUSAGEPROFILE
125
aselectionofasubsetisrequired.Thesimplest,andmoststraight-forward
approachistousethefirst-occurrenceapproach.Thisisindeedtheapproach
usedinjectingintheininpreviousterfacescbehaptertweenandcompinsevoneneralts(sucotherhaprosdrivjects.ers),Howtheever,injectedwhen
haerrorvesevmayeralstemdistinctfromfaultseveralorigins.distinctThus,locationsinjectioninonthefirst-ocompccurrenceonent,i.e.,willmaonlyy
targetasubsetofthosepotentialfaultlocations,namelythosecorresponding
tothoughthefirstthedrivcall.er(andSubsequenindeedtthecallstowholeaservicesystem)mawillybnoteinbeadifferentargeted,tevstateen
whichmaybeofinteresttoevaluate.
Wehavedevelopedamethodologytoselecttherelevantserviceinvocation
basedontheobservationthataservicerequestfromanapplicationtranslates
aintosmallone,orsubsetmoreofthecallspintoossiblethedrivsequenceserbyofthecallsOS.thatInancanybepracticalmadeconaretextactuallyonly
observed.Asanexample,itdoesnotmakesenseforanapplicationtoread
fromfromatrying.filebeforeSinceitwehasbasebeenouropevened,aluationalthoughonatheresystemiswhicnothinghisstoppingfunctional,it
itiscanbexecuted.eexpectedThus,thatthesucophbehaerationalviorbisehanotviorpresenoftthewhendriverthewcanborkloadebrokuseden
downintoaseriesofcallstothedriver,wherecertainsub-sequencesare
ofmorecallsfrequenmadetfromthanothers.applications,Suchtherebysubsequencesdefiningtherepresen“opterations”commonpsequenceserformed
betc.ytheSucdrivhaer,nsucelemenhastary“creatingsequenceafile”,oforcallsis“settingcalledaconnectioncallblock.parameters”Inour
model,thecallblocksareusedtotriggertheinjectionoferrorsintheOS-
Drivinjection,erinterface.comparedTherebtoyfirst-ogivingaccurrencemoreandfine-grainedtime-triggeredcontrolovinjection.erthetimeof
proThevidedbdrivyertheusagedriverprofile(theisdsx.ydefinedservicesasaninorderedFigurelist3.1).ofThiscallstodefinitionservicesis
slightlydifferentfromthetraditionaldefinitionofanoperationalprofileas
icallydefinedfordefinedasinstancethebyfrequencyMusa[Musa,distribution1993].acrossTheopacomperationalonent’sprofileisfunctions.typ-
Ourusageprofileadditionallyconsiderstheorderinwhichthefunctionsare
called.Thelistofcallsmadeistermedthecallstringofthedriver.The
callsequencesstringofisthefurthercalldividedstring.intoasetofcallblocks,whicharedisjointsub-
StringCall7.3.1callsFiguremade7.1todrivillustrateserstheservicesdriverareusageillustratedprofileasasatrectanglesime-serviceinthefigurediagram.(a−Thed).
126
CHAPTER7.ERRORTIMINGMODELS
Thecallstringisformedbyassigningtokenstoeachservicefromapredefined
alphabetandthenforeachcalladdingonetokentothelist.Thecallstringfor
theexampleinFigure7.1isthusababcdabdab.Ascanbeseen,somesequences
ofcallsarerepeating,formingcallblocksassubsequencesofthecallstring
(α,βandγ).Notethatsequentialexecutionofthedriverisassumed.
α1α2βα3γα4
abbacdabbda
Time
Figure7.1:Exampleofcalledservices.
7.3.2CallBlocks
TheexampleinFigure7.1showsthatservicesaandbarecalledmultiple
timesduringtheexecution.Eachcalltoserviceaisfollowedbyacallto
b;aandbthusformacallblock,whichisrepeatedduringtheexecution
ofthedriver.Thesequencescdanddarenotrepeating,andcannotbe
addedtoanyothercallblock.Thesesequencesformthenon-repeatingcall
blocks.ThecallstringisthussplitintocallblocksasindicatedinFigure
7.1.Usingaconventionalregularexpressionsyntaxthethesequencecanbe
representedcompactlyas(ab){2}cd(ab)(d)(ab),where(ab){2}means
thatthesequenceabisrepeatedtwice.
Currentlytheassignmentofcallblocksisperformedthroughacombina-
tionofidentificationofrepeatingblocksandaprioriknowledgeregardingthe
functionalityofthedriver.Ascallstringgrowinsizeautomatedtechniques
willberequiredtohandlethelargenumberoftokens.Section7.6discusses
theuseofspecialdatastructurestoautomatecallblockidentification.
Infigure7.2asimilarcallblockstructureisillustrated,andadditionally
showstheservicescalledwithineachcallblock.Whentargetingservicesi
usingthefirst-occurrenceapproachinjectionsareperformedonlythefirst
calltothatservice.Subsequentcallstosi,forinstanceinα2orα3,arenot
targeted.Usingthecallblockstrategymultiplecallstosicanbetargeted.
PhaseserationalOp7.3.3Ingeneral,adriver’slifetimecanbesplitintothreedisjointphases:initializa-
tion,workingandcleanupphases,asseeninFigure7.3.Intheinitialization
phasethedriversetsuprequireddatastructuresandregistersitspresence
withtheOS.Thereafterfollowstheworkingphase,wherethedriverperforms
7.3.DRIVERUSAGEPROFILE
sisjsksism
sisjsksism
127
α1α2βγα3α4
TimeFigure7.2:ExampleofadrivercallingOSservicesindifferentcallblocks.
workonbehalfofapplicationsortheOSitself.Finally,thecleanupphase
unregistersthedriverwiththeOSandreleasesanyresourcesheld.
Theoperationalphasesbecomerelevantwhendiscussingselectionofcall
blocksforinjection.Intuitivelyitcanbeexpectedthatfailuresintheinitial-
izationandcleanupphasesaremoreseverethanintheworkingphase,asin
thesephasesthedriverinteractswithmanyOSserviceswhichmayaffectthe
statenotonlyofthedriverbutthewholesystem.Theworkingphaseonthe
otherhand,iswheredriversspendthemosttime(maythereforehavebeen
moreextensivelytested)andperturbationsmaybeexpectedandtherefore
consideredbydevelopers.
OperInitializationClean up phaseational phasephase
time
Figure7.3:Theoperationalphasesofadriver.
Inthisworkwearemostlyfocusingonthedriver’soperationalphase.
However,withapplicationlevelknowledge,similarphasescanbedefined
alsofortheworkloadused.Lookingattheworkloadusedforthecasestudy
(Section4.5.4),thedriverspecifictestapplicationscanbedecomposedinto
tworoundsofinitialization,workingandcleanupphases,asillustratedin
Figure7.4.Notethatthisisspecifictothesetestapplicationsandrequires
accesstoandknowledgeoftheapplications.
Eachcallblockistargetedforinjection,i.e.,eachoperationperformed
bythedriver.Forcallblocksthatarerepeatedmultipletimesafiltering
maytakeplace,toreducethenumberofinjections.Preferablyatleastone
128
CHAPTER7.ERRORTIMINGMODELS
InitializationOperationalClean upInitializationOperationalClean up
time
Round 1
Round 2
Figure7.4:Theoperationalphasesoftheworkload.
callcallbloblocckk,petheropfirstoerationalccurrencephaseapproacshouldhincanthisbecaseusedboneeachtargeted.serviceForcalledeach
inblocthatk,callgivingbloriseck.toNotesomethatcalltblocypicallyksnotrequiringallmanservicesyareinjectionscalledandinsomeeachfew.call
SetuptalerimenExp7.4Toevaluatetheusefulnessoftheproposedapproachanimplementationhas
beenmadefortheWindowsCE.Net.TheserialportdriverandtheEther-
netdriverwereselectedandcallblockswerederivedforbothdrivers.
Theproposedapproachwillbecomparedtoatraditionalfirst-
occurrenceapproach.Themaincriteriausedforthecomparisonwillbe
thenumberofinjectionsrequired,thefailureclassdistributionobservedand
thenumberofseverevulnerabilitiesobservedforeachofthetwoapproaches.
Thissectionfirstpresentsthetwodrivers,theinjectionstrategyand
detailstheselectedcallblocksidentifiedforeachofthedrivers.
ersDrivargetedT7.4.1Twodriversareselectedforthiscasestudy,theserialportdriver
(cerfioserial)andEthernetdriver(91C111).Thetwodriversarewellsuit-
ableforevaluationastheya)representfunctionalityfoundinallmodern
OS’s,andb)representdifferentfunctionalities,givingrisetoadifferentus-
OS.theofprofileageThedifferenceinOSusageprofilecanbeillustratedbystudyingthe
frequenciesatwhichOSservicesarecalledbythedriversforsomeworkload,
inourcasethetestapplicationsdescribedinSection4.5.4.Figure7.5and
7.6showthedifferenceinprofileforthetwodrivers.Thex-axesshowthe
servicescalledbythetwodriversandthey-axesshowthenumberofcalls
SETUPALEXPERIMENT7.4.
129
madetoeachservice.Thetwofiguresclearlyshowthatthecerfioserialcalls
ahighernumberofservicesandmorefrequentlythan91C111.
160 140 120 100 80Nr of invocations 60 40 200Services
60 50 40 30Nr of invocations 20 10 0
ServicesFigure7.5:Callprofileforcerfioserial
0ServicesFigure7.6:Callprofileof91C111
cerfioserialuses41services.Onaverageaserviceisinvoked30.5times
forthegivenworkloadwithastandarddeviationof53.5andmedianof2.
Thisreflectsthefactthattherearesomeservicesthatareusedfrequently
(forreading/writing,synchronizationetc.)andsomeonlyonceortwice(like
configurationofthedevice).For91C111theaveragenumberofinvocations
is5.4withamedianof1andstandarddeviationof11.7.
Bothfiguresshowthatmanyservicesarecalledmultipletimes,indicating
thatfirst-occurrencemaynotfindallvulnerabilities.Thedifferencebetween
130
est ApplicationsT
Operstem & libsyting SaTrackerInjector
erget drivraT
CHAPTER7.ERRORTIMINGMODELS
ExperimentanagerM
- Exp- Exp S. Synch.etup
ggingo- Lting- Restar
Figure7.7:Theexperimentalsetup.
HosteromputC
thetwodriversalsosuggeststhatasmoreservicesarecalledfrequentlyfor
cerfioserialitistobeexpectedthatthecallblockapproachwillbemore
effectiveforthisdriver.
delMoError7.4.2toevBuildingaluateonthethetworesultsapproacfromhes.ChapterThe6mothedelwbit-flipasc(hosenBF)masoitdeliswathescmosthosen
vulnerability-revealingmodeloftheonesevaluatedinChapter6andwill
thereforebetterexplorethepotentialofbothapproaches.Focuswillbeput
onthemostsevereclassoffailuresandexperimentsarethereforefocused
onClassthe3importfailuresininterfaceChapterofthe6.drivers,whichwastheonlyonetoexperience
Injection7.4.3Theexperimentalsetupusedfortheinjectiondiffersslightlyfromtheone
busedeeninintrotheducedpreviousbetwceenhaptertheinOSthatandathenewIntargetedterceptordriver.moThedule(exptrackererimen)talhas
setupisshowninFigure7.7.Forthecallblockapproachtheinjectoris
blockconfiguredisreactohed.injectAparterrorsfromwhenthesethectrachangeskerthesignalssetupthatremainsthethetargetedsamecallas
ed.describpreviously
SETUPALEXPERIMENT7.4.
131
Beforeinjectionscanbemade,aprofilingexecutionofthedriverisper-
formed.Duringthiserror-freeexecutionthetrackermodulerecordsallcalls
btheeingtwomadedrivtoerstheanddrivtheer,wi.e.,orkloaditusedrecordsforthethesecallexpstringerimenfortsthethecalldriver.stringsFor
wereinjectingbotherrorsthedeterministic,samei.e.,callevstringerywtimeasthewgenerated.orkloadwThisasisaexecutedsimplificationwithout
andaresultofchoosingasimpleanddeterministicworkload.Usingadeter-
toministiccountingwcallsorkloadmadereducedtoOStheservices.triggeringSectionof7.6injectionsfurtherinaspdiscussesecificcallthisbloissue.ck
7.4.4CallStringsandCallBlocks
Thissectionreportsonthecallstringsandcallblocksidentifiedforthetwo
ers.drivtargeted
erDrivortPSerialTheworkloadforcerfioserialfirstwritesastringofcharacterstotheserial
portwhicharereadbythehostcomputerconnectedtoit.Thehostcomputer
echoesthesamestringbackandtheyarereadonebyonebytheapplication.
Thisprocessisthenrepeatedoncemore.Theworkloadgeneratescallstothe
driver,formingthecallstringshowninFigure7.8,representedasaregular
expression.Intotalthecallstringfortheserialportdrivercontains152
ens.tokThefirstcallperformedisaninitializationcalltothedriver,DllMain.
Suchanentrypointexistsforeachdriver.Afterthisfollowsaseriesofcalls
toperformtheservicesrequested.
InFigure7.8thetokensassignedarelistedinTable7.1,whichshows
theStreaminterfaceentrypointsprovidedbycerfioserial.Additionally,the
DllMainfunctioncalledwhentheDllisloadedisassignedthetokenD.
MoredetailedinformationontheStreaminterfaceanddevicedriversfor
WindowsCE.Netcanbefoundin[Boling,2003].
ThecallstoDllMainandtoCOMInitmakeuptheinitializationphase
ofthedriver.Theworkingphase(whichofcourseisworkloaddependent)
consistsofaseriesofcallstoCOMOpen,COMRead,COMWrite,
COMCloseandCOMIOControl.Thethecleanupphaseofthework-
loadfinisheswiththecalltoCOMClose.Thepatternisthenrepeated
more.onceThecallstringinFigure7.8issplitintocallblocks,asillustratedinTable
7.2andFigure7.9.Fivecallblocksareidentified(δ,α,β,γandω),someof
whicharerepeating.Fortheremainderofthepresentationinthischapter
132
CHAPTER7.ERRORTIMINGMODELS
Table7.1:Streaminterfaceforserialdriver.
NumberNamePurpose
0COMInitInitializingthedriver.
DepricatedDeinitCOM12COMOpenOpenaconnectiontothedeviceorafile.
3COMCloseCloseapreviouslyopenedconnectionor
file.4COMReadReadfromanopenconnectionorfile.
5COMWriteWritetoanopenconnectionfile.
6COMSeekMovewithinthefile.Usuallydonotwork
devices.tedconnection-orienon7COMIOControlSendcontrolcommandstothedevice.Are
ecific.spdeviceypicallyt8COMPowerDownTellthedevicetomovetoapowersaving
state.9COMPowerUpTelldevicetocomebackfrompowersav-
states.ingD02775(747){23}732775(747){23}73
Figure7.8:Theserialdrivercallstring.
wetermthetargetedcallblocksδ,α,β1,γ1,ω1,β2,γ2andω2asshownalong
calltheblox-axiscks,innotFigurephysical7.9.time,Notei.e.,thatthethelengthx-axisofabindicatesoxdoestimenotasrepresensequencesttheof
service.theoftimeexecution
Table7.2:Callblocksforcerfioserial.
CallblockTokensOccurrences
δα0pre-load11
22775βωγ73747462
oftheForcalltheblocallck.bloCcallksblothatckγreorepccureatsonceall(inβalland46ω)wtimes,etargetandeactargetinghinstanceeach
ofinstancethemwofγouldineacclearlyhrepbeveatingerytimesequence.consuming.ForthefirstTherefore,sequencewewetargettargetone
EXPERIMENT7.4.SETUPAL
73
7373747...747747...747
all Block277527750CDemtimeδαβ1γ1ω1β2γ2ω2
Figure7.9:Thecallblocksfortheserialport.
133
thefirstinstanceandforthesecondwehavechosenoneinstancearbitrarily
sixth).(the
erDrivEthernetThenetworkcarddriverworkloadworksinasimilarfashionas
cerfioserialworkload.Amessageissentoverthenetworkandisechoed
backbythehostcomputer.However,astheNDISwrapperisusedinstead
fortheStreaminterfacefortheEthernetdriveraslightlydifferenttracker
mechanismisused(thedriverexportsStreamfunctionsaswellthoughas
partofbeingaproperdriver).
Table7.3:NDIScallbackfunctionsforthepassthroughwrapper.
Num1berMinipNameortInitialize
33MinipMiniportSendortSendPackets
54MinipMiniportProcessSetortQueryInformationPowerOid
ortSetInformationMinip678MinipMiniportTortReturnPransferDataacket
ortHaltMinip91110MinipMiniportResetortCancelSendPackets
tenortDevicePnPEvMinip12wnutdoortAdapterShMinip13TheNDISarchitectureisalayeredone,withtheupperlayerbeingproto-
134
CHAPTER7.ERRORTIMINGMODELS
collayers(TCP/IPetc)andthelowerlayerbeingthedevicedriver(termed
miniportdriver).Thelayeredmodel,withitsdefinedinterfaces,makesit
possibletointroducenewfilteringlayersin-betweenexistinglayers(which
iswhatsomefirewallsandanti-virussoftwaredo).Apassthroughwrap-
perisimplemented,thatisaddedontopoftheminiportdrivertargeted
(91C111.dll).Thepassthroughwrapper,asthenamesuggests,doesnotal-
terthedatainanyway,simplypassesitthroughtotheminiportdriver.The
purposeisonlytotrackthecallsmadetothedriver.Thecallbackfunctions
exportedaresummarizedinTable7.3.Lessthantenfunctionswereexer-
cisedforourworkload,whichallowedustostillusesinglenumericaltokens
forthecallstring.Notethatthisisnoreallimitationtotheapproach,since
anyalphabetcouldhavebeenused.
D1(4){9}1(4){15}666(4){10}666444336663444(3){9}
Figure7.10:TheEthernetdrivercallstring.
ThecallstringfortheEthernetdriverisshowninFigure7.10.Again,the
firstcallistoDllMain.Additionallythedriverhasaspecificsetupexport
(DriverEntry)whichtogetherformsthefirstentryinthecallstring(D).
Table7.4:CallblocksfortheEthernetdriver.
CallblockTokensOccurrences
δDllMain+DriverEntry1
βα144444444444444412
2444666γ14444444µω33666636443333333331
Asforcerfioserialamanualinspectiongaverisetothecallblocksillus-
tratedinFigure7.11.AstheEthernetdrivergaverisetosignificantlyfewer
callblockswetargetallcallblocksusinganyservices.Forsomecallblocksno
OSilarly,servicesthelastwerecallbloused,ckand(ω)givconsequenesrisetlytononocallsinjectionsmadewtoerethepOS.erformed.Therefore,Sim-
itwasnotfurthersplitintocallblocks.
aluationEvofResult7.5Fpaulterformedinjectionusingexpbotherimenthetswfirst-oerecarriedccurrenceoutforapproacbothhdrivanders.thepropInjectionsosedwcall-ere
7.5.RESULTOFEVALUATION
µγγ21all BlockCα1α2
βδ
ω
time
Figure7.11:Thenetworkdrivercallblocks.
135
blockapproach.AnoverviewoftheresultsisshowninTable7.5anda
moredetailedviewisshowninTable7.6.Notetheuseofthethepreviously
introducednamesforidentifyingthecallblocks.Theresultsaregraphically
illustratedinFigures7.12and7.13.Adiscussionandfurtherinterpretation
oftheseresultsisfoundintheSection7.6.
Tnuablember7.5:ofinjectionsComparingandthethefirst-onumberccurrenceofexpanderimencalltblockoutcomesapproacinhestheonmostthe
severefailureclass(Class3).
Trigger#InjectionsSerialdriver#Class3#InjectionsNetworkdriv#Classer3
CallFirstoblocksccurrence840824281310235618181212
7.5.1SerialPortDriver
ThedistributionofoutcomesacrossfailureclassesisshowninFigure7.12.
Forcomparisonpurposestheoutcomeofthefirst-occurrenceinjectionsare
shownaswell.Additionallythenumberofinjectionsforeachcallblockis
shownasopaque,blackbars.ForillustrativepurposestheClassNFisnot
shown,butalldataisfoundinTable7.6.
Callblocksγ1andγ2exhibitthelowestnumberofClass3failures.These
callblockscorrespondtotheworkingphaseofthedriver,wereitisonlysend-
ingandreceivingdata.Sincetheworkdoneinthisphasecorrespondstothe
mainpartofadriver’slifetime,itisreasonabletoexpectittobesufficiently
wellspecifiedandunderstoodfortheOStobeimplementedtoleratingmany
136
CHAPTER7.ERRORTIMINGMODELS
canfluctuationalsobeinobservdevice/drived,witherγb2ehahavingvior.aAslighsmalltlyhigherdifferenceratio.inClass2behavior
Comparedtotheworkingphase,theinitializationphaseofthedriver
shophasewsaofthehighertestratioofapplicationClass(3β)shofailuresws(aδhighandα).ratioofSimilarlyClass,the3failures.initializationThe
andsameω2holdsshowforclosethetocleanidenupticalphaseofdistributions,thetestβ2shoapplicationwsmore(ω).Class1Whereasfailuresω1
.βthan1
Class 2
Class 1
#Injections
#InjectionsClass 1 Class 2 Class 3 45.040.0 200035.030.0 150025.0Number of injectionsFailure class distributions in percent20.0 100015.010.0 5005.00.0FOδαβ1γ1ω1β2γ2ω2 0
Call blocks
oFigureccurrence7.12:Fapproacailurehclassandforeacdistributionhcallbloandcknuofmbcererfioofserial.injectionsforthefirst-
FromFigure7.12itcanalsobeseenthatthefirstinitializationcallblock
δislesspronetoClass3failuresthanthesecondcallblockintheinitial-
izationphase,α.WhenDllMainiscalled(i.e.,inδ)driverdevelopersare
discouraged(inthedocumentation)toincludeanytimeconsumingorcom-
plexoperations,restrictingtoonlyinitializationofsynchronizationobjects
andotherlightweightoperations.ThisminimizescallsmadetotheOSinthis
criticalphaseofthesystem.ConsequentlyweobservefewerClass3failure.
However,asthetheseoperationsmaynowfail,withthewholedriverbeing
unabletomakeprogress,weseeariseinClass2failuresinsteadcompared
.αtoComparingwiththefirst-occurrenceinjectionsitcanclearlybeseenthat
thefirst-occurrenceapproachgivesverysimilarresultsasthecallblocksin
7.5.TRESULFOVETIONALUA137
theinitializationphase.Thisbehaviorisofcourseexpected,sinceitinjects
inthefirstcalltoeachserviceusedbythedriver.
Notonlythedistributionacrossfailureclassesisofinterest,butalsothe
numberservicesfoundtohavesevere(Class3)failures.Table7.7presents
theresultforcerfioserial.Fromthe41servicesbeingtargeted,13services
causedClass3failures.Comparedwiththefirst-occurrenceapproachthisis
anincreaseofthreeservices,i.e.,threeservicesnotpreviouslyfoundtohave
Class3failureswereidentified.Thisindicatesthatchoosingthetriggering
eventforinjectionfortheserialportdriverhasasignificantimpactonthe
resultsobtainedandthatfirst-occurrenceisnotsufficientforacomprehensive
.evaluation
138
CHAPTER7.ERRORTIMINGMODELS
---%C33.05%0.58%1.55%3.76%0.00%1.60%3.24%0.00%1.69%3.52%4.04%0.00%2.34%1.75%1.75%
#74717510184301964710433---
---%C218.53%22.10%9.38%16.08%2.24%2.14%10.92%4.83%2.84%2.75%1.82%0.00%0.58%2.34%1.17%
#4502691032181324145283250320142---
(91C111.dll).erdrivcardorkwnettheandserial.dll)
---1.44%1.48%0.82%0.37%0.34%9.16%12.12%0.34%8.00%22.88%28.13%0.00%38.01%38.01%37.43%
%C1#35189521031612904164940656564---
---76.98%75.84%88.25%79.79%97.41%87.10%73.72%94.83%87.47%70.85%66.00%100.00%59.06%57.89%59.65%
%NF#18899239691082565979979550984128811599610199102---
00024281217109813565801124132858011251818175696171171171
#InjectionsskInjection7.6:ableT(cerfioserialtheforresults
cBloFOδαβ1γ1ω1β2γ2ω2FOδα1α2βγ1µγ2ω
CallNameserial.dller91C111.dllcerfioDriv
serial.dllcerfio
91C111.dll
DISCUSSION7.6.
139
Table7.7:TheserviceshavingClass3failuresforcerfioserial.Servicesnot
identifiedbyfirst-occurrencearemarkedwithX.
Service/CallblockFOδαβ1γ1ω1β2γ2ω2
xxxCreateThreadEventModifyDisableThreadLibraryCallsxxXX
xxreeLibraryFxxranslateBusAddressHalTXInitializeCriticalSectionInterlocLoadLibraryWkedDecrementxxX
LomemcpcalAlloycxxxxx
xxxmemsetxMmMapIoSpaceTSetProcPransBusAddrTermissionsoStaticxXxx
xxx
X
7.5.2erdrivEthernetTable7.6andFigure7.13showthatcallblockδshowsaverysimilardis-
tributionasthefirst-occurrenceapproach.Forcallblocksµ,γ2andωthe
driverdoesnotperformanycallstotheOS,andconsequentlynoinjections
wereperformedandthesecallblocksarenotshowninFigure7.13.
Thecallblocksα2,βandγ1showaverysimilarbehavior,duetothefact
thatonlyoneservicesisusedforallthreecallblocks.Thereforethesame
amountofinjectionsareperformed.Noinjectionsincallblockα1showany
Class3failures.OverallnonewserviceswerefoundtohaveClass3failures
7.5).ableT(see
Discussion7.6
Thismostlysectionontheinmostterpretssevereandclassdiscussesoffailures,theresultsClassof3.thecasestudy.Focusis
140
Class 3 45.040.035.030.025.0Failure class distributions [%]20.015.010.05.00.0
FO
CHAPTER7.ERRORTIMINGMODELS
Class 2 Class 1 #Injections
#Injections
2000 1500#Injections 1000 500δα1α2βγ10
Call blocks
bloFigureckof97.13:1C111.FailureCallclassblocksdistributionwithoutandinjectionsnumberareofexcluded.injectionsforeachcall
7.6.1DifferenceinDriverTypes
Theresultsforthetwodriversshowasignificantdifferenceacrossthem.For
theserialdriveragoodnumberofnewserviceswerefoundtoexperience
Class3failures.TheEthernetdriverontheotherhanddidnotgetany
additionalservicevulnerabilitieswiththenewcallblockapproach.
Section7.4.1showsthattheserialdriverusesmoreservices,andmore
frequentlythantheEthernetdriver,suggestingthatitwouldbemoresus-
ceptibletocallblockinjections,andthisisindeedalsothecase.Forthe
EthernetdrivernonewClass3failureswhereidentifiedandtheδcallblock
showsaverysimilarbehaviortothefirst-occurrenceinjections,furthersub-
tuition.inthetiatingstanFurtheranalysisonthecallsmadeforeachcallblockispresentedin
Figures7.14and7.15.Figure7.14showsthecallprofileforcerfioserial,i.e.,
thenumberofcallsmadebythedriverforeachofthecallblocksidentified.
Itcanclearlybeobservedthatthecallsmadearespreadthroughoutthe
lifetimeofthedriver,withcallblockβhavingthemostcalls.
Figure7.15showsthecallprofilefortheEthernetdriver.Itshowsthatthe
Ethernetdriverismoreactiveintheinitializationphaseofthedriver(tothe
leftinthefigures)thanintheworkingphase.Comparedwithcerfioserialin
Figure7.14thedistributionofcallsisclearlygearedtowardstheinitialization
DISCUSSION7.6.
200150100Nr of calls50......0δβα1γω1β2γω2
21Figure7.14:Thecallprofilefortheserialdriver.
141
phase.Thisexplainswhythecallblockapproachfindsmorenewvulnerabilities
fortheserialdriversthanfortheEthernetdriver.Itconfirmsthatfordrivers
whichperformfewcalls,andespeciallyduringtheinitializationphase,first-
occurrenceistobepreferred.
Theprofilingofthedriversshowthattheeffectivenessoftheapproach
cantosomedegreebepredictedandsuggeststhataprofilingofthetargeted
driversshouldbeconductedbeforetriggeringtechniquesareselected,tomin-
imizethetimeforimplementationandnumberofinjectionsrequired.Note
thatsuchprofilingcanbeconductedpriortoinjectionanyerrors!However,
profiles,suchasFigures7.14and7.15doesrequirecallblockstobedefined.
200150100Nr of calls500δα1α2βγ1µγ1ω
Figure7.15:ThecallprofilefortheEthernetdriver.
OccurrenceFirstwithComparing7.6.2Thefirst-occurrenceapproachhasseveraldistinctadvantagescomparedto
thecallblockapproach.Firstandforemostitusesfewerinjections.Itisalso
appropriatewhendoingcodelevelinjections(Table7.5),wherethelocation
ofthemodeledfaultcoincideswiththeinjectederror.
142
CHAPTER7.ERRORTIMINGMODELS
Thehighernumberofinjectionsforthecallblockapproachtranslatesinto
longerlessenedtimebyprequirederformingforppre-profilingerformingtotheeremovvealuation.non-significanThiscantbeexperimensomewhatts.
Unfortunately,thehighernumberofinjectionsisinherentlyinevitablesince
thecallblockapproachinjectsinmultipleinvocationsofaservice,whereas
first-occurrenceonlyinjectsinthefirstcalltoaservice.
Theadditionalcostsgiverisetoatrade-offwiththeusefulnessofthe
results.Newservicevulnerabilitiescanindeedbeidentifiedusingthecall
blockapproach,asshowninSection7.5.First-occurrencefoundtenClass
3services,whereasthecallblockinjectionsfoundthirteen.Thisrepresents
anincreaseof30%.Ontheotherhand,fortheEthernetdrivernoadditional
servicesareidentifiedduetoitscallprofilebeinggearedtowardstheinitial
phase.
7.6.3IdentifyingCallBlocks
Thelengthofacallstringvariesdependingontheworkloadusedtogenerate
it.Manualinspectionwassufficienttoidentifycallblocksfortheexperiments
presentedinthischapter.However,thiswillnotbefeasibleforlongercall
strings.Forlongercallstringssomelevelofautomationisrequired.
Therepeatingnatureofcallblocksiswhatmakesthemidentifiablein
thecallstring.Toidentifyarepeatingsequenceoftokensfromagiven
alphabetinastringisawellstudiedproblem.Examplesofusesistoidentify
repeatingsequencesinDNA.Multipledatastructuresandalgorithmshave
beendevisedforthispurpose.Manyexamplesofsuchproblemscanbefound
forinstancein[Gusfield,1997].Wehaveexploredtheuseofsuffixtrees,a
specialtreedatastructure,whichfindsrepeatingsequencesquickly.However,
furtherresearchisneededtodevelopappropriatetoolsandtechniquestofully
takeadvantageofthisdatastructure.
Figure7.12and7.13showthatsomecallblocksaremoreusefuliniden-
tifyingClass3failures.Thesecallblockstypicallybelongtoeitherthe
initializationorcleanupphasesofthedriverandworkload.Byfocusing
mostlyontheinitialization(likefirst-occurrence)andcleanupphasesone
couldpotentiallyreducethenumberofinjectionsrequired.
orkloadW7.6.4Withoperationalphasesandcallblocksformingthebasisfortheusagepro-
fileitisimportanttoidentifyrepresentativeworkloadstobeusedtodrive
experiments.Representativeofthesystem’sexpectedworkloadonceitbe-
comesoperational.Inmanycasesno,oronlypartialinformationisavailable
7.6.DISCUSSION
143
proacregardinghistotheuseexpasynectedtheticopwerationalorkload,profile.exercisingIfthenotknosystemwn,inaadivcommonersewaap-y
real-w[Johansson,orld2001].applications.TheHowealternativver,etmanoyusingaapplicationssyntheticarewnotorkloadsuitableistotobusee
useddirectly,duetorequireduserinputs,non-determinismetc.
Ascallblocksrepresenthigherleveloperationscarriedoutonthedriverit
isimportantthatthecallstringisstableacrossruns,i.e.,thatthesamecall
stringisgeneratedeverytimetheworkloadisexecuted.Theworkloadused
inthisthesisisindeedstableanddeterministic,andgivesrisetothesame
callstringforeachrun.Thisisanimportantpropertyforanyworkloadused
inacomparativepurposeandisawellestablishedapproachinthebench-
wmarkingorkloadiscommkeyunitfory,aclikehievingthereprostandardducibilittestsyin,anSPECimp[SPE].ortantApropdertyeterministiciden-
Astifiedoneforstilldepstrivesendabilitafterybusingenchmarksreal-worldw[Johansson,orkloads,2001;anapproacKanounhietstoal.,mo2005].dify
byapplicationsusingspbecificyremouservingscenariossourcesorofusecases.non-determinism,suchasuserinputs,
Anothersourceofnon-determinismfordevicedriversisthefactthatthey
can,ingeneral,beaccessedbyseveralapplicationsconcurrently.Depending
onthesemanticsofthedevice(ordriver)thismaybesupportedornot.
Aserialdriverdoesforinstancenotgenerallyacceptconcurrentaccesses,
whereasanetworkcarddrivertypicallydoes.Fortheexperimentsinthis
thesiswehavedeliberatelyfocusedonsingle-accessapplicationscenarios.
Thisisasimplificationandextendingtheapproachtohandleconcurrency
isterleaavetopicdtokforensfutureintheresearccallh.string,Concurrenbelongingtaccessestowdifferenouldtprogiverisecesses/tasks.toin-
Fcurrenturthermore,onethe(basedonimplemensimpletationcounofters),triggerssincewillonlinebemorepatterncomplexrecognitionthantheis
ativrequired.e(bencWhetherhmarking)concurrenpurposestaccesscanbeshoulddiscussed,atallasbeitgivesconsideredrisetoforpotencompar-tial
issueswithreproducibility.
DurationError7.6.5Asjectedpreviouslyerrors,i.e.,describeachederrorwehaappveearsusedonceaandtransienthentdisappdurationearsmofordelforsubsequenthein-t
invocationsofthesameservices.Ourinjectionframeworkdoessupportin-
Howtermittenever,tnone(errorofdisappthesemoearsdelsafterhavneinbveenoevcations)aluatedaswyeet.llaspermanenterrors.
144
CHAPTER7.ERRORTIMINGMODELS
ErrorsTiming7.6.6Forconfusedthiswwithorkwetiminghaveerrors,consideredwhichcthehangetimetheoftiminginjection.behaviorThisofisthenottosystembe
byforinstancedelayingordroppingcallsbeingissued.Sincemanydevices
requirespecifictimingrequirementstobemet,thismaybearelevanterror
modelforevaluatingdevicedriversandOS’s.Experimentalsupportforthis
noerrormosystematicdelisevimplemenaluationtedhasinyetthebeenfaultcarriedinjectionoutusingframewthisorkmodevdel.eloped,but
orkWRelated7.7Thecallblockstrategyisoneofseveralthatusesaprofileofthesystem
totriggerinjection.In[Tsaietal.,1999]stress-basedinjectionsareper-
formed,whereinjectionissynchronizedwithhighworkloadactivityinthe
system.Similarlyapplicationresourceusageisanalyzedtoguideinjection
intoresourcesactivelyused.
TsaiandSingh[2000]usedasetupverysimilartoours,butwiththeintent
totestapplicationsonWindowsNTbycorruptionofparametervaluesto
librarycalls.Thefirst-occurrencestrategyisused,andacommentismade
thatregardinginjectioninsubsequentcalls:“...preliminaryresultsshowed
thatsuchinjectionsproducedsimilarresults.”[TsaiandSingh,2000,page
4].Webelievethatthisassertiondoesnotgenerallyhold.However,the
resultsforthenetworkdrivershowsthatdependingonthecallprofileofthe
targetedcomponent,differentbehaviorsareobserved.Thissuggeststhat
moreresearchisneededtocompletelycharacterizeforwhichcomponents
first-occurrenceismostsuitableandforwhichnot.
Inthisworkwedonotconsiderdistributedsystemsexplicitly.Fordis-
tributedsystemstheconceptofglobalstateisofkeyimportanceandone
maywanttoinjecterrorsatparticularlocaland/orglobalstates.Whereas
thetechniquepresentedhereissuitableforlocalstates,itdoesnothandle
globalstatesofdistributedsystems,whereevendetectionofspecificglobal
events/statesisdifficult.Someworkhasbeendonewithinthisspecificarea,
forinstancetheLokitool[Chandraetal.,2004].
7.8SummaryofResearchContributions
Thetimeatwhichafaultisinjectedcanhaveanimpactontherobustness
evaluationofasystem.InthecontextofdevicedriversforOS’s,thischap-
terhasestablishedthatselectionofthetriggeringevents(controllingerror
7.8.YSUMMAROFCHRESEARCONTRIBUTIONS145
timing)doesimpacttheevaluationresults.Furthermore,itisshownthata
profilingofthedriverrevealsitssensitivitytothetimingofinjections.
Thefollowingdistinctcontributionsareputforwardinthischapter:
•
•
•
Anoveltimingmodelfordevicedriversispresented.Thenewmodel
isbasedontheconceptofacallblock,asubsequenceofcallstothe
drivercorrespondingtohigherleveloperations.
Itisdetailedhowtheproposedmodelcanbeusedtoidentifyinjection
triggersfordevicedriversusedinfaultinjection.
Alargecasestudyfortwodevicedriversshowthatselectingtheerror
timingimpactstherobustnessevaluation,especiallyfordriverswhich
activelyuseOSservicesthroughouttheirlifetime.
146
CHAPTER7.ORERRTIMINGMODELS
8Chapter
Conclusion
&
utureF
hResearc
Whathavewelearned,andhowdowemoveforward?
Thischapterconcludesthethesisbysummarizingitsmaincontributions
anddiscussingtheirrelevancetotheresearchcommunity.
Additionally,thischapteraimstobroadenthescopeofthetechniques
usedbysurveyinganddiscussingtheirusabilityinenhancingthedepend-
abilityofOS’s.Differentareasofdependabilityenhancementsarediscussed,
coveringbothfault-removalandfault-tolerancetechniques.
Theworkinthisthesisformsthebasisformanyinterestingnewresearch
directions.Thethesisisthereforeconcludedbyscopingoutmultiplefuture
directionsofresearch.Thisincludesbothrefinementsandextensionsaswell
asnewexitingproblemswarrantingfurtherresearch.
147
148
CHAPTER8.CONCLUSIONANDFUTURERESEARCH
8.1tributionsCon
ThissectionrecapstheresearchquestionsposedinChapter1anddiscusses
theindividualcontributionsmadeandtheirrelevance.
Conceptual1:Category8.1.1OS?ResearcWhathisagQuestionoodmo1:delHowfordoidentificerrorsationinofdevicsuchepropdriversagationproppagateaths?inan
Errorsindevicedrivershavebeenshowntocausemanyfailuresin
anOS’sOSandallotowsustopropagatecapturetobothapplicationserrorscausingrunningsevonerethefailuresOS.inOurthemodelsystemof
andthosedataerrorspropagatingtoapplicationsthroughtheOS.
TheresultspresentedinChapter6clearlyshowthatthereisasignificant
differenceacrossservicesregardingerrorpropagation.Someservicesare
identifiedtoberobust,i.e,severesystemstatescannotbeprovokedthrough
it.Otherservicescanleadtosevereconsequences,includingacomplete
systemcrash.Furthermore,itisshownthatthecontextinwhichtheservice
isused,e.g.,thedriverusingit,hasasignificantimpactonitsdamage
tial.otenpResearchQuestion2:Whatarequantifiablemeasuresofrobustness
OS’s?ofofilingprChapter5presentsaframeworkwithwhicherrorpropagationacrossthe
OScanbeestimated.Themeasurespresentedallowforidentifyingindividual
Asservicessuchtheymorearevulnerableusefulfortopropagatingdiscriminatingerrors,serviceseitherbasedasonsourcesosusceptibilitrsinks.y
topropagatingerrors,whichgivesdevelopershintsonwhichservicesare
morelikelytoexperienceproblemsduringruntime.Furthermore,theyallow
discriminationacrossdriversandapplications,whichcanbeusedtoprioritize
acrossplanningmbyultiplemanagers.contendingCompdrivonentswithers/applicationshigherorexpforosurevorerificationdiffusionresourceshould
bethefirsttargetsforimprovements.
Overall,themeasuresdefinedprovideevaluatorswiththetoolstomake
informeddecisions,onvariouslevels.
8.1.2Category2:ExperimentalValidation
ResearchQuestion3:Wheretoinject?Whereareerrorsrepresenting
offaultsdifferinentlodriverscbations?estinjected?Whataretheadvantagesanddisadvantages
CONTRIBUTIONS8.1.
149
Theuseofstandardinterfacesisbeneficialsinceitfacilitatesbothin-
jectionoferrorsandobservingtheireffectswithminimalintrusiononthe
systemstudied.Itgivesclearandeasilyinterpretablefeedbacktotheevalu-
atoronwhereerrorspropagateandwhichservicesanddriversaremorelikely
tospreadthemiftheyarepresent.
ResearchQuestion4:Whattoinject?Whicherrormodelshouldbe
usedforrobustnessevaluation?Whatarethetrade-offsthatcanbemade?
Chapter6evaluatesthreecontemporaryerrormodels,chosenbasedon
theirsuitabilityforinjectionattheOS-Driverinterfaceandtheirpriorusefor
thispurpose.Thecontributionsherearetwo-fold.First,tothebestofour
knowledgethisthesisisthefirstcomprehensivecomparisonacrossmultiple
errormodelsattheinterfacelevel.Thisstudyhighlightsthestrengthsand
weaknessesofthemodels,highlightingdifferencesacrossthem.Secondly,our
comparisonevaluatesthemodelsonthenumberofidentifiedfailures,cov-
erageofservices,executiontime,efficiencyandimplementationcomplexity.
Thisallowsselectingthemostappropriateerrormodel,basedontrade-offs
parameters.theacrossForthecasestudyperformedbit-flipsrevealthemostsevere(Class3)
failuresandprovokefailuresinthehighestnumberofservices.Additional
serviceshavingseverefailurescanbeidentifiedusingtheFuzzingerrormodel.
Theleastnumberofinjectionswereincurredbythedatatypeerrormodel.
Chapter6furthershowshowanewcompositemodelcanbedefined,
combiningthebit-flipandFuzzingerrormodelstoachieveagoodtrade-off
betweenefficiency,coverageandnumberofinjectionsperformed.
ResearchQuestion5:Whentoinject?Whichtimingmodelshould
beusedforinjection?
i.e.,Thetheevtimingentsmotriggeringdelusedconinjections.trolstheClassicallytime,aterrorswhichareerrorsinjectedareeitherinjected,on
first-occurrence,i.e.,thefirsttimeaserviceiscalled,orinjectedaccording
tomodelsomebasedonpredefinedthetimeusageprofiledistribution.ofthecompChapteronen7tinpropterface,osesainnothisvelcasetimingthe
OS-Drivinjectionsercaninbeterface.concenByfirsttratedonprofilingoptheerationsopusingerationstheperformedconceptofoncathellblodrivckser,
i.e.,repeatingsequencesofcalls.
Thenewtimingmodelsallowsformorefocusedinjections,givingmore
comprehensiveresults,withoutrequiringdeepknowledgeoftheservicesin
tythepesofOS-Drivdriverersinareterface.moreFsensitivurthermore,eintheprofilinginitializationofdriversphase,revealwthathereascertainsome
aresensitivealsointheworkingandcleanupphase,suggestingthatthe
150CHAPTER8.CONCLUSIONANDFUTURERESEARCH
first-occurrenceapproachmaybemoresuitablefortheformercase.
orkramewFInjection8.1.3Forcarryingoutallthefaultinjectionexperimentsrequiredaflexibleand
menscalableted.ThefaultframewinjectionorkalloframewwsfororkeasyforandWindofastwsCEextension.Nettohasnewbeenerrormoimple-d-
elsusingapluginmodelforerrormodels.Theflexiblearchitecturemakesit
easytoimplementnewerrormodelsandtoincorporatenewdrivers.
8.2ApplicationsofRobustnessEvaluation
Thissectionillustrateshowtherobustnessevaluationframeworkpresented
inSuchtheevpreviousaluationsccanhaptersservecanbeprimarilyusedtothreeenhancepurposes,thea)asrobustnesssuppofortaninOS.the
buildtestingmoreofrobustsystems,cob)deasbyahighlighsourcefortingpdevoteneloptialerfeedbacrobustnessk,bohelpingttlenecdevkselopusingers
robustnessprofiles,andc)asinputtoactiverobustnessenhancingactivities,
sucsectionshasdetailadditioneachofoferrorthesepodetectiontentialandusesofrecoveryrobustnessmodules.evaluation.Thefollowing
ProfilingRobustness8.2.1provideRobustnessusefulevaluationfeedbackoftotheplatforms,developsuchersasofOS’s,isapplicationsusefulbbuiltecauseonittopcanof
suchplatforms.Byprovidingsocalledrobustnessprofiles,adeveloperis
madeequippaedwaretoofmakpeotentialdecisionsonrobustnesswhichOSvulnerabilitiesservicestoinusetheandsystemtheandisconsequencesbetter
thatmightcomefromusingthem.
tlenecksRobustnessmayexistprofilesinasgivystemeorinformationcomponenont.whereThepotenrobustnesstialprofilerobustnesscan,bforot-
instance,informationconsistgainedofacanbsubseteusedoftotheraisemeasuresawarenesspresentedamongindevChapterelopers5.onThethe
canelectconsequencestousefaultsdifferencanthaveservicesforstopacecifichieveservices.thesameWhengoalpinossible,asaferdevelopmanner.ers
Robustnessprofilescanbeusedasinputtotesters,whichcanfocustesting
onthosepartsofthesystemusingvulnerableservices.Robustnessprofiles
alsomorecanlikbelyetousedcausetofocusdamagecodeininsptheectionssystem.anddesignreviewsonthoseparts
8.2.APPLICATIONSOFROBUSTNESSEVALUATION
151
8.2.2RobustnessEvaluationinTesting
Robustnessevaluationcanbeconsideredaspecialbranchoftesting,where
focusisputonthenon-functionalrequirementsofthesystem.Typically
arobustnessevaluationrequiresanacceptableleveloffunctionalitytobe
fullypresentinexecutetheonsystemthebOSeingevwithoutaluated,anyi.e.,errors.theThisusedwimpliesorkloadthatmustfunctionalsuccess-
testingThereofarethethreesystemphaseshasbofeentestingcarriedwhereout.robustnessevaluationfocusingon
devicedriversmaybeofgreatassistant:
•AcceptanceTesting:ToverifythattheOSanditsdriversbehave
ellevreasonableaat•InusedintegrationtheTsystemesting:Toverifythatadrivercanbeintegratedand
•performedRegressiontheTesting:robustnessofWhenthemajorsystemneedsconfigurationtobecre-evhangesaluatedhavebeen
estingTAcceptanceAspartoftherequirementsforaspecificsoftwarecomponentrequirements
onrobustnessmaybeincluded.Thismayinvolvespecifyingwhichservices
maypropagateerrorsand/oratwhichseverity,forinstancebyspecifying
thatnoservicemaycauseacrashofthesystem,nomatterwhichvaluesitis
usedwith.Assuchthepresentedrobustnessevaluationframeworkmaybe
usedtovalidateorinvalidatesuchproperties.
estingTtegrationIninWhenteractionintegratingacrosscompcomponenonentstswwithorkseacashexpotherected.oneFneedsurthermore,tomakeonesuremaythatbe
interestedinevaluatingtheconsequencesfaultsinonecomponenthaveon
theothercomponent(s).Whenmisbehavingcomponentsareabletocause
severefailurestheymayeitherrequireadditionalfocusedverificationefforts
ormayneedtobeequippedwitherrorhandlingcapabilities.
Thetechniquepresentedinthisthesisiswellsuitedfortestingtheinte-
grationofnewdriversintheOSasitworksontheOS-Driverinterfacelevel,
whichiswheretheinteractiontakesplace.Itisimportanttonotethoughthat
theerrorpropagationprofilingisnotaimedatevaluatingdriversspecifically,
buttheOS-Driverinteraction.Therefore,detectedvulnerabilitiesforanew
drivershouldnotautomaticallybe“blamed”solelyonthedriver.Similarly,
152
CHAPTER8.CONCLUSIONANDFUTURERESEARCH
andisrobustnessthereforeprofilingadocomplemenesnottfotocusonfunctionalthetesting,functionalitnotyoafaspreplacemenecifict.driver,
estingTRegressionAscomponentsevolveaspartofthedevelopmentprocesstheirerrorpropa-
gationabilitiesmaychange,tothebetterortheworse.Aspartofregression
testingcampaignserrorpropagationcanbeevaluatedsuchthatnewpropaga-
tionpathsaredetectedassoonaspossibleandmaybetreatedappropriately.
Thescalabilityandautomationpossibilitiesofinterface-basedfaultinjec-
tionmakesitexcellentforregressiontesting.Usingthepre-profilingapproach
describedtheinjectionsareadoptedtoanychangesinworkloadonthesys-
tem.Additionalinjectionsfornewservicesareeasilydefined.Newerror
modelsrequireminimalchangestothesystembeforeinclusion.
8.2.3RobustnessEnhancingWrappers
Inmanycasesmodificationstosystemcomponentsexhibitingrobustness
vulnerabilitiesarenotpossible,orevendesirable.Thisisforinstancethe
caseforpureblack-boxsystems,wherethelackofaccesstosourcecodepro-
hibitsanymodifications.Evenwithaccesstosourcecode,legalreasonsmay
prohibitmodifyingthecode.Typicalrobustnessenhancingmodifications
includeadditionoferrorcheckingandhandlingcode,suchasexecutableas-
sertions[VoasandMiller,1994b;Hiller,2000].Forsystemsgearedtowards
highcompponentserformanceinvolvited,maespynoteciallybeforviabletogeneral-purpaddosetime-consumingsystemssucchhecaskstoOS’s.the
Asanalternativetomodifyingtheinvolvedcomponentsanattractive
alternativeistoaddnewcomponents“wrapping”theoriginalcomponent
[Fraseretal.,1999;Ghoshetal.,1999;Mitchemetal.,2000].Suchwrappers
canbeaddedwhereneededandthusbeappliedonapolicybasisorwhere
likelytobemosteffective[Hilleretal.,2002a].Errorpropagationanalysis
canbeusedtoidentifyprominentpropagationpaths,suchasin[Hilleretal.,
2002a].Thedatacollectedfromfaultinjectionexperimentscanbeusedtodesign
assertions,whichcanbeimplementedaswrappers[Voas,1997b;Whisnant
etal.,2004].Itcanalsobeusedtoenhancethewrappersdesignbyother
means.Severalresearchprojectshavelookedintotheuseofwrappersfor
enhancingOSrobustnessandsecurity.In[Arlatetal.,2002]theauthorsde-
scribefaultinjectioncampaignscarriedoutontwomicrokernel-basedOS’s.
Robustnessenhancingwrappersareaddedtosomefunctionalcomponentsof
theOSbyformallydefiningpredicatesthatmustholdoverthecourseofthe
ONOUTLOOK8.3.FUTURETHE
153
menexecution.tedusingThisreflection.requiresItaccessistonotedinthatternalevenstateswhenofthesuchOSaccesswhichisipsossible,imple-
thedefinitionof(correct)predicatesistime-consuminganddifficult.The
tremelyrequiredformaldifficulttomodelsimplemenofbtehaforviorgeneralmayinpurppracticeoseCOTSmakeOS’s.theTheapproachauthorsex-
proposetoinsteaduseoperationalconsistencychecks,suchasacceptanceor
validitychecks.
In[FetzerandXiao,2002b,a]wrappersareusedtotracknon-robustar-
ofgumenmemorytstoalloCcationslibrariesonmadethebyheapandapplications.stackandcanStatefulverifywrappthatersakeepccessestracarek
onlymadetoallocatedmemory,presentedfirstin[FetzerandXiao,2001](a
similartechniqueispresentedin[DeValeandKoopman,2001]forexception
hardeningofI/Olibraries).Formemorynotpresentontheheaporonthe
stacdataksignalstructureshandlersarevarealidatedsetupusingtotracexistingkanyvaccessalidationviolations.functionsFprovidedurthermore,by
thesystemandwhensuchfunctionsarenotavailable,stateinformationis
kaeptviolationsimilaristofound,memoryasafealloreturncationctoodeviserifythereturnedtocorrectnesstheofapplication.arguments.If
maNojoritoksyoisfanfailuresadd-onindevicesubsystemdrivtoersan[SwiftOS,etal.,protecting2005].theOSDriversfromareaviso-ast
lated(i.e.,wrapped)withinlightweightprotectiondomains.Allinteraction
withthekernelistracked,tobothisolatefailuresandforfacilitatingcleanup
procedures.Theprotectionisachievedbylimitingadriver’swriteaccess
toprotectkernelagainstmemorymemoryandbykviolationsernelobandjectkerneltrackingstructuremechanisms.corruption.NoFoksaultcanin-
jectionparameterwaschecusedkstotovimproalidatevethefailureapproachisolation.andcanInbelaterusedwtoorkdefine[Swiftspetecifical.,
2006],ducingtheshadowauthorsdrivers.extendedShadothewrecodrivverserytempcapabilitiesorarilytakofestheoversystemwhilebydrivintro-ers
arereloadedandrestarted,thesystemalsohandlesstateinformationtrans-
fertothenewlystarteddriver,makingrecoverytransparenttotheuserof
ittheavoidssystem.timeThemicrconsumingorebooandtpstrategyossiblyisadisruptivpromisingesystemrecovreberyootsapproac[Candeah,as
etal.,2004;Herderetal.,2007].
8.3OutlookontheFuture
onThisfuturesectionstepsreflectsinresearconthehneededthemesforpresenthemtedtobyfurtherdiscussingevolve.andspeculating
154CHAPTER8.CONCLUSIONANDFUTURERESEARCH
8.3.1FaultInjectionTechnology
Thefaultinjectionperformedforthisthesisisbasedoninterceptingcalls
madeindevicedrivers.Thusitisbasedonrealcallsmadeinthesystem,
i.e.,aworkloadisneededtogeneratetherequiredcalls.Thisisincontrast
toapproachwheretestharnessesaresetuptosimulateoperationalcondi-
tions,whereeachservicecanbetestedinisolation.Thebenefitofthelatter
approachisthatmoreinjectionscanbeperformedpertimeunit,butonthe
otherhanditrequiresoperationalconditionstobesetup,whichmaybedif-
ficulttodo,especiallyforlow-levelsystemsoftware,suchasdevicedrivers.
Sincebothapproacheshavemeritsacomprehensiveevaluationofbothona
largerprojectwouldgiveinsightsandguidanceonwhereoneismoreuseful
other.thethanForthefaultinjectionapproachpresentedheretogainwidespreadaccep-
tanceandadoptionitneedstobeincorporatedintoapropertoolset.Sucha
toolsetmustminimizethesemanticalburdenontheevaluatorandautomate
theprocessofevaluationasmuchaspossible,stillallowingforuser-driver
extensibilityandscalability.Theseareasofthepresentedapproachneedto
behandledbythetool:
•Profilingofthetargeteddriver,includingidentificationofallusedser-
vicesandautomaticgenerationofinjectionwrapper.
•Provideaselectionoferrormodelsthattheevaluatorcanchoosefrom,
aswellasastandardinterfaceforaddingcustomerrormodels.
•Automaticallyperformtheinjectionsandcollecttherequiredlogsand
database.ainthemstore•Providetheevaluatorwiththemechanismstoautomaticallycalculate
relevantmeasures,includingerrorpropagation.
•Additionally,throughouttheprocessdatamustbestoredinopenfor-
mats,e.g.,XML,enablingintegrationwithexternaltoolsandfuture
ts.enhancemen
PropagationError8.3.2Therearemanyusesforinformationonerrorpropagation,somealreadydis-
cussedpreviouslyinthischapter.Inthisthesisafour-gradedscalehasbeen
usedtoclassifyeachexperimentintodifferentfailureclasses.Propagation
isthentypicallystudiedonafailureclassbasis.Thescaleusedisbasedon
severity,withoutanyspecificsysteminmind.However,furtherrefinementof
FUTURETHEONOUTLOOK8.3.
155
thescalemaybeusefulforspecificsystems,specificallyincorporatingappli-
cationlevelinformation,suchasdatacorruptionorotherapplicationspecific
failuresofdifferentseverity.Developingguidelinesforincludingapplication
specificfailuresandstillpreservingcomparativecapabilitiesisaninteresting
hallenge.c
Arelatedissuetotheinclusionofapplication-specificinformationinfail-
ureclassificationistofurtherinvestigatetheroleoftheworkloadselection
ontheoutcomeoftheevaluation.Itiswellestablishedthattheusedwork-
i.e.,loadoneshouldusesasthecloselyopaserationalpossibleprofileresemofblethethesystemrealw[Musa,orkload1993].ontheHowesystem,ver,
theusedworkloadcanalsohaveanimpactonthefailurerevealingcapabil-
itiespropagatingoftheeverrorsaluation,thanwhereothers.someForwinstance,orkloadsmaybapplicationsemoreconlikelytainingtoexpsomeose
tolevelofhandleerror(i.e.,checnotkingrepandortingorcorrectionshowingmay,effecttransparenof)mantlyytoofthetheuser,bpropagatingeable
errors.Whensuchapplicationsareused“asis”theymayhideimportant
robustnessinformationfromtheevaluator.Thissuggeststhatthebestop-
thistionwthesis.ouldbeDuetotousethese“errorrevconflictingealing”goals,theapplications,compwhicositionhisofwhatanis“efficiendonet”in
workloadbecomescomplicated,asitmightnotreflecttheactualuseofthe
systemandthereforeskewtheresults.Moreresearchisneededintoidentifi-
faultcationofinjectionbothexpusefulerimenandts.realisticworkloadstobeusedinconjunctionwith
Knowingwhichservicesmayaffectyourapplicationcanbefoundoutby
studyingtheServiceExposuremeasure.Thiswayindividualservicescan
beidentifiedandthedesigneroftheapplicationcanverifyiftheseservices
arepropusederlyinandthethatapplicationpropagatingintheerrorsfirstareplace,handled.andifHoso,wevthater,thistheyproarecessusedis
complexandcanbetime-consuming,especiallysinceitneedstoberedone
iswhencthereforehangestohavdefineebeenApplicmadeationtotheExposureapplication.measures,Anewcapturingresearchhowdirectionappli-
cationsareaffectedbypropagatingerrors.Then,techniquesforassessment
aofcsuchhallengingeffectstask,needesptobeeciallyfound,forforcaseswheninstancenousingsourcefaultcodeisinjection.available.ThisFi-is
nally,ApplicationExposureandrobustnessprofilesoftheOSarecomposed
intoasystem-levelrobustnessprofile,consideringthespecificapplications
system.theonrunning
156CHAPTER8.CONCLUSIONANDFUTURERESEARCH
delsMoError8.3.3Chapter6evaluatedtheappropriatenessofthreeerrormodelsandtheresults
clearlyfavorssimplermodels,suchasbit-flipsandfuzzing,oversemantically
richermodelssuchasthedatatypemodel.Thisisalsoinlinewithcurrent
advancesinrandomtesting[Hamlet,2006;Pachecoetal.,2007].However,
theresultspresentedheremustbeinterpretedinlightofthespecificcase
studywheretheywerefound.Therefore,moreresearchisneededinthearea
ofsoftwareerrormodels,bothforsimplerandmorecomplexmodels.The
compositemodelpresentedshowsthat,atleastforspecificsystems/contexts,
modelsmayhavetobecombinedtobemosteffective.Thesemodels,even
Thethoughsetcoofvmoeringdelsaevwidealuatedspectrumshouldofpropthereforeerties,beareenlarged,ofcourseespnoteciallycomplete.consid-
eringcodelevelfaults,suchasmutations,tomakeamorecomprehensive
selectionofmodelsavailableforsystemevaluators.
Alsoonthedatalevel,allthreemodelscanbeextended.Thedata-level
errorscanforinstancebeextendedwithsemanticknowledgeofthefunctions
testedintheinterface.Fuzzingcanbeextendedusingadvancesinrandom
testing[Hamlet,2006].Bit-flipscanbefurtherextendedtoincorporatemul-
tipleflips(extendedfromtheSEUmodel)andfurtherextendtheworkon
injections.biteselectivAnotherimportantresearchdirectionistovalidatetheproposedrobust-
nessevaluationmethodologyaspartofastructureddevelopmentprocess.
Thiswouldallowforastrongerconnectionwiththe“bug-revealing”capabil-
itiesofthechosenerrormodels,animportantaspectnotfocusedoninthis
thesis.
TimingError8.3.4Thetimingmodelpresentedwasdevelopedwithinthecontextofdevice
drivers.However,webelievethatitcouldbeofmuchmoregeneralusein
bocompxesonenandt-basedrobustnesstesting.evInaluationsystemsiswwarranereted,comptheonentsselectionareofseenastriggeringblack
eventsisasdifficultasfordevicedrivers.Anextensionoftheworktosuch
systemscouldpotentiallyfurthervalidateitsusefulnessandmakeitmore
ers.elopdevtoaccessibleThedriversevaluatedhadnoconcurrentaccesspatterns,simplifyingthe
analysis.Furtherresearchisneededtohandleconcurrentaccesspatterns.
Formakingthecallblocktechniquemoreapproachableitshouldbebased
onautomaticidentificationofcallblocksfromagivencallstring.Whether
completeautomationispossibleisstillanopenquestion,butsupporting
LEARNEDLESSONSCTICALPRA8.4.
157
totrees.olscanInitialbedevprototelopeypdetobasedolshaonvebpatterneenimplemenrecognitiontedtecandshohniques,wsomesuchaspromise.suffix
Moreresearchisneededindevelopmentofalgorithmsandtoolstohandle
sets.datalarger
LearnedLessonsPractical8.4scienDuringtific)thelessonsprocesswereofwlearned.orkingonThisthesectionmaterialaimsfortolistthissomethesisofsevtheseerallessons,(non-
andthepurposeistoshareourexperiencewiththesesystems.Someofthese
maybeobvioustoexperiencedresearchers,butwestillhopetheycanbe
usefulforyoungresearchersanddevelopers.
datastructuredStore1:LessonBymovingtoastructuredstorageofdata(i.e.,adatabasevs.simpletext
files)hasanassociatedoverhead,intermsoftime,effortandskillsrequired.
Ourexperienceisthatthiscostissmallcomparedtothebenefitsgained.
Havingdatainadatabasemakesiteasytochangeanalysistools,tomodify
theanalysisorextendit.Italsosimplifiesaccessingthedataastoolsalready
existtoworkwithdatabases.Itisalsoeasytosearchthedataforinconsis-
tencies,arisingduetosoftwarebugsorincompletelogfiles,somethingwhich
otherwisemaybehardwhentheamountofdataincreasesrapidly.
Lesson2:Userevisioncontrol
Akeytoanysuccessfulprogrammingtaskissecuringthecodefromaccidental
(ormalicious)changes.Thisnotonlyincludeshavingastructuredback-
upsystem(thatisalsotested!),butalsotouserevisioncontrolforthe
forsourceinstancecode.whenThiswsimplifiesorkingonthemtask,ultipleevenmacwhenhines.thereisAdditionallyonly,onedevputtingeloptheer,
asresultsloosing(thesucrahwfileslogmafilesyindestroourycase)manyunderhoursorevisionfexpconerimentrolistation.alsobeneficial,
Lesson3:Don’ttrustthedocumentation!
Inmanycasesdocumentationcanbeoutdated,orincomplete.Whenthe
systemdoesn’tbehavethewaythedocumentationstatesitshould,itmay
bethatthedocumentationisoutdatedandnotthatsomethingiswrong.This
isespeciallytrueforarticleswrittenbeforethereleaseofasoftware(white
papers).Makesurethatyouhavethelatestversionofthedocumentation
158
CHAPTER8.CONCLUSIONANDFUTURECHRESEARandandwesearcbhforumstheInareternetgoodforplacesuserstoexpgeteriencinganswers.similarproblems.Newsgroups
Lesson4:Usetherighttoolforthejob
Typicallytherearemanytools(suchasprogramminglanguages)thatmaybe
usedtoaccomplishagiventask.Theydifferineaseofuseandfeatureset.For
instance,manyhigh-levelprogramminglanguages(suchasJava/.Net)allow
forprovvideerysimplerapidtodevuseinelopmenterfacestusingtoformoderninstancedevbuildelopmeninttuitiveneuservironmenintsterfaces,and
interactingwithdatabasesetc.Choosethetoolthatisbestsuitedforthe
problem,consideringthetimeitrequirestoimplementthesolutionandthe
pfamiliaritossibilitiesyofmightnotextendingbetheitforbestfuturedecisionneeds.intheCholongosingarun.toolbasedonlyon
Alastwordfromtheauthor
wThisarethesissystems,hasforathercusedthanononidentifyingincreasingthevulnerabilitiesdependabilitandywofeaknessessuchinsystems.soft-
AsmenatlastmadebcommenyJimtonGramyyiwnork1I990.wouldNamelythereforethatlikiteistoafterparaphraseallpaossiblecom-to
buildtrulyfaulttolerantsystems(containingsoftware)havingameantime
be[Gratwy,een1990].failuresItisofsevencouragingeralyearsasorasoftmorewareusingengineertherightotknotecwhniquesthatsucandhtogoalsols
able.hievacindeedare
Bibliography
BOINC:BerkeleyOpenInfrastructureforNetworkComputing.Projectweb
site.URLhttp://boinc.berkeley.edu/.Accessed2007-10-27.
TheBallistaProject.Projectwebsite.URLhttp://www.ballista.org.
2007-10-27.AccessedTheEU-ISTDependabilityBenchmarkingproject(DBench).Projectweb
site.URLhttp://www.dbench.org.Accessed2007-10-27.
TheEmbeddedMicroprocessorBenchmarkConsortium(EEMBC).Consor-
tiumwebsite.URLhttp://www.eembc.org.Accessed2007-10-27.
FreeBSDKernel//people.freebsd.org/StressTestSuite.Onlinepho/stress/index.htmlcollection.oftests.Accessed:URL2007-10-http:
~27.IEEEStandardGlossaryofSoftwareEngineering.IEEEStandard610.12-
1990,December1990.
IntrinsycSoftware.Companywebsite.URLhttp://www.intrinsyc.com.
2007-10-27.AccessedTheLINPACK.Webpage.URLhttp://www.netlib.org/linpack/.Ac-
2007-10-27.cessedPROTOS-SecurityTestingofProtocolImplementations.Projectweb
Accessed:.http://www.ee.oulu.fi/research/ouspg/protos/URLsite.2007-10-27.TheStandardPerformanceEvaluationCorporation(SPEC).Organization
website.URLhttp://www.spec.org.Accessed2007-10-27.
TheTransactionProcessingPerformanceCouncil(TPC).Organizationweb
site.URLhttp://www.tpc.org.Accessed2007-10-27.
159
160
BIBLIOGRAPHY
Transact-SQLReferenceforSQLServer2005.Onlinereferencedocu-
menaspxt..URLAccessed2007-10-27.http://msdn2.microsoft.com/en-us/library/ms189826.
ArnaudAlbinet,JeanArlat,andJean-CharlesFabre.Characterizationof
theImpactofFaultyDriversontheRobustnessoftheLinuxKernel.In
ProceedingsoftheInternationalConferenceonDependableSystemsand
2004.807–816,pages,NetworksJeanArlat,MartineAguera,LouisAmat,YvesCrouzet,Jean-CharlesFabre,
Jean-ClaudeLaprie,LeianeMartins,andDavidPowell.FaultInjectionfor
DependabilityValidation:AMethodologyandSomeApplications.IEEE
TransactionsonSoftwareEngineering,16(2):166–182,February1990.
JeanArlat,AlainCostes,YvesCrouzet,Jean-ClaudeLaprie,andDavid
Powell.FaultInjectionandDependabilityEvaluationofFault-Tolerant
Systems.IEEETransactiononComputers,42(8):913–923,August1993.
JeanArlat,Jean-CharlesFabre,ManuelRodriguez,andFredericSalles.De-
pendabilityofCOTSMicrokernel-BasedSystems.IEEETransactionson
Computers,51(2):138–163,February2002.
AlgirdasAviˇzienis,Jean-ClaudeLaprie,BrianRandell,andCarlLandwehr.
BasicConceptsandTaxonomyofDependableandSecureComputing.
IEEETransactionsonDependableandSecureComputing,1(1):11–33,
2004.huary-MarcJanThomasBallandSriramRajamani.Theslamproject:Debuggingsystem
softwareviastaticanalysis.InProceedingsofSymposiumonPrinciplesof
ProgrammingLanguages,pages1–3,2002.
JamesH.Barton,EdwardW.Czeck,ZaryZ.Segall,andDanielP.Siewiorek.
FaultInjectionExperimentsUsingFIAT.IEEETransactionsonComput-
1990.April39(4):575–582,,ersDouglasBoling.ProgrammingMicrosoftWindowsCE.Net.MicrosoftPress,
2003.edition,thirdAaronB.BrownandDavidA.Patterson.ToErrisHuman.InProcedeedings
oftheFirstWorkshoponEvaluatingandArchitectingSystemdependabil-
2001.,(EASY)itYAaronB.Brown,LeonardC.Chung,andDavidA.Patterson.Includingthe
HumanFactorinDependabilityBenchmarks.InProceedingsoftheDSN
WorkshoponDependabilityBenchmarking,pagesF9–14,2002.
BIBLIOGRAPHY
161
GeorgemandoFoCandea,x.MicrorebShinichiootKaAwTeamoto,chniqueYuicforhiFCheapujiki,RecoGregvFery.Iriedman,nProceeanddingsAr-
oftheSymposiumonOperatingSystemDesignandImplementation,pages
2004.31–44,
Jo˜TaoechniqueCarreira,fortheHenriqueExperimenMadeira,talEvandaluationJo˜aoofGabrielDepSilvendabilita.yinXception:ModernA
Computers.IEEETransactionsonSoftwareEngineering,24(2):125–136,
1998.ebruaryF
Jo˜aoViegasCarreira,DiamantinoCosta,andJo˜aoGabrielSilva.Fault
InjectionSpot-ChecksComputerSystemDependability.IEEESpectrum,
1999.August36(8):50–55,
GeorgeJ.Carrette.Thewebsiteforcrashme.URLhttp://people.
2007-10-27.Accessed.delphiforums.com/gjc/crashme.html
RameshChandra,RyanM.Lefever,KaustubhR.Joshi,MichelCukier,and
tributedWilliamH.SystemSanders.EvAaluation.Global-State-TIEEETrriggeredansactionsFonaultParalInjectorlelandforDis-Dis-
tributedSystems,15(7):593–605,2004.
ShuoChen,JunXu,RavishankarK.Iyer,andKeithWhisnant.Evaluating
TtheransienSecurittyErrors.ThreatInofProceeFirewdingsallofDataInternationalCorruptionConferCausedencbeyonDepInstructionend-
ableSystemsandNetworks,pages495–504,2002.
RamChillarege.HandbookonSoftwareReliabilityEngineering,chapterOr-
thogonalDefectClassification,pages359–400.McGraw-Hill,1996.
RamChillaregeandNicholasS.Bowen.UnderstandingLargeSystemFail-
ures-AFaultInjectionExperiment.InProceedingsoftheInternational
SymposiumonFault-TolerantComputing,pages356–363,1989.
RamDianeS.Chillarege,Moebus,InderpalBonnieS.K.RayBhandari,,andJarirMan-YK.uenChaar,Wong.MichaelOrthogonalJ.HallidaDefecty,
tionsClassificationonSoftwar-AeEngineConcepteringfor,In-Pro18(11):943–956,cessMeasuremenNovembts.er1992.IEEETransac-
AndyEngler.Chou,AnJunfengEmpiricalYang,StudyofBenjaminOperatingChelf,SethSystemHallem,Errors.andInPDaroceewsondingsR.
ofSymposiumonOperatingSystemsPrinciples,pages73–88,2001.
162
BIBLIOGRAPHY
J¨orgenChristmanssonandRamChillarege.GenerationofanErrorSetthat
EmulatesSoftwareFaultsBasedonFieldData.InInternationalSympo-
siumonFaultTolerantComputing,pages304–313,1996.
J¨orgenChristmansson,MartinHiller,andMartinRim´en.AnExperimental
nationalComparisonConferofFencaulteonandSoftwarErroreRInjection.eliabilityInEngineProceeeringdings,pagesofthe378–396,Inter-
1998.JeffreyClarkandDhirajK.Pradhan.FaultInjection:AMethodforValidat-
ingComputer-SystemDependability.IEEEComputer,28(6):47–56,June
1995.ChristianConstantinescu.NeutronSERCharacterizationofMicroproces-
temssors.InandProceeNetworksdings,ofpagesthe754–759,International2005.ConferenceonDependableSys-
ChristianIEEEMicroConstan,tinescu.23(4):14–19,Trends2003.andChallengesinVLSICircuitReliability.
JonathanCorbet,AlessandroRubini,andGregKroah-Hartman.Linux
DeviceDrivers.O’Reilly,thirdedition,February2005.URLhttp:
.//www.oreilly.com/catalog/linuxdrive3/book/index.csp
RichardA.DeMillo,RichardJ.Lipton,andFrederickG.Sayward.Hints
onTestDataSelection:HelpforthePracticingProgrammer.IEEECom-
1978.11(4):34–41,,puter
JohnDeValeandPhilipKoopman.PerformanceEvaluationofException
HandlinginI/OLibraries.InProceedingsoftheInternationalConference
onDependableSystemsandNetworks,pages519–524,July2001.
JohnDeVale,PhilipKoopman,andDavidGuttendorf.TheBallistaSoft-
wareRobustnessTestingService.InProceedingsoftheTestingComputer
SoftwareConference,1999.
ChristopherP.Dingman,JoeMarshall,andDanielP.Siewiorek.Measur-
ingRobustnessofaFault-TolerantAerospaceSystem.InProceedingsof
theInternationalSymposiumonFault-TolerantComputing,pages522–526,
1995.WenliangDuandAdityaP.Mathur.TestingforSoftwareVulnerabilityUsing
EnvironmentPerturbation.InProceedingsoftheInternationalConference
onDependableSystemsandNetworks,pages603–612,2000.
BIBLIOGRAPHY
163
Jo˜aoDur˜aesandHenriqueMadeira.MultidimensionalCharacterizationof
theImpactofFaultyDriversontheOperatingSystemBehavior.IEICE
Transactions,E86-D(12):2563–2570,December2003.
Jo˜aoDur˜aesandHenriqueMadeira.EmulationofSoftwareFaultsbyEd-
ucatedMutationsatMachine-CodeLevel.InProceedingsoftheInter-
nationalSymposiumonSoftwareReliabilityEngineering,pages329–340,
2002.Jo˜aoDur˜aesandHenriqueMadeira.EmulationofSoftwareFaults:AField
DataStudyandaPracticalApproach.IEEETransactionsonSoftware
2006.32(11):849–867,,eringEngineChristofFetzerandZhenXiao.AFlexibleGeneratorArchitectureforIm-
provingSoftwareDependability.InProceedingsoftheInternationalSym-
posiumonSoftwareReliabilityEngineering,pages102–113,2002a.
ChristofFetzerandZhenXiao.DetectingHeapSmashingAttacksThrough
FaultContainmentWrappers.InProceedingsofIEEESymposiumonRe-
liableDistributedSystems,pages80–89,2001.
ChristofFetzerandZhenXiao.AnAutomatedApproachtoIncreasingthe
RobustnessofCLibraries.InProceedingsoftheInternationalConference
onDependableSystemsandNetworks,pages155–164,June2002b.
JustinE.ForresterandBartonP.Miller.AnEmpiricalStudyoftheRobust-
nessofWindowsNTApplicationsUsingRandomTesting.InProceedings
oftheUSENIXWindowsSystemsSymposium,pages59–68,2000.
TimothyFraser,LeeBadger,andMarkFeldman.HardeningCOTSSoftware
withGenericSoftwareWrappers.InProceedingsofIEEESymposiumon
SecurityandPrivacy,pages2–16,1999.
ArchanaGanapathiandDavidPatterson.CrashDataCollection:AWin-
dowsCaseStudy.InProceedingsoftheInternationalConferenceonDe-
pendableSystemsandNetworks,pages280–285,2005.
ArchanaGanapathi,VijiGanapathi,andDavidPatterson.WindowsXP
KernelCrashAnalysis.InProceedingsofLargeInstallationSystemAd-
ministrationConference,2006.URLhttp://www.cs.berkeley.edu/
.archanag/publications/lisa.pdf~AnupGhosh,MatthewSchmid,andShah.TestingtheRobustnessofWin-
dowsNTSoftware.InProceedingsofInternationalSymposiumonSoftware
ReliabilityEngineering,pages231–235,1998.
164
BIBLIOGRAPHY
AnupGhosh,MatthewSchmid,andFrankHill.WrappingWindowsNT
SoftwareforRobustness.InProceedingsofInternationalSymposiumon
Fault-TolerantComputing,pages344–347,1999.
PatriceGodefroid,MichaelY.Levin,andDavidMolnar.AutomatedWhite-
boxFuzzTesting.MicrosoftTechnicalReportMSR-TR-2007-91,Microsoft
2007.Julyh,ResearcJimGray.ACensusofTandemSystemAvailabilityBetween1985and1990.
IEEETransactionsonReliability,39(4):409–418,1990.ISSN0018-9529.
JimGray.WhyDoComputersStopandWhatCanWeDoAboutIt.Tech-
nicalReportTR85.5,Tandem,1985.
MichaelGrottkeandKishorS.Trivedi.FightingBugs:Remove,Retry,
Replicate,andRejuvenate.Computer,40(2):107–109,2007.
WeiningGu,ZbigniewKalbarczyk,RavishankarK.Iyer,andZhenyuYang.
CharacterizationofLinuxKernelBehaviorunderErrors.InProceedingsof
theInternationalConferenceonDependableSystemsandnetworks,pages
2003.468,–459WeiningGu,ZbigniewKalbarczyk,andRavishankarK.Iyer.ErrorSen-
sitivityoftheLinuxKernelExecutingonPowerPCG4andPentium4
Processors.InProceedingsoftheInternationalConferenceonDependable
SystemsandNetworks,pages887–896,2004.
DanGusfield.AlgorithmsonStrings,TreesandSequences.CambridgeUni-
1997.Press,yersitvDickHamlet.WhenOnlyRandomTestingWillDo.InProceedingsofthe
InternationalWorkshoponRandomtesting,pages1–9,2006.ISBN1-
ttp://doi.acm.org/10.1145/1145735.1145737.hdoi:59593-457-X.SeungjaeHan,KangG.Shin,andHaroldA.Rosenberg.DOCTOR:AnIn-
tegratedSoftwareFaultInjectionEnvironmentforDistributedReal-Time
Systems.InProceedingsoftheInternationalComputerPerformanceand
DependabilitySymposium,pages204–213,1995.
JaneHuffmanHayesandJeffOffutt.Inputvalidationanalysisandtesting.
EmpiricalSoftwareEngineering,11(4):493–522,December2006.
JohnL.Henning.SPECCPU2000:measuringCPUperformanceinthenew
millennium.IEEEComputer,33(7):28–35,2000.
BIBLIOGRAPHY
165
JorritN.Herder,HerbertBos,BenGras,PhilipHomburg,andAndrewS.
Tanenbaum.Failureresiliencefordevicedrivers.InProceedingsofthe
InternationalConferenceonDependableSystemsandNetworks,pages41–
2007.50,MartinHiller.ExecutableAssertionsforDetectingDataErrorsinEmbed-
dedControlSystems.InProceedingsoftheInternationalConferenceon
DependableSystemsandNetworks,pages24–33,2000.
MartinHiller.ASoftwareProfilingMethodologyforDesignandAssessment
ofDependableSoftware.Ph.D.Thesis,DepartmentofComputerEngineer-
ing,ChalmersUniversityofTechnology,G¨oteborg,Sweden,2002.
MartinHiller,ArshadJhumka,andNeerajSuri.OnthePlacementofSoft-
wareMechanismsforDetectionofDataErrors.InProceedingsoftheInter-
nationalConferenceonDependableSystemsandNetworks,pages135–144,
2002a.MartinHiller,ArshadJhumka,andNeerajSuri.PROPANE:AnEnviron-
mentforExaminingthePropagationofErrorsinSoftware.InProceedings
ofInternationalSymposiumonSoftwareTestingandAnalysis,pages81–
2002b.July85,MartinHiller,ArshadJhumka,andNeerajSuri.EPIC:ProfilingtheProp-
agationandEffectofDataErrorsinSoftware.IEEETransactionson
Computers,53(5):512–530,May2004.
MichaelHowardandSteveLipner.TheSecurityDevelopmentLifecycle.Mi-
2006.edition,firstPress,crosoftMei-ChenHsueh,TimothyK.Tsai,andRavishankarK.Iyer.FaultInjection
TechniquesandTools.IEEEComputer,30(4):75–82,April1997.
RavishankarIyerandPaolaVelardi.Hardware-RelatedSoftwareErrors:
MeasurementandAnalysis.IEEETransactionsonSoftwareEngineering,
1985.SE-11(2):223–231,TaharJarboui,JeanArlat,YvesCrouzet,andKaramaKanoun.Experi-
mentalAnalysisoftheErrorsInducedintoLinuxbyThreeFaultInjection
Techniques.InInternationalConferenceonDependableSystemsandNet-
works,pages331–336,2002a.
TaharJarboui,JeanArlat,YvesCrouzet,KaramaKanoun,andThomas
Marteau.AnalysisoftheEffectsofRealandInjectedSoftwareFaults.In
166
BIBLIOGRAPHY
ProceedingsofthePacificRimInternationalSymposiumonDependable
2002b.51–58,pages,ComputingTaharJarboui,JeanArlat,YvesCrouzet,KaramaKanoun,andThomas
Marteau.ImpactofInternalandExternalSoftwareFaultsontheLinux
Kernel.IEICETransactionsonInformationandSystems,E86-D(12):
2003.2571–2578,SteveJobs.KeynotetalkatAppleWorldWideDevelopersConference,2006.
Andr´easJohansson.DependabilityBenchmarking.Master’sthesis,De-
partmentofComputerEngineering,ChalmersUniversityofTechnology,
G¨oteborg,Sweden,2001.
AliKalakech,TaharJarboui,JeanArlat,YvesCrouzet,andKaramaKa-
noun.BenchmarkingOperatingSystemDependability:Windows2000as
aCaseStudy.InProceedingsofthePacificRimInternationalSymposium
onDependableComputing,pages261–270,2004a.
AliKalakech,KaramaKanoun,YvesCrouzet,andJeanArlat.Benchmarking
TheDependabilityofWindowsNT4,2000andXP.InProceedingsof
theInternationalConferenceonDependableSystemsandNetworks,pages
2004b.681–686,GhaniA.Kanawati,NasserA.Kanawati,andJacobA.Abraham.FER-
RARI:AToolfortheValidationofSystemDependabilityProperties.In
InternationalSymposiumonFault-TolerantComputing,pages336–344,
1992.GhaniA.Kanawati,NasserA.Kanawati,andJacobA.Abraham.FER-
RARI:AFlexibleSoftware-BasedFaultandErrorInjectionSystem.IEEE
TransactionsonComputers,44(2):248–260,February1995.
CemKaner,JackFalk,andHungQ.Nguyen.TestingComputerSoftware.
1999.Sons,&WileyJohnKaramaKanoun,JeanArlat,DiamantionJ.G.Costa,MarioDalCin,Pedro
Gil,Jean-ClaudeLaprie,HenriqueMadeira,andNeerajSuri.Dbench(De-
pendabilityBenchmarking).InWorkshoponTheEuoropeanDependability
2001.D12–15,pages,InitiativeKaramaKanoun,YvesCrouzet,AliKalakech,Ana-ElenaRugina,and
PhilippeRumeau.TMBenchmarkingtheDependabilityofWindowsandLinux
usingPostMarkWorkloads.InProceedingsoftheInternationalSympo-
siumonSoftwareReliabilityEngineering,pages11–20,2005.
BIBLIOGRAPHY
167
Wei-LunKaoandRavishankarK.Iyer.DEFINE:ADistributedFaultInjec-
tionandMonitoringEnvironment.InWorkshoponFault-TolerantParallel
andDistributedSystems,pages252–259,1994.
Wei-LunKao,RavishankarK.Iyer,andDongTang.FINE:AFaultInjection
andMonitoringEnvironmentforTracingtheUNIXSystemBehaviorunder
Faults.IEEETransactionsonSoftwareEngineering,19(11):1105–1118,
November1993.
JohanKarlsson,PeterLiden,PeterDahlgren,RolfJohansson,andUlfGun-
neflo.UsingHeavy-IonRadiationtoValidateFault-HandlingMechanisms.
IEEEMicro,14(1):8–23,February1994.ISSN0272-1732.
PhilipKoopman.Towardascalablemethodforquantifyingaspectsoffault
tolerance,softwareassurance,andcomputersecurity.InComputerSecu-
rity,DependabilityandAssurance:FromNeedstoSolutions,1998.Pro-
ceedings,pages103–131,1999.
PhilipKoopmanandJohnDeVale.ComparingtheRobustnessofPOSIX
OperatingSystems.InProceedingsoftheInternationalSymposiumon
Fault-TolerantComputing,pages72–79,1999.
PhilipKoopmanandJohnDeVale.TheExceptionHandlingEffectivenessof
POSIXOperatingSystem.IEEETransactionsonSoftwareEngineering,
26(9):837–848,September2000.
PhilipKoopman,JohnSung,ChristopherDingman,DanielSiewiorek,and
TedMarz.ComparingOperatingSystemsUsingRobustnessBenchmarks.
InProceedingsoftheSymposiumonReliableDistributedSystems,pages
1997.72–79,NathanPKropp,PhilipJKoopman,andDanielPSiewiorek.Automated
RobustnessTestingofOff-the-ShelfSoftwareComponents.InProceedings
oftheInternationalSymposiumonFaultTolerantComputing,pages230–
1998.239,SanjeevKumarandKaiLi.UsingModelCheckingtoDebugDevice
Firmware.InProceedingsoftheUSENIXSymposiumonOperatingSys-
temsDesignandImplementation,2002.
Jean-ClaudeLaprie,editor.Dependability:BasicConceptsandTerminology.
1992.erlag,Springer-V
168
BIBLIOGRAPHY
InhwanLeeandRavishankarK.Iyer.Faults,Symptoms,andSoftwareFault
ToleranceintheTandemGUARDIAN90OperatingSystem.InProceedings
oftheInternationalSymposiumonFault-TolerantComputing,pages20–
1993.29,NancyG.Leveson.Safeware:SystemSafetyandComputers.Addison-
1995.,esleyWHenriqueMadeira,DiamantinoCosta,andMarcoVieira.OntheEmulation
ofSoftwareFaultsbySoftwareFaultInjection.InProceedingsoftheInter-
nationalConferenceonDependableSystemsandNetworks,pages417–426,
2000.JuneHenriqueMadeira,RaphaelR.Some,F.Moreira,DiamantinoCosta,and
DavidRennels.ExperimentalEvaluationofaCOTSSystemforSpace
Applications.InProceedingsoftheInternationalConferenceonDepend-
ableSystemsandNetworks,pages325–330,2002.
EricMarsdenandJean-CharlesFabre.FailureModeAnalysisofCORBA
ServiceImplementations.InProceedingsofMiddleware,volume2218of
LectureNotesonComputerScience,pages216–231.SpringerVerlag,2001.
ManuelMendoncaandNunoNeves.RobustnessTestingoftheWindows
DDK.InProceedingsoftheInternationalConferenceonDependableSys-
temsandNetworks,pages554–564,2007.
ChristophC.MichaelandRyanC.Jones.OntheUniformityofErrorPropa-
gationinSoftware.InProceedingsoftheAnnualConferenceonComputer
Assurance,pages68–76,1997.
VisualStudio,MicrosoftPortableExecutableandCommonObjectFileFor-
matSpecification.Microsoft,8.0edition,May2006.URLhttp://www.
.microsoft.com/whdc/system/platform/firmware/PECOFF.mspxBartonMiller,DavidKoski,CjinPheowLee,VivekanandaMaganty,Ravi
Murthy,AjitkumarNatarajan,andJeffSteidl.FuzzRevisited:ARe-
examinationoftheReliabilityofUNIXUtilitiesandServices.Technical
Report1268,DepartmentofComputerSciences,UniversityofWisconsin,
1995.BartonP.Miller,LouisFredriksen,andBryanSo.AnEmpiricalStudyofthe
ReliabilityofUNIXUtilities.CommunicationsoftheACM,33(12):32–44,
December1990.ISSN0001-0782.doi:http://doi.acm.org/10.1145/96267.
96279.
BIBLIOGRAPHY
169
BartonP.Miller,GregoryCooksey,andFredrickMoore.AnEmpiricalStudy
oftheRobustnessofMacOSApplicationsUsingRandomTesting.InPro-
ceedingsoftheInternationalWorkshoponRandomTesting,pages46–54,
2006.TerrenceMitchem,RaymondLu,RichardO’Brien,andKentLarson.Linux
KernelLoadableWrappers.InProceedingsofDARPAInformationSur-
vivabilityConference&Exposition,volume2,pages296–307,2000.
R.Moraes,R.Barbosa,J.Duraes,N.Mendes,E.Martins,andH.Madeira.
Injectionoffaultsatcomponentinterfacesandinsidethecomponentcode:
aretheyequivalent?InProceedingsoftheDependableComputingConfer-
ence,pages53–64,2006.
L.J.Morell.Atheoryoffault-basedtesting.IEEETransactionsonSoftware
Engineering,16(8):844–857,August1990.ISSN0098-5589.
MSDN.TheMicrosoftDeveloperNetwork(MSDN).Onlinereferencedocu-
ments.URLhttp://www.msdn.microsoft.com.Accessed2007-10-27.
ArupMukherjeeandDanielP.Siewiorek.Measuringsoftwaredependability
byrobustnessbenchmarking.IEEETransactionsonSoftwareEngineering,
0098-5589.ISSN1997.June23(6):366–378,BrendanMurphy.AutomatingSoftwareFailureReporting.Queue,2(8):
2004.42–48,BrendanMurphyandBj¨ornLevidow.Windows2000Dependability.In
ProceedingsoftheWorkshoponDependableNetworksandOS,2000.
JohnMusa.OperationalProfilesinSoftware-ReliabilityEngineering.IEEE
Software,pages14–32,March1993.
GlenfordJ.Myers.TheArtofSoftwareTesting.JohnWiley&Sons,2
2004.edition,NunoFerreiraNeves,Jo˜aoAntunes,MiguelCorreia,PauloVer´ıssimo,and
RuiNeves.UsingAttackInjectiontoDiscoverNewVulnerabilities.In
ProceedingsoftheInternationalConferenceonDependableSystemsand
2006.457–466,pages,NetworksWeeTeckNgandPeterM.Chen.ThedesignandVerificationoftheRio
FileCache.IEEETransactionsonComputers,50(4):322–337,2001.ISSN
0018-9340.
170
BIBLIOGRAPHY
PeterOehlert.ViolatingAssumptionswithFuzzing.IEEESecurity&Pri-
2005.3(2):58–62,,MagazinevacyWalterOney.ProgrammingtheMSWindowsDriverModel.MicrosoftPress,
2003.CarlosPacheco,ShuvenduK.Lahiri,MichaelD.Ernst,andThomasBall.
Feedback-DirectedRandomTestGeneration.InProceedingsoftheIn-
ternationalConferenceonSoftwareEngineering,pages75–84,Washing-
ton,DC,USA,2007.IEEEComputerSociety.ISBN0-7695-2828-7.doi:
ttp://dx.doi.org/10.1109/ICSE.2007.37.hJiantaoPan,PhilipKoopman,DanielSiewiorek,YennunHuang,Robert
Gruber,andMimiLingJiang.RobustnessTestingandHardeningof
CORBAORBImplementations.InProceedingsoftheInternationalCon-
ferenceonDependableSystemsandNetworks,pages141–150,2001.
DavidPowell,G.Bonn,D.Seaton,PauloVerissimo,andF.Waeselynck.
TheDelta-4ApproachtoDependabilityinOpenDistributedComputing
Systems.InProceedingsoftheInternationalSymposiumonFault-Tolerant
1988.246–251,pages,ComputingDhirajK.Pradhan,editor.Fault-TolerantComputerSystemDesign.Prentice
1996.Hall,ManuelRodriguez,ArnaudAlbinet,andJeanArlat.MAFALDA-RT:ATool
forDependabilityAssessmentofReal-TimeSystems.InProceedingsof
theInternationalConferenceonDependableSystemsandNetworks,pages
2002.267–272,Z.Segall,D.Vrsalovic,D.Siewiorek,D.Yaskin,J.Kownacki,J.Barton,
R.Dancey,A.Robinson,andT.Lin.FIAT-FaultInjectionBasedAuto-
matedTestingEnvironment.InProceedingsoftheInternationalSympo-
siumonFault-TolerantComputing,pages102–107,1988.
CharlesP.Shelton,PhilipKoopman,andKobeyDeVale.RobustnessTesting
oftheMicrosoftWin32API.InProceedingsoftheInternationalConference
onDependableSystemsandNetworks,2000.
KangG.Shin.HARTS:aDistributedReal-TimeArchitecture.IEEECom-
puter,24(5):25–35,May1991.
PremkishoreShivakumar,MichaelKistler,StephenW.Keckler,Doug
Burger,andLorenzoAlvisi.ModelingtheEffectofTechnologyTrendson
BIBLIOGRAPHY
171
theSoftErrorRateofCombinationalLogic.InProceedingsoftheInter-
nationalConferenceonDependableSystemsandNetworks,pages389–398,
2002.DanielSiewiorek,JohnJ.Hudak,Byung-HoonSuh,andZarySegal.Devel-
opmentofaBenchmarktoMeasureSystemRobustness.InProceedings
ofInternationalSymposiumonFault-TolerantComputing,pages88–97,
1993.AbrahamSilberschatz,PeterBaerGalvin,andGregGagnet.OperatingSys-
temsConcepts.JohnWiley&Sons,seventhedition,December2004.
DanielSimpson.WindowsXPEmbeddedwithServicePack1Reliabil-
ity.Technicalreport,MicrosoftCorporation,2003.URLhttp://msdn2.
2007-10-27.Accessed.us/library/ms838661.aspxmicrosoft.com/en-DavidT.Stott,BenjaminFloering,DanielBurke,ZbigniewKalbarczyk,and
RavishankarK.Iyer.NFTAPE:AFrameworkforAssessingDependability
inDistributedSystemswithLightweightFaultInjectors.InProceedingsof
theInternationalSymposiumonComputerPerformanceandDependabil-
2000.91–100,pages,ityMarkSullivanandRamChillarege.SoftwareDefectsandtheirImpacton
SystemAvailability-AStudyofFieldFailuresinOperatingSystems.In
InternationalSymposiumFault-TolerantComputing,pages2–9,1991.
MarkSullivanandRamChillarege.AComparisonofSoftwareDefectsin
DatabaseManagementSystemsandOperatingSystems.InInternational
SymposiumonFault-TolerantComputing,pages475–484,1992.
MartinS¨ußkrautandChristofFetzer.RobustnessandSecurityHardening
ofCOTSSoftwareLibraries.InInternationalConferenceonDependable
SystemsandNetworks,pages61–71,2007.
MartinS¨ußkrautandChristofFetzer.AutomaticallyFindingandPatching
BadErrorHandling.InProceedingsoftheEuropeanDependableComput-
ingConference,pages13–22,2006.
MichaelM.Swift,BrianN.Bershad,andHenryM.Levy.ImprovingtheReli-
abilityofCommodityOperatingSystems.ACMTransactionsonComputer
2005.23(1):77–110,,SystemsMichaelM.Swift,MuthukaruppanAnnamalai,BrianN.Bershad,and
HenryM.Levy.RecoveringDeviceDrivers.ACMTransactionsonCom-
puterSystems,24(4):333–360,November2006.
172
BIBLIOGRAPHY
AndrewS.Tanenbaum.ModernOperatingSystems.PrenticeHall,2edition,
2001.TimothyTsaiandNavjotSingh.ReliabilityTestingofApplicationsonWin-
dowsNT.InProceedingsoftheInternationalConferenceonDependable
SystemsandNetworks,pages427–436,2000.
TimothyK.TsaiandRavishankarK.Iyer.MeasuringFaultTolerance
withtheFTAPEFaultInjectionTool.InProceedingsofthePerformance
Tools/MMB,LNCS977,pages26–40.SpringerVerlag,1995.
TimothyK.Tsai,RavishankarK.Iyer,andDougJewitt.AnApproachto-
wardsBenchmarkingofFault-TolerantCommercialSystems.InProceed-
ingsoftheInternationalSymposiumonFault-TolerantComputing,pages
1996.314–325,TimothyK.Tsai,Mei-ChenHsueh,HongZhao,ZbigniewKalbarczyk,and
RavishankarK.Iyer.Stress-BasedandPath-BasedFaultInjection.IEEE
TransactionsonComputers,48(11):1183–1201,November1999.
MarcoVieiraandHenriqueMadeira.PortableFaultloadsBasedonOper-
atorFaultsforDBMSDependabilityBenchmarking.InProceedingsof
theInternationalComputerSoftwareandApplicationsConference,pages
2004.202–209,MarcoVieiraandHenriqueMadeira.RecoveryandPerformanceBalance
ofaCOTSDBMSinthePresenceofOperatorFaults.InProceedingsof
theInternationalConferenceonDependableSystemsandNetworks,pages
2002a.615–624,MarcoVieiraandHenriqueMadeira.DefinitionofFaultloadsBasedonOp-
eratorFaultsforDMBSRecoveryBenchmarking.InProceedingsofthe
PacificRimInternationalSymposiumonDependableComputing,pages
2002b.265–272,JeffreyVoas.CertifyingSoftwareforHigh-AssuranceEnvironments.IEEE
Software,31(6):48–54,July-August1999.
JeffreyVoas.ErrorPropagationAnalysisforCOTSSystems.IEEComputing
&ControlEngineeringJournal,8(6):269–272,December1997a.
JeffreyVoas.BuildingSoftwareRecoveryAssertionsfromaFaultInjection-
basedPropagationAnalysis.InProceedingsoftheInternationalComputer
SoftwareandApplicationsConference,pages505–510,1997b.
BIBLIOGRAPHY
173
JeffreyVoasandF.Charron.TolerantSoftwareInterfaces:CanCOTS-based
SystemsbeTrustedWithoutThem?InProceedingsoftheInternational
ConferenceonComputerSafety,ReliabilityandSecurity,1996.
JeffreyVoasandKeithW.Miller.DynamicTestabilityAnalysisforAssessing
FaultTolerance.HighIntegritySystemsJournal,1(2):171–178,1994a.
JeffreyVoasandKeithW.Miller.SoftwareTestability:theNewVerification.
IEEESoftware,12(3):17–28,1995.
JeffreyVoasandKeithW.Miller.PuttingAssertionsinTheirPlace.In
ProccedingsoftheInternationalSymposiumonSoftwareReliabilityEngi-
neering,pages152–157,1994b.
JeffreyVoas,GaryMcGraw,andAnupK.Gosh.GluingSoftwareTogether:
HowGoodisYourGlue?InProceedingsofthePacificNorthwestSoftware
QualityConference,oct1996.
JeffreyVoas,FrankCharron,GaryMcGraw,KeithMiller,andMichaelFried-
man.PredictingHowBadly“Good”SoftwareCanBehave.IEEESoftware,
1997.July-August14(4):73–83,AlanR.WeissandRichardClucas.TheStandardizationofEmbeddedbench-
marking:ThePitfallsandtheopportunities.InProceedingsoftheEmbed-
dedSystemsConference,1999.
ElaineWeyuker.TestingComponent-BasedSoftware:ACautionaryTale.
IEEESoftware,15(5):54–59,Sep.–Oct.1998.
KeithWhisnant,RavishankarIyer,ZbignewKalbarczyk,PhillipH.JonesIII,
DavidRennels,andRaphaelSome.TheEffectsofanARMOR-BasedSIFT
EnvironmentonthePerformanceandDependabilityofUserApplications.
IEEETransactionsonSoftwareEngineering,30(4):257–277,April2004.
JamesA.Whittaker.HowtoBreakSoftware.Addison-Wesley,2003.
JunXu,ZbigniewKalbarczyk,andRavishankarK.Iyer.NetworkedWindows
NTSystemFieldFailureDataAnalysis.InProceedingsofthePacificRim
InternationalSymposiumonDependableComputing,pages178–185,1999.
JunXu,ShuoChen,ZbigniewKalbarczyk,andRavishankarK.Iyer.AnEx-
perimentalStudyofSecurityVulnerabilitiesCausedbyErrors.InProceed-
ingsoftheInternationalConferenceonDependableSystemsandNetworks,
2001.421–432,pages
174
enStev
J.
Zeil.
ansactionsrT
estingT
on
for
eSoftwar
erturbationsP
eringEngine
,
of
Program
BIBLIOGRAPHYts.Statemen
SE-9(3):335–346,
yMa
1983.
IEEE
CV
DataersonalP
´AndrName:Johanssoneas
Dateofbirth:March24,1977
Placeofbirth:Falkenberg,Sweden
olhoScEducation
1984-1993Apelskolan,Ullared,Sweden
1993-1996FalkenbergsGymnasieskola,Falkenberg,Sweden
EducationyersitUniv
1997-2001G¨oteborg,MasterSwedenofScienceinComputerEngineering,
2002-2008Ph.D.inComputerScience,TechnischeUniversit¨at
yGermanDarmstadt,
175
Chalmers,
Darmstadt,
Un accès à la bibliothèque YouScribe est nécessaire pour lire intégralement cet ouvrage.
Découvrez nos offres adaptées à tous les besoins !