Time-efficient asynchronous service replication [Elektronische Ressource] / vorgelegt von Dan Dobre

technischen_universitat_darmstadt - Dan Dobre

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

190 pages

English

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Sujets

Informatik

Informations

Publié par	technischen_universitat_darmstadt
Publié le	01 janvier 2010
Nombre de lectures	13
Langue	English
Poids de l'ouvrage	1 Mo

Extrait

Time-E cient Asynchronous Service Replication
Vom Fachbereich Informatik der Technischen Universit at Darmstadt
genehmigte
Dissertation
zur Erlangung des akademischen Grades eines Doktor-Ingenieur (Dr.-Ing.)
vorgelegt von
Dipl.-Inform. Dan Dobre
aus Lugoj, Rum anien
Referenten:
Prof. Neeraj Suri
Prof. Michel Raynal
Datum der Einreichung: 23.06.2010
Datum der mundlic hen Prufung: 30.09.2010
Darmstadt 2010
D17iiAbstract
Modern critical computer applications often require continuous and correct oper-
ation despite the failure of critical system components. In a distributed system,
fault-tolerance can be achieved by creating multiple copies of the functionality and
placing them at di erent processes. The core constitutes a distributed protocol
run among the processes whose goal is to provide the end user with the illusion of
sequentially accessing a single correct copy. Not surprisingly, the e ciency of the
distributed protocol used has a severe impact on the application performance.
This thesis investigates the cost associated with implementing fundamental
abstractions constituting the core of service replication in asynchronous distributed
systems, namely (a) consensus and (b) the read/write register. The main question
addressed by this thesis is how e cient implementations of these abstractions
can be. The focus of the thesis lies on time complexity (or latency) as the main
e ciency metric, expressed as the number of communication steps carried out by
the algorithm before it terminates. Besides latency, important cost factors are the
resilience of an algorithm (i.e. the fraction of failures tolerated) and its message
complexity (the number of messages exchanged).
Consensus is perhaps the most fundamental problem in distributed computing.
In the consensus problem, processes propose values and unanimously agree on one
of the proposed values. In a purely asynchronous system, in which there is no
upper bound on message transmission delays, consensus is impossible if a single
process may crash. In practice however, systems are not asynchronous. They are
timely in the common case and exhibit asynchronous behavior only occasionally.
This observation has led to the concept of unreliable failure detectors to capture
the synchrony conditions su cient to solve consensus.
This thesis studies the consensus problem in asynchronous systems in which
processes may fail by crashing, enriched with unreliable failure detectors. It de-
termines how quickly consensus can be solved in the common case, characterized
by stable executions in which all failures have reliably been detected, settling im-
portant questions about consensus time complexity.
Besides consensus, the read/write register abstraction is essential to sharing
information in distributed systems, also referred to as distributed storage for its
importance as a building-block in practical distributed storage and le systems. We
study fault-tolerant read/write register implementations in which the data shared
by a set of clients is replicated on a set of storage base objects. We consider
robust storage implementations characterized by (a) wait-freedom (i.e. the fact
the read/write operations invoked by correct clients always return) and (b) strong
consistency guarantees despite a threshold of object failures. We allow for the most
general type of object failure, Byzantine, without assuming authenticated data to
limit the adversary. In this model, we determine the worst-case time complexity
of accessing such a robust storage, closing several fundamental complexity gaps.
iiiKurzfassung
Fur moderne sicherheitskritische Computeranwendungen ist eine ununterbrochene
und fehlerfreie Funktion erforderlich, oft auch dem Ausfall kritischer Systemkom-
ponenten zum Trotz. In einem verteilten System kann Fehlertoleranz dadurch
erreicht werden, dass mehrere identische Kopien einer Applikation erstellt, und
auf verschiedene, m oglicherweise fehleranf allige Prozesse plaziert werden. Kern
dieses Verfahrens ist ein verteiltes Protokoll, das von den Prozessen im verteil-
ten System ausgefuhrt wird, mit dem Ziel eine einzelne und ausfallsichere Kopie
zu simulieren. Endbenutzern wird der Eindruck vermittelt, auf eine korrekte,
hochverfugbare Kopie sequentiell zuzugreifen. Wie nicht anders zu erwarten hat
die E zienz des verwendeten, verteilten Protokolls eine signi kante Auswirkung
auf die Performanz der Applikation.
Diese Dissertation untersucht die Kosten grundlegender Abstraktionen verteil-
ten Rechnens, die den Kern der Replikation von Diensten in verteilten Systemen
bilden, n amlich (a) Consensus und (b) das Lese-/Schreibregister. Die Haupt-
fragestellung dieser Arbeit ist wie e zient Implementierungen dieser Abstraktio-
nen ub erhaupt sein k onnen. Dabei liegt das Augenmerk der Dissertation auf der
Zeitkomplexit at (oder Latenzzeit) als ma gebliche E zienzmetrik, gegeben durch
die Anzahl der Kommunikationsphasen (oder -schritte) die ein verteiltes Protokoll
ben otigt bevor es terminieren kann. Zwei wichtige Kostenfaktoren neben der
Latenzzeit sind die Ausfallsicherheit (die Anzahl der tolerierten Ausfalle) und die
Nachrichtenkomplexit at (die Anzahl der gesendeten Nachrichen) eines Protokolls.
Consensus ist h ochstwahrscheinlich das grundlegendste Problem auf dem Ge-
biet des verteilten Rechnens. Es kann wie folgt beschrieben werden: Prozesse
schlagen jeweils einen Wert vor und mussen sich auf einen der vorgeschlagenen
Werte einigen. In einem rein asynchronen System, in dem keine oberen Schranken
fur die Kommunikationszeit zwischen Prozessen existieren, ist Consensus unlosbar,
selbst wenn nur ein einziger Prozess ausfallen darf. In der praktischen Anwen-
dung sind allerdings solche Systeme meistens synchron (d.h. es gibt solche oberen
Schranken), und sie verhalten sich nur gelegentlich asynchron. Diese Beobach-
tung fuhrte zu dem Konzept des unverl asslichen Fehlerdetektors, der die hinre-
ichenden Synchronit atsbedingungen fur die L osbarkeit von Consensus erfasst und
abstrahiert.
Diese Arbeit untersucht das Consensus-Problem in asynchronen Syste-
men mit Anhalte-Ausf allen von Prozessen, und Verfugbark eit von Fehlerdetek-
toren, die auch unverl assliche Angaben ub er den Fehlerzustand von Prozessen
machen durfen. Es wird ermittelt wie schnell Consensus in F allen die in der
Praxis h au g auftreten gel ost werden kann in, z.B. in sogenannten stabilen
Ausfuhrungsinstanzen, in denen geschehene Ausf alle bereits verl asslich erkannt
worden sind, und keine weiteren Ausf alle mehr statt nden. O ene Fragen nach
der Latenzzeit von Consensus werden durch die Ergebnisse dieser Arbeit gekl art.
Neben Consensus, ist auch das Lese- und Schreibregister eine grundlegende
ivAbstraktion auf dem Gebiet des verteilten Rechnens, und erm oglicht den Prozessen
in einem verteilten System auf gemeinsame Daten zuzugreifen. Das Lese- und
Schreibregister wird oft auch, wegen seiner Relevanz als Baustein in praktischen
verteilten Speicher- und Dateisystemen, als Storage bezeichnet.
Diese Dissertation erforscht fehlertolerante Storage-Implementierungen, in
denen Daten, die von Clients gemeinsam genutzt werden, aus Grunden der
Verl asslichkeit und Hochverfugbark eit auf mehrere Storage-Server repliziert und
damit redundant gespeichert werden.
Es werden robuste Storage-Implementierungen betrachtet, die sich (a) durch
Wartefreiheit (d.h. von korrekten Clients aufgerufene Lese- und Schreiboperatio-
nen mussen stets terminieren) und (b) durch starke Konsistenzeigenschaften ausze-
ichnen, trotz der Fehlfunktion von Storage-Servern und Clients. Das untersuchte
Systemmodell erlaubt die allgemeinste Klasse von Funktionsfehlern, sogenannte
Byzantinische Fehler, ohne eine Authenti zierung der Daten anzunehmen um den
Angreifer zu begrenzen. In diesem Rahmen wird die Worst-Case Latenzzeit von
Lese- und Schreibzugri en auf ein robustes Storage untersucht und ermittelt, und
dadurch werden etliche grundlegende Komplexit atsluc ken geschlossen.
vviAcknowledgements
To begin with, I am deeply grateful to my advisor Prof. Neeraj Suri for
the huge amount of con dence he has placed in me, and for the unlimited
freedom I received for developing my own ideas and research direction. To
the thesis committee for the time spent reading and evaluating my thesis.
Special thanks goes to my co-advisor Prof. Michel Raynal who’s scienti c
writings have opened an entire new chapter in my professional life.
A big thanks goes to my close colleagues and dear friends, Marco and
Matthias, from the distributed computing \subgroup" at DEEDS. They al-
ways have o ered their availability and devoted their time and attention to
endless discussions about various research topics, and also about personal
subjects. Special thanks goes to Marco for helping me to improve most of
my writings, for the good time we had in Darmstadt, and the weeks spent
together on our fun travels to foreign countries. Also, special thanks to
Matthias who always took the time to listen to my ideas, pointing out the
bad and the boring ones, since the rst day he joined. I thank all my coau-
thor colleagues for their commitment to our joint publications, and the more
senior ones for their guidance.
I am grateful to my colleagues and friends Dinu, Marco, Matthias, Pe-
ter and Piotr with whom I’ve spent reams of pleasant lunch