A practical framework for adaptive metaheuristics [Elektronische Ressource] / von Klaus E. Varrentrapp

technischen_universitat_darmstadt - Klaus E. Varrentrapp

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

379 pages

Deutsch

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Description

Sujets

Informatik

Informations

Publié par	technischen_universitat_darmstadt
Publié le	01 janvier 2005
Nombre de lectures	21
Langue	Deutsch
Poids de l'ouvrage	3 Mo

Extrait

A Practical Framework for
Adaptive Metaheuristics
Vom Fachbereich Informatik
der Technischen Universitat? Darmstadt
genehmigte
Dissertationsschrift
zur Erlangung des akademischen Grades eines
Doktor-Ingenieurs (Dr.-Ing.)
von
Diplom-Informatiker
Klaus E. Varrentrapp
aus Frankfurt am Main
Referent: Prof.Dr.Wolfgang Bibel
Korreferent: Prof.Dr.Johannes Fur? nkranz
Tag der Einreichung: 15. Februar 2005
Tag der mu?ndlichen Pru?fung: 30. Mai 2005
Darmstadt 2005
Hochschulkennziﬀer D17Fur meine Oma, Mutter und Tante Doris?Danksagung
Eine Promotion abzuschließen ist ohne Hilfe und Unterstutzung kaum moglich. So auch in meinem? ?
Falle. AusdiesemGrundemoc? hteichmichbeiallen,diemiraufdieeineoderandereWeisegeholfen
haben, bedanken.
Insbesondere moc? hte ich meinem Doktorvater Prof. Dr. Wolfgang Bibel danken. Er hat mir diese
Promotion u?berhaupt erst ermogli? cht. Zunac? hst durch sein Vertrauen in mich bei meiner Ein-
stellung als Mitarbeiter und der Annahme als Doktorand. Daruber hinaus danke ich ihm fur die? ?
viele Geduld und Zeit, die er in die Betreuung meiner Arbeit gesteckt hat und die Aufmunterung
wahrend dieser Zeit.?
Des Weiteren danke ich Dr. Thomas Stut? zle, an den ich mich jederzeit mit Fragen und Problemen
wenden durfte und der mir immer weiterhelfen konnte.
Maria Tiedemann danke ich fur? ihre Gesprac? he und ihre Unterstut? zung in jeglichen Verwaltungs-
fragen.
Mein Dank gilt auch Dr. Mauro Birattari und Dr. Gunter Grieser fur die Hilfe zu Themen des?
maschinellen Lernens, Sergey Yevtushenko fur? seine Tipps bzgl. C++ und Tommaso Schiavinotto
ebenfalls fur seine Hilfe bei C++-Problemen und anderen Problemen mit meinem Computer.?
Allen bisher Genannten sowie Marco Chiarandini, Luis Paquete und Dr. Ulrich Scholz danke ich
fur viele große und kleine Diskussionen, Tipps und Hinweise und fur die tolle Arbeitsatmosphare.? ? ?
Stefan Pfetzing, Ben Herman und Marcus Assion danke ich dafur,? dass meine Rechner immer
funktionierten und ich uberhaupt erst arbeiten konnte.?
Patrick Duchstein, Oliver Korb, Jens Gimmler, Christian Bang, Alexander Dotor und Michael
Achenbachm?ochteichfur? dieZusammen-undMitarbeitamGAILSFrameworkdanken. Ohneihre
Hilfe hatte ich den enormen Implementierungsaufwand nicht bewaltigen konnen. Jurgen Henge-? ? ? ?
Ernst danke ich fur? seine Mitarbeit am Testbed in Vorbereitung auf meine Dissertation.
Weiterer Dank gebuhrt Ulrike Hißen fur ihre Geduld mit mir bei meinen vielen Fragen und ihre? ?
Anleitung durch den Dschungel der Formalitate? n.
Christa Turudic und Prof. Dr. Alejandro P. Buchmann danke ich ebenso wie Prof. Dr. Wolfgang
Bibel fur? die Mogli? chkeit, meine Promotion durch Weiterbeschaft? ung an der TU Darmstadt ab-
schließen zu konnen.?
Prof.Dr.JohannesFurnkranzdankeichfurseineKorrekturvorschlageundseinInteresseanmeiner? ? ?
Dissertation.
Sabrina Englert danke ich fur die Unterstutzung und Aufmunterung in zweiﬂerischen Zeiten und? ?
die Ermutigung, eine Promotion zu wagen.
DemCenterofScientiﬁcComputingandderUniversitatFrankfurtundinsbesonderederenAdmin-?istratoren danke ich fur die Moglichkeit, dort umfangreiche Experimente rechnen zu lassen.? ?
Prof. Dr. Thomas Hoﬀmann danke ich fur? die Mogli? chkeit, die Raum? lichkeiten und Ressourcen
seines Fachgebietes nutzen zu durfen.?
Ich danke daneben meinen Freunden, die mich immer unterstut? zt haben und mich jederzeit in
meinem Vorhaben bestarkt und an mich geglaubt haben.?
SchließlichgiltmeintieferDankmeinerFamilie, ohnedieichesnichtsoweithatt? ebringenkon? nen.
IchdankemeinenElternRenateundHorstK?ohlerundmeinenPatenDorisundHeinrichNaumann
fur ihre uneingeschrankte Liebe und Unterstutzung und ihr Grundvertrauen in mich. Diese Arbeit? ? ?
ist ihnen gewidmet.Extended Abstract
Local search methods are useful tools for tackling hard problems such as many combinatorial
optimization problems (COP). Local search methods work by probing (in contrast to enumerating)
the search space. So-called trajectory-based local search methods start from some initial solution
and iteratively replace the current solution by a neighboring one which diﬀers only in some local
changes. Thepossiblechangestothecurrentsolutionaretypicallydeﬁnedbylocalsearchoperators
andpotentiallyarerandomized. Thelong-termgoalistoﬁndaglobaloraverygoodlocaloptimum.
Experience from mathematics has shown that exploiting regularities in problem solving is beneﬁ-
cial. Consequently, identifying and exploiting regularities in the context of local search methods
is deemed to be desirable, too. Due to the complexity of the COPs tackled, regularities might
better be detected and learned automatically. This can be achieved by means of machine learning
techniques to extend existing local search methods. Learning requires feedback, but in the context
oflocalsearchmethods, instructivefeedbackisnotavailable, sinceglobalorverygoodlocaloptima
are not known in advance. Instead, evaluative feedback can be derived from the cost function of
COPs evaluating single solutions, for example.
Reinforcement learning (RL) is a machine learning technique that only needs evaluative feedback.
Itisbasedonthenotionofanagentthatmovesfromstatetostatebyapplyingactionsandreceiving
a scalar evaluative feedback for each move called reward. The rewards of an agent’s interaction in
the form of sequences of actions and moves from one state to the next are accumulated in a sum of
discountedrewardscalledreturn. Maximizingthereturnisthelong-termgoalofanagent. Inorder
to achieve this goal, a particular RL method called Q-learning established a so-called action-value
function. Action-valuefunctionsevaluateineachstatetheapplicableactionsintermsofhowuseful
theirapplicationinthisstatewouldbeinmaximizingthereturn. Duetothesizeofthestatespaces
tackled, action-value functions are represented by function approximators that work on a selection
of predictive real-valued state characteristics called features. Solving a RL problem can be done
by learning a useful action-value function during an agent’s interaction and next deriving a search
strategy called policy by applying in each state the best-valued action.
The present thesis attempts to develop learning local search methods in a general and practical
manner. One possibility to enhance local search methods with learning capabilities is by using
RL methods. RL techniques can be applied to Markov decision processes (MDP). The direct
application of existing RL techniques for extending existing local search methods is enabled by the
concept of a local search agent (LSA). The advancement of a trajectory-based local search method
can be regarded as the interaction of a virtual agent whose states basically consist of solutions
and whose actions are composed of arbitrary hierarchical compositions of local search operators,
altogether yielding the same setting as an MDP. The resulting LSA using RL can then be called a
learning LSA. The changes in cost for each move of a learning LSA can be used as reward. Based
on these, returns can be computed such that maximizing the return reﬂects the goal of ﬁnding
a global or a very good local optimum. The hierarchical structure of LSA actions allows to use
so-called ILS-actions. ILS-actions coincide with the application of one iteration of the well-knownIterated Local Search (ILS) metaheuristic. The advantage of this metaheuristic and this kind of
action is that only solutions from the subset of local optima – which must contain any acceptable
solution – are considered and thus introduces a search space abstraction which in turn can improve
performance. AlearningLSAthatemploysILS-actionsiterativelywillvisitlocaloptimainaguided
andadaptivemanner. TheresultingtheoreticalframeworkiscalledGuidedAdaptiveIteratedLocal
Search (GAILS).
In order to evaluate randomized GAILS algorithms, empirical experiments have to be conducted.
Each GAILS algorithm thereby consists of three, mainly independent parts. The ﬁrst part com-
prises the actions of a learning LSA which are speciﬁc to a problem type. The LSA actions being
arbitrary hierarchical compositions of local search operators are implemented through basic local
search operators. The second part represents the RL techniques used, which in turn transparently
useactionsandhenceareproblemtypeindependent. Thethirdpartconsistsofthefunctionapprox-
imators used by RL techniques to implement policies. The function approximators only require
as input a vector of real-valued features and this way are independent from the ﬁrst two parts.
Empirical experiments can be supported by providing a framework that can decouple these three
mainpartsinanyGAILSalgorithmprograminstantiation,thusallowingforanarbitraryreuseand
combination enabling rapid prototyping. The GAILS implementation framework is such an appli-
cation framework which is designed to rapidly implement learning LSAs reﬂecting the separation
of a learning LSA into its three main parts. It provides generic interfaces between components of
the three parts and this way provides for a separation of problem type speciﬁc states from search
control. It also provides for a separation of search control from the state of the search control unit.
Hierarchically built actions are mapped to object hierarchies.
Two GAILS algorithms according to Q-learning algorithm Q(0) and Q(‚) that are based on ILS-
actions were developed, built and compared to corresponding standar