DIY a compiler...

DIY a compiler...


435 pages
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres


and Writing



Publié par
Ajouté le 02 septembre 2011
Nombre de lectures 35
Langue English
Signaler un problème
Understanding and Writing Compilers A do-it-yourself guide Richard Bornat Middlesex University, London. First published|1979. Internet edition|2007; corrected 2008. Copyright c 1979, 2007, 2008 Richard Bornat. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the copyright notice above and the title of the book appear, and notice is given that copying is by permission of Richard Bornat. Original edition published by Macmillan Press Ltd., Basingstoke, UK. Internet by Richard Bornat, 28 Albany Road, LONDON N4 4RL, UK. Preface to the online edition I wrote this book on compiling in the late 1970s. It was a success. I still meet people who learnt about from it. I get three or four requests a year from people who’d like a copy. I used to tell them to use, but now there aren’t any copies even there. Since I own the copyright (thanks to Macmillan), I can publish it again, for free. For a while I tried to reproduce the original book from the original nro source, but Unix version 6 nro is long dead, and the printer-driver hacks that made Abold and underlining work are even deader. So I hacked it into LT X, and hereE it is. I xed the errors that I knew about, changed a very few infelicities, and otherwise tried to change as little as possible. Why isn’t it in C? When I began to write this book I was heavily into BCPL. By the time it was nished I was a C programmer. The display editor ded that I developed to write the book was written in C. C is in many respects a direct descendant of BCPL. I was pressed, even before the book came out, to rewrite all the examples in C. I didn’t do that for various reasons, some good and some bad. It would be an enormous e ort to do it now, because C is now so far from its origins. Old-fashioned opinions I agree with quite a lot of this book, but by no means all of it. I’ve decided to change nothing. For better or worse, here it is. i ii Preface In the past compiler writers and designers seemed to form an elite group within computing science, set apart by their esoteric knowledge and their ability to produce large, important system programs which really worked. The admiration of the computing public, whether it was once deserved or not, is no longer merited now that the principles of programming-language implementation are so well understood. Compiler-writing is no longer a mystery. This book attempts to explain and demystify the principles of compiler writing so that you can go out and build a working compiler of your own. There is enough detail in this book for you to build a for quite a complicated language { certainly PASCAL, perhaps ALGOL 68 or SIMULA 67 { but it doesn’t attempt an encyclopaedic coverage of the eld. It is intended more as an introduction to compiler-writing and a do-it-yourself kit for the compiler-writer, giving enough detail for you to understand the principles of the subject, than as a survey of past history or present horizons. The of interpretation are close enough to those of compilation for chapter 19 to give a simple introduction to interpreter writing. The method of treatment and the relative amount of attention given to various topics in this book re ects my own views about the relative importance of those topics. There is a separate section on run-time support, less attention is paid than is perhaps usual to the topic of parsing or syntax analysis and the discussion of translation is totally oriented to tree-walking. I have presented the subject in this way for both practical and educational reasons. First, the object code instruction sequences which implement run-time support are more important in practice than is usually recognised. It is di erences in run-time mechanisms, as much as or more than anything else, which distinguish one language from another { say SIMULA 67 from ALGOL 68, POP-2 from ALGOL 60 { and the e ciency of run-time support code fragments is crucial to the e ciency of the object program. Second, I believe it is more important to give a practical description of syntax analysis in a book which is intended for the practical compiler-writer than to give a more formal and complete introduction to the topic. The syntax analysis mechanisms chosen for illustration in section IV] are selected for their practical relevance. Of the three mechanisms presented, the ‘one-track’ and ‘operator-precedence’ mechanisms are now rather old-fashioned but are still quite adequate to the task of parsing popular modern languages. iii iv Finally, my insistence on tree-walking as the best means of translation is both because it makes explanation of translation algorithms much easier and enables me to bring out the topic of ‘crucial code fragments’ which forms so much of the life of the professional compiler writer; also in my experience it is a practical way in which both novice and expert can quickly build a working translator containing the minimum number of errors. Throughout the book I have emphasised the fact that the task of compilation can be divided into separate modular sub-tasks. It is largely the identi cation of and emphasis on this essential modularity that has clari ed the subject. Emphasis on modular design also helps me to avoid discussing every known technique for each of the tasks { if you know what the task is, one or two good ways to accomplish it and how to recognise another good way if somebody shows you one, then you are on the way to becoming a capable compiler writer. Throughout the book I have emphasised the need for the compiler to provide a service to its users. It seems to me that the demands of system or compiler e ciency are too often given precedence over the justi able demands of the user who wants understandable error reports, accurate and reliable object code or strict adherence to an industry standard. The same goes, I believe, for the demands of small compiler size or simplicity of construction. A good compiler can be acceptably e cient and reasonably small yet still provide a good user service. Indeed I believe that the well-designed compiler will out-perform the ‘e cient’ special-purpose construction, but even if it isn’t so the compiler writer should stand by the principle that machines are provided to save human time and e ort. A few seconds of machine time which saves minutes (or more often hours) of human time is machine time well spent! Host, Object and Source Language in Examples In the examples throughout the book most algorithms are written in a version of the language BCPL, details of which are brie y explained in appendix A. Some example algorithms are in PASCAL: I would have used PASCAL more extensively were it not for the fact that its lack of block structure, lack of conditional expressions and lack of a simple ‘union-type’ convention forces an obscure programming style. BCPL’s advantage is that it is untyped and that therefore the examples can show the bare bones of an algorithm without too much unnecessary hedging round of type declarations. At the same time this is a drawback: BCPL’s lack of data structure declarations makes it di cult to explain some fairly simple algorithms. Early examples give some explanation of the data structures which are manipulated by the algorithms: it is worth pointing out that in every case the values which are manipulated are pointers to record-structures rather than the structures themselves. Appendix B explains the assembly code which is used to illustrate the oper- ation of the translation algorithms in sections II and III. It is the code of a single-address, multi-register single-segment-addressing machine. Throughout v the book there is an emphasis on the machine-independence of compiler design and the fact that details of the object machine’s instruction set don’t a ect the design of the compiler. Nevertheless it is useful, when discussing translation al- gorithms, to illustrate the code of an example object machine in order to show the advantages of good design of code fragments. With the present explosion in the use of microprocessors interest in compil- ing has re-emerged, particularly interest in compiling system-programming lan- guages. The problems of compiler-writing for small machines are mainly to do with producing compact object code: the examples presented in this book are not directly oriented towards this purpose but may be readily adapted to it. The desire to produce a very small compiler, as opposed to a small object pro- gram, should be curbed until you properly understand the principles of compiler design (when perhaps the desire will go away!) The source language which is used to illustrate syntax analysis and translation algorithms has a variable syntax and semantics, oating somewhere in the space between BCPL, ALGOL 60, ALGOL 68 and PASCAL. Some attention is given to the detailed di culties which arise in writing a compiler for each of these languages. I didn’t wish to limit the discussion to those features displayed by any particular language, however, nor to get bogged down in the details of that language. Most examples therefore relate to the general di culties of compiling any language which includes recursive procedures, block-structuring, dynamic arrays, pointer variables, and so on. Acknowledgements I’m enormously indebted to all the people who have argued with me about compilers and compiling over the years, lectured to me, listened to me, corrected me and generally helped me develop the ideas I present in this book. Many of the good ideas are those of my mentors. Thanks especially to Je Rohl for helping me in my rst steps and encouraging me to develop my ideas as far as they have gone: thanks also to Bernard Sufrin for so many invaluable discussions about the best way to build and to explain compilers. Thanks to my colleagues at the University of Essex { Bruce Anderson, Mike Brady, Tony Brooker, Mike Foster, Pete Gardner, Pat Hayes, John Laski, Bob Wielinga and all the students who listened to earlier versions of this book in the form of lectures. This book was type-set on a DEC PDP-11/40 and printed using a Diablo printer. Thanks to the denizens of the Queen Mary College Computer Systems Labo- ratory for allowing me to use the machine so heavily and especially to George Coulouris, Jon Rowson, Ben Salama and Harold Thimbleby for leading me through the intricacies of the local software. vi Contents Preface to the online edition i Preface iii I Modular Organisation of Compilers 1 1 Phases and Passes 5 1.1 Tasks and Sub-tasks . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2 Translation and Optimisation . . . . . . . . . . . . . . . . . . . . 9 1.3 Object Descriptions in the Symbol Table . . . . . . . . . . . . . . 10 1.4 Run-time Support . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.5 Source Program Errors . . . . . . . . . . . . . . . . . . . . . . . . 12 1.6 Two-pass Compilation . . . . . . . . . . . . . . . . . . . . . . . . 14 1.7 An Example of Compilation . . . . . . . . . . . . . . . . . . . . . 17 2 Introduction to Translation 19 2.1 Phrases and Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2 Tree Walking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3 Linear Tree Representations . . . . . . . . . . . . . . . . . . . . . 25 2.4 Improving the Tree Walker . . . . . . . . . . . . . . . . . . . . . 27 2.5 Using the Symbol Table Descriptors . . . . . . . . . . . . . . . . 32 2.6 Translation Error Handling . . . . . . . . . . . . . . . . . . . . . 32 3 Introduction to Syntax Analysis 37 3.1 Language Descriptions (Grammars) . . . . . . . . . . . . . . . . . 39 3.2 Bottom-up Analysis of Expressions . . . . . . . . . . . . . . . . . 41 3.3 Top-down of Statements . . . . . . . . . . . . . . . . . . 47 vii