Want to Write a Compiler? Just Read These Two Papers (2008)

(prog21.dadgum.com)

94 points | by downbad_ 1 hour ago

18 comments

jll29 36 minutes ago
*Donald Knute -> Donald Ervin Knuth is the author of the book "The Art of Computer Programming" (in progress for a couple of decades, currently volume 4c is being written). It is quite advanced, and it will likely not cover compilers anymore (Addison-Wesley had commissioned a compiler book from Knuth when he was a doctoral candidate, now he is retired and has stated his goal for the series has changed).
I disagree with the author's point: the "Dragon book"'s ("Compilers: Principles, Techniques, and Tools" by Aho et al.) Chapter 2 is a self-sufficient introduction into compilers from end to end, and it can be read on its own, ignoring the rest of the excellent book.
Another fantastic intro to compiler writing is the short little book "Compilers" by Niklaus Wirth, which explains and contains the surprisingly short source code of a complete compiler (the whole book is highly understandable - pristine clarity, really) and all in <100 pages total (99).
(I learned enough from these two sources to write a compiler in high school.)
omcnoe 20 minutes ago
Been working on a toy compiler for fun recently.
I have ignored all the stuff about parsing theory, parser generators, custom DSL's, formal grammers etc. and instead have just been using the wonderful Megaparsec parser combinator library. I can easily follow the parsing logic, it's unambiguous (only one successful parse is possible, even if it might not be what you intended), it's easy to compose and re-use parser functions (was particularly helpful for whitespace sensitive parsing/line-fold handling), and it removes the tedious lexer/parser split you get with traditional parsing approaches.
GCUMstlyHarmls 1 hour ago
Nanopass paper seems to be dead but can be found here at least https://stanleymiracle.github.io/blogs/compiler/docs/extra/n...
[-]
- kunley 58 minutes ago
  Also:
  https://www.cambridge.org/core/journals/journal-of-functiona...
soegaard 18 minutes ago
An Incremental Approach to Compiler Construction
Abdulaziz Ghuloum
http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf
Abstract
Compilers are perceived to be magical artifacts, carefully crafted by the wizards, and unfathomable by the mere mortals. Books on compilers are better described as wizard-talk: written by and for a clique of all-knowing practitioners. Real-life compilers are too complex to serve as an educational tool. And the gap between real-life compilers and the educational toy compilers is too wide. The novice compiler writer stands puzzled facing an impenetrable barrier, “better write an interpreter instead.”
The goal of this paper is to break that barrier. We show that building a compiler can be as easy as building an interpreter. The compiler we construct accepts a large subset of the Scheme programming language and produces assembly code for the Intel-x86 architecture, the dominant architecture of personal computing. The development of the compiler is broken into many small incremental steps. Every step yields a fully working compiler for a progressively expanding subset of Scheme. Every compiler step produces real assembly code that can be assembled then executed directly by the hardware. We assume that the reader is familiar with the basic computer architecture: its components and execution model. Detailed knowledge of the Intel-x86 architecture is not required.
The development of the compiler is described in detail in an extended tutorial. Supporting material for the tutorial such as an automated testing facility coupled with a comprehensive test suite are provided with the tutorial. It is our hope that current and future implementors of Scheme ﬁnd in this paper the motivation for developing high-performance compilers and the means for achieving that goal.
morphle 18 minutes ago
Compiler writing has progressed a lot. Notably in meta compilers [1] written in a few hundred lines of code and adaptive compilation [3] and just in time compilers.
[1] Ometa https://tinlizzie.org/VPRIPapers/tr2007003_ometa.pdf
[2] Other ometa papers https://tinlizzie.org/IA/index.php/Papers_from_Viewpoints_Re...
[3] Adaptive compilation https://youtu.be/CfYnzVxdwZE?t=4575
armchairhacker 1 hour ago
Nowadays I’ve heard recommended Crafting Interpreters. (https://craftinginterpreters.com)
The Nanopass paper link doesn’t work.
[-]
- tmountain 24 minutes ago
  Incredible book for self guided learning!
- ramon156 31 minutes ago
  Awesome course! finished it while i was doing my final CS year because I had to wait on a bunch of classes (and frankly had no one to talk to before classes). I haven't tried nanopass, but there's other links that work, so I'll give it a go.
itsmemattchung 1 hour ago
It's been about 4 years since I took a compilers course (from OMSCS, graduate program) and still shutter ... it was, hands down, the most difficult (yet rewarding) classes I've taken.
[-]
- tjarjoura 14 minutes ago
  What did you find more painful about compilers than other forms of programming?
- nirvdrum 1 hour ago
  Based on another reply I can’t tell if there’s a clever window-based pun that I’m missing. If not, I think you want “shudder” and not “shutter” here. I’m sorry if I just ruined the joke.
- msla 1 hour ago
  It made me drink myself Venetian blind.
- Pay08 1 hour ago
  I would like to agree with this comment, but I certainly didn't find it rewarding. It was pure pain.
  [-]
  - kuboble 44 minutes ago
    10 years ago I took few coursera courses to fill the gaps in my computer science education.
    One of them was a compilers course done by karpathy. It was pure joy and a great learning experience.
    Also in my experience the joy of doing a course was much stronger correlated with the teacher's qualities rather than the subject itself.
fzeindl 45 minutes ago
A similarly scoped book series is „AI game programming wisdom“, which contains a multitude of chapters that focus on diverse, individual algorithms that can be practically used in games for a variety of usecases.
bradley13 32 minutes ago
Maybe I'm missing the point of this article? Writing a simple compiler is not difficult. It's not something for beginners, but towards the end of a serious CS degree program it is absolutely do-able. Parsing, transforming into some lower-level representation, even optimizations - it's all fun really not that difficult. I still have my copy of the "Dragon Book", which is where I originally learned about this stuff.
In fact, inventing new programming languages and writing compilers for them used to be so much of a trend that people created YACC (Yet Another Compiler Compiler) to make it easier.
LiamPowell 35 minutes ago
See also, Andy Keep's dissertation [1] and his talk at Clojure/Conj 2013 [2].
I think that the nanopass architecture is especially well suited for compilers implemented by LLMs as they're excellent at performing small and well defined pieces of work. I'd love to see Anthropic try their C compiler experiment again but with a Nanopass framework to build on.
I've recently been looking in to adding Nanopass support to Langkit, which would allow for writing a Nanopass compiler in Ada, Java, Python, or a few other languages [3].
[1]: https://andykeep.com/pubs/dissertation.pdf
[2]: https://www.youtube.com/watch?v=Os7FE3J-U5Q
[3]: https://github.com/AdaCore/langkit/issues/668
petcat 1 hour ago
And Nystrom's book
[-]
- vlaaad 6 minutes ago
  Yeah, I really enjoyed Crafting Interpreters, wholeheartedly recommend!
krtkush 1 hour ago
I wonder if it makes sense to do the nand2tetris course for an absolute beginner since it too has compiler creation in it.
[-]
- wartywhoa23 21 minutes ago
  I highly recommend nand2tetris to everyone. For me, nothing ever explained the whole domain from logic gates and inner workings of a CPU to compilers better than this course.
  [-]
  - wartywhoa23 19 minutes ago
    On a side note, why is imrozim's comment dead? What in the world is wrong with it? It's perfectly fine IMO.
- imrozim 54 minutes ago
  [dead]
downbad_ 1 hour ago
https://news.ycombinator.com/item?id=10786842
msla 1 hour ago
fanf2 on Dec 25, 2015 [dead] | parent | prev | next [–]
I quite like "understanding and writing compilers" by Richard Bornat - written in the 1970s using BCPL as the implementation language, so rather old-fashioned, but it gives a friendly gentle overview of how to do it, without excessive quantities of parsing theory.
[-]
- Tor3 17 minutes ago
  I like that book too, I bought it many decades ago and learned enough to write a transpiler for converting Fortran source code to Turbo Pascal.
- msla 1 hour ago
  https://news.ycombinator.com/item?id=10791521
zionpi 29 minutes ago
[dead]
consomida 1 hour ago
[dead]
imrozim 55 minutes ago
[dead]
rdevilla 54 minutes ago
Almost 20 years old. This knowledge, this entire field of study, is obsolete in the AI era. What's next, handwritten XHTML?
[-]
- KnuthIsGod 48 minutes ago
  Some of us enjoy intellectual challenges....
  The Bornat book looks great.
  The fact that it is in BCPL is ultra cool.
  [-]
  - rdevilla 42 minutes ago
    Liking intellectual challenges on Hacker News is very 2008. It's 2026, the AI will write a compiler in 5 minutes, no headache required.
    If you're still playing with Rubik's cubes you're going to get left behind.
    [-]
    - guitarlimeo 18 minutes ago
      > you're going to get left behind.
      What's wrong with that? Why do you fear getting left behind? This is just fearmongering.
      Mind you these are legitimate interests of people and in most of cases probably not related to professional work.
      And lol @ "It's 2026, the AI will write a compiler in 5 minutes, no headache required.", no it will not, have you seen Anthropic's post about Claude writing the "C compiler"?
      [-]
      - weavie 10 minutes ago
        I think this chap is just joking.
- noosphr 34 minutes ago
  lol