background preloader

Compiler

Facebook Twitter

Vm

Parser. Gcc. Llvm. LL vs. LR vs. GLR. One of the other things that listening to Bjarne Stroustrup reminded me of is an idea that I’ve had kicking around in my head for quite some time about the way programming languages evolve. I can’t exactly quote Bjarne here, but I think he said something to the effect of, “One of the reasons I give this talk is because people still think of C++ as it existed back in 1986, not as it exists today.” Which reminded me of something I read a long time ago about avalanches and the way that they work. The interesting thing about avalanches is that, should you ever be unlucky enough to be caught in one, they go through two distinct phases. In the first phase, while the avalanche is making its way down the slope, it behaves almost as if it were a liquid.

Everything is moving, and it’s theoretically possible (if you don’t get bashed in the head by a rock or tree or something, and you can tell which way is “up”) to “swim” to the top of the avalanche and sort of float/surf on top of it. Writing a (Ruby) compiler in Ruby bottom up - step 21. 2009-11-10This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series. I've been lazy lately... Well, not really, I've been extremely busy, but Iought to have fit this in earlier. It's gotten harder and harder to get donetoo, since it's now more work since I had to go back and figure out a lot ofthe reasons for what I'd done. Anyway, finally a new part, though short. Down the rabbit hole: attr_(reader|writer|accessor) Adding attr_reader / "attr_writer" / "attr_accessor" Should be easy, right?

Trouble is you can't know that in advance. class Class def attr_reader foo puts "Hah! " class Foo attr_reader :bar end foo = Foo.new p foo.bar Ouch. It doesn't mean we can't make some assumptions, though, as long as we can handlethe worst case where someone does something stupid (later we may want to add anoption to make it assume you're not being stupid, and enable additional optimizations). So how do we do this then? That's the ugly part. How to write your own compiler. Logic of Lemmings in Compiler Innovation. By CACM Staff Communications of the ACM, Vol. 52 No. 5, Pages 7-9 10.1145/1506409.1506412 Comments I am deeply ambivalent about what I read in the contributed article "Compiler Research: The Next 50 Years" by Mary Hall et al. (Feb. 2009). On the one hand, its description of the field's challenges and opportunities evoke great excitement; on the other, the realities cast a discouraging pall on that excitement.

The practical adoption of useful research results is generally a slow process, taking up to a decade or more to achieve. In compilers, however, technology transfer has actually proceeded negatively. It has been at least four decades since the idea first emerged that, besides translating to machine code, a compiler must be able to perform a second important function: automate detection of a large class of programming errors without the need for massive test suites. The trend has now shifted toward pervasive use of scripting languages that abandon static safety altogether. Rodney M. Writing a compiler in Ruby bottom up - step 19. 2009-05-21This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.If you've been following the commits to the Github repository, you've already seen this go in...

Specifically, this was the state as at the end of this post. Here's finally some explanation. The Object Model Every object oriented language implements it's own "brand" of object model. For this compiler I want to eventually approximate the Ruby object model. The Ruby model is, however, extremely dynamic, and extremely dynamic translates to hard to compile efficiently (a post on the problems facing compilation of Ruby is upcoming). For most of this series I've ignored performance issues, but only because they've not been structural or really significant. I will leave the details of the problems with compiling the Ruby model for my later post, but lets boil it down to something very simple.

First, lets take a look at some C code. Int main() { Foo test; Eww.. A Machine-Checked Model for a Java-Like Language, Virtual Machin. G. Klein and T. Nipkow, A Machine-Checked Model for a Java-Like Language, Virtual Machine, and Compiler, ACM TOPLAS, vol. 28, no. 4, 2006. We introduce Jinja, a Java-like programming language with a formal semantics designed to exhibit core features of the Java language architecture. Jinja is a compromise between realism of the language and tractability and clarity of the formal semantics. The following aspects are formalised: a big and a small step operational semantics for Jinja and a proof of their equivalence; a type system and a definite initialization analysis; a type safety proof of the small step semantics; a virtual machine (JVM), its operational semantics and its type system; a type safety proof for the JVM; a bytecode verifier, i.e. dataflow analyzer for the JVM; a correctness proof of the bytecode verifiers w.r.t. the type system; a compiler and a proof that it preseves semantics and well-typedness.

Compiler in Ruby pt. 13. 2009-01-25This is the 13th in a series. Please start on part 1 first if you haven't already to get the background, or see here for the full series so far It's been a few months, and apart from my side step into parser land I haven't had much time to keep posting this series, in part because of work, and part because of repeated illness that after being poked and prodded and x-rayed and being subjected to ultrasound (no, I'm not pregnant, nor is anything about to rupture) and being tapped for what feels like several litres of blood turns out to likely 'only' be infectious mononucleosis. Luckily I've avoided the sometimes lengthy bouts of fatigue, but 4 rounds with several days of fevers and fatigue in just a few months hasn't been entertaining, and if I'm unlucky I might keep getting them occasionally for a few more months.

Fun. Anyway, enough moaning. Lets first figure out how to do it, by resorting to our trusted old method of seeing what gcc does to this: Introducing "let" '(Montreal Scheme/Lisp User Group) / Bienvenue. The 90 Minute Scheme to C compiler - Marc Feeley Marc Feeley gave us another really good presentation. It was more technical than the previous ones, but it was definitely worth it. (I think this may set the tone for future presentations... we'll see!) Marc showed us how to write a simple Scheme to C compiler, in Scheme. In only 90 minutes! The presentation is available in PDF format. Also, there are AVIs for the whole presentation: Part 1 and Part 2. Compiling to JavaScript. Treating JavaScript as a back-end for a compiler is becoming more and more popular.

Here are some examples of compilers that already target JavaScript:This has come up in some of the ECMA-TG1 discussions, and I think some of Edition 4 will help compiler writers, in particular proper tail calls. On the other hand, I'm not sure whether "compiling X to JavaScript" is always a priori a net gain. Some of these compilers are more toys than anything else: people always get a kick out of translating idioms between languages. That's fun, if a little boring after a while. But a lot of these tools are aiming at building abstractions for web programming, which is much more ambitious. If you can properly build an abstraction on top of the many incompatibilities and low-level details of web platforms, then the abstraction is probably appropriate.

I still think these abstractions are an important goal, but the hard part isn't the compilers. Writing a compiler in Ruby bottom up - step 10. 2008-07-10This is the tenth in a series. Please start on part 1 first if you haven't already to get the background, or see here for the full series so far Uh, yeah. So much for posting the next part in a few days. I think I'll stop trying to second guess when I'll next have time (but sine I'm going off to Norway for vacation for a week, it's a safe bet the next part won't show up until later than that). For those who care (anyone? Anyone at all? Didn't think so), I've been very busy lately, both with a big new project at work that'll see us relaunching the websites for three large restaurant chains in the UK.

My spare time has all been taken up by other projects which I might say more about some other time... Anyone, back to the subject at hand... Each step so far has included minor bits and pieces to test specific features. It will not be a proper parser, but merely a very simple case of rewriting pieces of text to turn something like this: (foo "bar" 1 2 3) into something like this: Then: Primeval C: two very early compilers. Several years ago, Paul Vixie and Keith Bostic found a DECtape drive, attached it to a VAX, and offered to read old DECtapes. Even at the time, this was an antiquarian pursuit, and it presented an opportunity to mine beneath the raised floor of the computer room and unearth some of the DECtapes we'd stored since the early 1970s.

Gradually, I've been curating some of this, and here offer some of the artifacts. Unfortunately existing tapes lack interesting things like earliest Unix OS source, but some indicative fossils have been prepared for exhibition. information: Warren Toomey, now at Bond University, has managed to make one of the compilers (last1120c, see below) compile itself using a First/Second edition Unix emulator for the PDP-11; see his ftp-available directory. As described in the C History paper, 1972-73 were the truly formative years in the development of the C language: this is when the transition from typeless B to weakly typed C took place, mediated by the (Neanderthal?)

Call-with-current-continuation.org. Writing a compiler in Ruby bottom up - step 9. 2008-07-10This is the ninth in a series. Please start on part 1 first if you haven't already to get the background, or see here for the full series so far UPDATE: Fixed a bug in compile_while Ok, so I know it's been far too long. Lots going on at the moment... This is a very short part, but I'll try to get the next part cleaned up in the next few days (teaser at the end...)

I like small language cores, and it really appeals to me to implement control structures as methods rather than hardcoding them into the language. There's an added complication: Using lambda's that are not proper closures, which is the case for us so far, means there's no way for the body to access external variables.

So I'm biting the bullet and adding a built in "while" construct for now. As usual we start with a simple C test to see how to do this: int main() { while (foo()) { bar(); }} gcc -S, and this is the relevant part: jmp .L2.L3: call bar.L2: call foo testl %eax, %eax jne .L3 Then we add it to #compile_exp: "clang" C Language Family Frontend for LLVM. Writing a compiler in Ruby bottom up - step 8. 2008-06-01This is the eight in a series. Please start on part 1 first if you haven't already to get the background, or see here for the full series so far Last time we looked at an improved way of handling loops etc. using anonymous functions.

But most of that is relatively limited if there's no way of modifying variables. Sure, you can get away with recursion and not allow mutation or even other side effects at all. It might satisfy some functional programming purists, but it would not satisfy me and it would also require some fairly sophisticated optimizations to get decent performance. I like the principle of reducing side effects as much as possible, and pushing side effects up, so much so that it's one of the factors I pointed out in my post about reducing coupling through unit tests.

But I don't like sacrificing simplicity and/or performance for the sake of a "purer" approach. So lets introduce assignment, and at the same time add in basic arithmetic and comparisons. Assignment. Anarres. Introduction The C Preprocessor is an interesting standard. It appears to be derived from the de-facto behaviour of the first preprocessors, and has evolved over the years.

Implementation is therefore difficult. JCPP is a complete, compliant, standalone, pure Java implementation of the C preprocessor. It is intended to be of use to people writing C-style compilers in Java using tools like sablecc, antlr, JLex, CUP and so forth. Download The latest version is considered the development version, and bug fixes will be applied to it without an update of the version number.

Anarres-cpp-bin-1.2.6.tar.gz (Binaries and javadoc only, about 120 Kb) anarres-cpp-src-1.2.6.tar.gz (Source and full build environment, about 8,700 Kb) Development Documentation Generated Javadoc Older versions See also Text::CPP (The C preprocessor in Perl) Atul's Mini-C Compiler. Writing a compiler in Ruby bottom up - step 7. This is the seventh in a series. Please start on part 1 first if you haven't already to get the background, or see here for the full series so far I've combined two of the planned parts this time, what was in the list from last time as parts 7 and 8.

Making use of lambda / call We can implement loops using recursion "manually", but adding lambda's now should make it possible to create a slightly cleaner version by actually defining a "while" function: (defun while (cond body) (if (call cond ()) (do (call body ()) (while cond body) ())) ) In a way, this isn't so far from Ruby blocks, except that we don't provide the syntactic sugar that allows the blocks to be defined without any extra vocabulary, so in use our "while" function would look like this: (while (lambda () (cond)) (lambda () (body))) Not terrible, and far better than earlier, but still not very clean.

We finally have to add support for using the arguments passed. (call (lambda (i) (code here)) (0)) Adding function arguments into this: Boostrapping in Langdev. Programming Languages: Application and Interpretation by Shriram. Functional programming - coming to a compiler near you soon? We can classify programming languages into a simple taxonomy: Commercial programmers have overwhelmingly developed software using imperative languages, with a strong shift from procedural languages to object oriented languages over time.

While declarative style programming has had some successes (most notably SQL), functional programming (FP) has been traditionally seen as a play-thing for academics. FP is defined in Wikipedia as: A programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. Whereas an imperative language allows you to specify a sequence of actions (‘do this, do that’), a functional language is written in terms of functions that transform data from one form to another. There is no explicit flow of control in a functional language.

In an imperative language variables generally refer to an address in memory, the contents of which can change (i.e. is ‘mutable’). Y = f(x) + f(x); Can always be rewritten as: z = f(x); Writing a compiler in Ruby bottom up - step 3.