How do they compile a compiler?

Question:

2013-10-27 00:19:51 UTC

We have several compilers available for C,C++,assembly and others programming languages like borland C, turbo c, Microsoft's C compiler... and others. They all have .exe files for windows platform like tcc.exe,bcc32.exe etc. So how did they compile and link the compiler when they didn't have a compiler?? How was the first compiler compiled?? How did they make those exe files without a compiler ?? I hope you guys are understanding what I am trying to ask. I asked this question to my teacher at school. She ignored me. :\ Please don't tell me I am a retard. :P I am a kid and I am learning C. I don't know much about programming..I am just curious. :)

Four answers:

Ratchetr

2013-10-27 01:06:17 UTC

It's a good question. You are not a retard. Your teacher...I'm not so sure about....

What you are asking about is called bootstrapping. See links.

It is a classic problem: How do you compile a compiler for language X, if the compiler is written in language X? You can't. There is no compiler. So you have to write the first compiler for language X in language Y. Once you have that, then you can compile language X with the compiler built using language Y, and you don't need language Y anymore.

But where did it all start? See "The chicken and egg problem" in link. If you follow the chicken and the egg back far enough, you will end up at a bunch of nerds entering programs into computers using mechanical devices, such as punched cards or toggle switches or by flipping wires around. But that manual labor was all done 50 or 60 years ago. Today it is just a software problem.

husoski

2013-10-27 01:51:27 UTC

It's a good question, and there are multiple answers.

The general name for the process is "bootstrapping" and it usually starts with a subset of the language, with just enough in it to write a compiler for that subset.

Suppose you have language X to implement, and you'd like to use language X to write the compiler. Define that subset and call it X0.

The first step is the hardest. You could write a compiler (or interpreter) for X0 in assembler, and then rewrite that compiler for X0 in the X0 language itself. The assembler-based compiler (when debugged) is used to compile the X0 compiler written X0. When that is debugged, you have first version of an X compiler written in X.

That's also one "boostrapping step". Next, extend the X0 compiler, still using the X0 subset language, to support an improved version of the language (X1, say). One important test will be to verify that both X0 and X1 compile the source to X1 and produce equivalent results. Now you have an X1 compiler and can use X1 features for future refinements.

That's another "bootstrapping step", and you can continue in the same way to get the full language implemented.

Assembly is not the only option for the first step. Some languages have been "hand compiled" from the first source version. This was done (I think) with McKeeman/Horning/Wortman's XPL programming language, used to teach compiler construction techniques back when I was learning this stuff. (_A Compiler Generator_ was the text. See: http://www.cs.toronto.edu/XPL/)

Moving to another machine and/or OS can be done as a series of bootstrap steps, too. Usually, only the code generation modules are affected by this--at least at first. Write a modified version of X that produces code for the other CPU/OS. When that is debugged and produces programs that run on the other system, you can use that to compile source for the new compiler into a version of the compiler that will RUN on the new system,

sparky_dy

2013-10-27 01:09:46 UTC

When you have a whole new machine architecture, you write -- in assembler -- a cut-down C interpreter which only has to be able to understand enough instructions to run the Source Code of the compiler as an interpreted script, and doesn't even have to be particularly fast as it is only going to have to be run once. Then you start the interpreter running an interpreted version of the C compiler Source Code; and you use that interpreted compiler to compile the Source Code for the real compiler, which you will then use for the rest of the system that you are building.

Lesus

2013-10-27 00:26:28 UTC

The first compilers were written in assembler. The first assemblers were written in machine language. When a new processor is developed, usually cross assemblers and compilers are written, which run on an existing system and produce code for the new system.

ⓘ

This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.

about - legalese