I still remember a discussion with a colleague in which I said, “That’s the transpiler,” and he replied, “The…what?”
If you’ve never heard that name, you’re not alone. As developers, we all get used to writing code in a high-level language that humans can understand. However, computers can only understand a program written in a binary system known as machine code.
To speak to a computer in its non-human language, we came up with two solutions: interpreters and compilers. Ironically, most of us know very little about them, although they’re a part of our daily coding life.
Compilers vs. Interpreters Explained
- Interpreter: An interpreter translates code written in a high-level programming language into machine code line-by-line as the code runs.
In this post, I’ll dive into the journey of translating a high-level language into a machine code ready for execution. I’ll focus on the inner workings of the two key players in this game, the compiler and the interpreter, and break down the related concepts.
Compilers vs. Interpreters: How Do They Work?
Compilers and interpreters have long been used as computer programs to transform code. But they work in different ways:
- A compiler translates code written in a high-level programming language into a lower-level language like assembly language, object code and machine code (binary 1 and 0 bits). It converts the code ahead of time before the program runs.
- An interpreter translates the code line-by-line when the program is running. You’ve likely used interpreters unknowingly at some point in your work career.
Compilers vs. Interpreters: Advantages and Disadvantages
Both compilers and interpreters have pros and cons:
- A compiler takes in the entire program and requires a lot of time to analyze the source code. Whereas the interpreter takes a single line of code and very little time to analyze it.
- Compiled code runs faster, while interpreted code runs slower.
- A compiler displays all errors after compilation. If your code has mistakes, it will not compile. But the interpreter displays errors of each line one by one.
- Interpretation does not replace compilation completely.
- Compilers can contain interpreters for optimization reasons like faster performance and smaller memory footprint.
4 Common Types of Interpreters to Know
Interpreters were used as early as 1952 to ease programming and also translate between low-level machine languages. The first interpreted high-level language was Lisp. Python, Ruby, Perl and PHP are other examples of programming languages that use interpreters.
Below is a list of interpreter’s types:
4 Types of Interpreters
- Bytecode interpreter
- Threaded code interpreter
- Abstract syntax tree interpreter
- Justin-in-time compilation
1. Bytecode Interpreter
The trend toward bytecode interpretation and just-in-time compilation blurs the distinction between compilers and interpreters. Bytecode interpreters can process up to 256 instructions, with each instruction starting with a byte.
2. Threaded Code Interpreter
Unlike bytecode interpreters, threaded code interpreters use pointers instead of bytes. Each instruction is a word pointing to a function or an instruction sequence, possibly followed by a parameter. The number of different instructions is limited by the available memory and address space.
Forth code, which is used in Open Firmware systems, is a classical example of threaded code. The source code is compiled into a bytecode known as “F code,” which a virtual machine then interprets.
3. Abstract Syntax Tree Interpreter
If you’re a TypeScript developer with some familiarity of the TypeScript architecture, you may have heard about the abstract syntax tree (AST).
AST is an approach to transform the source code into an optimized abstract syntax tree, then execute the program following this tree structure, or use it to generate native code just-in-time.
AST keeps the global program structure and relations between statements. This allows the system to perform better analysis during runtime and makes AST a better intermediate format for just-in-time compilers than bytecode representation.
However, for interpreters, AST causes more overhead. Interpreters walking the abstract syntax tree are slower than those generating bytecode.
4. Just-in-Time Compilation
Just-in-time compilation (JIT) is a technique in which the intermediate representation is compiled to native machine code at runtime.
10 Common Types of Compilers to Know
Below are the common types of compilers you should know.
10 Types of Compilers
- Native compiler
- Bootstrap compiler
- Source-to-source compiler
- Language rewriter
- Bytecode compiler
- Just-in-time compiler
- AOT compilation
A cross-compiler is one that runs on a computer whose CPU or operating system differs from the one on which the code it produces will run.
2. Native Compiler
A native compiler produces an output that would run on the same type of computer and operating system as the compiler itself.
3. Bootstrap Compiler
Bootstrap compiler is a compiler written in the language that it intends to compile.
A decompiler translates code from a low-level language to a higher level one.
5. Source-to-Source Compiler (Transpiler)
A source-to-source compiler is a program that translates between high-level languages. This type of compiler is also known as a transcompiler or transpiler.
Some examples of a transpiler include:
- Cfront: The original compiler for C++ (from around 1983). It used C as its target language and created C code with no indent style and no pretty C intermediate code, since the generated code was not usually intended to be readable by humans.
6. A Language Rewriter
This is usually a program translating form of expressions without a change of language.
7. Bytecode Compiler
A compiler that translates a high-level language into an intermediate simple language that a bytecode interpreter or virtual machine can interpret. Examples include: Bytecode compilers for Java and Python.
8. Just-in-Time Compiler (JIT Compiler)
A JIT compiler defers compilation until runtime. It generally runs inside an interpreter.
Examples of a JIT compiler include:
- The earliest published JIT compiler is attributed to LISP in 1960.
- The latter technique appeared in languages such as Smalltalk in the 1980s.
In Java, source files are first compiled and converted into .class files that contain Java bytecode, a highly optimized set of instructions, then a bytecode interpreter executes the bytecode, and later the JIT compiler translates the bytecode to machine code.
Java bytecode can either be interpreted at runtime by a virtual machine, or compiled at load time or runtime into native code. Modern JVM implementations use the compilation approach, so after the initial startup time the performance is equivalent to native code.
9. AOT Compilation
Ahead-of-time (AOT) compilation is the approach of compiling a higher-level programming language, or an intermediate representation such as Java bytecode, before the runtime.
An assembler translates human-readable assembly language into machine code. This compilation process is called assembly. The inverse program that converts machine code to assembly language is called a disassembler.
An assembly language (ASM) is a low-level programming language in which there is a dependence on the machine code instructions. That’s why every assembly language is designed for exactly one specific computer architecture.
Why Compilers and Interpreters Are Important
Both compilers and interpreters are computer programs that convert a code written in a high-level language into a lower-level or machine code understood by computers. However, there are differences in how they work and when to use them.
Even if you’re not going to implement the next compiler or interpreter, these insights should help to improve your knowledge of the tools you use as a developer every day.