Assembler (ASM) is a program that converts code into machine language. There are assembler languages for different types of processor architectures. For example, there is an assembler for the Intel and AMD processor architectures (x86_64), and there’s another for ARM architectures.
This tutorial is going to be oriented toward Intel’s architecture using the Hello World! example.
Steps to Write Hello World! In ASM
- Download NASM and a text editor (like VS Code).
- Structure the ASM program.
- Build the command line interface program that prints “Hello World!”.
- Assemble and link the results.
4 Steps to Write ‘Hello World!’ In ASM
There are four steps to write ‘Hello World!’
in ASM. Let’s take a look at each one.
1. Download the Tools
For this short program, we are going to use NASM and whatever text editor you like. I’m going to use VS Code for this example since it has some nice plugins.
To install NASM on Debian systems (Ubuntu, PopOs!, Linux Mint, etc.):
sudo apt-get update -y
sudo apt-get install -y nasm
I’m only going to provide an example for a Linux system because the system calls are different for Mac.
2. Structure the ASM Program
Now that we have our assembler and our text editor or IDE, what’s next? Let’s create a new file and name it helloWorld.asm
.
With our empty file, we need to determine how it’s going to be used. In ASM, each file has four sections. These sections will always exist even if you don’t define them.
The four sections are:
.data
: This is where we are going to declare our global initialized variables..rodata
: Where we are going to declare our global un-initialized constants..bss
: Where we are going to declare our global un-initialized variables..text
: Where we are going to define our code.
3. Build the CLI Program That Prints ‘Hello World!’
We’re trying to build a command line interface (CLI) program that prints ‘Hello World!’
. To do so, we need to inform the processor that this function that we are going to name ‘start’
is global to the entire system. So, we add our .text
section with the ‘start’
function, and the global statement outside the section. Like this:
Since we don’t want to use any fancy C functions, nor any of those other high-level language functions, we are going to rely on system calls (syscalls).
Syscalls are just calls to the operating system. We need to call the 0x80 interruption (on Unix systems) and pass to that interruption the parameters we want it to handle.
For the function we’re using, sys_write
, the interruption receives four parameters:
- The function number (
RAX
). - Where do we want it to execute (
RBX
). - The direction of the memory we want to execute (
RCX
). - The size of the message in bytes (
RDX
).
RAX
, RBX
, RCX
, and RDX
are multi-purpose registers. Let’s define the message first and call int 0x80 after that.
There’s a lot of new info here. Let’s go line by line.
section .data
This is the section where we’re going to define our ‘Hello World!’
string variable. Since it will already be initialized, we declare it in .data
.
msg: DB 'Hello World!', 10
This is our new string. It’s declared under the name msg
, and we initialize it with define byte (DB), the characters that will be displayed and a ‘, 10’
, which is going to be our \n
character. Each character comprising ‘Hello World!’
takes one byte of memory. By using DB, we’re asking the processor for a memory slot that will take 13 bytes, counting the space and \n
char.
msgSize EQU $ - msg
This one is a little bit tougher. We’re declaring a variable called msgSize
that is going to step on the right end of Hello World! ($)
and will subtract the address where your msg variable began. Thus, leaving us with the bytes used for msg. We have our message. Let’s display it now!
Here’s what’s happening:
mov rax, 4 ; function 4
mov rbx, 1 ; stdout
mov rcx, msg ; msg
mov rdx, msgSize ; size
int 0x80
Intel has a unique way of doing things. Each line of text will be divided into four fragments again
mov A, B ; comments
Mov
: An instruction that moves the elements fromB
toA
.A
: The destiny register/memory.B
: The origin register/memory.comments
: Where the comments are:p
.
What we’re doing here is moving the number 4 to our RAX
register, because sys_write
is our function number 4
on Unix. We move the number 1
to RBX
representing STDOUT
. Then, the memory in which msg
is defined will be stored on RCX
, and finally, the size on RCX
. By calling int 0x80
we are asking the interruption 0x80 to handle all the parameters we threw to it and do what it’s supposed to do.
Our final step is to exit the program. And guess what? That requires another syscall. In this case, our function will be number 1
(exit), and our parameter will be 0
, because that’s the number we want to return. A 0
usually means that the program was executed successfully, while a 1
means that it wasn’t.
mov rax, 1 ; function 1
mov rbx, 0 ; code
int 0x80
4. Assemble and Link the Results
Let’s save our file as helloWorld.asm
and head over to the terminal. If you’ve already installed NASM, head to the folder where you saved your .asm file and assemble and link it. Linux:
nasm -f elf64 -g -F DWARF helloWorld.asm
ld -e start -o helloWorld helloWorld.o
./helloWorld
And that’s it for today. You should get a ‘Hello World!’
message on your terminal. You can also review the full code to see if your attempt matches mine.
Frequently Asked Questions
How do you write Hello World! in ASM?
There are four steps to write “Hello World!” in ASM x86_64:
- Download NASM and a text editor (like VS Code).
- Structure the ASM program.
- Build the command line interface program that prints “Hello World!”.
- Assemble and link the results.
What is ASM?
Assembler (ASM) is a compiler that converts code into binary. There are different assembler languages for different processing architectures.