In this article by Andrew Dennis, the author of Raspberry Pi Computer Architecture Essentials, we will discuss about Assembly language and the assembler.
(For more resources related to this topic, see here.)
The Raspberry Pi comes equipped with an ARM v7 quad core processor. Each processor has its own set of specific machine code that it understands; this machine code is represented in the binary format. The machine code is different for each processor architecture, so the Raspberry Pi’s ARM processor machine code will not work on an IBM or Intel CPU.
Short of writing out 32-bit long binary machine code instructions, the lowest level of programming language we can find ourselves using is Assembler language, also known as Assembly language.
The computer architecture’s Assembly language is usually a one-to-one mapping between itself and the underlying machine code. This is achieved through using a mnemonic. A combination of these mnemonic codes will result in an operation, such as addition or subtraction.
A program written in the Assembly language is compiled into machine code by the Assembler program. This program passes through the code one or more times and generates an object file as part of this process. The Assembler in some cases will also perform a variety of optimizations on the code in its subsequent passes.
Following this a program called the Linker that generates an executable file you can run on your computer.
Two important terms you will come across when writing Assembly language are opcode and operand. The opcode is an instruction (such as add) and the operand is data (such as an integer value). Each opcode and operand is created through the combination of sets of 8 bits (1 byte).
In this article, we will write a simple program in Assembly language in order to understand the basics. The subject of the ARM v7 Assembly language is covered in more detail by the University of Michigan that hosts a useful guide to the ARM v7 architecture in PDF format at https://web.eecs.umich.edu/~prabal/teaching/eecs373-f10/readings/ARMv7-M_ARM.pdf.
You may be interested in reviewing this as a supplement to the topics covered here.
So, what do the mnemonic codes that make up Assembly language look like before being converted to machine code?
Let’s take a look at an example and see. Here, we demonstrate how we can take register 0 of the CPU and assign a number to it; in this case, 10.
MOV R0, #10
The Assembly code MOV is short hand for assigning a value. The register is an example of the processor’s internal memory storage location and, of course, 10 is an integer value.
You can read more on CPU registers at Wikipedia: https://en.wikipedia.org/wiki/Processor_register
As you explore the language further, you will become familiar with these types of commands, as they are the building blocks of your program.
How about looking at another example? What do you think this does?
ADD R0, R1, R2
This simple program introduces us to another mnemonic, ADD. Here, we are taking the values of registers 1 and 2, adding and assigning them to register 0.
Running commands like this on the Raspberry Pi is very simple; we can add them to a file assemble and link them ourselves.
We shall now explore a short Assembly language program that incorporates these two commands, MOV and ADD.
Let’s start by creating a new directory under the pi user:
This will be the place we store our Assembly code.
Navigate into this directory, for example:
Next, we need to create a new file to place our code in. You can choose any text editor you are comfortable with in order to write the program. We have used Vim in the following example:
To this file, add the following block of code.
Make sure that you include the spacing as demonstrated next:
.global main .func main main: MOV R0, #0 MOV R1, #10 MOV R2, #20 ADD R0, R1, R2 BX LR
So, what does this program do?
The first line in the program defines a directive called main. The prefix of .global tells the Assembler that the name is global and thus available to the C runtime.
A directive is a code executed by the Assembler at assembly time, rather than the processor. We could have called this directive anything, but we have gone with main to keep it consistent with our C program. Assembler, unlike C, does not require the program entry point to be called main.
As you will see, we will use the GCC compiler/linker to build an executable for our program, so the format we are writing the Assembly language in mimics that of a C program in some areas. This is why you will see references to the C runtime mentioned when discussing Assembly in this article.
Following this, we then define that main is a function. Here, we can see another directive .func is used to specify this.
So, now that we have main available, we can denote where this function starts, which in our case is the third line.
Contained in the function are three lines of code for adding values to the registers. From our earlier examples, these should be familiar. What we have done is assigned the value 0 to R0, 10 to R1, 20 to R2 and then added the values together and stored the result (30) in R0.
Finally, we call BX LR to return the value of register 0 back to the operating system.
As you can see, this program is very simple, but demonstrates how to add numbers and store the results.
Save the file and exit your text editor. You should now be back at the command line.
This leads us to the next step of assembling and linking in order to generate a file we can run.
Assembling and linking
Now we have a program we need to test it. This is a two-step process that involves assembling the code and then linking it, which we touched upon at the start of this article.
When you come to explore the C language next, you will see that linking is also a component there as well; in fact, we use the same tool for both C and Assembly—the GCC compiler.
Briefly, these two steps to generate a runnable program that can be summed up as:
- Assembling is the process of generating the machine code object file from the Assembly mnemonics
- Linking is the process of creating an executable from one or more object files
The first command we will run called as (the GNU assembler) will take the code we wrote previously and create an object file as its output.
Run the following command from inside of the folder where you created your program:
as –o first_assem_prog.o first_assem_prog.s
If it assembled correctly, you should see no output.
Following this, we need to run the linker, which is invoked with the gcc command. There is also another linker available called ld. However, since we are writing our Assembly in a C-like manner, we will use the gcc tool.
You will also need to run this command in the same directory that you ran as in:
gcc –o first_assem_prog first_assem_prog.o
GCC stands for the GNU Compiler Collection
If everything is successful, you shouldn’t see any output.
We now have an executable file we can run from the Linux command line.
To do this, you can simply type:
You’ll notice there is no output, however. So, how do we know whether the program executed correctly?
We can use the Linux echo command, as follows:
This displays the exit code of the previous process, which in our case is the result of program we just ran. You may remember that we wrote this value back using the BX LR code.
As our program simply returned a value of 30 to register 0, this is the result we can see when using the echo command.
You can try changing the values in your program and assembling and linking once more. The result you see when running echo should reflect your changes.
Try changing the program to use R1 instead of R0 in the add function and see what happens.
So, in a few easy steps, you have created an Assembly language program and learned how to assemble, link, and run it.
In this article, we explored the programming languages we will be using in this book. This included Assembler and C/ C++.