BareMetal OS - Programming

Version 0.5.0 (DRAFT), 14 February 2011 - Return Infinity

This documentation explains how to write software for BareMetal OS in assembly language. It shows you the tools you need, how BareMetal OS programs work, and how to use the system calls included in the kernel.



Overview

Introduction to 64-bit (x86-64) Assembly

Most of todays modern operating systems are written in high-level programming languages such as C and C++. High-level programming is very useful when portability and code-maintainability are crucial, but it adds an extra layer of complexity to the proceedings. BareMetal OS is written in assembly to remove those extra layers and get right down to the hardware level. It's more verbose and non-portable, but you don't have to worry about compilers and linkers. Assembly also gives you full and total control over what the computer is doing at a granularity that is not possible with a high-level language.

Assembly language (or colloquially "asm") is a textual way of representing the instructions that a CPU executes. For instance, an instruction to move some memory in the CPU may be 11001001 01101110 - but that's hardly memorable. Assembly provides mnemonics to substitute for these instructions, such as mov rax, 55. These mnemonics correlate directly with machine-code CPU instructions, without the convoluted binary numbers.

Like most programming languages, assembly is a list of instructions to be followed in order. You can jump around between various places and set up subroutines/functions, but it's much more minimal than C/C++ and friends. You can't just print "Hello world" to the screen - the CPU has no concept of what a screen is! Instead, you work with memory, manipulating chunks of RAM, performing arithmetic on them and putting the results in the right place. Sound scary? It's a bit alien at first, but it's not hard to grasp once you wrap your head around the concept.

At the assembly language level, there is no such thing as a variable in the high-level language sense. What you do have, however, is a set of registers, which are on-CPU memory stores. You can put numbers into these registers and perform calculations on them. In 64-bit mode, the general purpose registers can hold numbers between 0 and 18446744073709551615 (The maximum 32 and 16-bit values are 4294967295 and 65535 respectively). Here is a list of the fundamental registers on an x86-64 CPU:

RAX, RBX, RCX, RDX General-purpose registers for storing numbers that you're using. For instance, you may use RAX to store the character that has been pressed on the keyboard, while using RCX to act as a counter in a loop. These 64-bit registers can be used as a 32-bit, 16-bit, or 8-bit register as well (See below).
RSI, RDI Source and destination data index registers. These point to places in memory for retrieving and storing data.
RSP The Stack Pointer (explained in a moment).
RIP The Instruction/Code Pointer. This contains the location in memory of the instruction being executed. When an instruction has finished, it is incremented and moves on to the next instruction. You can change the contents of this register to move around in your code.
Notes: RAX is the 64-bit register but can also go by other names depending on "how much" of the register you want to use. EAX will use the lower 32-bits, AX will use the lower 16-bits, and AL will use the lower 8-bits. The same goes for other registers. More information can be found here.

So you can use these registers to store numbers as you work - a bit like variables, but they're much more fixed in size and purpose. There are a few others, notably the extra 64-bit registers (R8, R9, R10, R11, R12, R13, R14, and R15), 8 80-bit Floating Point registers (ST0 - ST7), and 16 128-bit SIMD registers (XMM0 - XMM15).

The stack is an area of your main RAM used for storing temporary information. A stack is a last in, first out (LIFO) data structure. The push operation adds to the top of the list, hiding any items already on the stack, or initializing the stack if it is empty. The pop operation removes an item from the top of the list, and returns this value to the caller. A pop either reveals previously concealed items, or results in an empty list. If you were to push the numbers 5, 7 and 15 onto the stack, you will pop them out as 15 first, then 7, and lastly 5. In assembly, you can push registers onto the stack and pop them out later - it's useful when you want to save the value of a register while you use that register for something else.

Computer memory can be viewed as a linear line of cells ranging from byte 0 to whatever you have installed (millions of bytes on modern machines). Computers are better off with powers of two (because they're based on binary) whereas humans count in powers of 10 (10, 100, 1000 etc. - decimal). So we use hexadecimal, which is base 16, as a way of representing numbers. See this chart to understand:

Decimal 01234567891011121314151617181920
Hexadecimal 0123456789ABCDEF1011121314

As you can see, whereas our normal decimal system uses 0 - 9, hexadecimal uses 0 - F in counting. It's a bit weird at first, but you'll get the hang of it. In assembly programming, we identify hexadecimal (hex) numbers by tagging a '0x' at the beginning - so 0x0A is hex for the decimal number 10. (You can also denote hexadecimal in assembly by putting an 'h' at the end of the number - for instance, 0Ah.)

Let's finish off with a few common assembly instructions. These move memory around, compare them and perform calculations. They're the building blocks of your software - there are hundreds of instructions, but you don't have to memorise them all, because the most important handful are used 90% of the time.

Please keep in mind that for this document and all of the BareMetal OS source files we are using the Intel syntax (not AT&T syntax). The Intel syntax is the most popular but may look odd to those who have used C or C++. Intel syntax uses this format: OPCODE DESTINATION, SOURCE.

ADDAdd - Adds the destination operand (first operand) and the source operand (second operand) and then stores the result in the destination operand.add rax, 120 adds 120 to the RAX register. add rdx, rbx adds the value in RBX to RDX
CALLCall procedure - Jump to a sub-procedurecall b_print_char would call the BareMetal print character function. CALL saves the address of the operation that follows it to the stack and branches to the called procedure specified using the target operand. RET is required in the called function to restore program flow.
CLCClear carry flagclc clears the Carry flag to 0.
CMPCompare operands - Compares the first source operand with the second source operand and sets the status flags according to the results.cmp rax, rbx compares the RAX and RBX registers. If they are equal the Zero flag is set (otherwise it is cleared). If RAX is less than RBX then the Carry flag is set (otherwise it is cleared). You can also compare immediate values like cmp rdx, 1200.
DECDecrement by 1 - Subtracts 1 from the destination operand.dec rdx decrements the RDX register by 1.
DIVUnsigned divide - Preform an unsigned divide of the value in the AX, DX:AX, EDX:EAX, or RDX:RAX registers (dividend) by the source operand (divisor) and stores the result in the AX (AH:AL), DX:AX, EDX:EAX, or RDX:RAX registersdiv rbx divides RDX:RAX by RBX and stores the result in RDX:RAX.
INInput from port - Copies the value from the I/O port specified with the second operand (source operand) to the destination operand (first operand).in al, 0x20 inputs a byte from port 0x20 and puts it into AL. In order to access a 16-bits port you must use in al, dx where DX is set to the port number.
INCIncrement by 1 - Adds 1 to the destination operand.inc rbx increments the RBX register by 1.
JxxJump if condition - Checks the state of one or more of the status flags and, if the flags are in the specified state (condition), performs a jump to the target instruction specified by the destination operand.jc test_area - Jump if Carry - Only jump if the carry flag is set.
je test_area - Jump if Equal - Only jump if the equal flag is set.
jnc test_area - Jump if Not Carry - Only jump if the carry flag is not set.
jne test_area - Jump if Not Equal - Only jump if the equal flag is not set.
More information on all of the available conditional jump types can be found in the x86-64 manuals from Intel or AMD.
JMPJump - Unconditionally transfers program control to a different point in the instruction stream without recording return information.jmp test_area jumps to the code address of test_area (defined by "test_area:" in your source code).
LODSxLoad - For loading a byte, word, doubleword, or quadword (8-bit, 16-bit, 32-bit, or 64-bit) value from the address in the RSI registerlodsb will load a byte in AL, lodsw for loading a word in AX, lodsd for a doubleword in EAX, and lodsq for a quadword in RAX
MOVMove - Copies the second operand (source operand) to the first operand (destination operand).mov rax, 1000 stores the value 1000 in the RAX register.
MULUnsigned multiply - Perform an unsigned multiplication of the value in AL, AX, EAX or RAX by the source operand and stores the result in the AX (AH:AL), DX:AX, EDX:EAX, or RDX:RAX registers.mul rbx mulitplies RAX by RBX and stores the result in RDX:RAX
OUTOutput to port - Copies the value from the second operand (source operand) to the I/O port specified with the destination operand (first operand).out 0x20, al - Send the byte in AL to port 0x20. In order to access a 16-bit port you must use out dx, al where DX is set to the port number.
POPPop data from the stack - Loads the value from the top of the stack to the location specified with the destination operand and then increments the stack pointer.pop rcx pops a 64-bit value from the stack and places it in RCX. pop ax pops a 16-bit value from the stack and places it in AX.
PUSHPush data onto the stack - Decrements the stack pointer and then stores the source operand on the top of the stack.push rbx pushes the 64-bit value in RBX to the stack. push dl pushes the 8-bit value in DL to the stack.
RETReturn from procedure - Transfers program control to a return address located on the top of the stack.ret transfers program control to a return address located on the top of the stack. The address is usually placed on the stack by a CALL instruction, and the return is made to the instruction that follows the CALL instruction.
STCSet carry flagstc sets the Carry flag to 1.
STOSxStore - For storing a byte, word, doubleword, or quadword (8-bit, 16-bit, 32-bit, or 64-bit) value to the address in the RDI registerstosb will store the byte in AL, stosw for storing the word in AX, stosd for the doubleword in EAX, and stosq for the quadword in RAX
SUBSubtraction - Subtracts the second operand (source operand) from the first operand (destination operand) and stores the result in the destination operand.sub rax, 57 subtracts 57 from the RAX register. sub rdx, rcx subtracts the value in RCX from the RDX register.
XCHGExchange data - Exchanges the contents of the destination (first) and source (second) operands.xchg rax, rbx exchanges the contents of the RAX and RBX registers.

Let's look at some of these instructions in a little more detail. Consider the following code snippet:

	mov rbx, 0x1000
	mov rax, [rbx]
	cmp rax, 50
	jge label
	...

label:
	mov rax, 10

In the first instruction, we move the number 0x1000 into the RBX register. Then, in the second instruction, we store in RAX whatever's in the memory location pointed to by RBX. This is what the [rbx] means: if we just did mov rax, rbx it'd simply copy the number 0x1000 into the RAX register. But by using square brackets, we're saying: don't just copy the contents of RBX into RAX, but copy the contents of the memory address to which RBX points. Given that RBX contains 0x1000, this instruction says: find whatever is at memory location 0x1000, and put it into RAX.

So, if the byte of memory at location 0x1000 contains 37, then that number 37 will be put into the RAX register via our second instruction. Next up, we use cmp to compare the number in RAX with the number 50 (the decimal number 50 - we didn't use the '0x' prefix) The following jge instruction acts on the cmp comparison, which has set the FLAGS register as described earlier. The jge label says: if the result from the previous comparison is greater than or equal, jump to the part of the code denoted by label:. So if the number in RAX is greater than or equal to 50, execution jumps to label:. If not, execution continues at the '...' stage.

One last thing: you can insert data into a program with the db (define byte) directive. For instance, this defines a series of bytes with the number zero at the end, representing a string:

mylabel: db 'Message here', 0

In our assembly code, we know that a string of characters, terminated by a zero, can be found at the mylabel: position. We could also set up single byte to use somewhat like a variable:

foo: db 0

Now foo: points at a single byte in the code, which in the case of BareMetal OS will be writable as the OS is copied completely to RAM. So you could have this instruction:

mov byte al, [foo]

This moves the byte pointed to by foo into the AL register.

That's the essentials of x86-64 assembly language, and enough to get you started. When writing an OS or program, though, you'll need to learn much more as you progress, so see the Resources section for links to more in-depth assembly tutorials.

Introduction to BareMetal OS Programs

BareMetal OS programs are written in 64-bit, protected mode assembly language. The OS, OS buffers, CLI, and the memory structures needed for 64-bit mode are contained in the first 2 MiB of physical memory (0x0000000000000000 - 0x00000000001FFFFF). Memory above 2 MiB is available for programs.

BareMetal OS programs are loaded at the 2 MiB point (Address 0x0000000000200000). Consequently, programs will need to begin with these directives:

[BITS 64]
[ORG 0x0000000000200000]

There are many system calls available in BareMetal OS for controlling the screen, getting input, manipulating strings, loading/saving files, and more. All parameters to BareMetal OS system calls are passed in registers, and not on the stack. To use them in your programs, you need this line:

%INCLUDE "bmdev.asm"

This doesn't include any code, but a list of equ directives that point to system call vectors in the kernel. So, by including this file you can call, for instance, the BareMetal OS b_print_string routine without having to know exactly where it is in the kernel. bmdev.inc is included in the programs/ directory of the BareMetal OS download -- it also provides a quick reference to the system calls.


Tools needed

To write BareMetal OS programs you need:

BareMetal OS uses NASM for its programs and kernel source code, hence why we recommend it here. You can, of course, use any other assembler that can output plain binary files (ie with no header) and accept 64-bit code. NASM is available in most Linux distro repositories, or you can download the Windows version from here (get the 'win32' file).

For the second point, copy programs/bmdev.asm so that it's alongside your program's source code for inclusion.

For the third point, if you've written BareMetal OS to a real hard disk, you can just copy your .APP programs onto that disk (root directory only!), boot BareMetal OS and test them out.



Example

Source code

Here is an example BareMetal OS program, in NASM format, which prints a string to the screen:

[BITS 64]
[ORG 0x0000000000200000]

%INCLUDE "bmdev.asm"

start:					; Start of program label

	mov rsi, test_message		; Load RSI with memory address of string
	call b_print_string		; Print the string that RSI points to

ret					; Return to OS

test_message: db 'My first program in BareMetal OS!', 13, 0

Building

Save the above code as test.asm, and enter this command to assemble it (works on both Linux and Windows):

nasm -f bin -o test.app test.asm

Using the '-f bin' option we tell NASM that we just want a plain binary file: no header or sections. The resulting executable file is test.app that we can copy to our hard drive or add to the virtual disk image as described in Copying files in the User Handbook. Then we can boot BareMetal OS and run the program.


Explanation

This is a very short program, but we'll explain exactly how it works for complete clarity. The first three lines will not be assembled to machine code instructions, but are just directives to tell NASM that we want to operate in 64 bit mode, our code is written to be executed from the 2MiB mark, and we want to use system calls from the BareMetal OS API.

Next we have the 'start:' label, which is not essential but good practice to make things clear. We put the location of a zero-terminated string into the RSI register, then call the BareMetal OS b_print_string routine which we can access via the vectors listed in bmdev.asm.

All BareMetal OS programs must end with a ret instruction, which passes control back to the OS.



Resources

Online

Books



Extra

License

BareMetal OS is open source and released under the 3-clause "New BSD License" (see docs/LICENSE.TXT in the BareMetal OS distribution). Essentially, it means you can do anything you like with the code, including basing your own project on it, providing you retain the license file and give credit to Return Infinity and the BareMetal OS developers for their work.