Contents
Assembly?
Assembly is the programming language that gives direct access to the instructions and registers of the processor. A program called the assembler compiles assembly language into machine code. NetBSD installs the GNU assembler "gas" into /usr/bin/as and this program assembles for the host processor architecture.
A higher-level compiler like "gcc" acts as a preprocessor to the assembler, by translating code from C (or other language) to assembler. Just run cc -S yourfile.c and look at the output yourfile.s to see assembly code. A higher-level compiler can probably write better assembly code than a human programmer who knows assembly language.
There remain a few reasons to use assembly language. For example:
- You need direct access to processor registers (for example, to set the stack pointer).
- You need direct access to processor instructions (like for vector arithmetic or for atomic operations).
- You want to improve or fix the higher-level compiler, assembler, or linker.
- You want to optimize code, because your higher-level compiler was not good enough.
- You want to learn assembly language.
i386
i386 architecture takes its name from the Intel 386, the first x86 processor to have a 32-bit mode. Other names for this architecture are:
- IA-32, which means Intel Architecture, 32 bit.
- x86, which can mean the 32-bit mode or the ancient 16-bit mode.
The i386 assembly language is either AT&T syntax or Intel syntax. Most programmers seem to prefer the Intel syntax.
nasm
NASM (the Netwide Assembler) is an x86 assembler that uses the Intel syntax. It is easily available via devel/nasm.
You can also use devel/yasm with devel/nasm syntax.
Hello world, NetBSD/i386
; Hello world, NetBSD/i386 4.0
section .note.netbsd.ident progbits alloc noexec nowrite
dd 0x00000007 ; Name size
dd 0x00000004 ; Desc size
dd 0x00000001 ; value 0x01
db "NetBSD", 0x00, 0x00 ; "NetBSD\0\0"
db 400000003 ; __NetBSD_Version__ (please see <sys/param.h>)
section .data
msg db "Hello world!", 0x0a ; "Hello world\n"
len equ $ - msg
section .text
global _start
_start:
; write()
mov eax, 0x04 ; SYS_write
push len ; write(..., size_t nbytes)
push msg ; write(..., const void *buf, ...)
push 0x01 ; write(int fd, ...)
push 0x00
int 0x80
pop ebx
; exit()
mov eax, 0x01 ; SYS_exit
push 0x00 ; exit(int status)
push 0x00
int 0x80
How to compile and link
To use the above code you need to compile and then link it:
$ nasm -f elf hello.asm
$ ld -o hello hello.o
$ ./hello
Hello world!
gas
the portable GNU assembler
It uses AT&T syntax and is designed after the 4.2BSD assembler. You can use it on many CPU architectures.
Example:
.section ".note.netbsd.ident", "a"
.long 2f-1f
.long 4f-3f
.long 1
1: .asciz "NetBSD"
2: .p2align 2
3: .long 400000000
4: .p2align 2
.section .data
data_items: # this is an array
.long 3,39,41,21,42,34,42,23,38,37,15,37,16,17,18,25,23,12,31,2
.set DATASIZE, ( . - data_items) / 4 - 1
.section .text
.globl _start
_start:
movl $0, %edi # zero the index register
movl $DATASIZE, %ecx # set ecx to number of items
movl data_items(,%ecx,4), %eax # load first item
movl %eax, %ebx # its the biggest atm
main_loop:
decl %ecx # decrement counter
movl data_items(,%ecx,4), %eax # step to next element
cmpl %eax, %ebx # is it greater?
cmovll %eax, %ebx # set ebx to greater if its less
than cur. num.
jecxz end_prog # if we are at item 0 end iterat
ion
jmp main_loop # again!
end_prog:
pushl %ebx # return largest number
pushl %ebx # BSD-ism (has to push twice?)
movl $1, %eax # call exit
int $0x80 # kernel
ret
powerpc
PowerPC processors appear inside multiple different hardware platforms; NetBSD has at least 11 ports, see ?Platforms. The easiest way to obtain a PowerPC machine is probably to acquire a used Macintosh, choosing from among the supported models for NetBSD/macppc.
PowerPC processors have 32-bit registers and pointers and use big-endian byte order.
- A very few boards (not with NetBSD) run the PowerPC in little-endian mode to match the hardware.
- A few PowerPC processors also have a 64-bit mode. NetBSD 5.0 will support some Apple G5 machines with these processors, but only in 32-bit mode (see ppcg5 project).
gas
Here is an example of a program for gas:
## factorial.s
## This program is in the public domain and has no copyright.
###
## This is an example of an assembly program for NetBSD/powerpc.
## It computes the factorial of NUMBER using unsigned 32-bit integers
## and prints the answer to standard output.
.set NUMBER, 10
.section ".note.netbsd.ident", "a"
# ELF note to identify me as a native NetBSD program
# type = 0x01, desc = __NetBSD_Version__ from <sys/param.h>
##
.int 7 # length of name
.int 4 # length of desc
.int 0x01 # type
.ascii "NetBSD\0" # name
.ascii "\0" # padding
.int 500000003 # desc
.section ".data"
decbuffer:
.fill 16 # buffer for decimal ASCII
decbufend:
.ascii "\n" # newline at end of ASCII
.section ".text"
# PowerPC instructions need an alignment of 4 bytes
.balign 4
.globl _start
.type _start, @function
_start:
# compute factorial in %r31
li %r0, NUMBER
mtctr %r0 # ctr = number
li %r31, 1 # %r31 = factorial
li %r30, 1 # %r30 = next factor
factorial_loop:
mullw %r31, %r31, %r30 # multiply %r31 by next factor
addi %r30, %r30, 1 # increment next factor
bdnz+ factorial_loop # loop ctr times
# prepare to convert factorial %r31 to ASCII.
lis %r9, decbufend@ha
la %r4, decbufend@l(%r9) # %r4 = decbufend
lis %r8, decbuffer@ha
la %r29, decbuffer@l(%r8) # %r29 = decbuffer
li %r5, 1 # %r5 = length of ASCII
# Each loop iteration divides %r31 by 10 and writes digit to
# position %r4. Formula (suggested by gcc) to divide by 10,
# 0xcccccccd
# is to multiply by ----------- = 0.100000000005821
# 0x800000000
# which is to multiply by 0xcccccccd, then shift right 35.
##
.set numerator, 0xcccccccd
lis %r9, numerator@ha
la %r28, numerator@l(%r9) # %r28 = numerator
decloop:
cmpw %r29, %r4 # start of buffer <=> position
beq- buffer_overflow
# begin %r9 = (%r31 / 10)
mulhwu %r9, %r31, %r28 # %r9 = ((%r31 * %r28) >> 32)
addi %r4, %r4, -1 # move %r4 to next position
srwi %r9, %r9, 3 # %r9 = (%r9 >> 3) = %r31 / 10
mulli %r8, %r9, 10 # %r8 = (%r31 / 10) * 10
sub %r27, %r31, %r8 # %r27 = %r31 % 10 = digit
addi %r27, %r27, '0 # convert digit to ASCII
addi %r5, %r5, 1 # count this ASCII digit
stb %r27, 0(%r4) # write ASCII digit to buffer
mr. %r31, %r9 # %r31 /= 10, %r31 <=> 0
bne+ decloop # loop until %r31 == 0
# FALLTHROUGH
buffer_overflow:
# write(2) our factorial to standard output
li %r0, 4 # SYS_write from <sys/syscall.h>
li %r3, 1 # standard output
## %r4 # buffer
## %r5 # size of buffer
sc
# exit(2)
li %r0, 1 # SYS_exit from <sys/syscall.h>
li %r3, 0 # exit status
sc
.size _start, . - _start
With a NetBSD/powerpc system, you can run this program using
$ as -o factorial.o factorial.s
$ ld -o factorial factorial.o
$ ./factorial
3628800
$
Useful Documents
To learn about PowerPC assembly language, here are two documents to start with.
- IBM developerWorks. PowerPC Assembly. This is a very good introduction to PowerPC assembly. It provides and explains the Hello World example (but using a Linux system call).
SunSoft and IBM. System V Application Binary Interface, PowerPC Processor Supplement (PDF file, hosted by Linux Foundation). This is the specification for 32-bit PowerPC code in ELF systems. It establishes %r1 as the stack pointer and describes the stack layout. It explains the C calling conventions, how to pass arguments to and return values from C functions, how to align data structures, and which registers to save to the stack.
- NetBSD, Linux and OpenBSD (and FreeBSD?) all use ELF with PowerPC and all follow this specification, with a few deviations and extensions.
Wiki Pages
- ELF Executables for PowerPC. This introduces assembly language with a commented example.