basic unix programming

Contents

Introduction
Compiling a C program
Using Make
Using BSD Make
Debugging programs with gdb
References

Introduction

This is a guide to Unix programming. It is intended to be as generic as possible, since we want to create programs that are as portable as NetBSD itself.

Compiling a C program

I will assume you know C and how to edit a file under Unix.

First, type in the code to your file, let's say we put it in 'hello.c'. Now, to create a program from it, do the following:

$ gcc hello.c -o hello
<compiler output>

If gcc (GNU Compiler Collection) is not available, use the command 'cc' instead of 'gcc'. Command line switches can be different, though. If you use C++, everything said here is the same, but you may replace the 'gcc' by 'g++'.

Now if there were no compiler errors, you can find the file 'hello' in the current working directory. To execute this program, type

$ ./hello
<Your program's output>

The reason you can't just type 'hello' is a good one but I won't explain it here. Beware, with some shells, the current line is overwritten when the program quits, so be sure to include at least a newline (\n) in the last line you print to the screen. This can be very frustrating if the program seems to give no output.

To compile larger programs, you can do this:

$ gcc -c file1.c
<compiler output>

And the same for file2.c etc. The -c switch tells gcc to only compile the code, and not link it into a complete program. This will result in a file with the same name as the source file, except the '.c' is replaced by '.o'.

Now, once you have compiled all your .c modules, you can tie the resulting object files (those with a .o extension) together with the following:

$ gcc file1.o file2.o file3.o -o program
<linker output>

Gcc now will link the object files together into your program. If all went good, you can run the program again with './program'.

If you forget the -o switch on linking, the resulting default program will be called 'a.out' and will be placed in the current working directory.

If an include file can't be found, you can use the -I switch to point gcc to the correct path. Example:

$ gcc -c -I/usr/local/include blah.c

This will compile blah.c into blah.o, and when it comes across a #include, it will look in /usr/local/include for that file if it can't find it in the default system include directory (or the current directory).

When you start using libraries (like libX11), you can use the -l, -L and -R flags when linking. This works as follows:

$ gcc -lX11 blah.o -o program

This will try to link the program to the file 'libX11.so', or 'libX11.a' depending on if your system is dynamically or statically linked.

Usually, the linker can't find libraries, so always use the following command:

$ gcc -L/usr/X11R6 -R/usr/X11R6 -lX11 blah.o -o program

The -L flag tells the linker where to find the 'libX11.so' or 'libX11.a' file. The -R flag is for 'tying in' this path into the program. The reason this needs to be done next to the -L, is that when the system is dynamically linked, the library will be accessed on demand. On statically linked systems, the 'libX11.a' file's contents will be copied into the final binary, which will make it a lot bigger. Nowadays, almost all systems are dynamically linked, so the 'libX11.so' file is used. But the contents are not copied over. The library itself will be accessed whenever the program is started.

To make sure the dynamical linker can find the library, you need to tie in the library's path. Some systems (notably, most Linuxen) function even when you don't tie in the path yourself, because it 'knows' about some common paths, but some other, more correct systems do not 'know' any of the common paths. This seems to be for security reasons, but I don't know the exact details of why this is insecure (probably has to do with chrooted environments). It never hurts to use '-R', so please do so in every project you do, since you might some day want to run it on another system.

Using Make

It can be tedious work to type in those long command lines all the time, so you probably want to use Make to automate this task.

A Makefile generally looks like the following:

# This is a comment
program: hello.o
    gcc hello.o -o program

hello.o: hello.c
    gcc -c hello.c

Just save the things between the lines as the file called 'Makefile' and type 'make'. Make will try to build the first 'target' (a word followed by a colon) it encounters, which is in our case 'program'. It looks at what is required before it can do so, which is hello.o (after the colon). Warning: The whitespace in front of the commands below the target must be exactly one tab, or it will not work. This is a deeply propagated bug in all versions of Make. (don't let anyone convince you it's a feature. It's not)

So if 'hello.o' does not exist, before Make can build 'program', it must first build 'hello.o'. This can be done from 'hello.c', as it can see in the next target. If hello.o already existed, it will look if hello.c is newer (in date/time) than hello.o. If this is not the case, it will not have to rebuild hello.o, as obviously(?) the file hasn't been modified. If the file is newer, it will have to rebuild hello.o from hello.c.

When Make has determined that it must build a target, it will look at the indented lines (with a tab) directly following the target line. These are the command it will execute in sequence, expecting the target file to be a result of executing these commands.

If you wish to build a particular file only, just type 'make ', where target is the name of the target you wish to build. Now we'll conclude with a bigger example of a Makefile:

# Assign variables.  Doing this makes the Makefile easier to
# modify if a path is incorrect, or another compiler is used.
LINKFLAGS=-L/usr/X11R6/lib -R/usr/X11R6/lib

# NB! Don't assign CC, like many linux folks do it makes impossible 
# to use another compiler without changing the source

# If an 'all' target is present, it is always executed by default
# (when you just type 'make') even if it's not the first target.
all: myprog

# In the following, '${CC}' will expand to the contents of the
# variable 'CC'.
myprog: first.o second.o
    ${CC} ${LINKFLAGS} -lX11 first.o second.o  -o myprog

first.o: first.c
    ${CC} -c first.c

second.o: second.c
    ${CC} -c second.c

As you can see, you can do rather interesting things with Make.

Consider using BSD Make scripts, they simplify handling projects a lot. System builds using them.

Using BSD Make

Makefiles are nice, but typing the same lines all the time can get very annoying, even if you use SUFFIXES.

BSD's Make is a very nice Make, which comes pre-packed with some files which can make your life a lot easier and your Makefiles more elegant, but it is not compatible with GNU make. Some things are even incompatible between the Makes of different versions of BSD. Now we got that out of the way, let's see an example:

PROG=   test
SRCS=   test_a.c test_b.c
# We have written no manpage yet, so tell Make not to try and
# build it from nroff sources.  If you do have a manpage, you
#  usually won't need this line since the default name
#  of the manpage is ${PROG}.1 .
MAN=

.include <bsd.prog.mk>

That's all there's to it! Put this in the same directory as the 'test' program's sources and you're good to go.

If you're on a non-BSD system, chances are that the normal 'make' program will choke on this file. In that case, the BSD Make might be installed as 'pmake', or 'bmake'. On Mac OS X, BSD make is called 'bsdmake', the default 'make' is GNU Make. If you can't find make on your particular system, ask your administrator about it.

The bsd.prog.mk file (in /usr/share/mk) does all the work of building the program, taking care of dependencies etc. This file also makes available a plethora of targets, like 'all', 'clean', 'distclean' and even 'install'. A good BSD Make implementation will even call 'lint' on your source files to ensure your code is nice and clean.

If you wish to add flags to the C compiler, the clean way to do it is like this:

CFLAGS+= -I/usr/X11R6/include

For the linker, this is done by

LDADD=  -lX11

If you're adding libraries or include paths, be sure to make lint know about them:

LINTFLAGS+= -lX11 -I/usr/X11R6/include

If you're creating a library, the Makefile looks slightly different:

LIB=    mylib
SRCS=   mylib_a.c mylib_b.c

.include <bsd.lib.mk>

A library doesn't have a manpage by default. You can force one to be built by supplying a MAN line, of course.

As you can see, the BSD Make system is extremely elegant for large projects. For simple projects also, but only if you have one program per directory. The system does not handle multiple programs in one directory at all. Of course, in large projects, using directories for each program is a must to keep the project structured, so this shouldn't be a major problem.

The main directory of a project should contain this Makefile:

SUBDIR=         test mylib

.include <bsd.subdir.mk>

Additionally, bsd.prog.mk and bsd.lib.mk always include the file ../Makefile.inc, so you can keep global settings (like DEBUG switches etc) in a Makefile.inc file at toplevel.

For more information, usually there is a /usr/share/mk/bsd.README file which explains BSD Make more completely than this little document. See also ?BSD Make.

Tip: Use LaTeX-mk if you want to build LaTeX sources with this type of BSD Makefiles.

Debugging programs with gdb

Now you know how to compile and link a program, but debugging is often very useful as well. I'll quickly explain some common things one can do with gdb, the GNU debugger.

At first, when you see someone using the debugger, it looks like yet another black art that can be done by gurus only, but it's not really too difficult, especially considering it's a command line debugger. If you practice a bit, using gdb will become a second nature.

First, it is handy to compile a program with debugging symbols. This means the debugger knows about the variable and function names you use in your program. To do this, use the -g switch for gcc when compiling:

$ gcc -g blah.c -o program

For each object, you must use -g on compilation. On linking -g can beomitted. Now, run the debugger:

$ gdb program
<output and prompt>

To run the program, just type 'run'. To load a new binary into gdb, use "file ". For information about possible commands, use 'help'.

When your program crashes, you can use the command 'bt' to examine the stack. With 'select ' you can select stack frame N. For example, suppose you're in the middle of a function foo which is called from function bar, you can switch to bar by going down a step in the stack. Stack frame 0 is always the last called function.

With 'l' (or 'list') you can list a certain file, at a certain point. By default 'l' shows the point where the debugger has halted right now. It always lists 10 lines of context, with the middle line being the current line. If you wish to see some other file or some other line, use 'l file.c:line', with file.c being the file to look at, line the number of the line. With every consecutive 'l', the next lines will be shown, so you can scroll through the program by hitting 'l' all the time.

The nice thing about gdb is, that when you simply press enter, it executes the last command. So you need to type 'l' only once, and keep pressing enter until you see the right piece of code. This is especially useful when using 'step' (see below).

To investigate a variable, just use 'print '. You can print contents of pointers in the same fashion as you would in C itself, by prefixing it with a *. You can also cast to int, char etc if you wish to see only that many bits of the memory (useful when investigating buffers). Printing a variable only works if you've selected a stack frame in which the variable has a meaning. So if function foo has an integer called 'i' and function bar also has one, it depends on the selected stack frame which of the two i's contents you get to see.

With 'break' you can set breakpoints. The program being debugged will stop when it gets to the line of the breakpoint. Breakpoints work like listings when giving line numbers/filenames. You can clear a breakpoint by issuing the same, but with 'break' replaced by 'clear'.

To continue the program, use 'continue'. This just resumes executing the program from the point where it halted. If the program really crashed, continue is of course not possible.

To walk through a program line by line, use 'step'. This way, after each line is executed, you can investigate how variables changed. It is common practice to set a breakpoint at a place in the program where you expect a problem to occur. When that breakpoint is hit, you can use 'step' to step through the failing part of the program.

If you think a certain function will work, use 'finish', which is like a continue, but only for the current function. When the previous function is recalled from the stack, the program will stop.

With 'kill' you can immediately kill your program. 'quit' quits gdb.

References

?BSD Make
LaTeX-mk

Links: tutorials/bsd make

Last edited late Sunday evening, November 20th, 2011

Preferences | Logout