Contents

  1. Introduction
  2. kqueue API
    1. kevent data structure
      1. pair
      2. flags
      3. EV_SET() macro
    2. kqueue(2)
    3. kevent(2)
      1. timeout
  3. Examples
    1. A timer example
    2. A raw tcp client
  4. include
  5. include
  6. include
  7. include
  8. include
  9. include
  10. include
  11. include
  12. include
  13. define BUFSIZE 1024
    1. More examples
  14. Documentation

Introduction

The purpose of this document is to introduce the programmer to the methodology of kqueue, rather than providing a full and exhaustive documentation of its capabilities.

Kqueue provides a standard API for applications to register their interest in various events/conditions and have the notifications for these delivered in an efficient way. It was designed to be scalable, flexible, reliable and correct.

kqueue API

kevent data structure

The kevent structure goes like this:

struct kevent {
            uintptr_t ident;        /* identifier for this event */
            uint32_t  filter;       /* filter for event */
            uint32_t  flags;        /* action flags for kqueue */
            uint32_t  fflags;       /* filter flag value */
            int64_t   data;         /* filter data value */
            void     *udata;        /* opaque user data identifier */
};

pair

A kevent is identified by an pair. The ident might be a descriptor (file, socket, stream), a process ID or a signal number, depending on what we want to monitor. The filter identifies the kernel filter used to process the respective event. There are some pre-defined system filters, such as EVFILT_READ or EVFILT_WRITE, that are triggered when data exists for read or write operation is possible respectively.

If for instance we want to be notified when there's data available for reading in a socket, we have to specify a kevent in the form <sckfd, EVFILT_READ>, where sckfd is the file descriptor associated with the socket. If we would like to monitor the activity of a process, we would need a <pid, EVFILT_PROC> tuple. Keep in mind there can be only one kevent with the same in our kqueue.

flags

After having designed a kevent, we should decide whether we want to have it added to our kqueue. For this purpose we set the flags member to EV_ADD. We could also delete an existing one by setting EV_DELETE or just disable it with EV_DISABLE.

Combinations may be made by OR'ing the desired values. For instance, EV_ADD | EV_ENABLE | EV_ONESHOT would translate to "Add the event, enable it and return only the first occurrence of the filter being triggered. After the user retrieves the event from the kqueue, delete it."

Reversely, if we would like to check whether a flag is set in a kevent, we would do it by AND'ing with the respective value. For instance:

if (myevent.flags & EV_ERROR) {
   /* handle errors */
}

EV_SET() macro

The EV_SET() macro is provided for ease of initializing a kevent structure. For the time being we won't elaborate on the rest of the kevent members; instead let's have a look at the case when we need to monitor a socket for any pending data for reading:

kevent ev;

EV_SET(&ev, sckfd, EVFILT_READ, EV_ADD, 0, 0, 0);

If we liked to monitor a set of N sockets we would write something like this:

kevent ev[N];
int i;

for (i = 0; i < N; i++)
   EV_SET(&ev[i], sckfd[i], EVFILT_READ, EV_ADD, 0, 0, 0);

kqueue(2)

The kqueue holds all the events we are interested in. Therefore, to begin with, we must create a new kqueue. We do so with the following code:

int kq;

if ((kq = kqueue()) == -1) {
   perror("kqueue");
   exit(EXIT_FAILURE);
}

kevent(2)

At this point the kqueue is empty. In order to populate it with a set of events, we use the kevent(2) function. This system call takes the array of events we constructed before and does not return until at least one event is received (or when an associated timeout is exhausted). The function returns the number of changes received and stores information about them in another array of struct kevent elements.

kevent chlist[N];   /* events we want to monitor */
kevent evlist[N];   /* events that were triggered */
int nev, i;

/* populate chlist with the events we are interested in */
/* ... */

/* loop forever */
for (;;) {
   nev = kevent(kq, chlist, N, 
                    evlist, N,
                    NULL);   /* block indefinitely */

   if (nev == -1) {
      perror("kevent()");
      exit(EXIT_FAILURE);
   }
   else if (nev > 0) {
      for (i = 0; i < nev; i++) {
         /* handle events */
      }
   }
}

timeout

Sometimes it is useful to set an upper time limit for kevent() to block. That way, it will return, no matter if none of the events was triggered. For this purpose we need the timespec structure, which is defined in sys/time.h:

struct timespec {
            time_t tv_sec;        /* seconds */
            long   tv_nsec;       /* and nanoseconds */
};

The above code would turn into the following:

kevent chlist[N];   /* events we want to monitor */
kevent evlist[N];   /* events that were triggered */
struct timespec tmout = { 5,     /* block for 5 seconds at most */ 
                          0 };   /* nanoseconds */
int nev, i;

/* populate chlist with the events we are interested in */
/* ... */

/* loop forever */
for (;;) {
   nev = kevent(kq, chlist, N, 
                    evlist, N,
                    &tmout);   /* set upper time limit to block */

   if (nev == -1) {
      perror("kevent()");
      exit(EXIT_FAILURE);
   }
   else if (nev == 0) {
      /* handle timeout */
   }
   else if (nev > 0) {
      for (i = 0; i < nev; i++) {
         /* handle events */
      }
   }
}

Note that if one uses a non-NULL zero timespec structure, the kevent() will return instantaneously, bringing down the performance to the levels of a plain poll method.

Examples

A timer example

The following code will setup a timer that will trigger a kevent every 5 seconds. Once it does, the process will fork and the child will execute the date(1) command.

#include <sys/event.h>
#include <sys/time.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>   /* for strerror() */
#include <unistd.h>

/* function prototypes */
void diep(const char *s);

int main(void)
{
   struct kevent change;    /* event we want to monitor */
   struct kevent event;     /* event that was triggered */
   pid_t pid;
   int kq, nev;

   /* create a new kernel event queue */
   if ((kq = kqueue()) == -1)
      diep("kqueue()");

   /* initalise kevent structure */
   EV_SET(&change, 1, EVFILT_TIMER, EV_ADD | EV_ENABLE, 0, 5000, 0);

   /* loop forever */
   for (;;) {
      nev = kevent(kq, &change, 1, &event, 1, NULL);

      if (nev < 0)
         diep("kevent()");

      else if (nev > 0) {
         if (event.flags & EV_ERROR) {   /* report any error */
            fprintf(stderr, "EV_ERROR: %s\n", strerror(event.data));
            exit(EXIT_FAILURE);
         }

         if ((pid = fork()) < 0)         /* fork error */
            diep("fork()");

         else if (pid == 0)              /* child */
            if (execlp("date", "date", (char *)0) < 0)
               diep("execlp()");
      }
   }

   close(kq);
   return EXIT_SUCCESS;
}

void diep(const char *s)
{
   perror(s);
   exit(EXIT_FAILURE);
}

Compile and run:

$ gcc -o ktimer ktimer.c -Wall -W -Wextra -ansi -pedantic
$ ./ktimer
Tue Mar 20 15:48:16 EET 2007
Tue Mar 20 15:48:21 EET 2007
Tue Mar 20 15:48:26 EET 2007
Tue Mar 20 15:48:31 EET 2007
^C

A raw tcp client

We will implement a raw tcp client using the kqueue framework. Whenever the host sends data to the socket, we will print them in the standard output stream. Similarly, when the user types something in the standard input stream, we will send it to the host through the socket. Basically, we need to monitor the following:

  1. any incoming host data in the socket
  2. any user data in the standard input stream

    include

    include

    include

    include

    include

    include

    include

    include

    include

    define BUFSIZE 1024

    / function prototypes / void diep(const char s); int tcpopen(const char host, int port); void sendbuftosck(int sckfd, const char *buf, int len);

    int main(int argc, char argv[]) { struct kevent chlist[2]; / events we want to monitor / struct kevent evlist[2]; / events that were triggered */ char buf[BUFSIZE]; int sckfd, kq, nev, i;

    / check argument count / if (argc != 3) { fprintf(stderr, "usage: %s host port\n", argv[0]); exit(EXIT_FAILURE); }

    / open a connection to a host:port pair / sckfd = tcpopen(argv[1], atoi(argv[2]));

    / create a new kernel event queue / if ((kq = kqueue()) == -1) diep("kqueue()");

    / initialise kevent structures / EV_SET(&chlist[0], sckfd, EVFILT_READ, EV_ADD | EV_ENABLE, 0, 0, 0); EV_SET(&chlist[1], fileno(stdin), EVFILT_READ, EV_ADD | EV_ENABLE, 0, 0, 0);

    / loop forever / for (;;) { nev = kevent(kq, chlist, 2, evlist, 2, NULL);

      if (nev < 0)
         diep("kevent()");
    
       else if (nev > 0) {
         if (evlist[0].flags & EV_EOF)                       /* read direction of socket has shutdown */
            exit(EXIT_FAILURE);
    
         for (i = 0; i < nev; i++) {
            if (evlist[i].flags & EV_ERROR) {                /* report errors */
               fprintf(stderr, "EV_ERROR: %s\n", strerror(evlist[i].data));
               exit(EXIT_FAILURE);
            }
    
            if (evlist[i].ident == sckfd) {                  /* we have data from the host */
               memset(buf, 0, BUFSIZE);
               if (read(sckfd, buf, BUFSIZE) < 0)
                  diep("read()");
               fputs(buf, stdout);
            }
    
            else if (evlist[i].ident == fileno(stdin)) {     /* we have data from stdin */
               memset(buf, 0, BUFSIZE);
               fgets(buf, BUFSIZE, stdin);
               sendbuftosck(sckfd, buf, strlen(buf));
            }
         }
      }
    

    }

    close(kq); return EXIT_SUCCESS; }

    void diep(const char *s) { perror(s); exit(EXIT_FAILURE); }

    int tcpopen(const char host, int port) { struct sockaddr_in server; struct hostent hp; int sckfd;

    if ((hp = gethostbyname(host)) == NULL) diep("gethostbyname()");

    if ((sckfd = socket(PF_INET, SOCK_STREAM, 0)) < 0) diep("socket()");

    server.sin_family = AF_INET; server.sin_port = htons(port); server.sin_addr = ((struct in_addr )hp->h_addr); memset(&(server.sin_zero), 0, 8);

    if (connect(sckfd, (struct sockaddr *)&server, sizeof(struct sockaddr)) < 0) diep("connect()");

    return sckfd; }

    void sendbuftosck(int sckfd, const char *buf, int len) { int bytessent, pos;

    pos = 0; do { if ((bytessent = send(sckfd, buf + pos, len - pos, 0)) < 0) diep("send()"); pos += bytessent; } while (bytessent > 0); }

Compile and run:

$ gcc -o kclient kclient.c -Wall -W -Wextra -ansi -pedantic
$ ./kclient irc.freenode.net 7000
NOTICE AUTH :*** Looking up your hostname...
NOTICE AUTH :*** Found your hostname, welcome back
NOTICE AUTH :*** Checking ident
NOTICE AUTH :*** No identd (auth) response
_USER guest tolmoon tolsun :Ronnie Reagan
NICK Wiz_
:herbert.freenode.net 001 Wiz :Welcome to the freenode IRC Network Wiz
^C 

(Whatever is in italics it is what we type.)

More examples

More kqueue examples (including the aforementioned) may be found here.

Documentation

  1. kqueue(2): kqueue, kevent NetBSD Manual Pages
  2. Kqueue: A generic and scalable event notification facility (pdf)
  3. kqueue slides
  4. The Julipedia: An example of kqueue

The example code at http://repo.or.cz/w/eleutheria.git/blob_plain/master:/kqueue/kqdir.c has a bug. It skips the first file in the directory of files it's supposed to monitor.

I was able to fix this by changeing line 51 to while(cnt++ < 2 && (pdent = readdir(pdir)) != NULL)

Comment by Kevin early Thursday morning, January 1st, 2015