May 2008

Packet Reading with libpcap Part 1

Tcpdump, Snort and similar tools are great; administrators and programmers alike can leverage them for everything from basic packet header reading down to bit for bit analysis of what, when, where on a network. How do they work? If someone wished to include packet reading functionality in their own software(s) what might be the best method? In this text a first pass at setting up a simple packet reading program using the libpcap packet reading library.

A solid understanding of network packet structure and basic C programming skills (up to pointers and data structures) is recommended for this text.

The PCAP Site

Before delving any further it is worth noting that the tcpdump/libpcap site has a variety of documentation and differing examples that can be used if this text is found to be unsuitable to a reader or one simply wishes to get going a little faster. Indeed; the examples in the text below are derived from both examples at the tcpdump site as well as some tcpdump code itself.

What is libpcap?

The libpcap library can be used to read, record, inject and in general deal with network packets at a higher level than raw sockets. Essentially libpcap can be used to easily collect up or manipulate packets. Libpcap functions also abstract a lot of the differences between Operating Systems' network API making programs that leverage libpcap generally more portable or perhaps saving the programmer the headache of writing their own network API layer. This is not to say dealing with packets even with libpcap is easy; just slightly easier.

Getting Libpcap

Installing libpcap itself may not always be enough; below is how it is installed for a variety of systems - note one cheap way to make sure enough bits are installed is to install Nmap:

  • Debian/Ubuntu (and other Debian based distributions): apt-get install libpcap-dev
  • FreeBSD: cd ~ports/net/libpcap && make install
  • NetBSD: cd ~pkgsrc/net/libpcap && make install

Key Data Structures and Definitions

Before jumping head first into utilizing libpcap an overview of the two major data structures and some of the definitions is needed. Point one with programming in libpcap is to understand that all the software does it gets and (can) manipulate data - nothing more. Dealing with network packets (not unlike kernel programming) is not some mystical voodoo realm; it is data handling: nothing more nothing less. Since the pktutils: nread and nject programs (older versions of them actually) are being used as a reference the structure names map to those programs.

The IP Header

struct nread_ip {
    u_int8_t        ip_vhl;          /* header length, version    */
#define IP_V(ip)    (((ip)->ip_vhl & 0xf0) >> 4)
#define IP_HL(ip)   ((ip)->ip_vhl & 0x0f)
    u_int8_t        ip_tos;          /* type of service           */
    u_int16_t       ip_len;          /* total length              */
    u_int16_t       ip_id;           /* identification            */
    u_int16_t       ip_off;          /* fragment offset field     */
#define IP_DF 0x4000                 /* dont fragment flag        */
#define IP_MF 0x2000                 /* more fragments flag       */
#define IP_OFFMASK 0x1fff            /* mask for fragmenting bits */
    u_int8_t        ip_ttl;          /* time to live              */
    u_int8_t        ip_p;            /* protocol                  */
    u_int16_t       ip_sum;          /* checksum                  */
    struct  in_addr ip_src, ip_dst;  /* source and dest address   */
};

The TCP Header

struct nread_tcp {
    u_short th_sport; /* source port            */
    u_short th_dport; /* destination port       */
    tcp_seq th_seq;   /* sequence number        */
    tcp_seq th_ack;   /* acknowledgement number */
#if BYTE_ORDER == LITTLE_ENDIAN
    u_int th_x2:4,    /* (unused)    */
    th_off:4;         /* data offset */
#endif
#if BYTE_ORDER == BIG_ENDIAN
    u_int th_off:4,   /* data offset */
    th_x2:4;          /* (unused)    */
#endif
    u_char th_flags;
#define TH_FIN 0x01
#define TH_SYN 0x02
#define TH_RST 0x04
#define TH_PUSH 0x08
#define TH_ACK 0x10
#define TH_URG 0x20
#define TH_ECE 0x40
#define TH_CWR 0x80
    u_short th_win; /* window */
    u_short th_sum; /* checksum */
    u_short th_urp; /* urgent pointer */
};

Note how the names and comments make the data structures very self explanatory. Again - it is just data: packet data at that. In fact the example programs do not use the data much until printing out information.

Following are the includes that will be used for all three passes of the program - note they are not all used until the last swipe at the program:

#define _BSD_SOURCE 1

#include <sys/ioctl.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/types.h>

#include <arpa/inet.h>
#include <net/ethernet.h>
 
#ifdef LINUX
#include <netinet/ether.h>
#endif
 
#include <netinet/if_ether.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <netinet/udp.h>
 
#include <fcntl.h>
#include <getopt.h>
#include <ifaddrs.h>
#include <netdb.h>
#include <pcap.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <syslog.h>
#include <unistd.h>

Pass One: Simple Header Reading

First set up the needed bits:

int main(void)
{
1   struct bpf_program filter;     /* The compiled filter       */
1   pcap_t *descr;                 /* Session descr             */
2   char *dev;                     /* The device to sniff on    */
3   char errbuf[PCAP_ERRBUF_SIZE]; /* Error string              */
4   struct bpf_program filter;     /* The compiled filter       */
5   bpf_u_int32 mask;              /* Our netmask               */
6   bpf_u_int32 net;               /* Our IP address            */
7   u_char* args = NULL;           /* Retval for pcacp callback */
...

Line by Line

  1. pcap_t *descr - this is the session descriptor for a new pcap session.
  2. char *dev - the device file for the network interface.
  3. char errbuf[PCAP_ERRBUF_SIZE] - A buffer for pcap only errors (internal to libpcap but used by the caller).
  4. struct bpf_program filter - the built network filter, pcap utilizes the berkeley packet filter (hence the bpf nomenclature).
  5. bpf_u_int32 mask - holds the netmask for the interface and filter.
  6. bpf_u_int32 net - holds the network address for the interface and filter.
  7. u_char* args = NULL - contains the return value from and for a pcap callback function.

The function of the variables is pretty self evident - so far nothing too strange. Next are some checks and opening the device itself. First make sure the user is root, while pcap simply will not be able to open the device if the user is non-root life is simpler if the programs exits well before even bothering to try and open a device:

    if (getuid()) {
        printf("Error! Must be root ... exiting\n");
        return (1);
    }

Set the device:

    dev = pcap_lookupdev(errbuf);

    if (dev == NULL) {
        printf("%s\n", errbuf);
        return (1);
    }

What is happening above: the OS will try to open the first device it can that is a valid network interface - so not loopback. Later on in the next versions the interface will have an argument option so the default is not the only choice a user has.

Once the device is set it is time to set up the pcap descriptor:

    descr = pcap_open_live(dev, BUFSIZ, 1, 0, errbuf);

    if (descr == NULL) {
        printf("pcap_open_live(): %s\n", errbuf);
        return (1);
    }

Last and not least before getting into more specific nuts and bolts the networking data must be ascertained, again pcap provides a method to do so:

    pcap_lookupnet(dev, &net, &mask, errbuf);

Ironically before actually calling pcap a decision has to be made; collect one packet, a number of packets or continuous? For the example continuous will be used. Prior to setting up the actual reading there is a little grunt work to be performed. Three basic hurdles need to be jumped before reading the actual data:

  1. Compile and setup a filter (for the purposes here there is no filter).
  2. Create the IP and Ethernet handlers to actually deal with the packets themselves.
  3. A callback function to cycle through the reader functions.

It certainly sounds challenging - but it is not as bad as it reads.

Compiling and Setting up a Filter

The pcap_compile and pcap_setfilter functions are pretty simple - especially considering at this point they are not being modified from the defaults:

    if (pcap_compile(descr, &filter, " ", 0, net) == -1) {
        fprintf(stderr,"Error compiling pcap\n");
        return (1);
    }

    if (pcap_setfilter(descr, &filter))  {
        fprintf(stderr, "Error setting pcap filter\n");
        return (1);
    }

Note argument 3 of pcap_compile is where the filter would be passed to libpcap, the filter is the same as tcpdump filters; for example pcap_compile(descr, &filter, "host 192.169.0.10", 0, net) would read packets only related to the host at IP address 192.169.0.10. The filter argument contains the filter after being called for future reference.

Getting the Handlers Ready

The next step is getting the IP and Ethernet handlers, these should both be in separate functions.

Ethernet Handler

The ethernet handler determines the type and returns it to the caller, for posterity it also prints out information. Later on it might be wise to make the ethernet printing optional. First up are the passed in variables - pretty straightforward:

u_int16_t ethernet_handler (u_char *args, const struct pcap_pkthdr* pkthdr,
                                              const u_char* packet)

{
...
  1. The filter (nothing)
  2. The packet header
  3. Packet data

Local variables needed:

        u_int caplen = pkthdr->caplen; /* length of portion present from bpf  */
        u_int length = pkthdr->len;    /* length of this packet off the wire  */
        struct ether_header *eptr;     /* net/ethernet.h                      */
        u_short ether_type;            /* the type of packet (we return this) */
        eptr = (struct ether_header *) packet;
        ether_type = ntohs(eptr->ether_type);
...
  1. The length as described by bpf
  2. Actual length
  3. Ethernet header data structure
  4. Type of ethernet packet
  5. Create a pointer to the ethernet header
  6. Get the actual type and assign it

Last, print out the information then compare the returned type to the types of interest to determine what to print and return the type:

        fprintf(stdout,"eth: ");
        fprintf(stdout,
        "%s ",ether_ntoa((struct ether_addr*)eptr->ether_shost));
        fprintf(stdout,
        "%s ",ether_ntoa((struct ether_addr*)eptr->ether_dhost));
 
        if (ether_type == ETHERTYPE_IP) {
                fprintf(stdout,"(ip)");
        } else  if (ether_type == ETHERTYPE_ARP) {
                fprintf(stdout,"(arp)");
        } else  if (eptr-ether_type == ETHERTYPE_REVARP) {
                fprintf(stdout,"(rarp)");
        } else {
                fprintf(stdout,"(?)");
        }
 
        fprintf(stdout," %d\n",length); /* print len */
 
        return ether_type;
}
IP Handler

The ip handler takes care of tcpip in total. Note that is has the same arguments as the ethernet handler:

u_char* ip_handler (u_char *args,const struct pcap_pkthdr* pkthdr,
                                             const u_char* packet)
{
...

The local variables and structures are far different, however. Now both IP and TCP must be taken into account in addition to key data points that are extracted from the data structures for formatted printing:

        const struct nread_ip* ip;   /* packet structure         */
        const struct nread_tcp* tcp; /* tcp structure            */
        u_int length = pkthdr->len;  /* packet header length  */
        u_int off, version;             /* offset, version       */
        int len;                        /* length holder         */
...
  1. The IP data structure detailed earlier
  2. The TCP data structure detailed earlier
  3. Variable to hold the packet header length
  4. Another header length copy, the offset, and version
  5. Yet another length placeholder...

Populate the data and print:

        ip = (struct nread_ip*)(packet + sizeof(struct ether_header));
        length -= sizeof(struct ether_header);
        tcp = (struct nread_tcp*)(packet + sizeof(struct ether_header) +
                                                sizeof(struct nread_ip));

        len     = ntohs(ip->ip_len); /* get packer length */
        version = IP_V(ip);          /* get ip version    */

        off = ntohs(ip->ip_off);

        fprintf(stdout,"ip: ");
        fprintf(stdout,"%s:%u->%s:%u ",
                        inet_ntoa(ip->ip_src), tcp->th_sport,
                        inet_ntoa(ip->ip_dst), tcp->th_dport);
        fprintf(stdout,
                "tos %u len %u off %u ttl %u prot %u cksum %u ",
                        ip->ip_tos, len, off, ip->ip_ttl,
                        ip->ip_p, ip->ip_sum);

        fprintf(stdout,"seq %u ack %u win %u ",
                        tcp->th_seq, tcp->th_ack, tcp->th_win);
        fprintf(stdout,"%s", payload);
        printf("\n");

        return NULL;
}

The Callback Function

With the handlers in place a repeatable callback function can now be put into place:

void pcap_callback(u_char *args, const struct pcap_pkthdr* pkthdr,
                                             const u_char* packet)
{
        u_int16_t type = ethernet_handler(args, pkthdr, packet);

        if (type == ETHERTYPE_IP) {
                ip_handler(args, pkthdr, packet);
         } else if (type == ETHERTYPE_ARP) {
                /* noop */
        } else if (type == ETHERTYPE_REVARP) {
                /* noop */
        }
}

The callback function is pretty simple - if it is ethernet call the ip_handler. Back in the main() function in order to loop over the packets, the pcap_loop() function is called using the callback as an argument:

pcap_loop(descr, -1, pcap_callback, args);

The arguments for the loop are:

  1. Descriptor
  2. Number of loops or a -1 to loop forever.
  3. The callback function
  4. Arguments for pcap

Summary & Next Time

The example code in the text can be cobbled together to create a functional IP packet reader, however, there are some flaws in the examples from both a usable and bug perspective:

  • Nowhere is there bit continuity checking for the address portion - this can cause invalid data to be printed and not acted upon.
  • A few checks are missing such as length checks, type checks etc.
  • The example code is far too restrictive; the caller (whether a user or another program) should be able to specify a variety of options such as which interface, a filter, verbosity etc.

Dealing with network programming can be difficult; especially when supporting multiple platforms, Operating Systems, protocols and the rest of the myriad issues a programmer can come across. Tools like libpcap make the heavy lifting not so heavy allowing a focus more on what to do with the information available versus how to collect the information. In the next text a more fleshed version of the packet reader with minimal error checking.

next