C Network Programming Reference

Index | Up


Table of Contents

1. Introduction

This article is meant to be a quick guide/reference for C programmers who are interested in network programming on Unix-like systems. The code in this article has been tested on Linux 6.11.6.

I am somewhat new to network programming myself, so if you have any suggestions, please feel free to contribute to this page.

Some other interesting resources about network programming:

3. Getting address information

Before establishing a connection, we need to create a socket. As I mentioned above, the operating system uses socket descriptors for identifying connections and transmitting data. Sockets are created with the socket(2) function.

#include <sys/types.h>
#include <sys/socket.h>

int socket(int domain, int type, int protocol);

We could call socket with values such as PF_INET5, SOCK_STREAM and IPPROTO_IP. However, there is a cleaner way of obtaining the information that is used when making most of these networking calls: using the getaddrinfo(3) function.

3.1. Usage for getaddrinfo

The getaddrinfo function fills a linked list of addrinfo structures based on its arguments.

#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>

int getaddrinfo(const char* node,
                const char* service,
                const struct addrinfo* hints,
                struct addrinfo** res);

Here is a brief description of each parameter:

  1. The node parameter is used to specify the target host. This is usually an IPv4 or IPv6 address6, but it can also be network hostname and it will be looked up and resolved. It can also be NULL, as we will see when doing a passive open below.
  2. The service parameter is a string used to specify the target service. The string usually contains the target port as a decimal number, but it can also be a service name (such as “ftp” or “http”) which will be translated to the port number according to the services(5) file.
  3. The hints parameter is an addrinfo structure containing some hints about the type of information we want to receive. Note that unused members this hints structure must be set to zero, so a call to memset is convenient after the definition.
  4. The res parameter is a pointer to another addrinfo pointer, and the function will use it to build a linked list of addrinfo structures. The pointer that res points to should be freed by the caller with the freeaddrinfo function.

The getaddrinfo function returns 0 on success, or non-zero on error. The error codes returned by this function can be converted to a human-readable string with gai_strerror. The linked filled by getaddrinfo (the last argument) must be freed by the caller using freeaddrinfo.

Different members of the addrinfo will be used throughout this article, so here is the structure definition from <netdb.h>:

#include <sys/socket.h>

struct addrinfo {
    int ai_flags;             /* Input flags */
    int ai_family;            /* Protocol family for socket */
    int ai_socktype;          /* Socket type */
    int ai_protocol;          /* Protocol for socket */
    socklen_t ai_addrlen;     /* Length of socket address */
    struct sockaddr* ai_addr; /* Socket address for socket */
    char* ai_canonname;       /* Canonical name for service location */
    struct addrinfo* ai_next; /* Pointer to next in list */
};

The sockaddr structure is defined in <sys/socket.h, contains useful information about the socket address. However, since its members are a bit abstract, this sockaddr structure is usually casted to a sockaddr_in or sockaddr_in6 structure (depending on whether it’s an IPv4 or IPv6 address, respectively), both defined in <netinet/in.h>7.

#include <netinet/in.h>

struct sockaddr_in {
    sa_family_t     sin_family;     /* AF_INET */
    in_port_t       sin_port;       /* Port number */
    struct in_addr  sin_addr;       /* IPv4 address */
};

struct sockaddr_in6 {
    sa_family_t     sin6_family;    /* AF_INET6 */
    in_port_t       sin6_port;      /* Port number */
    uint32_t        sin6_flowinfo;  /* IPv6 flow info */
    struct in6_addr sin6_addr;      /* IPv6 address */
    uint32_t        sin6_scope_id;  /* Set of interfaces for a scope */
};

struct in_addr {
    in_addr_t s_addr;
};

struct in6_addr {
    uint8_t   s6_addr[16];
};

typedef uint32_t in_addr_t;
typedef uint16_t in_port_t;

3.2. Example code for getaddrinfo

The following example shows a call to getaddrinfo, although more specific examples will be shown below. Remember to check the value returned by getaddrinfo, and to free the linked list of addrinfo structures with freeaddrinfo after you are done using it.

struct addrinfo hints;
memset(&hints, 0, sizeof(hints));
hints.ai_family   = AF_INET;     /* IPv4 */
hints.ai_socktype = SOCK_STREAM; /* TCP */

struct addrinfo* server_info;
const int status = getaddrinfo(ip, port, &hints, &server_info);
if (status != 0) {
    fprintf(stderr, "Error: %s\n", gai_strerror(status));
    abort();
}

/* ... */

freeaddrinfo(server_info);

We can then use the members of the filled server_info to create the socket. Remember to check the value returned by socket, and to close the socket descriptor after you are done using it.

const int sockfd = socket(server_info->ai_family,
                          server_info->ai_socktype,
                          server_info->ai_protocol);
if (sockfd < 0) {
    fprintf(stderr, "Could not create socket: %s\n", strerror(errno));
    abort();
}

/* ... */

close(sockfd);

4. Communicating through TCP

To communicate data through TCP, we need to either listen and accept incoming connections (a passive open), or establish a connection to another computer on a listening port (an active open).

4.1. Connecting with a passive open

These are the general steps for establishing a connection through a passive open:

  1. Obtain a socket descriptor, used for listening.
  2. Bind a local port to the socket descriptor.
  3. Start to listen on that socket descriptor.
  4. Wait for connections, and accept them.

4.1.1. Getting our address information

We know how to obtain information about an external address (using getaddinfo), but we will also need to obtain information about ourselves before creating the socket. We need to make two small changes when making the call:

  1. Set hints.ai_flags to AI_PASSIVE.
  2. Pass NULL as the first (node) parameter of getaddrinfo.

From the getaddinfo(3) man page:

If the AI_PASSIVE flag is specified in hints.ai_flags, and node is NULL, then the returned socket addresses will be suitable for bind(2)ing a socket that will accept(2) connections.

It’s important to note that the second argument when calling getaddrifo will determine the port that we will use when listening, and therefore the port that the peer will have to use when connecting to us (i.e. when doing an active open). Note that all ports below 1024 are reserved8 for the system, so you should use a number in the range [1025..65535] (inclusive), and it should not be in use by another program.

This is the new code for obtaining our address information. In this case, the addrinfo structure filled by getaddrinfo will refer to the port 4321 of our machine.

struct addrinfo hints;
memset(&hints, 0, sizeof(hints));
hints.ai_family   = AF_INET;
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags    = AI_PASSIVE; /* New */

struct addrinfo* self_info;
const int status = getaddrinfo(NULL, "4321", &hints, &self_info); /* Updated */
if (status != 0) {
    fprintf(stderr, "Could not obtaining our address info: %s\n",
            gai_strerror(status));
    abort();
}

4.1.2. Creating the passive socket

The socket(2) function returns a socket descriptor from the specified domain (e.g. IPv4 or IPv6), socket type (e.g. TCP or UDP) and protocol (e.g. IP).

#include <sys/types.h>
#include <sys/socket.h>

int socket(int domain, int type, int protocol);

On error, -1 is returned and errno is set. If the returned socket is valid, it must be closed by the caller using close(2).

Now that self_info contains information about the current machine, we can call socket just like we did before.

const int sockfd_listen = socket(server_info->ai_family,
                                 server_info->ai_socktype,
                                 server_info->ai_protocol);
if (sockfd_listen < 0) {
    fprintf(stderr, "Could not create socket: %s\n", strerror(errno));
    abort();
}

That sockfd_listen variable will be used for the process of accepting connections, not for transmitting data after the connection is established. This is normally referred to as a passive socket.

4.1.3. Binding the socket address

Next, we need to bind the socket address (IP address, port and protocol) to the socket descriptor we just created. This can be done with the bind(2) function.

#include <sys/types.h>
#include <sys/socket.h>

int bind(int sockfd, const struct sockaddr* addr, socklen_t addrlen);

The bind function returns zero on success, or -1 on error, setting errno appropriately. We could create our own sockaddr structure, but getaddrinfo already filled one for us, so we should use that.

const int status = bind(sockfd_listen,
                        self_info->ai_addr,
                        self_info->ai_addrlen);
if (status != 0) {
    fprintf(stderr, "Could not bind to socket descriptor: %s\n",
            strerror(errno));
    abort();
}

4.1.4. Listening for connections

After binding the socket address, we can start listening for connections. We do this with the listen(2) function.

#include <sys/types.h>
#include <sys/socket.h>

int listen(int sockfd, int backlog);

The first parameter is the passive socket we created earlier, and the second parameter is the maximum length to which the queue of pending connections for sockfd may grow9. The listen function returns zero on success, or -1 on error, setting errno appropriately.

const int status = listen(sockfd_listen, 10);
if (status != 0) {
    fprintf(stderr, "Could not listen for connections: %s\n", strerror(errno));
    abort();
}

Now the system is listening for connections on the port we specified when calling getaddrinfo (in this case 4321), and it will queue incoming connections until we accept them.

4.1.5. Accepting connections

Once we encounter an incoming connection, we can accept it using the accept(2) function.

#include <sys/types.h>
#include <sys/socket.h>

int accept(int sockfd, struct sockaddr* addr, socklen_t* addrlen);

The first parameter of accept is the passive socket we created with socket(2) above. The other two parameters are used to retrieve information about the computer that is connecting to us, but they can be set to NULL if we don’t care about this information.

The accept function returns a new socket descriptor used for sending and receiving data in the accepted connection. On error, it returns -1 and sets errno.

const int sockfd_connection = accept(sockfd_listen, NULL, NULL);
if (sockfd_connection < 0) {
    fprintf(stderr, "Could not accept incoming connection: %s\n",
            strerror(errno));
    abort();
}

After the connection is accepted, we can send and receive data from the peer using the returned socket descriptor.

4.2. Connecting with an active open

TODO

4.3. Sending and receiving data through sockets

TODO

Footnotes:

1

Note that these are not the only existing transport protocols. Some other examples include the Datagram Congestion Control Protocol (DCCP) and the Stream Control Transmission Protocol (SCTP).

2

See RFC 793.

3

See RCC 768.

4

When one of the colon-separated numbers is zero, it can be omited. Therefore, the “expanded” version of that IPv6 address is 2001:0db8:0000:0000:0000:8a2e:0370:7334.

5

The PF prefix stands for Protocol Family, whereas AF stands for Address Family. In practise, AF_INET and PF_INET have the same value.

6

The IPv4 and IPv6 formats are valid acording to inet_aton(3) and inet_pton(3), respectively.

7

More specifically, the sockaddr structure from <sys/socket.h> contains only a sa_family_t member and a char data[] array. Based on the sa_family member, we can decide which sockaddr_in* structure we should use, since they provide a nicer interface.

8

See also Registered port (Wikipedia) and List of TCP and UDP port numbers (Wikipedia).

9

A value of 5 or 10 for the backlog argument is fine. The system silently truncates the argument to the value in /proc/sys/net/core/somaxconn. Since Linux 5.4, the default in this file is 4096; in earlier kernels, the default value is 128.