C Network Programming Reference
Table of Contents
1. Introduction
This article is meant to be a quick guide/reference for C programmers who are interested in network programming on Unix-like systems. The code in this article has been tested on Linux 6.11.6.
I am somewhat new to network programming myself, so if you have any suggestions, please feel free to contribute to this page.
Some other interesting resources about network programming:
3. Getting address information
Before establishing a connection, we need to create a socket. As I mentioned
above, the operating system uses socket descriptors for identifying connections
and transmitting data. Sockets are created with the socket(2)
function.
#include <sys/types.h> #include <sys/socket.h> int socket(int domain, int type, int protocol);
We could call socket
with values such as PF_INET
5, SOCK_STREAM
and IPPROTO_IP
. However, there is a
cleaner way of obtaining the information that is used when making most of these
networking calls: using the getaddrinfo(3)
function.
3.1. Usage for getaddrinfo
The getaddrinfo
function fills a linked list of addrinfo
structures based on its
arguments.
#include <sys/types.h> #include <sys/socket.h> #include <netdb.h> int getaddrinfo(const char* node, const char* service, const struct addrinfo* hints, struct addrinfo** res);
Here is a brief description of each parameter:
- The
node
parameter is used to specify the target host. This is usually an IPv4 or IPv6 address6, but it can also be network hostname and it will be looked up and resolved. It can also beNULL
, as we will see when doing a passive open below. - The
service
parameter is a string used to specify the target service. The string usually contains the target port as a decimal number, but it can also be a service name (such as “ftp” or “http”) which will be translated to the port number according to theservices(5)
file. - The
hints
parameter is anaddrinfo
structure containing some hints about the type of information we want to receive. Note that unused members thishints
structure must be set to zero, so a call tomemset
is convenient after the definition. - The
res
parameter is a pointer to anotheraddrinfo
pointer, and the function will use it to build a linked list ofaddrinfo
structures. The pointer thatres
points to should be freed by the caller with thefreeaddrinfo
function.
The getaddrinfo
function returns 0 on success, or non-zero on error. The error
codes returned by this function can be converted to a human-readable string with
gai_strerror
. The linked filled by getaddrinfo
(the last argument) must be freed
by the caller using freeaddrinfo
.
Different members of the addrinfo
will be used throughout this article, so here
is the structure definition from <netdb.h>
:
#include <sys/socket.h> struct addrinfo { int ai_flags; /* Input flags */ int ai_family; /* Protocol family for socket */ int ai_socktype; /* Socket type */ int ai_protocol; /* Protocol for socket */ socklen_t ai_addrlen; /* Length of socket address */ struct sockaddr* ai_addr; /* Socket address for socket */ char* ai_canonname; /* Canonical name for service location */ struct addrinfo* ai_next; /* Pointer to next in list */ };
The sockaddr
structure is defined in <sys/socket.h
, contains useful information
about the socket address. However, since its members are a bit abstract,
this sockaddr
structure is usually casted to a sockaddr_in
or sockaddr_in6
structure (depending on whether it’s an IPv4 or IPv6 address, respectively),
both defined in <netinet/in.h>
7.
#include <netinet/in.h> struct sockaddr_in { sa_family_t sin_family; /* AF_INET */ in_port_t sin_port; /* Port number */ struct in_addr sin_addr; /* IPv4 address */ }; struct sockaddr_in6 { sa_family_t sin6_family; /* AF_INET6 */ in_port_t sin6_port; /* Port number */ uint32_t sin6_flowinfo; /* IPv6 flow info */ struct in6_addr sin6_addr; /* IPv6 address */ uint32_t sin6_scope_id; /* Set of interfaces for a scope */ }; struct in_addr { in_addr_t s_addr; }; struct in6_addr { uint8_t s6_addr[16]; }; typedef uint32_t in_addr_t; typedef uint16_t in_port_t;
3.2. Example code for getaddrinfo
The following example shows a call to getaddrinfo
, although more specific
examples will be shown below. Remember to check the value returned by
getaddrinfo
, and to free the linked list of addrinfo
structures with
freeaddrinfo
after you are done using it.
struct addrinfo hints; memset(&hints, 0, sizeof(hints)); hints.ai_family = AF_INET; /* IPv4 */ hints.ai_socktype = SOCK_STREAM; /* TCP */ struct addrinfo* server_info; const int status = getaddrinfo(ip, port, &hints, &server_info); if (status != 0) { fprintf(stderr, "Error: %s\n", gai_strerror(status)); abort(); } /* ... */ freeaddrinfo(server_info);
We can then use the members of the filled server_info
to create the
socket. Remember to check the value returned by socket
, and to close
the socket
descriptor after you are done using it.
const int sockfd = socket(server_info->ai_family, server_info->ai_socktype, server_info->ai_protocol); if (sockfd < 0) { fprintf(stderr, "Could not create socket: %s\n", strerror(errno)); abort(); } /* ... */ close(sockfd);
4. Communicating through TCP
To communicate data through TCP, we need to either listen and accept incoming connections (a passive open), or establish a connection to another computer on a listening port (an active open).
4.1. Connecting with a passive open
These are the general steps for establishing a connection through a passive open:
- Obtain a socket descriptor, used for listening.
- Bind a local port to the socket descriptor.
- Start to listen on that socket descriptor.
- Wait for connections, and accept them.
4.1.1. Getting our address information
We know how to obtain information about an external address (using getaddinfo
),
but we will also need to obtain information about ourselves before creating the
socket. We need to make two small changes when making the call:
- Set
hints.ai_flags
toAI_PASSIVE
. - Pass
NULL
as the first (node
) parameter ofgetaddrinfo
.
From the getaddinfo(3)
man page:
If the
AI_PASSIVE
flag is specified inhints.ai_flags
, and node isNULL
, then the returned socket addresses will be suitable forbind(2)
ing a socket that willaccept(2)
connections.
It’s important to note that the second argument when calling getaddrifo
will
determine the port that we will use when listening, and therefore the port that
the peer will have to use when connecting to us (i.e. when doing an active
open). Note that all ports below 1024 are reserved8 for the system,
so you should use a number in the range [1025..65535]
(inclusive), and it should
not be in use by another program.
This is the new code for obtaining our address information. In this case, the
addrinfo
structure filled by getaddrinfo
will refer to the port 4321
of our
machine.
struct addrinfo hints; memset(&hints, 0, sizeof(hints)); hints.ai_family = AF_INET; hints.ai_socktype = SOCK_STREAM; hints.ai_flags = AI_PASSIVE; /* New */ struct addrinfo* self_info; const int status = getaddrinfo(NULL, "4321", &hints, &self_info); /* Updated */ if (status != 0) { fprintf(stderr, "Could not obtaining our address info: %s\n", gai_strerror(status)); abort(); }
4.1.2. Creating the passive socket
The socket(2)
function returns a socket descriptor from the specified
domain (e.g. IPv4 or IPv6), socket type (e.g. TCP or UDP) and protocol
(e.g. IP).
#include <sys/types.h> #include <sys/socket.h> int socket(int domain, int type, int protocol);
On error, -1 is returned and errno
is set. If the returned socket is valid, it
must be closed by the caller using close(2)
.
Now that self_info
contains information about the current machine, we can call
socket
just like we did before.
const int sockfd_listen = socket(server_info->ai_family, server_info->ai_socktype, server_info->ai_protocol); if (sockfd_listen < 0) { fprintf(stderr, "Could not create socket: %s\n", strerror(errno)); abort(); }
That sockfd_listen
variable will be used for the process of accepting
connections, not for transmitting data after the connection is established. This
is normally referred to as a passive socket.
4.1.3. Binding the socket address
Next, we need to bind the socket address (IP address, port and protocol) to the
socket descriptor we just created. This can be done with the bind(2)
function.
#include <sys/types.h> #include <sys/socket.h> int bind(int sockfd, const struct sockaddr* addr, socklen_t addrlen);
The bind
function returns zero on success, or -1 on error, setting errno
appropriately. We could create our own sockaddr
structure, but getaddrinfo
already filled one for us, so we should use that.
const int status = bind(sockfd_listen, self_info->ai_addr, self_info->ai_addrlen); if (status != 0) { fprintf(stderr, "Could not bind to socket descriptor: %s\n", strerror(errno)); abort(); }
4.1.4. Listening for connections
After binding the socket address, we can start listening for connections. We do
this with the listen(2)
function.
#include <sys/types.h> #include <sys/socket.h> int listen(int sockfd, int backlog);
The first parameter is the passive socket we created earlier, and the second
parameter is the maximum length to which the queue of pending connections for
sockfd
may grow9. The listen
function returns zero
on success, or -1 on error, setting errno
appropriately.
const int status = listen(sockfd_listen, 10); if (status != 0) { fprintf(stderr, "Could not listen for connections: %s\n", strerror(errno)); abort(); }
Now the system is listening for connections on the port we specified when
calling getaddrinfo
(in this case 4321
), and it will queue incoming connections
until we accept them.
4.1.5. Accepting connections
Once we encounter an incoming connection, we can accept it using the
accept(2)
function.
#include <sys/types.h> #include <sys/socket.h> int accept(int sockfd, struct sockaddr* addr, socklen_t* addrlen);
The first parameter of accept
is the passive socket we created with
socket(2)
above. The other two parameters are used to retrieve information
about the computer that is connecting to us, but they can be set to NULL
if we
don’t care about this information.
The accept
function returns a new socket descriptor used for sending and
receiving data in the accepted connection. On error, it returns -1 and sets
errno
.
const int sockfd_connection = accept(sockfd_listen, NULL, NULL); if (sockfd_connection < 0) { fprintf(stderr, "Could not accept incoming connection: %s\n", strerror(errno)); abort(); }
After the connection is accepted, we can send and receive data from the peer using the returned socket descriptor.
4.2. Connecting with an active open
TODO
4.3. Sending and receiving data through sockets
TODO
Footnotes:
Note that these are not the only existing transport protocols. Some other examples include the Datagram Congestion Control Protocol (DCCP) and the Stream Control Transmission Protocol (SCTP).
When one of the colon-separated numbers is zero, it
can be omited. Therefore, the “expanded” version of that IPv6 address is
2001:0db8:0000:0000:0000:8a2e:0370:7334
.
The PF
prefix stands for
Protocol Family, whereas AF
stands for Address Family. In practise, AF_INET
and
PF_INET
have the same value.
The IPv4 and IPv6 formats are valid acording to
inet_aton(3)
and inet_pton(3)
, respectively.
More specifically, the sockaddr
structure
from <sys/socket.h>
contains only a sa_family_t
member and a char data[]
array. Based on the sa_family
member, we can decide which sockaddr_in*
structure
we should use, since they provide a nicer interface.
See also Registered port (Wikipedia) and List of TCP and UDP port numbers (Wikipedia).
A value of 5 or 10 for the backlog
argument is fine. The
system silently truncates the argument to the value in
/proc/sys/net/core/somaxconn
. Since Linux 5.4, the default in this file is 4096;
in earlier kernels, the default value is 128.