COMP 3621: Computer Networks Project: libmicrohttpd (Checkpoint 1)

Due: Feb 14th (not graded)

For this checkpoint, you will need to implement MHD_run and MHD_get_fdset. To understand what these do, you will notice that in microhttpd.h you have an enum that lists options for your web daemon when it's created. Three important options that determine how your web daemon works are as follows:

MHD_USE_THREAD_PER_CONNECTION: In this mode, every new connection is associated with a separate thread. Thus, when your main thread checks for pending connections and calls accept, a socket will be returned. At this point you should pass your socket and/or MHD_Session struct to a new thread that will read and write to the socket.
MHD_USE_SELECT_INTERNALLY: In this mode, a separate thread will be created to run select. This thread will loop through all connections using select for multiplexing, i.e., a single thread will read from and write to all sockets and/or MHD_Sessions.
Neither of the above: If neither option is specified, you assume the web daemon will be using an external select. This, in essence, is a default option. The way this works is that the user of the library may call select on your set of sockets. This is why we have the function MHD_get_fd_set. Regardless of what they are doing with the sockets, they will call MHD_run periodically. When they do, you check the status of your sockets with select and make a single pass through all sockets available for reading and writing to process the data on for those sockets.

You will notice from the above descriptions that MHD_run performs different actions depending on the type of server passed to it. If it's a threaded server, the user only calls MHD_run once to get the server to start accepting connections. If it's a server that uses an internal select, the user again only calls MHD_run once, which spawns a thread to busy loop on select.

If it's a daemon that uses external select, the user will call MHD_run repeatedly whenever they are ready for the daemon to process input and output on the already accepted sockets. In this MHD_run function, you will need to check for new connections, accept them, and add them to your list of sockets that belong to the FD_SETs. You will also have to check each socket for input and available output so that you can read from and write to them.

select

int select(int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset, const struct timeval *timeout):
select is used for multiplexing on the set of sockets we have without blocking on a read or write. This function will mark sets of sockets and let you know which sockets are available for reading, writing, and have exceptions (which we won't use since exception sets are for out-of-band data).

To use select, you first mark the sockets in the fd_set you are interested in as such:


int i, res;
fd_set read_set;
fd_set write_set;
struct timeval timeout;

/* zero out the timeout struct so select doesn't block */
bzero(&timeout, sizeof(struct timeval));

FD_ZERO(&read_set);
FD_ZERO(&write_set);

/* listenfd is the socket descriptor that you listen for connections
   on, maxfd is the highest descriptor number you've seen from accept. */

for (i = listenfd; i < maxfd; i++)
{
  FD_SET(i, &read_set);
  FD_SET(i, &write_set); 
}

/* now call select */
res = select(maxfd + 1, &read_set, &write_set, NULL, &timeout);

/* to check to see if a socket is ready to be read from ... */
for (i = listenfd; i < maxfd; i++)
{
  if (FD_ISSET(i, &read_set))
  {
     process_input(i);
  }
}

You'll need to check if the sockets are ready for writing also and process the output accordingly.

Accepting connections

If your listenfd socket is ready for reading, that means a connection is pending on it. You should call accept on the socket. However, we want to make sure that the socket is a non-blocking socket if we are using either the internal or extenal select option with our daemon. We do so by setting a socket option as follows:


int options, res;

options = fnctl(sockfd, F_GETFL, 0);
res = fnctl(sockfd, F_SETFL, options | O_NONBLOCK);

In this code, we use fnctl to get the flags on a descriptor, and then fnctl to set the flag to be the bit-wise or of the returned flag and the non-blocking flag, O_NONBLOCK. We do this with all sockets we are using to prevent blocking (unless we are using a multi-threaded server in which case it may not be necessary).

MHD_Session

It's time for you to define your MHD_Session struct. This should at minimum hold the socket that the session is associated with, though you'll see that you need more than just a socket. Instead of writing a hash table or a linked list, we can keep a simple array of MHD_Session structs. However, you are probably wondering how many to keep. Well, by default, the operating system can only call select on a certain number of sockets (due to the size of the fd_set struct). This value is defined by FD_SETSIZE. Some operating systems allow you to define FD_SETSIZE before including <sys/types.h> to increase the maximum size of the set. By default it's usually around 1024. For our purposes, leave it at its default size.

Buffers

With TCP, just keeping track of a socket is not sufficient for managing multiple sockets since it is a streaming protocol and doesn't pay attention to message boundaries. Thus, with each socket we need to keep a buffer that we can read from and a buffer that we write to. If you overflow a write buffer, simply disconnect the socket for now.

Keeping a buffer in C is fairly simple. Just malloc a char * array and keep track of its maximum size and an int that shows you how much data is in the buffer. In the next checkpoint, you will have to figure out how to read an entire HTTP message from the buffer.

Reading from sockets

ssize_t recv(int s, void *buf, size_t len, int flags):
To receive data from a socket, you call recv. You give it your socket, s in this case, a pointer to a buffer called buf, and how many bytes you would like to read, called len. Note that len should be less than or equal to the number of bytes left that your buffer will hold. If you've malloc'ed 256 bytes, you'd pass in 256. The final argument is a flag that lets you specify how you wish to read from the socket, which can be left as 0 for our purposes.

Note that recv returns the number of bytes that were actually read. It may have read all the bytes you were expecting or just read a few of them. This is why we have to buffer the data. Next round it may read more and along the way we have to examine our buffer for the end of the application layer message.

Writing to sockets

ssize_t send(int s, const void *msg, size_t len, int flags):
To send data on the socket, you call send. You pass in a socket, s, your buffer, msg, the number of bytes you want to write from the start of your buffer, len, and a flag, which you will leave at 0 for our purposes.

send will return the number of bytes read from your buffer that were copied into the OS buffer (and will be sent by the OS with TCP). You most definitely need to check this value or you will lose data! If you don't write everything, just update your buffer and try next loop through the sockets.