Web Server Application (epserver)
=================================

Introduction
------------
The ``epserver`` program is a sample HTTP web server which handles
HTTP request  and provides web pages through HTTP response.
``epserver`` uses ``epoll`` (event poll) interface to receive 
network events from mTCP sockets.

Code Walkthrough
-----------------

The following sections provide an explanation of the main components of the
epserver code.
All mOS library functions used in the sample code are prefixed with ``mtcp_``
and are explained in detail in the `Programmer's Guide - mOS Programming API`_.

(1) The main() Function
~~~~~~~~~~~~~~~~~~~~~~~~

The ``main()`` function performs the initialization and calls the execution
threads for each CPU core.

The first task is to initialize mOS thread based on the mOS configuration file.
``fname`` holds the path to the ``mos.conf`` file which will be supplied to 
``mtcp_init()``.

.. code-block:: c

	/* parse mos configuration file */
	ret = mtcp_init(fname);

The next step is to initialize server settings which is done by calling
``GlobInitServer()`` function. The ``GlobInitServer()`` function reads
the application-specific configuration file (``epserver.conf``) and loads
all available web pages. We will decribe the details of this function in
the next subsection.

The last step is to create and run per-core mTCP threads. For each CPU core,
it creates a new mTCP thread which gets spawned from a function named ``RunMTCP()``.
We describe this function in further detail in the later subsections.

.. code-block:: c

	for (i = 0; i < core_limit; i++) {
		cores[i] = i;
		/* Run mtcp thread */
		if ((g_mcfg.cpu_mask & (1L << i)) &&
			pthread_create(&mtcp_thread[i], NULL, RunMTCP, (void *)&cores[i])) {
			perror("pthread_create");
			TRACE_ERROR("Failed to create msg_test thread.\n");
			exit(-1);
		}
	}

(2) The Global Parameter Initialization Function
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``GlobInitServer()`` function loads the ``epserver`` application
configuration from ``epserver.conf`` file. The following code block shows
the example configuration for ``epserver``. ``www_main`` parameter indicates
the path to the directory that holds the web pages. By setting ``core_limit``
parameter, the application can override the number of CPU cores to be used.

.. code-block:: c

	www_main = www
	core_limit = 8

``GlobInitServer()`` function parses the configuration file, stores the
parameters, and opens the files in the folder it has to serve.

.. code-block:: c

	if (!dir) {
		TRACE_ERROR("Failed to open %s.\n", www_main);
		perror("opendir");
		exit(-1);
	}

After opening the files, it loads the file contents in memory in
order to accelerate web page transfers (by avoiding expensive file read operations during
data transmission).

.. code-block:: c

	fcache[nfiles].file = (char *)malloc(fcache[nfiles].size);
	if (!fcache[nfiles].file) {
		TRACE_ERROR("Failed to allocate memory for file %s\n", 
				fcache[nfiles].name);
		perror("malloc");
		continue;
	}

	total_read = 0;
	while (1) {
		ret = read(fd, fcache[nfiles].file + total_read, fcache[nfiles].size - total_read);
		if (ret <= 0) {
			break;
		}
		total_read += ret;
	}
	if (total_read < fcache[nfiles].size) {
		free(fcache[nfiles].file);
		continue;
	}
	close(fd);
	
(3) The RunMTCP() Function
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``RunMTCP()`` function is executed in a per-thread manner.
First, ``RunMTCP()`` function affinitizes a CPU core to each thread
and creates a mtcp context. Next, it calls the ``RunApplication``
function, which opens the sockets, receives HTTP request, and sends
back HTTP response.

.. code-block:: c

	/* affinitize the mTCP thread to a core */
	mtcp_core_affinitize(core);

	/* mTCP initialization */
	mctx = mtcp_create_context(core);
	if (!mctx) {
		pthread_exit(NULL);
		TRACE_ERROR("Failed to craete mtcp context.\n");
		return NULL;
	}
	RunApplication(mctx);


``RunApplication()`` function consists of ``InitServer()`` function and ``RunServer()``
functions. ``InitServer()`` creates a thread context which holds thread-specific
metadata including epoll-related variables and statistics of the flows related
to their status (e.g., started, pending, done, errors, and incompletes).

Inside the ``InitServer()`` function, it creates an epoll loop to receive the
read and write availability events. Afterwards, it creates a listening socket
to accept the connections from new clients.

.. code-block:: c

	/* create epoll descriptor */
	ctx->ep = mtcp_epoll_create(mctx, MAX_EVENTS);
	if (ctx->ep < 0) {
		TRACE_ERROR("Failed to create epoll descriptor!\n");
		exit(-1);
	}

	...

	ctx->listener = CreateListeningSocket(ctx);
	if (ctx->listener < 0) {
		TRACE_ERROR("Failed to create listening socket.\n");
		exit(-1);
	}


``RunServer()`` is the core of this program. In this function, using the
``epoll`` event API, it accepts incoming connections and receives/sends
web content.

.. code-block:: c

	while (1) {
		nevents = mtcp_epoll_wait(mctx, ep, events, MAX_EVENTS, -1);
		if (nevents < 0) {
			if (errno != EINTR)
				perror("mtcp_epoll_wait");
			break;
		}

		do_accept = FALSE;
		for (i = 0; i < nevents; i++) {

			if (events[i].data.sock == ctx->listener) {
				/* if the event is for the listener, accept connection */
				do_accept = TRUE;

			} else if (events[i].events & MOS_EPOLLERR) {
				int err;
				socklen_t len = sizeof(err);

				/* error on the connection */
				TRACE_APP("[CPU %d] Error on socket %d\n", 
						core, events[i].data.sock);
				if (mtcp_getsockopt(mctx, events[i].data.sock, 
						SOL_SOCKET, SO_ERROR, (void *)&err, &len) == 0) {
					if (err != ETIMEDOUT) {
						fprintf(stderr, "Error on socket %d: %s\n", 
								events[i].data.sock, strerror(err));
					}
				} else {
					fprintf(stderr, "mtcp_getsockopt: %s (for sockid: %d)\n",
						strerror(errno), events[i].data.sock);
					exit(-1);
				}
				CloseConnection(ctx, events[i].data.sock, 
						&ctx->svars[events[i].data.sock]);

			} else if (events[i].events & MOS_EPOLLIN) {
				ret = HandleReadEvent(ctx, events[i].data.sock, 
						&ctx->svars[events[i].data.sock]);

				if (ret == 0) {
					/* connection closed by remote host */
					CloseConnection(ctx, events[i].data.sock, 
							&ctx->svars[events[i].data.sock]);
				} else if (ret < 0) {
					/* if not EAGAIN, it's an error */
					if (errno != EAGAIN) {
						CloseConnection(ctx, events[i].data.sock, 
								&ctx->svars[events[i].data.sock]);
					}
				}

			} else if (events[i].events & MOS_EPOLLOUT) {
				struct server_vars *sv = &ctx->svars[events[i].data.sock];
				if (sv->rspheader_sent) {
					SendUntilAvailable(ctx, events[i].data.sock, sv);
				} else {
					TRACE_APP("Socket %d: Response header not sent yet.\n", 
							events[i].data.sock);
				}

			} else {
				assert(0);
			}
		}

		/* if do_accept flag is set, accept connections */
		if (do_accept) {
			while (1) {
				ret = AcceptConnection(ctx, ctx->listener);
				if (ret < 0)
					break;
			}
		}

	}

Here are some detailed explanations for each sub-function in the code above:

* ``AcceptConnection()`` function accepts the connections from the listening queue
  (through the listening socket).
 
  .. code-block:: c

	c = mtcp_accept(mctx, listener, NULL, NULL);

	if (c >= 0) {
		TRACE_APP("New connection %d accepted.\n", c);
		ev.events = MOS_EPOLLIN;
		ev.data.sock = c;
		mtcp_setsock_nonblock(ctx->mctx, c);
		mtcp_epoll_ctl(mctx, ctx->ep, MOS_EPOLL_CTL_ADD, c, &ev);
		TRACE_APP("Socket %d registered.\n", c);
	} else {
		if (errno != EAGAIN) {
			TRACE_ERROR("mtcp_accept() error %s\n", strerror(errno));
		}
	}

* ``HandleReadEvent()`` function reads the HTTP request from the socket, 
  and then responds to the request.

  .. code-block:: c

	/* HTTP request handling */
	rd = mtcp_read(ctx->mctx, sockid, buf, HTTP_HEADER_LEN);
	if (rd <= 0) {
		return rd;
	}
	memcpy(sv->request + sv->recv_len, 
			(char *)buf, MIN(rd, HTTP_HEADER_LEN - sv->recv_len));
	sv->recv_len += rd;

	sv->request_len = find_http_header(sv->request, sv->recv_len);
	if (sv->request_len <= 0) {
		TRACE_ERROR("Socket %d: Failed to parse HTTP request header.\n"
				"read bytes: %d, recv_len: %d, "
				"request_len: %d, strlen: %ld, request: \n%s\n", 
				sockid, rd, sv->recv_len, 
				sv->request_len, strlen(sv->request), sv->request);
		return rd;
	}

	http_get_url(sv->request, sv->request_len, url, URL_LEN);
	TRACE_APP("Socket %d URL: %s\n", sockid, url);
	sprintf(sv->fname, "%s%s", www_main, url);
	TRACE_APP("Socket %d File name: %s\n", sockid, sv->fname);

	...

	/* If the HTTP request is fully arrived, create HTTP a response header and transfer it */
	sprintf(response, "HTTP/1.1 %d %s\r\n"
			"Date: %s\r\n"
			"Server: Webserver on Middlebox TCP (Ubuntu)\r\n"
			"Content-Length: %ld\r\n"
			"Connection: %s\r\n\r\n", 
			scode, StatusCodeToString(scode), t_str, sv->fsize, keepalive_str);
	len = strlen(response);
	TRACE_APP("Socket %d HTTP Response: \n%s", sockid, response);
	sent = mtcp_write(ctx->mctx, sockid, response, len);
	if (sent < len) {
		TRACE_ERROR("Socket %d: Sending HTTP response failed. "
				"try: %d, sent: %d\n", sockid, len, sent);
		CloseConnection(ctx, sockid, sv);
	}
	TRACE_APP("Socket %d Sent response header: try: %d, sent: %d\n", 
			sockid, len, sent);

* ``SendUntilAvailable()`` function sends the HTTP response to the client 
  until the buffer is unavailable or the file reaches the end. As described earlier, there would be 
  no disk I/O during this step, since all the files are already loaded onto the memory.

  .. code-block:: c

	sent = 0;
	ret = 1;
	while (ret > 0) {
		len = MIN(SNDBUF_SIZE, sv->fsize - sv->total_sent);
		if (len <= 0) {
			break;
		}
		ret = mtcp_write(ctx->mctx, sockid,  
				fcache[sv->fidx].file + sv->total_sent, len);
		if (ret < 0) {
			if (errno != EAGAIN) {
				TRACE_ERROR("Socket %d: Sending HTTP response body failed. "
						"try: %d, sent: %d\n", sockid, len, ret);
			}
			break;
		}
		TRACE_APP("Socket %d: mtcp_write try: %d, ret: %d\n", sockid, len, ret);
		sent += ret;
		sv->total_sent += ret;
	}

	/* if all the data sent,
		(1) wait for next request (keep-alive) or
		(2) close the socket */
	if (sv->total_sent >= fcache[sv->fidx].size) {
		struct mtcp_epoll_event ev;
		sv->done = TRUE;
		finished++;

		if (sv->keep_alive) {
			/* if keep-alive connection, wait for the incoming request */
			ev.events = MOS_EPOLLIN;
			ev.data.sock = sockid;
			mtcp_epoll_ctl(ctx->mctx, ctx->ep, MOS_EPOLL_CTL_MOD, sockid, &ev);

			CleanServerVariable(sv);
		} else {
			/* else, close connection */
			CloseConnection(ctx, sockid, sv);
		}
	}

(4) Multi-process Version (DPDK-only)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You can also run ``epserver`` in multi-process (single-threaded) mode.
This mode will only work with Intel DPDK driver. You can find ``epserver-mp``
placed in the same directory where ``epserver`` lies. The overall design
of ``epserver-mp`` is similar to ``epserver`` (only ``pthreads`` are absent).
One can run ``epserver-mp`` on a 4-core machine using the following script:

  .. code-block:: bash

		  
	#!/bin/bash
  	./epserver-mp -f config/mos-master.conf -c 0 &
	sleep 5
     	for i in {1..3}
	do
     	./epserver-mp -f config/mos-slave.conf -c $i &
     	done


The ``-c`` switch is used to bind the process to a specific
CPU core. Under DPDK settings, the master process (core 0 in the
example above) is responsible for initializing the underlying
DPDK-specific NIC resources one time. The slave processes (cores
1-3) share those initialized resources with the master process.
The master process relies on the ``mos-master.conf`` file for
configuration. It has only 1 new keyword: ``multiprocess = 0 master``;
where 0 stands for the CPU core id. The ``mos-slave.conf``
configuration file has an additional line: ``multiprocess = slave``;
which (as the line suggests) sets the process as a DPDK secondary (slave)
instance. We employ a mandatory wait between the execution of the
master and the slave processes. This is needed to avoid potential
race conditions between the shared resources that are updated
between them.
	
.. _`Programmer's Guide - mOS Programming API`: ../programmer/04_mos_api.html