Web Client Application (epwget)¶
Introduction¶
The epwget
program is a sample event-driven HTTP web client which sends
HTTP requests and receives the web pages through HTTP response. epwget
uses
epoll
(event poll) interface to detect whether the mTCP socket is ready for
read and write operations.
Code Walkthrough¶
The following sections provide an explanation of the main components of the
epwget code.
All mOS library functions used in the sample code are prefixed with mtcp_
and are explained in detail in the Programmer’s Guide - mOS Programming API.
Note that we omit the error handling logic from the example code snippets for brevity.
(1) The main() Function¶
The main()
function performs the initialization and calls the execution
threads for each CPU core.
The first task is to initialize mOS thread based on the mOS configuration file.
fname
holds the path to the mos.conf
file which will be passed to
mtcp_init()
function. We can use mtcp_getconf()
function to retrieve
current configuration settings from the mOS core.
/* parse mos configuration file */
ret = mtcp_init(fname);
mtcp_getconf(&g_mcfg);
core_limit = g_mcfg.num_cores;
The next step is global parameter initialization using the GlbInitWget()
function. We will describe the details of this function in the next section.
The last step is to create and run per-core mTCP threads. For each CPU core,
it creates a new mTCP thread which gets spawned from a function named RunMTCP()
.
for (i = 0; i < core_limit; i++)
pthread_create(&mtcp_thread[i], NULL, RunMTCP, (void *)&cores[i]));
(2) The Global Parameter Initialization Function¶
The GlbInitWget()
function loads the epwget
application-specific
configuration from epwget.conf
file. The following code block shows the example
configuration for epwget.conf
.
url
parameter is used to set the URL of the file to be downloaded.
dest_port
specifies the port number of the web server to connect.
total_flows
indicates the total number of flows (in other words, the
total number of downloads), and total_concurrency
is the number of
concurrent flows allowed to run at the same time. By setting core_limit
parameter, the application can override the number of CPU cores to be used.
url = 10.0.0.3/64K
dest_port = 80
total_flows = 100000
total_concurrency = 4000
core_limit = 8
GlbInitWget()
function reads the configuration file, and saves the
parameters in global variables. We note that our epwget
implementation
assumes that the maximum number of file descriptors that mTCP thread can
create is three times larger than the user-defined number of concurrent flows.
epwget
overrides the max_concurrency
and max_num_buffers
parameters of mOS configuration using mtcp_getconf()
and mtcp_setconf()
functions:
/* set the max number of fds 3x larger than concurrency */
max_fds = concurrency * 3;
mtcp_getconf(&mcfg);
mcfg.max_concurrency = max_fds;
mcfg.max_num_buffers = max_fds;
mtcp_setconf(&mcfg);
(3) The RunMTCP() Function¶
The RunMTCP()
function is executed in a per-thread manner.
First, RunMTCP()
function affinitizes a CPU core to each thread
and creates a mtcp context. Next, it calls the RunApplication()
function, which uses sockets to create connections, send HTTP requests,
and receive HTTP responses.
/* affinitize the mTCP thread to a core */
mtcp_core_affinitize(core);
/* mTCP initialization */
mctx = mtcp_create_context(core);
RunApplication(mctx);
RunApplication()
function consists of InitWget()
function and RunWget()
function. InitWget()
creates a thread context which holds thread-specific
metadata including epoll-related variables and statistics of the flows related
to their status (e.g., started, pending, done, errors, and incompletes).
One of the important roles of InitWget()
function is to initialize the RSS
(receive-side scaling) setup which involves deriving the source port number from
the remaining three parameters of 4-tuple (source network address,
destination network address, and destination port number) TCP connection
information.
mtcp_init_rss(mctx, saddr, IP_RANGE, daddr, dport);
Afterwards, epwget
creates the epoll loop to receive the read and write
availability events as follows (note that we have simplified the code for
better readability):
ep = mtcp_epoll_create(mctx, ctx->maxevents);
RunWget()
is the core of this program. In this function, using the
epoll
event API, it creates new connections, and sends or receives data.
while (!done) {
/* until it meets the maximum number of concurrent connections, */
while (mtcp_get_connection_cnt(ctx->mctx) < concurrency) {
/* create a new connection */
CreateConnection(ctx);
}
/* wait inside the epoll_wait call until there's any event */
nevents = mtcp_epoll_wait(mctx, ctx->ep, ctx->events, ,,,);
for (i = 0; i < nevents; i++) {
if (ctx->events[i].events & MOS_EPOLLERR) {
/* print an error message and close the connection*/
...
} else if (ctx->events[i].events & MOS_EPOLLIN) {
/* read the data arrived at the socket buffer */
HandleReadEvent(ctx, ctx->events[i].data.sock, ...);
} else if (ctx->events[i].events == MOS_EPOLLOUT) {
/* write HTTP request to the socket send buffer */
SendHTTPRequest(ctx, ctx->events[i].data.sock, wv);
}
}
...
}
Here are some detailed explanations for each sub-function in the code above:
CreateConnection()
function creates a new mtcp socket, sets the socket as non-blocking, connects to the target web server, and adds the socket to the epoll event queue.sockid = mtcp_socket(mctx, AF_INET, SOCK_STREAM, 0); ... mtcp_setsock_nonblock(mctx, sockid); ... mtcp_connect(mctx, sockid, &addr, sizeof(struct sockaddr_in)); ... mtcp_epoll_ctl(mctx, ctx->ep, MOS_EPOLL_CTL_ADD, sockid, &ev);
SendHTTPRequest()
function creates an outgoing HTTP request header, and opens a file to store the response data.snprintf(request, HTTP_HEADER_LEN, "GET %s HTTP/1.0\r\n", ...); len = strlen(request); wr = mtcp_write(ctx->mctx, sockid, request, len); ... wv->fd = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
HandleReadEvent()
function consists of reading the payload from the socket, and storing the data to the file.rd = mtcp_read(mctx, sockid, buf, BUF_SIZE); /* parse the http header */ ... if (writable) { /* store the data to the file */ write(wv->fd, pbuf + wr, rd - wr); }
(4) Multi-process Version (DPDK-only)¶
You can also run epwget
in multi-process (single-threaded) mode.
This mode will only work with Intel DPDK driver. You can find epwget-mp
placed in the same directory where epwget
lies. The overall design
of epwget-mp
is similar to epwget
(only pthreads
are absent).
One can run epwget-mp
on a 4-core machine using the following script:
#!/bin/bash ./epwget-mp -f config/mos-master.conf -c 0 & sleep 5 for i in {1..3} do ./epwget-mp -f config/mos-slave.conf -c $i & done
The -c
switch is used to bind the process to a specific
CPU core. Under DPDK settings, the master process (core 0 in the
example above) is responsible for initializing the underlying
DPDK-specific NIC resources one time. The slave processes (cores
1-3) share those initialized resources with the master process.
The master process relies on the mos-master.conf
file for
configuration. It has only 1 new keyword: multiprocess = 0 master
;
where 0 stands for the CPU core id. The mos-slave.conf
configuration file has an additional line: multiprocess = slave
;
which (as the line suggests) sets the process as a DPDK secondary (slave)
instance. We employ a mandatory wait between the execution of the
master and the slave processes. This is needed to avoid potential
race conditions between the shared resources that are updated
between them.