The PathKit abstracts Scout paths from the Scout OS. The goal of the PathKit is to introduce the resource isolation and configurability provided by paths into a number of contexts, for instance within the Linux kernel or inside a user-space application. To this end, the PathKit provides a conceptual object-oriented architecture based on Scout paths. PathKit objects are implemented in many cases by writing wrappers around pre-existing system components such as queues, threads, or protocols.
The PathKit decomposes Scout paths into generic path components called Stages. Each Stage has a simple message-oriented interface that supports Push and Pull operations, resembling Click or the Scout partner interface:
int Push(Stage s, Msg m); int Pull(Stage s, Msg m); int Destroy(Stage s);
The Push operation is used to push a message into the Stage, while Pull tries to fetch a message from the Stage. The Destroy operation is used to free state associated with the Stage.
The PathKit framework contains several different subclasses of the Stage class. Since all of these are also Stages, they all support the Push/Pull interface described above. First, the atomic Stages:
A
Demux inspects a message and pushes it to one or more Stages
based on the message contents. Typically a Demux will extract flow
information from a packet's headers and push the packet to the Path
reponsible for this flow. The Demux interface provides methods to add
and delete mappings between keys and Stages, and to resolve a key into
a Stage.
A
Queue stores messages. Messages are pushed to a Queue, which
either stores them or returns a failure code if the Queue is full.
Pulling from a Queue returns a message if the Queue is not empty;
otherwise the operation may block until a message is available. The
Queue interface provides a method to check the queue length.
A
Thread represents a process in the system, and can store
scheduling state for that process (e.g., how many shares have been
assigned to the process in a proportional share CPU scheduler). Note
that there is not necessarily a one-to-one correspondence between
Thread objects and system processes. Multiple Threads may be
multiplexed onto one process if it is not necessary to save per-Thread
state (i.e., continuations), or multiple processes may share a single
Thread that provides an entry-point into a path. The Thread interface
provides methods to start and stop the Thread; it also contains a
pointer to the Thread's start function, which is the function that
executes when the Thread runs; and an entry-point method that can be
called by a process to run the Thread's start function in the context
of that process. A Thread can be marked non-blocking, so that pulling
from an empty Queue will return failure.
A Processing Stage transforms a message that is pushed to
or pulled from it, for example adding or removing protocol headers or
processing the message payload.
Two Stage objects are compound objects built out of the atomic Stages listed above. These are:
A
Path consists of one or more Stages chained together. A
typical Path may start with a Queue, followed by a Thread, one or more
Processing Stages, and then another Queue (as shown).
A
PktSched schedules messages contained in one or more Queues.
Pulling from the PktSched causes it to choose the next Queue to be
serviced according to its internal scheduling policy. A message is
then pulled from the chosen Queue and returned. The Queues managed by
a PktSched may also be part of a Path.
The objects described above are generic classes that simply define interfaces. In order to build functioning paths using the PathKit it is necessary to provide implementations of these objects, either as wrappers of existing functionality or by writing new code. The following objects have been implemented in the PathKit:
In order to use the PathKit in an existing system such as the Linux kernel, we connect PathKit objects to the rest of the system using thin layers of code called shims. In SILK, three shims tie Path objects into the Linux kernel: socket, raw socket,and netfilter. The picture below illustrates how paths and shims relate in SILK.
The picture on the left depicts the Linux network stack. As an example
of how data traverses it, consider an arriving TCP packet. The device
layer copies the packet contents to an sk_buff message, inspects and
strips the link-level header, and passes the message up to
the IP layer. This layer inspects and strips the IP header, notes
that the packet is a TCP packet addressed to the local host, and
passes the message to the TCP layer. The TCP protocol code
demultiplexes the message to the appropriate socket and
places the data it contains in the socket's receive buffer. All of
this occurs in the context of a soft interrupt. Subsequently, when a
process calls read() on the file descriptor corresponding to
this TCP socket, the system call traverses layers implementing generic
operations on file descriptors and socket-specific operations, enters
the TCP layer, and copies the data in the receive buffer to user
space.
The picture on the right shows the Linux stack
with the SILK module loaded, in particular highlighting the Path
objects and shims. Again, we consider an arriving TCP packet. The
packet is copied to an sk_buff message and processed by the
IP layer as before. After IP reassembly has taken place, the message
is intercepted by the netfilter shim and pushed to the root
demux, an instance of a DemuxSkbuff object. The root demux
matches the message to either a PathLinuxNetStack or a PathRawSocket
and pushes it to the path; if the push succeeds, the netfilter shim
reports to the Linux network stack that it has stolen the
sk_buff and that Linux should stop processing it. Assume as
before that the message belongs to a TCP socket and therefore is
pushed to a PathLinuxNetStack and deposited in its input
QueueSock.
When the process calls read() on the TCP socket, the
resulting system call traverses the file operations layer and is
intercepted by the socket shim. This shim finds the
PathLinuxNetStack that corresponds to the socket and calls into its
ThreadEntryPoint, which causes the process to pull the message from
the input queue and push it into the TCP layer via the
ProcLinuxReinject object. Now TCP processes the message in the
context of the process that called read() rather than in a
soft interrupt. After the message is processed by TCP, its data is
placed in the receive buffer as before; conceptually the message has
been pushed to the output QueueSock that wraps this receive buffer in
a Queue object. Finally, execution returns from the ThreadEntryPoint
back to the socket shim, and continues down into the TCP layer where
the data is copied from the receive buffer to user space.
The socket shim connects socket system calls to operations on Path objects. It does this by wrapping all socket calls within the kernel, allowing the shim to intercepts these calls, take some PathKit-related action, and then continue the call within Linux. For instance, the bind call is used to bind a socket to a local port. An application that calls bind first invokes the socket shim's bind wrapper; the shim checks its own port mapping to make sure the local port is available, enters a demultiplexing key for the socket into a Demux object, and then calls the bind function for that socket type in the kernel.
As implied by the bind example above, SILK maintains its own port usage map. One function of this map is to allow ports to be reserved for a particular user, meaning only that user can bind to that port. For this reason SILK takes over the assignment of free ports from Linux. That is, connect and the send* operations will assign a free local port if the socket is not already bound. In order to ensure that Linux does not choose a reserved port, the socket shim chooses the port itself and binds it to the socket within its connect or sendmsg wrappers.
The PathKit-specific operations performed in the socket wrappers are described below.
| Socket call | PathKit action |
|---|---|
| socket | Creates a PathLinuxNetStack corresponding to the socket, and changes all of the operations on that socket to point to wrapper functions in the shim. |
| bind | Checks to see if the local port is available; if so, adds a key to the root Demux object to demultiplex incoming packets on that local port to the appropriate PathLinuxNetStack object. |
| connect | Binds socket to an available local port if necessary. Adds a key containing the local and remote ports to the Demux object. |
| send* | If necesary, binds socket to an available local port and adds a key to the Demux object. |
| recv* | Calls into the ThreadEntryPoint of the PathLinuxNetStack object corresponding to the socket. |
| close | Destroys the PathLinuxNetStack for that socket, releasing port references and demux keys./TD> |
The raw socket shim intercepts operations on safe raw sockets, which allow non-privileged users to send and receive IP datagrams. In order to use a safe raw socket, it must first be bound to an unused local TCP or UDP port like a standard TCP/UDP socket. The raw socket shim implements this addition to the Linux raw socket interface, and performs header checks on outgoing packets to ensure that the bound local port is the source port in the transport header.
The PathKit-specific operations performed in the raw socket wrappers are described below.
| Socket call | PathKit action |
|---|---|
| socket | Creates a PathRawSocket corresponding to the socket, and changes all of the operations on that socket to point to wrapper functions in the shim. |
| bind | Checks to see if the local port is available; if so, adds a key to the root Demux object to demultiplex incoming packets on that local port to the appropriate PathRawSocket. |
| connect | Not supported for raw sockets |
| send* | Checks the transport header to make sure that the local port is the same one to which the socket was bound. Also verifies the protocol field in the IP header and sets the header's source address to the local IP address (to prevent spoofing). |
| recv* | No-op |
| close | Destroys the PathRawSocket for that socket, releasing port references and demux keys |
The netfilter shim intercepts both incoming and outgoing packets by registering hooks with Linux's netfilter interface. Incoming packets are intercepted at the NF_IP_LOCAL_IN hook, after IP reassembly has occurred. The incoming packet hook pushes packets to the root Demux object; if the push succeeds, the hook returns NF_STOLEN, SILK assumes responsiblity for the packet, and Linux ceases processing it. If the push fails because of a full Path input queue, then the input hook returns NF_DROP and Linux drops the packet. Otherwise, the input hook returns NF_ACCEPT, indicating that this packet does not belong to a socket under control of SILK and that Linux can continue processing it.
All outgoing packets, including those sent on sockets not under SILK's control, are intercepted at the NF_IP_POST_ROUTING hook. This hook accounts an outgoing packet to the vserver that sent it using the vserver ID field in the struct sock associated with the packet, and pushes copies of the packet to any raw sniffer sockets that are listening on the packet's local port.