Towards Understanding Application Semantics of Network Traffic (thesis)
This dissertation explores the problem of building a semantic network traffic analysis system and using it
to investigate various aspects of network traffic. Semantic traffic analysis uncovers the application-layer
semantics conveyed in packets so that one can examine the specific requests, responses, status messages,
error codes, and data items embedded in a connection dialog. Analyzing these at the application layer, as
opposed to the syntactic byte-string layer, opens up much greater insight into the nature and context of the
exchange between two hosts. For this reason, semantic traffic analysis is a cornerstone for precise network
intrusion detection and also has broad applications in measurements of networking systems.
This dissertation advances semantic traffic analysis in two aspects. First, we present tools and techniques
for building traffic analyzers and creating shared traces, including design and implementation of (1)
a declarative language, binpac, for writing application protocol parsers, and (2) a programming environment
for packet trace transformation and anonymization. Second, we characterize two types of previously
unstudied network traffic, Internet background radiation and enterprise internal traffic. Both studies focus
on traffic semantics, aiming to understand the network applications that generate the traffic and to uncover
underlying causes of network usage patterns.