M. Zhang, C. Zhang, V. Pai, L. Peterson, and R. Wang. Proc. Sixth Symposium on Operating Systems Design and Implementation. December 2004.
Detecting network path anomalies generally requires examining large volumes of traffic data to find misbehavior. We observe that wide-area services, such as peer-to-peer systems and content distribution networks, exhibit large traffic volumes, spread over large numbers of geographically-dispersed endpoints. This makes them ideal candidates for observing wide-area network behavior. Specifically, we can combine passive monitoring of wide-area traffic to detect anomalous network behavior, with active probes from multiple nodes to quantify and characterize the scope of these anomalies.
This approach provides several advantages over other techniques: (1) we obtain more complete and finer-grained views of failures since the wide-area nodes already provide geographically diverse vantage points; (2) we incur limited additional measurement cost since most active probing is initiated when passive monitoring detects oddities; and (3) we detect failures at a much higher rate than other researchers have reported since the services provide large volumes of traffic to sample. This paper shows how to exploit this combination of wide-area traffic, passive monitoring, and active probing, to both understand path anomalies and to provide optimization opportunities for the host service.