For simulation, we used the 14-node Worldnet backbone topology. Among these nodes, one is the centralized hosting center (content server); and four are border routers (or IGRs ) through which external traffic enters the Worldnet backbone. The remaining nodes serve as entry points for traffic from/to AT&T clients and are called Points of Presence (POPs) . This reflects the current EasyWWW architecture.
The sizes of the various documents were obtained from the trace, and bandwidth consumption was computed by multiplying the document size by the hops it traversed. Response latency was calculated as the sum of link latencies on the request and response paths, the document transfer time and the delay at the proxy server(s) and, if needed, the content server. The latency and bandwidth of various links in the topology were determined using delay measurement for various packet sizes. The proxy delays were obtained by assuming that they had a FIFO queue, with a fixed service time for each request. This model for latency on the backbone is not as realistic as for our bandwidth results, but it suffices to provide an idea of the penalty.
In our simulation, client requests are directed
to the proxy server that is co-located with the
exit point of traffic from Worldnet backbone to the client. Determining this
exit point is easy for AT&T clients since each client is connected to
a unique Worldnet node. For external clients, the exit point is determined by
using actual Worldnet BGP tables; inconclusive BGP data are resolved
using geographical proximity information from the Merit
database [10]
.
Selecting a reverse proxy based on BGP tables is used by many existing
DNS load balancers, such as Distributed Director [5].
Since every AT&T client connects at a unique POP, proxies did not forward requests from these clients. As already mentioned, the response from a remote proxy would have come back to the client POP anyway, and so request forwarding would add latency without any bandwidth savings. For requests from outside clients, we deployed the forwarding mechanism at all four IGRs/proxies, assuming each of them had infinite cache capacity. Since total size of all documents is less than 3 GB, our assumption about cache capacity is reasonable for this trace. Section 4.2 assumes that each cache has perfect, instant knowledge of the content of remote caches. Section 4.3 then explicitly studies the effect of delayed and imprecise knowledge on the performance of our schemes.
We investigated the two configurations described in the preceding section. The first configuration, which we call IGR , forwards among reverse proxies. The second, which we call Server-Forwardable, also can forward requests to the server or web hosting complex and have the response sent back directly to the client.
A proxy needs to know whether to forward a request or fetch and cache the page. To do this end, we use a forwarding threshold. Each proxy maintains a reference counter for each document, and forwards the requests when the corresponding document's reference counter does not exceed the threshold. Otherwise, the proxy fetches and stores the object. When there are multiple cached copies, a proxy picks the nearest remote proxy for forwarding or fetching.
In order to reflect the dynamics of documents, we introduce a new parameter called documents' time-to-live. Our assumption here is that the content of the documents will change periodically. Old cache copies will be out-of-date, and should be discarded. The time-to-live give the interval between consecutive changes. We assume the time-to-live parameter is the same for all documents.