Quasi-Copies: Efficient Data Sharing for Information Retrieval Systems
Currently, a variety of information retreival systems are available to potential users. These services are provided by commercial enterprises (such as Dow Jones and The Source), while others are research efforts (the Boston Community Information System). While in many cases these systems are accessed
from personal computers, typically no advantage is taken of the computing resources of those machines (such as local processing and storage). In this paper we explore the possibility of using the user's local storage capabilities to cache data at the user's site. This would improve the response time of user queries albeit at the cost of incurring the overhead required in maintaining multiple copies. In order to reduce this overhead it may be appropriate to allow copies to diverge in a controlled fashion. This would not only make caching less costly, but would also make it possible to propagate updates to the copies more efficiently, e.g., when the system is lightly loaded, when communication tariffs are lower, or by batching together updates. Just as importantly, it also makes it possible to access the copies even when the
communication lines or the central site are down. Thus, we introduce the notion of quasi-copies which embodies the ideas sketched above. We also define the types of deviations that seem useful, and discuss the available implementation strategies.