Using User-Provided Information to Improve Internet Services (Thesis)

June 2004
June 2004



Web users are not passive, but provide valuable information throughout
the process of information generation, delivery and access. User-provided information is openly available in various novel web services and can often be a valuable resource for construct ing further improved and enhanced services. This thesis explores the application of user-provided
information in two important services, one of which is at application level
while the other is at middleware level.

Our first study proposes and investigates a new reputation framework for
improving rating service. Rating services allow users to harvest the collective wisdom of the broad community
in making decisions. However, the difficulty with Internet ratings is that
little is known about the people providing them. This thesis presents a powerful methodology that automatically
computes the reputation of each online rater according to the quality and the quantity of the ratings given by the rater. This reputation information can be used to weight the ratings
in aggregating multiple users opinions on a product and to guide readers to high-quality
opinions. Using data collected from real rating sites, our experiments demonstrate that our
system possesses a set of important properties and has the potential to greatly enhance the
effectiveness of rating service.

This thesis also proposes and investigates the utilization of user provided information in
middleware design for distributed content delivery and caching. The information needs of
content consumers form the key to driving content delivery over the Internet. Typically,
these information needs are determined based on access patterns. This thesis explores a set
of novel content placement approaches enhanced by using stated user interest through subscriptions.
Our algorithms proactively deliver contents at publishing time and on demand at access time
to the edge servers that are close to end-users, based on subscription and access information.
We studied the algorithms performances using a simulator and the workloads that we built to
mimic the content and access dynamics of a busy news site. The results demonstrate that incorporating
subscription information judiciously can substantially improve the hit rate in the local servers
as compared to the access-based approaches, even when the subscription information does not
reflect users actual accesses perfectly.

