Home Overview
|
Overview
Digital data volume has been increasing at a phenomenal rate during the
past decade. The ``Moore's law curve'' (doubling every 18 months) no
longer refers only to the exponential improvement rate of processor
performance, storage density and network bandwidth, but also to the data
growth rates of many disciplines. The dominating data types are
feature-rich data such as audio, digital photos, videos, and scientific
sensor data. As we are moving into a digital society where all
information is digitized and where the world is interconnected by
digital means, it is highly desirable for next-generation systems to
provide users with abilities to access, search, explore and manage
feature-rich data. A key component in our research is a general-purpose
similarity search engine. To deliver high-quality similarity search
results with minimal CPU cycles and memory resources, we have developed
novel techniques based on dimension-reduction ideas recently developed
in the theory community. We use these to construct sketches -- tiny data
structures that can be used to estimate properties of the original data
-- from feature vectors as highly compact metadata for the similarity
search engine. This approach allows us to attack the ``curse of
dimensionality'' problem in the design of the similarity search engine
for feature-rich data.
|
Copyright © 2005