What makes Big Visual Data hard?
Alexei (Alyosha) Efros, Carnegie Mellon University
There are an estimated 3.5 trillion photographs in the world, of which
10% have been taken in the past 12 months. Facebook alone reports 6
billion photo uploads per month. Every minute, 72 hours of video are
uploaded to YouTube. Cisco estimates that in the next few years, visual
data (photos and video) will account for over 85% of total internet
traffic. Yet, we currently lack effective computational methods for
making sense of all this mass of visual data. Unlike easily indexed
content, such as text, visual content is not routinely searched or
mined; it's not even hyperlinked. Visual data is Internet's "digital
dark matter" [Perona,2010] -- it's just sitting there!
In this talk, I will first discuss some of the unique challenges that
make Big Visual Data difficult compared to other types of content. In
particular, I will argue that the central problem is the lack a good
measure of similarity for visual data. I will then present some of our
recent work that aims to address this challenge in the context of visual
matching, image retrieval and visual data mining. As an application of
the latter, we used Google Street View data for an entire city in an
attempt to answer that age-old question which has been vexing poets (and
poets-turned-geeks): "What makes Paris look like Paris?"