Information Society Technologies

Search In Audio Visual Content Using Peer-to-peer IR

IST FP6 Project


Technological Objectives


SAPIR proposes new solutions for an innovative technological infrastructure for next-generation multimedia search engines. Unlike existing multimedia searches that are limited to metadata annotations attached to the multimedia content, SAPIR will offer true content-based search relying on real multimedia content following the "query by example" paradigm in which the user supplies multimedia content in the form of images, speech, etc., as the query input. This content-based search combined with optional metadata annotations and user and social networking contexts, will provide the next level of search capabilities and precision of retrieved results.
The SAPIR research effort should lead towards a distributed, peer-to-peer based, search engine architecture, as opposed to today's search engines within a centralized web data warehouse.

This multidisciplinary research project will specifically focus on the following innovative research areas:

  1. Media-specific automatic feature extraction and content classification/understanding
    SAPIR will include media-specific adapters for feature extraction from various types of audio-visual content. This includes text summarization and feature extraction from audio, video, music, and images. The extracted features will be represented in standard formats such as XML, MPEG7, MPEG21, MXF, PMETA, and DC, allowing complex queries from multiple media types.

  2. Scalable and distributed (P2P) index structures supporting similarity search
    The amount of multimedia content, its dynamic nature, and the computing power required for media-specific feature extraction cannot scale in a centralized architecture. In SAPIR, we will develop a P2P architecture where feature extraction can occur in one peer and be pushed to an indexing peer. The P2P architecture will provide a scalable indexing structure that can be used for multi-feature search. Caching techniques will be developed to increase system performance.
    Research will be conducted on extending Information Retrieval (IR) search and ranking to utilize complex queries on multiple media types over the distributed P2P network. A key feature of the multimedia search is to develop new query languages where query data is of the same nature as the indexed data, i.e., image and text are used to describe the user needs for similar images. New ranking algorithms will be developed that combine similarity searches over several types of multimedia content combined with text, metadata, and context-aware search based on the user's personal and social contexts.

  3. Support for multiple devices embedding social networking in a trusted environment
    SAPIR will support multimedia content for uploading/pushing, and for searching/retrieving from a variety of devices, including mobile phones, PDAs, and PCs. User context such as GPS position, query history, and social networking (for groups of users with similar interests) will be used to increase result precision.
    Since content providers are end users themselves, SAPIR aims to study IPR methods (e.g., DRM, MPEG-21) for a similar P2P collaborative environment and develop methods to solve the conflict between maintenance of IPR protected digital content and the ability to analyze and retrieve this content. The additional aspect of trust that will be studied is how to identify and prevent spam techniques that are aimed at deliberately changing search results by peers.

These elements contribute to our vision of an innovative architecture for the next generation of P2P search engines, solving the main shortcomings in today's technology. Our vision is illustrated in the architectures in Figure 1.

Click to see full size
  Click to see full size
Figure 1 - SAPIR components and functions


Figure 1 shows the general application scenario of the project. The sources (web pages, video blogs, photo blogs, podcast, etc.) of the multimedia data are crawled and indexed by the P2P-based system. In general, there are two different types of crawling methods: pull and push. The former is the more traditional: crawlers are responsible for locating, browsing, and gathering the web resources. In the latter case, the resources upload their content directly to the crawler.
The P2P system is composed of a set of peers, some of whom are user-peers that search and consume audio-visual data, produce their content, and possibly use their local storage for caching. Some are super-peers who possess crawling, indexing, and searching capabilities. Figure 1 shows a zoom view of one super-peer. The crawler sends the multimedia data (such as images, music, and videos) to the indexing subsystem, which stores it in the form of key-frames, preview, thumbnails, etc. in the "audio-visual content" repository. Then, the indexing subsystem extracts the features from the data and, based on the indexing policy, indexes them in its local "audio-visual index" or distributes them over the P2P network to remote indices. Each super-peer of the P2P network provides a searching service, which either executes the query locally on its audio-visual index or submits the query over the P2P network to remote peers.

SAPIR technology can be used in various areas, for example, a tourist visiting a European city. In this scenario, the super-peers are large content providers in the city, each of whom holds city-specific audio-visual content such as the main attractions, calendar of events, etc. Our tourist is equipped with a PDA or cell phone and is connected through a local carrier to the closest super-peer at that city. The tourist, upon approaching a monument, can take a picture of the artefact with her device and use it as a query input to the closest SAPIR node. The image together with user's GPS location and her social context as tracked by SAPIR (subject to privacy issues) are then used to find information about the monument. The SAPIR peer executes the query and if needed, consults the P2P system for additional peers with relevant information. The results are then adapted to the user's device and returned to the user.
Our visiting tourist can take further pictures and upload them to the closest SAPIR peer for indexing and retrieval by other peers.


 
 

Partners

  
IBM Haifa Research Laboratory
IBM Haifa Research Laboratory
 
 
ISTI - CNR
ISTI - CNR
 
 
Max-Planck Institute for Informatics
Max-Planck Institute for Informatics
 
 
University of Padova
University of Padova
 
 
Eurix
Eurix
 
 
Xerox Research Centre Europe
Xerox Research Centre Europe
 
 
Masaryk University
Masaryk University
 
 
Telefonica Investigacion y Desarrollo
Telefonica Investigacion y Desarrollo
 
 
Telenor
Telenor