多媒体通信系统(3.4.4)

3.4.4 Content-Based Image Retrieval
To address their challenges, multimedia signal-processing methods must allow efficient access to processing and retrieval of content in general, and visual content in particular. This is required across a large range of applications, in medicine, entertainment, consumer industry, broadcasting, journalism, art and e-commerce. Therefore, methods originating from numerous research areas, that is, signal processing, pattern recognition, computer vision, database organization, human-computer interaction and psychology, must contribute to achieving the image-retrieval goal. An example of image retrieval is Given: A query
Retrieve: All images that have similar content to that of the query.
为应付挑战,多媒体信号处理方法必须有效地接入以处理和复现常规和特殊的视频内容。对于很大范围的应用这都是必须的,例如医学、娱乐、消费工业、广播、新闻业、艺术以及电子商务。因此,源于各个研究领域的方法,信号处理、模式识别、计算机视觉、数据库组织、人机交互和心理学,都为实现图像复现的目标作出了贡献。图像复现的一个例子是:
给定:一个查询
检索:所有与查询具有相似内容的图像。
Image-retrieval methods face several challenges when addressing this goal [3.68]. These challenges, which are summarized in Table 3.1, cannot be addressed by text-based image retrieval systems, which have had an unsatisfactory performance so far. In these systems, the query keywords are matched with keywords that have been associated to each image. Because of difficult automatic selection of the relevant keywords, time consuming and subjective manual annotation is required. Moreover, the vocabulary is limited and must be expanded as new applications emerge.
对于给定的这个目标,图像复现法面临几个挑战。这些挑战列在表3.1中,基于文本的图像复现系统无法解决,迄今为止,它的性能不能令人满意。在这些系统中,查询关键词与已经对每一图像建立关联的关键词相匹配。由于自动选择相应关键词的困难,因此要耗费时间并需要个人人工注解。此外,当有新的应用时,还必须对有限的词汇进行扩展。
To improve performance and address these problems, content-based image retrieval methods have been proposed. These methods have generally focused on using low-level features such as color, texture and shape layout, for image retrieval, mainly because such features can be extracted automatically or semiautomatically.
为改善性能解决问题,已经提出了基于内容的图像复现法。这些方法通常聚焦于低级特征,例如色彩、纹理和外形轮廓,对于图像复现,这些特征能够自动或半自动地提取。
Texture-Based Methods
Statistical and syntactic texture description methods have been proposed. Methods based on spatial frequencies, co-occurrence matrixes and multiresolution methods have been frequently employed for texture description because of their efficiency [3.69]. Methods based on spatial frequencies evaluate the coefficients of the autocorrelation function of the texture. Co-occurrence matrixes identify repeated occurrences of gray level pixel configurations within the texture.
已经提出统计和合成纹理描述法。这些方法基于空间频率,共生矩阵及多分辨率法由于效率高而常被纹理描述运用。基于空间频率的方法估计纹理的自相关函数。共生矩阵确定纹理的灰度级象素组态。
Table 3.1 Image retrieval challenges [3.68].
Challenges Remarks
Query types Color based/shape based/color and shape based
Quantitative, for example, find all images with 30% amount of red
Query forms
Query by example, for example, image region/image/sketch/other examples
Various content For example, natural scenes/head-and-shoulder images/MRIs
Matching types Object to object/image to image/object to image
Application specific
Precision levels
Exact versus similarity-based match
Presentation of results Application specific
Multiresolution methods describe the texture characteristics at coarse-to-fine resolutions. A major problem that is associated with most texture description methods is their sensitivity to scale, that is, the texture characteristics may disappear at low resolutions or may contain a significant amount of noise at high resolutions [3.70, 3.71, 3.72].
多分辨率法以由粗到细的分辨率描述纹理特征。主要问题与大多数纹理描述法的敏感度有关,即,在低分辨率时纹理特征可能消失,在高分辨率时又可能包含大量噪声。
Shape-Based Methods
Describing quantitatively the shape of an object is a difficult task. Several contour-based and region-based shape description methods have been proposed. Chain codes, geometric border representations, Fourier transforms of the boundaries, polygonal representations and deformable (active) models are some of the boundary-based shape methods that have been employed for shape description. Simple scalar region descriptors, moments, region decompositions and region neighborhood graphs are region-based methods that have been proposed for the same task [3.73, 3.74]. Contour-based and region-based methods are developed in either the spatial or transform domains, yielding different properties of the resulting shape descriptors. The main problems that are associated with shape description methods are high sensitivity to scale, difficult shape description of objects and high subjectivity of the retrieved shape results.
量化描述一个对象的形状是一个困难的任务。已经提出了几个基于轮廓和基于区域的形状描述法。链码、几何边框表示法、边界的傅立叶变换、多边形表示法以及可变形(主动的)模型是一些基于边界的形状法,已经用于形状描述。简单梯形区域描述符、矩、区域分解和邻域图是已经使用的基于区域法。
Color-Based Methods
Color description methods are generally color histogram based, dominant color based and color moment based [3.75, 3.76]. Description methods that employ color histograms use a quantitative representation of the distribution of color intensities. Description methods that employ dominant colors use a small number of color ranges to construct an approximate representation of color distribution. Description methods that use color moments employ statistical measures of the image characteristics in terms of color.
色彩描述法通常基于色彩直方图、基于支配色、基于色矩。使用色彩直方图的描述法用色强度分布的定量表示。使用支配色的描述法用少量的色彩范围构造色彩分布的近似表示。用色矩的描述法用图像特征在色彩上的统计度量。
The performance of these methods typically depends on the color space, quantization, and distance measures employed for evaluation of the retrieved results. The main problem that is associated with histogram-based and dominant-color-based methods is their inability to allow the localization of an object with the image. A solution to address this problem is to apply color segmentation, which allows both image-to-image matching and object localization. The main problem of color-moment-based methods is their complexity, which makes their application to browsing or other image-retrieval functionalities difficult.
这些方法的性能特别依赖于色彩空间、量化和检索结果的评估使用的距离测量。基于直方图和基于支配色法的主要问题是它们无法对图像中的对象定位。解决这个问题的办法是采用色彩分割,它既考虑图像对图像的匹配,又考虑对象定位。基于色矩法的主要问题是它们的复杂性,使它们在浏览及其它图像复现应用中产生功能性困难。
Examples of content-based image and video-retrieval systems are included in Table 3.2. Some or all of the limitations of these systems are the following [3.68]:
~ Few query types are supported
~ Limited set of low-level features
~ Difficult access to visual objects
~ Results partially match user”s expectations
~ Limited interactivity with the user
~ Limited system interoperability
~ Scalability problems
基于内容的图像和视频复现系统的例子列于表3.2中。这些系统的局限如下:
 支持的查询类型少
 低级特征量有限
 难以接入视觉对象
 结果与使用者期望部分匹配
 与使用者互动有限
 系统互操作性有限
 可扩缩性问题
Table 3.2 Examples of content-based image and video-retrieval systems [3.68].
Features System Image/Video Provider
WebSeek I, V Columbia University
Picasso I University of Florence
Color and text
Chabot I University of California, Berkeley
* I University of Toronto
QBIC I IBM
PhotoBook I MIT
Color, texture and shape
BlobWorld I University of California, Berkeley
VIR I, V Virage
Color, shape and scale Nefertiti I National Research Council of Canada
NeTra I University of California, Santa Barbara
Color, texture, shape and
spatial location Digital I Kodak
storyboard
WebClip V Columbia University
Color, texture and
Jacob I, V University of Palermo
motion
* V IMAX
N/A * V NASA
* No name has been adopted for the corresponding system.