登唐恭陵

工作途中,顺便来到唐恭陵。正是晚秋时节,西风飒飒,衰草瑟瑟,拾级而上,站在高高的王陵顶,四顾而望,夕阳如血,已近地平线,南面是巍巍嵩岳,北边是悠悠伊洛河,旷野之中,三五农夫在耕种。俯首近观,王陵在平川上拔地而起,雄浑巍峨,陪葬妃子墓,守卫王陵的角楼,还有东西南北四座门楼,遗迹依然。更有墓道两侧的雕像,历经千年风霜,个个耸立如初。不免慨叹,真乃盛唐气象!

多媒体通信系统(3.4.4)

3.4.4 Content-Based Image Retrieval
To address their challenges, multimedia signal-processing methods must allow efficient access to processing and retrieval of content in general, and visual content in particular. This is required across a large range of applications, in medicine, entertainment, consumer industry, broadcasting, journalism, art and e-commerce. Therefore, methods originating from numerous research areas, that is, signal processing, pattern recognition, computer vision, database organization, human-computer interaction and psychology, must contribute to achieving the image-retrieval goal. An example of image retrieval is Given: A query
Retrieve: All images that have similar content to that of the query.
为应付挑战,多媒体信号处理方法必须有效地接入以处理和复现常规和特殊的视频内容。对于很大范围的应用这都是必须的,例如医学、娱乐、消费工业、广播、新闻业、艺术以及电子商务。因此,源于各个研究领域的方法,信号处理、模式识别、计算机视觉、数据库组织、人机交互和心理学,都为实现图像复现的目标作出了贡献。图像复现的一个例子是:
给定:一个查询
检索:所有与查询具有相似内容的图像。
Image-retrieval methods face several challenges when addressing this goal [3.68]. These challenges, which are summarized in Table 3.1, cannot be addressed by text-based image retrieval systems, which have had an unsatisfactory performance so far. In these systems, the query keywords are matched with keywords that have been associated to each image. Because of difficult automatic selection of the relevant keywords, time consuming and subjective manual annotation is required. Moreover, the vocabulary is limited and must be expanded as new applications emerge.
对于给定的这个目标,图像复现法面临几个挑战。这些挑战列在表3.1中,基于文本的图像复现系统无法解决,迄今为止,它的性能不能令人满意。在这些系统中,查询关键词与已经对每一图像建立关联的关键词相匹配。由于自动选择相应关键词的困难,因此要耗费时间并需要个人人工注解。此外,当有新的应用时,还必须对有限的词汇进行扩展。
To improve performance and address these problems, content-based image retrieval methods have been proposed. These methods have generally focused on using low-level features such as color, texture and shape layout, for image retrieval, mainly because such features can be extracted automatically or semiautomatically.
为改善性能解决问题,已经提出了基于内容的图像复现法。这些方法通常聚焦于低级特征,例如色彩、纹理和外形轮廓,对于图像复现,这些特征能够自动或半自动地提取。
Texture-Based Methods
Statistical and syntactic texture description methods have been proposed. Methods based on spatial frequencies, co-occurrence matrixes and multiresolution methods have been frequently employed for texture description because of their efficiency [3.69]. Methods based on spatial frequencies evaluate the coefficients of the autocorrelation function of the texture. Co-occurrence matrixes identify repeated occurrences of gray level pixel configurations within the texture.
已经提出统计和合成纹理描述法。这些方法基于空间频率,共生矩阵及多分辨率法由于效率高而常被纹理描述运用。基于空间频率的方法估计纹理的自相关函数。共生矩阵确定纹理的灰度级象素组态。
Table 3.1 Image retrieval challenges [3.68].
Challenges Remarks
Query types Color based/shape based/color and shape based
Quantitative, for example, find all images with 30% amount of red
Query forms
Query by example, for example, image region/image/sketch/other examples
Various content For example, natural scenes/head-and-shoulder images/MRIs
Matching types Object to object/image to image/object to image
Application specific
Precision levels
Exact versus similarity-based match
Presentation of results Application specific
Multiresolution methods describe the texture characteristics at coarse-to-fine resolutions. A major problem that is associated with most texture description methods is their sensitivity to scale, that is, the texture characteristics may disappear at low resolutions or may contain a significant amount of noise at high resolutions [3.70, 3.71, 3.72].
多分辨率法以由粗到细的分辨率描述纹理特征。主要问题与大多数纹理描述法的敏感度有关,即,在低分辨率时纹理特征可能消失,在高分辨率时又可能包含大量噪声。
Shape-Based Methods
Describing quantitatively the shape of an object is a difficult task. Several contour-based and region-based shape description methods have been proposed. Chain codes, geometric border representations, Fourier transforms of the boundaries, polygonal representations and deformable (active) models are some of the boundary-based shape methods that have been employed for shape description. Simple scalar region descriptors, moments, region decompositions and region neighborhood graphs are region-based methods that have been proposed for the same task [3.73, 3.74]. Contour-based and region-based methods are developed in either the spatial or transform domains, yielding different properties of the resulting shape descriptors. The main problems that are associated with shape description methods are high sensitivity to scale, difficult shape description of objects and high subjectivity of the retrieved shape results.
量化描述一个对象的形状是一个困难的任务。已经提出了几个基于轮廓和基于区域的形状描述法。链码、几何边框表示法、边界的傅立叶变换、多边形表示法以及可变形(主动的)模型是一些基于边界的形状法,已经用于形状描述。简单梯形区域描述符、矩、区域分解和邻域图是已经使用的基于区域法。
Color-Based Methods
Color description methods are generally color histogram based, dominant color based and color moment based [3.75, 3.76]. Description methods that employ color histograms use a quantitative representation of the distribution of color intensities. Description methods that employ dominant colors use a small number of color ranges to construct an approximate representation of color distribution. Description methods that use color moments employ statistical measures of the image characteristics in terms of color.
色彩描述法通常基于色彩直方图、基于支配色、基于色矩。使用色彩直方图的描述法用色强度分布的定量表示。使用支配色的描述法用少量的色彩范围构造色彩分布的近似表示。用色矩的描述法用图像特征在色彩上的统计度量。
The performance of these methods typically depends on the color space, quantization, and distance measures employed for evaluation of the retrieved results. The main problem that is associated with histogram-based and dominant-color-based methods is their inability to allow the localization of an object with the image. A solution to address this problem is to apply color segmentation, which allows both image-to-image matching and object localization. The main problem of color-moment-based methods is their complexity, which makes their application to browsing or other image-retrieval functionalities difficult.
这些方法的性能特别依赖于色彩空间、量化和检索结果的评估使用的距离测量。基于直方图和基于支配色法的主要问题是它们无法对图像中的对象定位。解决这个问题的办法是采用色彩分割,它既考虑图像对图像的匹配,又考虑对象定位。基于色矩法的主要问题是它们的复杂性,使它们在浏览及其它图像复现应用中产生功能性困难。
Examples of content-based image and video-retrieval systems are included in Table 3.2. Some or all of the limitations of these systems are the following [3.68]:
~ Few query types are supported
~ Limited set of low-level features
~ Difficult access to visual objects
~ Results partially match user”s expectations
~ Limited interactivity with the user
~ Limited system interoperability
~ Scalability problems
基于内容的图像和视频复现系统的例子列于表3.2中。这些系统的局限如下:
 支持的查询类型少
 低级特征量有限
 难以接入视觉对象
 结果与使用者期望部分匹配
 与使用者互动有限
 系统互操作性有限
 可扩缩性问题
Table 3.2 Examples of content-based image and video-retrieval systems [3.68].
Features System Image/Video Provider
WebSeek I, V Columbia University
Picasso I University of Florence
Color and text
Chabot I University of California, Berkeley
* I University of Toronto
QBIC I IBM
PhotoBook I MIT
Color, texture and shape
BlobWorld I University of California, Berkeley
VIR I, V Virage
Color, shape and scale Nefertiti I National Research Council of Canada
NeTra I University of California, Santa Barbara
Color, texture, shape and
spatial location Digital I Kodak
storyboard
WebClip V Columbia University
Color, texture and
Jacob I, V University of Palermo
motion
* V IMAX
N/A * V NASA
* No name has been adopted for the corresponding system.

多媒体通信系统(3.4.3)

3.4.3 Video Signal Processing
Digital video has many advantages over conventional analog video, including bandwidth compression, robustness against channel noise interactivity and ease of manipulation. Digital-video signals come in many formats. Broadband TV signals are digitized with ITU-R 601 format, which has 30/25 fps, 720 pixels by 488 lines per frame, 2:1 interlaced, 4:3 aspect ratio, and 4:2:2 chroma sample. With the advent of high-definition digital-video, standardization efforts between the TV and PC industries have resulted in the approval of 18 different digital video formats in the United States. Exchange of video signals between TV and PCs requires effective format conversion. Some commonly used interframe/field filters for format conversion, for example, ITU~ R 601 to the Source Input Format (SIF) and vice versa and 3:2 pull-down to display 24 Hz motion pictures in 60 Hz format, have been reviewed [3.57]. As for video filters, they can be classified as interframe/field (spatial), motion-adaptive and motion-compensated filters [3.58]. Spatial filters are easiest to implement. However, they do not make use of the high temporal correlation in the video signals. Motion-compensated filters require highly accurate motion estimation between successive views. Other more sophisticated format conversion methods include motion-adaptive field-rate doubling and deinterlacing [3.59] as well as motion compensated frame rate conversion [3.58].
与传统的模拟视频相比,数字视频有很多优点,包括带宽压缩、抗信道噪声、交互性和易于操作。数字视频信号有很多格式。广播电视信号以ITU-R 601格式数字化,帧频为30/25 fps,每帧720象素、488线,2:1隔行,4:3宽高比,4:2:2色度抽样。随着高清晰度数字视频的出现,美国TV和PC行业之间的标准化努力的结果是批准了18种数字视频格式。TV和PC之间视频信号交换需要进行格式转换。一些共用的格式转换帧间/场滤波器已经接受评审,例如,ITU-R 601到SIF(源输入格式)及其反向转换、在60 Hz格式中3:2下降到24 Hz动画显示。视频滤波器可以分类为帧间/场滤波器(空间)、运动自适应和运动补偿滤波器。空间滤波器最容易实现。但是,它们不能用于时间关联度高的视频信号。运动补偿滤波器需要相邻图像之间非常精确的运动估计。其它更复杂的格式转换方法包括运动自适应场频倍增和去隔行以及运动补偿帧频转换。
Video signals suffer from several degradations and artifacts. Some of these degradations may be acceptable under certain viewing conditions. However, they become objectionable for freezeframe or printing from video applications. Some filters are adaptive to scene content in that they aim to preserve spatial and temporal edges while removing the noise. Examples of edge-preserving filters include median, weighted median, adaptive linear mean square error and adaptive weighted-averaging filtering [3.58]. Deblocking filters can be classified as those that do require a model of the degradation process (inverse, constrained, least square, and Wiener filtering) and those that do not (contrast adjustment by histogram specification and unsharp masking). Deblocking filters smooth intensity variations across amounts of temporal redundancy. Namely, successive frames generally have large overlaps with each other. Assuming that frames are shifted by subpixel amounts with respect to each other, it is possible to exploit this redundancy to obtain a high-resolution reference image (mosaic) of the regions covered in multiple views [3.60]. High-resolution reconstruction methods employ least-squares estimation, back projection, or projection-autoconvex sets methods based on a simple instantaneous camera model or a more sophisticated camera model including motion blur [3.61].
视频信号受到劣化和认为干扰。某些劣化在一定条件下可以接受。但是,凝结帧就令人讨厌了。某些滤波器适用于景物内容,在那种情况下,它们是用来在保持空间和时间边沿的同时去除噪声。边沿保持滤波器的例子包括中值、加权中值、自适应线性均方差以及自适应加权平均滤波。分解块滤波器分为需要劣化过程模型的(反转、受迫、最小平方、维纳滤波)和不需要劣化过程模型的(直方图规范调节对比度和模糊掩蔽)。分解块滤波器平滑时间冗余总量的变化强度。一般地,连续帧通常有大量的相互重叠。假设子象素数量相互关联的帧移动,利用这种冗余就能够获得多重图像区域覆盖下的图像(马赛克)的高分辨率基准。高分辨率重构法采用最小平方估计、后向投影以及基于简单的即时摄像机模型或更复杂的包括运动模糊的摄像机模型的投影自弯曲调整法。
One of the challenges in digital video processing is to decompose a video sequence into its elementary parts (shots and objects). A video sequence is a collection of shots, a shot is a group of frames and each frame is composed of synthetic or natural visual objects. Thus, temporal segmentation generally refers to finding shot boundaries, spatial segmentation corresponds to extraction of visual objects in each frame and object tracking means establishing correspondences between the boundaries of objects in successive frames.
数字视频处理的课题之一是把图像序列分解为基本单元(镜头与对象)。图像序列是镜头的集合,一个镜头是一组帧,每一帧是由合成或自然的视频对象组成的。因此,时间分割一般涉及寻找镜头边界,空间分割对应于从每一帧里提取视频对象,对象跟踪就是使相继帧中对象的边界相互一致。
Temporal segmentation methods edit effects as cuts, dissolves, fades and wipes. Thresholding and clustering using histogram-based similarity methods have been found effective for detection of cuts [3.62]. Detection of special effects with high accuracy requires customized methods in most cases and is a current research topic. Segmentation of objects by means of chroma keying is relatively easy and is commonly employed. However, automatic methods based on color, texture and motion similarity often fail to capture semantically meaningful objects [3.63]. Semiautomatic methods, which aim to help a human operator perform interactive segmentation by tracking boundaries of a manual initial segmentation, are usually required for object-based video editing applications. Object-tracking algorithms, which can be classified as boundary region or model-based tracking methods, can be based on 2D or 3D object representations. Effective motion analysis is an essential part of digital video processing and remains an active research topic.
时间分割法编辑特技如切换、叠化、淡变和划变。已经发现用基于直方图的相似法确定阈值和分组可以有效地检测切换。对特技的高准确度检测在大多数情况下需要定制法,这是目前的研究课题。用色键分割对象相对容易,是目前普遍采用的方法。但是,基于色彩、纹理和运动相似性的自动法捕获有意义的对象时常常失败。半自动法,它的目标是通过跟踪人工初步分割的边界帮助人们完成交互分割,通常需要基于对象的视频编辑软件。对象跟踪算法,可以分为边界区域或基于模型跟踪法,可以以2D及3D对象表示为基础。有效地运动分析是数字视频分析的基本部分并仍是活跃的研究课题。
Storage and archiving of digital video in shared disks and servers in large volumes, browsing of such databases in real time and retrieval across switched and packet networks pose many new challenges, one of which is efficient and effective description of content. The simplest method to index content is by assigning manually or semiautomatically the content to programs, shots and visual objects [3.64]. It is of interest to browse and search for content using compressed data because almost all video data will likely be stored in compressed format [3.65]. Video-indexing systems may employ a frame-based, scene-based or object-based video representation. The basic components of a video-indexing system are temporal segmentation, analysis of indexing features and visual summarization. The temporal-segmentation step extracts shots, scenes and/or video objects. The analysis step computes content-based indexing features for the extracted shots, scenes, or objects. Content-based features may be generic or domain dependent. Commonly used generic indexing features include color histograms, type of camera- motion direction and magnitude of dominant object motion entry and exit instances of objects of interest and shape features for objects [3.66, 3.67]. Domain-dependent feature extraction requires a priori knowledge about the video source, such as new programs, particular sitcoms, sportscasts and particular movies. Content-based browsing can be facilitated by a visual summary of the contents of a program, much like a visual table of contents. Among the proposed visual summarization methods are story boards, visual posters and mosaic-based summaries.
数字视频的存档在共享盘和服务器中要占据大量空间,对这样的数据库的实时浏览和通过包交换网络的重现引出许多新的挑战,其中之一是内容描述的效率和有效性。检索内容最简单的办法是人工或半人工地给节目、镜头和视频对象做目录。重要的是,用压缩的数据对内容进行浏览和检索,因为几乎所有的视频数据可能都是用压缩格式保存的。视频索引系统可以用基于帧、基于景物以及基于对象的视频表示。视频索引系统的基本组成是时间分割、索引特征分析和画面摘要。时间分割阶段是提取镜头、景物和/或视频对象。分析阶段是对提取的镜头、景物以及视频对象计算基于内容的索引特征。基于内容的特征可以随类或域而定。一般用类索引特征包括色彩直方图、摄像机运动的类型、主要对象的运动方向和幅度、重要对象进出的情况、对象的形状特征。与域有关的特征摘要需要关于视频源的知识,例如新节目、一部连续剧、比赛实况转播和一部电影。基于内容的浏览可以借助于节目内容的视频摘要,它很像内容的视频表格。被推荐的视频摘要法有故事板、视频海报和基于马赛克的摘要。

多媒体通信系统(3.4.2)

3.4.2 Speech, Audio and Acoustic Processing for Multimedia
The primary advances in speech and audio signal processing that contributed to multimedia applications are in the areas of speech and audio signal compression, speech synthesis, acoustic processing, echo control and network echo cancellation.
语音和音频信号处理的改进对多媒体应用的贡献在下述范围:语音和音频信号压缩、语音合成、声学处理、回声控制以及网络回声消除。
Figure 3.2 Block diagram for audio-assisted head and shoulder video [3.36]. ~1998 IEEE.
Speech and audio signal compression Signal compression techniques aim at efficient digital representation and reconstruction of speech and audio signals for storage and playback as well as transmission in telephony and networking.
语音和音频信号压缩 信号压缩技术的目标是为电话和网络中的存储、重放和传输进行语音和音频信号的有效的数字表示和重建。
Signal-analysis techniques such as Linear Predictive Coding (LPC) [3.37], and all-pole autoregressive modeling [3.38] and Fourier analysis [3.39], played a central role in signal representation. For compression, VQ [3.40, 3.41] marks a major advance. These techniques are built upon rigorous mathematical frameworks that have become part of the important bases of digital signal processing. Incorporation of knowledge and models of psychophysics in hearing have been proven as beneficial for speech and audio processing. Techniques such as noise shaping [3.42] and explicit use of auditory masking in the perceptual audio coder [3.43] have been found very useful. Today, excellent speech quality can be obtained at less than 8 Kb/s, which forms the basis for cellular as well as Internet telephony. The fundamental structure of the Code- Excited Linear Prediction (CELP) coder is ubiquitous in supporting speech coding at 4 to 16 Kb/s, encompassing such standards as G.728 [3.44], G.729 [3.45], G.723.1, IS-54 [3.46], IS-136 [3.47], GSM [3.48] and FS-1016 [3.491. CD or near-CD-quality stereo audio can be achieved at 64 to 128 Kb/s, less than one twelfth of the original CD rate, and is ready for such applications as Internet audio (streaming and multicasting) and digital radio (digital audio broadcast). Advances in audio-coding standards are supported in MPEG activities.
信号分析技术例如线性预测编码(LPC)、全极点自回归模型和傅立叶分析在信号表示中扮演着主要角色。对于压缩,VQ标志着一个重要进步。这些技术都建立在严格的数学框架之上,并已成为数字信号处理的重要基础的一部分。语音和音频信号处理已经从听觉的精神物理学知识与模型的结合中获得了益处。噪声频谱成型一类的技术和听觉屏蔽在知觉音频编码器中的直接应用已被发现非常有用。今天,在低于8 Kb/s的条件下已能获得极好的音质,这已成为蜂窝以及因特网电话的基础。码激励线性预测编码器的基本结构已经普遍用于支持4~16 Kb/s速率的语音编码,包括G.728 [3.44], G.729 [3.45], G.723.1, IS-54 [3.46], IS-136 [3.47], GSM [3.48] and FS-1016 [3.49]等标准。在64 ~128 Kb/s可达到CD或接近CD质量的立体声,速率低于CD码率的十二分之一,已经用于因特网音频(流和多播)和数字广播(数字音频广播)。MPEG支持音频编码标准的改进。
Speech synthesis The area of speech synthesis includes generation of speech from unlimited text, voice conversion and modification of speech attributes such as time scaling and articulatory mimic [3.50]. Text-to-speech conversion takes text as input and generates human-like speech as output [3.51]. Key problems in this area include conversion of text into a sequence of speech inputs (in terms of phonemes, dyades or syllables), generation of the associated prosodic structure and intonation and methods to concatenate and reconstruct the sound waveform. Voice conversion refers to the technique of changing one person”s voice to another, from person A to person B or from male to female and vice versa. It is useful to be able to change the time scale of a signal (to speed up or slow down the speech signal which changes the pitch) or to change the mode of the speech (making it sound happy or sad) [3.52]. Many of these signal-processing techniques have appeared in animation and computer graphics applications.
语音合成 语音合成的范围包括来自无约束文本语音的产生、话音语音特征例如时间尺度的转换和修改以及拟声。文本到语音转换以文本为输入,以产生的类人语音为输出。这个领域的关键问题包括文本变换到语音输入序列(术语叫音素或音节)、建立语法结构与音调的关联以及连接和重建声音波形的方法。话音转换涉及到把一个人的声音变为另一个人的技术,从人A到人B以及从男到女等等。能够改变信号的时间尺度(语音信号的快速或慢速以改变音调)或者语音的模式(欢快或悲愁的声音)是非常有用的。许多这些信号处理技术已经出现在动画和计算机图形这些应用中。
Acoustic processing and echo control Sound pickup and playback is an important area of multimedia processing. In sound recording, interference, such as ambient noise and reverberation, degrade the quality. The idea of acoustic signal processing and echo control is to allow straightforward high-quality sound pickup and playback in applications, such as a duplex device like a speakerphone, a sound source-tracking apparatus like microphone arrays, teleconferencing systems with stereo input and output, hands-free cellular phones and home theatre with 3D sound.
声学处理和回声控制 拾音和重放是多媒体处理的一个重要领域。录音时,环境噪声和回响之类的干扰使录音质量劣化。声学信号处理和回声控制是想在应用中能够获得高质量的拾音和重放,这些应用包括耳麦之类的双工设备、麦克风阵列之类的声源跟踪设备、立体声输入输出的远程会议系统、不用手的蜂窝电话以及3D声音的家庭影院等。
Signal processing for acoustic echo control includes modeling of reverberation, design of dereverberation algorithms, echo suppression, double-talk detection and adaptive acoustic echo cancellation, which is still a challenging problem in stereo full-duplex communication environments [3.53].
声学回声控制的信号处理包括回响模型、消回响算法设计、回声抑制、双方讲话检测以及适应回声消除,这仍然是立体声全双工通信环境中富有挑战性的问题。
Example 3.3 For typical environments, the system modeling time for reverberation is of the order of 100 ms. This at a sampling rate of 16 KHz translates into a echo-canceling filter of 1600 taps, requiring seconds to converge.
例3.3 在典型环境中,回响的系统建模时间是100ms量级。抽样频率16 KHz转换为1600个抽头的消回声滤波器,这需要若干秒汇聚。
For sound pickup, acoustic processing aims at the design of transducers or transducer arrays to achieve a durable directionality (beam steering and width control) as well as noise resistance. Understanding of near and far-field acoustics is important in achieving the required response in specific applications [3.54]. Various 1D and 2D microphone arrays have been shown in teleconferencing and auditorium applications with good results [3.55].
对于拾音,声学处理的目标是设计耐用的指向性(束调整和宽度控制)及抗噪声换能器或换能器阵列。在特殊应用中为得到所需要的响应必须掌握近场和远场声学特征。各种1D和2D麦克风阵列已经在远程会议和礼堂中获得良好的应用。
Network echo cancellation In telephony, both near-end and far-end echo exists due to the hybrid coil that is necessary for two-wire and four-wire conversions. Network echo can be so severe that it hampers telephone conversation. Network echo cancellers were invented to correct the problem in the late 1960s, based on the Least Mean Squares (LMS) adaptive echo cancelation algorithm [3.56]. The network echo delay is of the order of 16 ms, typically requiring a filter with 128 taps at a sampling rate of 8 KHz.
电话中的网络回声消除,包括由于二四线变换需要的混合线圈而产生的近端和远端回声。严重的网络回声将影响电话交谈。解决该问题的网络回声消除器发明于1960晚期,基于最小均方(LMS)适应回声消除算法。网络回声延迟大约16ms,在抽样频率8 KHz时需要128个抽头(典型值)的滤波器。

建立节目集成平台的概念(未完稿)

建立节目集成平台的概念

对于广播电视的技术体系,一般认为是由采集、制作、播出、传输、覆盖这样几个环节构成。从信息传播的角度来看,这几个环节都属于信源(图一),基本上没有考虑信宿(信息接受者)。这是广播电视模拟地面无线覆盖时代长期形成的一种思维习惯,其要害在于对受众的漠视。在那个时代,由于技术的限制、资源(信源和信道)的匮乏,在广播电视节目生产、流通、消费和分配中,是供不应求,受众处于完全被动的地位,他们是有理由被漠视的。

但是,消费需求永远是推动发展的原动力。在模拟地面无线电视时代,广大观众为了收看到更多的节目,想尽了办法。稍微懂行的就搞高增益天线、天线放大器,而许多普通人家,凡是听说能多收哪怕一两个台的,什么易拉罐、日光灯管,五花八门全都用上了。那时,城市的每一栋住宅楼的楼顶都是一片电视接收天线的森林。所以,共用天线系统应运而生,所以,有线电视应运而生。为什么共用天线系统、有线电视网络能够应运而生?是因为为广播电视观众做了节目集成这件事。
在共用天线系统出现之前,无线电视的节目集成是在用户的接收天线上,是接收天线把所有能够感生出一定电压的电视信号集合起来,送给电视机,电视机作为信息终端按照用户的需要选择、解码和复现信息。这个事实是直白的,没有什么中间环节的遮蔽,不管哪家电视台,不论采用什么手段,微波传送也好,卫星传送也好,都要通过无线转播把自己的节目送到用户的天线上,然后,它才能进入电视机才有机会被用户选到、看到,完成信息传播的任务。不管哪家电视台,再好的节目,送不到用户的接收天线上,一切都是白费。因此,那时不是没有节目集成平台,而是节目集成平台隐身于电视接收天线之中。
共用天线系统、有线电视的兴起和发展,原动力和目标就是增强和扩展用户接收天线的功能,以集成更多的节目。从某种意义上讲,有线网络的电缆和光缆不过是拉长了电视机后面的那一截电缆,有线电视前端机房的作用就是展宽了天线的频带宽度——从米波、分米波扩展到了下至视频、上至卫星频段,它仍是广义上的用户接收天线。所以,对于有线网络来说,网络前端就是节目集成平台。不论中央的节目、省市的节目,还是本地的节目、外地的节目,都汇聚在网络前端,这不是节目集成,什么是节目集成?
广播电视的数字化提高了信道的利用率,模拟时代一个频道对应一套节目的现象被多套节目的数字码流复用在一个频道里所取代。把复用称之为集成行不行呢?我认为不合适。首先,复用就是一个十分准确恰当的名词,没有必要再用一个其它名词来混淆复用这一概念。其次,集成,集大成也,把部分汇集成一个完整的整体才称得起集成。而复用一般是多层次进行、逐步汇集的过程,从零次群到一次群、再到二次群、三次群……。对于广播电视来说,一个频道只是一个播出机构所播出节目的一部分,把几套节目组合到一个频道,仍然是从部分到部分,不能称之为集成。第三,多路复用对应于英语的multiplex,集成是integrate,还是分别称谓为好。
建立节目集成平台的概念有什么意义呢?
首先,节目集成平台的概念准确地反映了实际存在着的某些事物,用户的接收天线也好,共用天线、有线电视也好,做的就是节目集成平台的事。
其次,节目集成平台的概念赋予有线电视网络以准确的定位。它揭示出有线电视网络是从用户接收天线发展而来的,是信宿而不是信源中的一部分。虽然也可以把有线电视网络看作信道的一部分,但是,把它看作信宿,更准确地反映了、更强调了它与生俱来的、天然的与受众的直接联系和历史渊源。
第三,建立节目集成平台的概念具有现实意义。把有线数字电视技术体系划分为节目播出平台、节目传输平台、节目集成平台、监管和结算平台,从信息传播流程、明确分工、行业管理等角度来看都更为合理。
模拟时代,有线电视网络的节目集成平台的地位无可置疑,数字时代将发生哪些变化呢?
按照我国广播电视数字化的“三步走”战略,2005年直播卫星上天,2008年开播地面数字电视。到那时候,有线电视网络的节目集成平台的地位会动摇吗?
建议在管理上区分节目播出平台和节目集成平台,制作、播出、传输、集成,分工明确,各司其职。往往是最简单的才符合自然规律。不要让节目播出平台去做集成平台的事,也不要把位于下游的集成平台硬搬到在它上面的传输平台前面,那都是违背客观规律费力不讨好的事。

历史地看,从个体无线接收,到共用天线、企事业单位的小网,再到县、市级行政区域网络,发展到现在全国连通的大网,是随着传输技术的进步而逐步发展的。为什么中央和省级没有自己的直接用户网?一方面是技术的限制,当初没有光纤传输手段,电缆干线最多只能做到几十公里,乡、县、市能建,省和中央不可能建有线电视网。更深刻的原因是发展的原动力在于用户的需求,是用户要看到更多节目的需求推动着广播电视接收沿着从个体到集体、从小规模到大规模这样一种自然的轨迹而发展到今天。
广播电视就要跨入数字时代了,很多人企盼已久的“全国一张网”至今依然难以实现,为什么呢?不管找出多少原因,恐怕最后都得回答这两个问题:技术上可能吗?符合用户的需求吗?
从技术上看,用光纤数字传输覆盖全国已经没有什么问题,但是进不了用户终端,省级也一样。数字光传输能够通达全球,就是进不了用户的电视机,这无情的事实迫使即使是再伟大的人物也只能无奈地接受用户接入网只能在市、县、乡这样一个冷酷的现实,干线就是干线,只能做传输平台,不能做节目集成,只有用户接入网才是电视机接收天线的自然延伸,它的前端才是节目集成平台,这是自然规律所决定的,是某些人的主观意志改变不了的。如果硬性规定,接入网就是不能做节目集成平台,相信行政力量能够办到,但是结果只会是阻碍发展。
从用户需求来看,传播学告诉我们,当今时代受众的需求是多样化、对象化、个性化。让全国人接收同样的节目,还是让不同地域的人接收各自不同的节目?如果说:我把全国的节目送到全国每一位观众的面前。愿望很好,但是既不可能也不可行。站在用户的角度,对用户需求的响应,当然是越贴近用户越好。
因此,即使是广播电视的数字时代,有线电视的节目集成平台仍然是用户接入网的前端。直播卫星上了天,虽然直接送到了用户,它也称不起集成平台,因为天上不是只有它一颗星。
国家也好、省市也好、县也好,都不要忘记节目集成的目的是为了向用户传送更多的节目。因此,最佳方案就是以较低的成本传送最多的节目。无线传播信道的本质特性决定了地面无线无法担当节目集成平台的重任。对于有线网络,在模拟时代,一个地方可供集成的节目一般来说就在四五十套左右,计算成本和效益,在经济发达地区的乡镇、一般地区的县,就可以做。国家和省级由于传输技术手段所限,反而无法直接做用户网络。因此就形成了市、县、乡三级有线电视网络的现实局面。

多媒体通信系统(3.4)

3.4 Challenges of Multimedia Information Processing
Novel communications and networking technologies are critical for a multimedia database system to support interactive dynamic interfaces. A truly integrated media system must connect with individual users and content-addressable multimedia databases. This will be a logical connection through computer networks and data transfer.
新的通信和网络技术必须支持交互动态接口的多媒体数据库系统。一个真正的集成媒体系统必须连接各个用户和内容可检索的多媒体数据库。这将是一种通过计算机网络和数据转移的逻辑连接。
To advance the technologies of indexing and retrieval of visual information in large archives, multimedia content-based indexing would complement the text-based search. Multimedia systems must successfully combine digital video and audio, text animation, graphics and knowledge about such information units and their interrelationships in real time.
为了改进大量档案中视觉信息的检索与重现技术,多媒体的基于内容检索将补充基于文本的检索。多媒体系统要能够实时地顺利组合数字视频、音频、字幕、图形和有关这些信息单元及其相互关系的知识。
The operations of filtering, sampling, spectrum analysis and signal representation are basic to all of signal processing. Understanding these operations in the multidimensional (mD) Case has been a major activity since 1975 [3.15, 3.16, 3.17]. More key results since that time have been directed at the specific applications of image and video processing, medical imaging, and array processing. Unfortunately, there remains considerable cross-fertilization among the application areas.
滤波、抽样、频谱分析以及信号表示等操作全部基于信号处理。1975年以来,理解多维(mD)情况下的这些操作已经成为主要活动。从那时以来,很多关键结果已经指导着图像和视频处理、医学图像以及阵列处理等专业应用。不幸的是在这些应用中残存了大量的杂交。
Algorithms for processing mD signals can be grouped into four categories:
 Separable algorithms that use 1D operators to process the rows and columns of a multidimensional array
 Nonseparable algorithms that borrow their derivation from their 1D counterparts
 mD algorithms that are significantly different from their 1D counterparts
 mD algorithms that have no 1D counterparts.
mD信号处理算法可以归纳为四类:
 用1D算子处理多维阵列的行和列的分离算法
 借用1D算法的非分离算法
 显著不同于1D的mD算法
 非1D的mD算法
Separable algorithms operate on the rows and columns of an mD signal sequentially. They have been widely used for image processing because they invariably require less computation than nonseparabte algorithms. Examples of separable procedures include mD Discrete Fourier Transforms (DFTs), DCTs and Fast Fourier Transform (FFT)-based spectral estimation using the periodogram. In addition, separable Finite Impulse Response (FIR) filters can be used in separable filter banks, wavelet representations for mD signals and decimators and interpolators for changing the sampling rate.
分离算法继续用于对mD信号的行和列的运算。由于与非分离算法相比它们总是可以用较少的计算,所以一直广泛的用于图像处理。分离规程的例子包括mD离散傅立叶变换(DFT)、DCT和以快速傅立叶变换(FFT)为基础的用周期图的频谱估计。另外,离散有限冲激响应(FIR)滤波器可用于离散滤波器单元、mD信号的小波表示和改变抽样速率的抽值器和内插器。
The second category contains algorithms that are uniquely mD in that they cannot be decomposed into a repetition of 1D procedures. These can usually be derived by repeating the corresponding 1D derivation in an mD setting. Upsampling and downsampling are some examples. As in the 1D case, bandlimited multidimensional signals can be sampled on periodic lattices with no loss of information. Most 1D FIR filtering and FFT-based spectrum analysis algorithms also generalize straightforwardly to any mD lattice [3.18]. Convolutions can be implemented efficiently using the mD DFT either on whole arrays or on subarrays. The window method for FIR filter design can be easily extended, and the FRI” algorithm can be decomposed into a vector-radix form, which is slightly more efficient than the separable row/column approach for evaluating multidimensional DFTs [3.19, 3.20]. Nonseparable decimators and interpolators have also been derived that may eventually be used in subband image and video coders [3.21]. Another major area of research has been spectral estimation. Most of the modern spectral estimators, such as the maximum entropy method, require a new formulation based on constrained optimization. This is because their 1D counterparts depend on factorization properties of polynomials [3.22]. An interesting case is the maximum likelihood method, where the 2D version was developed first and then adopted to the 1D situation [3.23].
第二类是唯一不能分解为重复1D规程的mD算法。它们通常通过在一个mD框架内重复相应的1D推导而推导出来。升抽样和降抽样就是它们的例子。如同1D的情况下,带限多维信号可以信息无损地按照周期晶格抽样。大多数1D FIR滤波和基于FFT的频谱分析算法也直接归为任一mD晶格。用mD DFT对阵列或子阵列都可以有效地进行卷积运算。FIR滤波器设计的窗口法易于扩展,FRI的算法可以分解为矢量基数形式,它比用分离的行/列逼近法求多维DFT的值效率稍高一些。不可分离的抽值器和内插器也已被导出,可最终用于子带图像和视频编码器。研究的另一个主要领域已经是频谱估计。最新的频谱估计器,例如最大熵法,需要一种基于强迫优化的新的表述。这是因为它们的1D副本依赖于多项式的因数分解性质。一种有趣的情况是最大似然法,首先开发出来的是2D版本,然后才被采用于1D。
There are also mD algorithms that have no 1D counterparts, especially algorithms that perform inversion and computer imaging. One of these is the operation of recovering an mD distribution from a finite set of its projections, equivalently inverting a discretized Radon transform. This is the mathematical basis of computed tomography and positron emission tomography.
也有mD算法不存在1D副本,特别是进行反转和计算机图像的算法。其中之一是从它的投影的有限集合复原一个mD分布,等价地反转一个离散Radon变换。这是计算机层析成像和正电子层析成像的数学基础。
Another imaging method, developed first for geophysical applications, is Fourier integration. Finally, signal recovery methods unlike the 1D case are possible, The mD signals with finite support can be recovered from the amplitudes of their Fourier transforms or from threshold crossings [3.24].
另一种首先为地球物理学应用开发的图像法是傅立叶综合。最终,信号恢复法不像1D情况是可能的,具有有限支持的mD信号能够从它们的傅立叶变换的幅值或者从阈值交叉中恢复。
3.4.1 Pre and Postprocessing
In multimedia applications, the equipment used for capturing data, such as the camera, should be cheap, making it affordable for a large number of users. The quality of such equipment drops when compared to their more expensive and professional counterparts. It is mandatory to use a preprocessing step prior to coding in order to enhance the quality of the final pictures and to remove the noise that will affect the performance of compression algorithms. Solutions have been proposed in the field of image processing to enhance the quality of images for various applications [3.25, 3.26]. A more appropriate approach would be to take into account the characteristics of the coding scheme when designing such operators. In addition, pre- and postprocessing operators are extensively used in order to render the input or output images in a more appropriate format for the purpose of coding or display.
在多媒体应用中,用于采集数据的设备,例如摄像机,或许很便宜,很多人都能买得起。这样的设备与那些价格高的专业设备相比质量差。必须在编码之前进行处理,以提高最终图片的质量和去掉噪波,否则将影响压缩算法的性能。用于改善各种应用中图像质量的图像处理领域已经有了解决方案。适当的办法是在设计这样的处理器时考虑编码方案的特性。另外,为了在编码或显示时以比较适当的格式输入或输出图像,也广泛地使用预处理器和后处理器。
Mobile communications is an important class of applications in multimedia. Terminals in such applications are usually subject to different motions, such as tilting and jitter, translating into a global motion in the scene due to the motion of the camera. This component of the motion can be extracted by appropriate methods detecting the global motion in the scene and can be seen as a preprocessing stage. Results reported in the literature show an important improvement of the coding performance when a global motion estimation is used [3.27].
在多媒体中移动通信是一类重要应用。这类应用终端一般处于不同的运动中,例如倾斜和抖动,由于摄像机运动而转化过来的景物的全向运动。可以用适当的方法通过检测现场的全向运动提取这个运动分量, 并把它作为预处理步骤。文献报告显示,采用全向运动估计时编码性能得到重大改善。
It is normal to expect a certain degree of distortion of the decoded images for very tow-bit- rate applications. However, an appropriate coding scheme introduces the distortions in areas that are less annoying to the users. An additional stage could be added to reduce the distortion further due to compression as a postprocessing operator. Solutions were proposed in order to reduce the blocking artifacts appearing at high compression ratios [3.28, 3.29, 3.30, 3.31, 3.32, 3.33]. The same types of approaches have been used in order to improve the quality of decoded signals in other coding schemes, reducing different kinds of artifacts, such as ringing, blurring and mosquito noise [3.34, 3.35].
一般认为,在很低比特率场合中解码图像会有一定程度的失真。然而,在某些情况下,为了减少用户的烦恼以适当的编码方案引入失真。可以附加一个步骤作为后处理器,以进一步减小压缩带来的失真。已经提出了解决高压缩比时出现阻塞问题的方法。同样的方法也已经在其它编码方案中用于改善解码信号的质量,减小各类噪声,例如振铃、斑点和哼声。
Recently, advances in postprocessing mechanisms have been studied to improve lip synchronization of head-and-shoulder video coding at a very low bit rate by using the knowledge of decoded audio in order to correct the positions of the lips of the speaker [3.36], Figure 3.2 shows an example of the block diagram of such a postprocessing operation.
最近,对改善在很低比特率时头肩像视频编码的唇同步问题的后处理机制的研究已经取得进展,这种机制运用解码音频的知识校正讲话者的唇位,图3.2显示了一例这类后处理过程的框图。

有线数字电视节目集成平台在哪里

对于有线数字电视技术体系,现在已经很少听到有人说要全国一个CA了,可是还有人想在几百万用户的全省搞一个CA,我认为这不可行,CA的合理位置是在节目集成平台,而不是播出平台或传输平台。
节目集成平台在哪里?
早期,无线电视的节目集成是在用户的接收天线上,电视机作为信息终端就是按照用户的需要选择、解码和复现信息。这个事实是直白的,没有什么中间环节的遮蔽,不管哪家电视台,不论采用什么手段,微波传送也好,卫星传送也好,都要通过无线转播把自己的节目送到用户的天线上,然后,它才有机会被用户选到、看到,完成信息传播的任务。不管哪家电视台,再好的节目,送不到用户的接收天线上,一切都是白费。
后来,共用天线系统、有线电视的逐渐兴起、发展。它们的兴起和发展,原动力和目标就是增强和扩展用户接收天线的功能,以集成更多的节目。从某种意义上讲,有线网络的电缆和光缆不过是拉长了电视机后面的那一截电缆,有线电视前端机房的作用就是展宽了天线的频带宽度——从米波、分米波扩展到了下至视频、上至卫星频段,它仍是广义上的用户接收天线。所以,对于有线网络来说,网络前端就是节目集成平台。不论中央的节目、省市的节目,还是本地的节目、外地的节目,都汇聚在网络前端,这不是节目集成,什么是节目集成?
历史地看,从个体无线接收,到共用天线、企事业单位的小网,再到县、市级行政区域网络,发展到现在全国连通的大网,是随着传输技术的进步而逐步发展的。为什么中央和省级没有自己的直接用户网?一方面是技术的限制,当初没有光纤传输手段,电缆干线最多只能做到几十公里,乡、县、市能建,省和中央不可能建有线电视网。更深刻的原因是发展的原动力在于用户的需求,是用户要看到更多节目的需求推动着广播电视接收沿着从个体到集体、从小规模到大规模这样一种自然的轨迹而发展到今天。
广播电视就要跨入数字时代了,很多人企盼已久的“全国一张网”至今依然难以实现,为什么呢?不管找出多少原因,恐怕最后都得回答这两个问题:技术上可能吗?符合用户的需求吗?
从技术上看,用光纤数字传输覆盖全国已经没有什么问题,但是进不了用户终端,省级也一样。数字光传输能够通达全球,就是进不了用户的电视机,这无情的事实迫使即使是再伟大的人物也只能无奈地接受用户接入网只能在市、县、乡这样一个冷酷的现实,干线就是干线,只能做传输平台,不能做节目集成,只有用户接入网才是电视机接收天线的自然延伸,它的前端才是节目集成平台,这是自然规律所决定的,是某些人的主观意志改变不了的。如果硬性规定,接入网就是不能做节目集成平台,相信行政力量能够办到,但是结果只会是阻碍发展。
从用户需求来看,传播学告诉我们,当今时代受众的需求是多样化、对象化、个性化。让全国人接收同样的节目,还是让不同地域的人接收各自不同的节目?如果说:我把全国的节目送到全国每一位观众的面前。愿望很好,但是既不可能也不可行。站在用户的角度,对用户需求的响应,当然是越贴近用户越好。
因此,即使是广播电视的数字时代,有线电视的节目集成平台仍然是用户接入网的前端。直播卫星上了天,虽然直接送到了用户,它也称不起集成平台,因为天上不是只有它一颗星。
建议在管理上区分节目播出平台和节目集成平台,制作、播出、传输、集成,分工明确,各司其职。往往是最简单的才符合自然规律。不要让节目播出平台去做集成平台的事,也不要把位于下游的集成平台硬搬到在它上面的传输平台前面,那都是违背客观规律费力不讨好的事。
反过来说,只传送自家的节目就不是节目集成。在一个区域内,要做得只有你一个节目集成平台,有两个办法:或者只有你一家办节目,或者只有你一家有节目传送权。这个区域是国家也好、省市也好、县也好,都不要忘记节目集成的目的是为了向用户传送更多的节目。因此,最佳方案就是以较低的成本传送最多的节目。无线传播信道的本质特性决定了地面无线无法担当节目集成平台的重任。对于有线网络,在模拟时代,一个地方可供集成的节目一般来说就在四五十套左右,计算成本和效益,在经济发达地区的乡镇、一般地区的县,就可以做。国家和省级由于传输技术手段所限,反而无法直接做用户网络。因此就形成了市、县、乡三级有线电视网络的现实局面。

多媒体通信系统(3.3)

3.3 Signal-Processing Elements
Many classical signal-processing procedures have become deeply embedded in the multidimensional fields. A key driver is optimization for representation of multimedia components, as well as the associated storage and delivery requirements. The optimization procedures range from very simple to sophisticated. Some of the principal techniques are the following:
 Nonlinear analog (video and audio) mapping
 Quantization of the analog signal
 Statistical characterization
 Motion representation and models
 3D representations
 Color processing
许多经典的信号处理规程已经深深地植入多维领域。为了表示多媒体分量,对关键的驱动器以及相关的存储和传输设备进行了优化。优化规程的范围涵盖了从非常简单的到精密复杂的。一些主要技术如下所示:
 非线性模拟(视频和音频)映射
 模拟信号的量化
 统计描述
 运动表示和模型
 3D表示
 色彩处理
A nonlinear analog (video and audio) mapping procedure may be purely analog. Its intention may be the desire to enhance the delivery process. It could also be introduced to mask the limitations of various components of the overall multimedia chain. Typical constraints are introduced by bandwidth limitations and constrained dynamic range in the display terminal.
非线性模拟(视频和音频)映射过程可以是完全模拟的。其目的可以是期望提高传输性能。也可能是用于掩饰整个多媒体链中各个方面的缺陷。典型的约束是显示终端的带宽限制和动态范围的限制。
Quantization of the analog signal is fundamental to any digital representation that has originated in the analog world. The quantization process is an inherently lossy procedure and fundamentally noninvertible. This classical signal-processing element still remains the basic constraint in limiting performance, although not very exciting compared with other multimedia issues [3.5]. Quantization techniques comprise a whole field by themselves. The major relevant issues include uniform and nonuniform techniques and adaptive and nonadaptive procedures [3.6].
模拟信号的量化对任何发源于模拟世界的数字表示来说都是基本的。量化处理是一种固有的有损过程,而且是根本不可逆转的。这个经典的信号处理方法仍然是对性能限制的基本约束,尽管与其它多媒体难题相比它还不是非常令人激动的[3.5]。量化技术有它自己的领域。主要相关课题包含统一的和非统一的技术以及适应的和非适应的规程[3.6]。
Statistical concepts and applications are directly and indirectly strongly embedded in processing components associated with multimedia. This relevant field is part of classical signal processing, and we can only highlight the major categories. A spectral analysis is fundamental to the entire range of image models for filtering and algorithm design. The procedures are critical to both visual and audio data components [3.7, 3.8]. Statistical redundancy is the basic concept upon which the entire field of data compression is based. Mathematical extension of the concept leads to optimum transform for decorrelation. This in turn leads to the entire field of modem transform-coding technology [3.9]. Model-based representations, primarily for compression, are determined from assumed or derived statistical models. The classes of transform-coding algorithms are based on this technology [3.10]. The utility of Fourier transform and its discrete extensions such as Discrete Cosine Transform (DCT), wavelets and others are based on the principle that these transforms asymptotically approach the optimum transform, assuming a reason- able statistical behavior [3.11]. Visual and audio models are fundamental to the relevant multimedia representations, primarily compression procedures. These models are based on fundamental statistical representations of the elementary components, including their evaluation by the human observer [3.12, 3.13].
统计概念及其应用直接或间接地深入于与多媒体有关的处理。这一相关领域是经典的信号处理的一部分,我们只需突出它的主要范畴。对于滤波和算法设计的图像模型的整个范围,谱分析是基本原理。对于视频和音频两者的数据分量这个规程都是必不可少的。在整个基于数据压缩的全部领域中统计冗余都是基本概念。这个概念的数学扩展导出了去相关的最优变换。进而导出了整个调制解调器的变换编码技术。基于模型的表示法(压缩的基础)正是源于统计模型。上层变换编码算法就是基于这项技术。傅立叶变换以及它的离散扩展,例如离散余弦变换、小波变换等基于该原理的变换,这些变换逐渐接近于最优变换,它们的应用采用一种合理的统计行为。相应的多媒体表示法,基本压缩规程,视频和音频模型是基本原理。这些模型基于基本分量包括它们的主观评价的统计表示的基本原理。
The models are:
 Implementation of motion detection and associated compensation in subsequent image frames can significantly reduce the required bandwidth. Successful prediction of image segment locations in future frames reduces the required information update to the required motion vectors. Thus, under this condition, the associated update information is dramatically reduced.
 Combining the presence of motion in video segments with the limitations for human visual systems provides additional bandwidth-reduction potentials. Because the human vision deteriorates when observing moving areas, image blur associated with these regions becomes significantly less noticeable. Consequently, additional image compression can be introduced in segments that contain motion, with minimal noticeable effect.
这些模型是:
 运动检测和在后继图像帧的相关补偿的实现,能够显著地减少所需带宽。成功预言在未来帧中图像块的位置减少了所需信息对所需运动矢量的更新。因而,在这个条件下,显著的减少了相关的更新信息。
 视频块中运动的存在,结合人类视觉系统的局限,提供了进一步减小带宽的可能性。由于人类视觉在观看运动区域时变得更糟,与这些区域相关的图像模糊就显著的变得不那么引人注意。因此,在包含运动的块中就能够以最小的可觉察效果进一步压缩图像。
Human vision is basically 3D. Efficient representation of a 3D signal is a major challenge of multimedia. The most common 3D techniques are based on 2D display techniques. The 3D scene is projected onto two dimensions in the rendering phase of the multimedia chain. The proper hierarchy of object elements and behavior maintains the 3D illusion. The relevant processes include shadowing consideration and preserving the proper hidden body behavior. The required processing resources are still significant. A substantial industry produces various processing components, such as chip sets and graphics boards, to develop solutions for many diverse applications including desktop computing. The associated technology is very effective in high-end applications. Virtual reality models using large screens are impressive even though the presentation remains 2D. In 3D representations, the stereo projection is the best known. The same 3D scene is recorded from two slightly different perspectives, essentially replicating our eyes. The two separate recordings are subsequently presented to the eyes separately. Unlike the early stereo film-based recordings, modern techniques are heavily dependent on digital processing, which corrects for camera-projection inaccuracies, resulting in significantly enhanced stereo display.
人类视觉基本上是三维的。3D信号的有效表示法是多媒体的一个主要课题。大多数3D技术基于2D显示技术。在多媒体的重现阶段,3D景物被投射为二维。对象元素和行为的恰当的层次保持着3D幻觉。有关的处理包括适当的遮蔽和保留身体形态的适当隐藏。所需的处理资源依然重要。制造业生产了各种处理器件,例如芯片和图像板卡,为许多不同的应用包括桌面计算机开发了解决方案。在高端应用中有关技术非常有效。尽管表达依然是2D的,使用大屏幕的虚拟现实模型仍给人留下深刻印象。在3D表示法中,最著名的是立体投影。它模拟我们的眼睛,以两个略微不同的透视点记录同一3D景物。然后这两个记录分别呈现给我们的眼睛。与早期立体电影不同,现代技术强烈地依赖数字处理校正镜头误差,显著地增强了立体显示的效果。
Projection techniques comprise an effective group to recreate multidimensionality from individual projections through the original object. Although this technology has been used very effectively in medical applications, its utility to multimedia applications is not likely to be useful in the near future. The primary limitations are complexity and lack of easy real-time implementation [3.14].
投影技术是通过对原初对象的各个投影的有效组合再现多维影像。尽管这项技术已经在医学上得到非常广泛的应用,但是在近期内在多媒体方面的应用还不会那么多。主要限制在于复杂和难以实时运行。
For efficient representation of color processing, modeling and communication applications, color plays a very important role. The correlation properties among color planes are used in image and video compression algorithms.
对于色彩处理、模型和通信应用的有效表示,色彩具有非常重要的作用。在图像和视频压缩算法中运用了彩色平面中的相关特性。

数字时代广播电视传播形态的变化(一稿全文)

数字时代广播电视传播形态的变化

数字技术在广播电视行业的广泛应用使广播电视这个电子传播媒介走进了数字化时代。美国早已宣布将在2006年停止模拟电视广播。我国也宣布要在2015年停止模拟电视广播。广播电视的模拟时代正在挥手向我们告别。
从传播的角度看,数字时代的广播电视与模拟时代相比会发生哪些变化呢?
数字化这种广播电视的新的生产方式给人们带来了很多、琳琅满目的礼物,惊喜之余,人们发现,最有价值者有三:一是传播信道的利用率成倍地提高了,二是信息的复制代数几乎是无限的,三是信源与信宿之间有了互动反馈的直接通道。
首先,传播信道利用率的成倍提高,使只能传输一套模拟电视节目的一个电视频道可以传送4-8套数字电视节目。对于我国上亿的有线电视用户来说,他们能够收看的节目套数将由几十套猛增到几百套。2005年,我国的直播电视卫星上天后,人们用一个“猫耳朵”(碟式卫星接收天线,还有人形象地称之为“锅”),就可以接受数百套卫星广播电视节目。
路宽了,坑坑洼洼的、仅容一辆车摇摇晃晃地爬行的乡间小道变成了宽阔平坦的高速路,必然带来传播形态的根本变化。观众早已厌倦了广播电视节目的雷同、重复、拙劣模仿。九百六十万平方公里的大国、几亿观众,在“信息爆炸”的时代,每天的信息来源都依赖于仅仅半小时的“联播”,而且被大小会议、领导人活动占去了不知百分之几十。“黄金时间”里,连遥控器都失去了作用,要么都是不知几条算得上新闻的“联播”,要么满屏装腔作势的皇上、淌眼抹泪的嫔妃、咋咋唬唬的格格,观众只有看与不看的选择权。这种现象是频道资源紧缺和计划经济体制共同作用的结果。首先,广播电视频率是稀缺资源,掌握着公共权力的政府独占尚嫌不足,就更不会放给民间了,所以,政府垄断了广播电视的全部频率,也垄断了节目的制作权、集成(广播电视台或频道的设立)权、传输权和播出权等等所有权力,观众只能看到各级权力机关的“喉舌”。其次,在这样的语境中,只有担当着“喉舌”功能的综合频道才是名正气壮的“正宗老大”,即使有那么几个经济、文艺、交通之类的专业台,在人们眼里(包括他们自己)都感觉是“主旋律”之外的“杂耍儿”,是二流的、从属的陪衬,因此才会出现争相竭力地把专业频道办成综合化的现象。各台(频道)都千方百计地跻身于“喉舌”主流,办节目就要千方百计地取悦于官员,最直接的办法就是尽量多让官员在自己的频道里露脸,管它是什么专业台,都要办新闻,管它是什么会议、什么剪彩,只要有官,我的记者就要去拍,拍回来就得发。你敢不发吗?一旦让领导在你的频道里没看到自己的形象,一个电话甩过来:“怎么回事儿?”足够你心慌几天的。某些人在电视上露不露脸,有时是政治风云的晴雨表,耽误了某位大人的前程,或者影响了某个地区的社会政治稳定,你担当得起吗!因此,不是办电视的人压根儿不懂新闻,不是他们不懂受众心理,那是存在着一个特殊的受众群,电视是他家的!
数字技术是广播电视新的生产力,数字化的节目生产方式是广播电视新的生产方式。新生产力的首要优势就是大大提高了资源的利用率,数字技术彻底改变了频道资源紧缺的状况,政府失去了独占频率资源这种本属于公共资源的天然合理性。信息时代社会对信息传播的需求也要求打破政府对频率资源、对信息传播信道的垄断,作为人际传播的电信行业由官办转向市场,顺应了时代潮流,作为大众传播的广播电视也必然要顺应时代的潮流,不过迟早而已,早一点儿它的发展就快一点儿、顺一点儿,迟了不过更被动而已。因为新的生产方式必然要求新的生产关系,所以广播电视的数字化绝不仅仅是建立技术新体系的事。多样化、对象化、个性化,这些喊了多年的事情,现在有了基础、手段,缺什么?缺环境、条件,缺推动广播电视数字化的法规、政策,缺适应、解放数字化这种新的生产力、新的生产方式的新的生产关系。
由于频道资源从紧缺变为丰富,数字技术为广播电视节目的多样化、对象化、个性化准备了物质基础,准备了可能实现的首要条件。多样化,就是不仅要有政府需要的“喉舌”,更多的是社会各种人群不同需求的各种内容和形式,这就是对象化、个性化。频道多了,如果还是老面孔,几百个频道唱“同一首歌”,观众肯定不答应。这几百个频道的生存,一、政府肯定不会包养,既没这个财力也没这个必要;二、单纯靠广告费也不行,因为广告费的投入在某个地区一定时期总量是一定的,饼只会越摊越薄,几十个频道就已经几乎榨干了的广告市场肯定养活不了几百个频道。广播电视节目自身必须商品化,它自身就必须直接带来利润。因此,商品生产的普遍规律,诸如消费决定生产,找准市场、找准消费群体,以销定产等等也必将成为广播电视人的必修课和口头禅。这就是数字时代广播电视媒介的变化之一:广播电视从主要担当“喉舌”功能的宣传工具变为更接近真正意义上的大众传播媒介,媒介把关人的角色淡化,传播的内容变成大众消费的商品,传播过程从以媒介为中心变为以受众为中心。与此相适应,国家必须以法律的形式建立广播电视的许可与准入制度,对广播电视传播的内容和方式进行分类和分级管理,把广播电视频率这些公共资源还给公众,落实宪法赋予的公民言论自由的权利。
数字化的另一个重大成果是信息几乎可以无限次地复制,可以很方便地一次复制出几乎无限量的拷贝。这意味着,一、通过多次的中继、再生,数字信息可以远距离传送,到达地球的每个角落;二、信息传播的成本降得很低,几乎使每个人都能够成为信息传播的源或中继者。前者不仅使电信网成为全球连通的大网,同样也会使广播电视网联通全球,全球化必将深入到文化层面。而对于后者,只要看一看,灵巧方便、价格低廉的DV数码制作技术让多少人圆了自己的导演梦、制片梦,你就能领会到它对广播电视媒介的深远影响。它模糊了节目制作者与受众的界限,它让普通平民有更多的可能成为发言者,成为荧屏上的主角,用不着靠什么人的恩赐:“把更多的镜头对准他们”。这类技术应用的另一项典型产品是个人硬盘录像机(PVR)。个人硬盘录像机是大容量的数字录像机,它不象模拟磁带录像机,它录制和重放的图像、声音与节目源,甚至与制作者的母版几乎具有相同的质量。它优于CD、VCD、DVD等光盘媒介的地方在于可以方便地随意存取和擦除,可编程地自动记录选定的内容,并在用户选定的时间以用户确定的方式和顺序进行重放,而且目前已经达到的上百G比特的容量也远远超过了光盘上百倍。不久,它的市场价格就会与目前的DVD播放机相当。与数字广播电视网络相配合,个人硬盘录像机就可以实现让用户在任意想看的时间看到任意想看的内容。原来是“你播我看”,你播什么、在什么时候播,我只能忍受着乌七八糟广告的煎熬,耐着性子等你播出那一点我想看的东西。现在是我想看什么、想在什么时候看,由我做主。皇上格格们,请回到你们的大清国去吧,回到你们的皇陵里去吧!也许为了不能忘却的过去,为了让呼吸着自由空气的人们形象地领略一下封建专制的嘴脸,有人会把你们请回来再表演一番,那还要看我高兴不高兴看。把关人的角色分散了,受众掌握了更多的选择权,或者说,把关人消失了。
数字化与网络化相结合,使信宿向信源的直接反馈成为可能,增强了广播电视这种大众传播媒介的反馈功能。直接成果之一是广播电视节目的即时点播,例如点播影视大片;之二是互动电视节目,观众参与节目的创作,例如故事情节有不同选择的电视剧,或者按照自己的意愿选择看的角度和方式,例如足球或拳击比赛的实况转播;之三是“三网融合”,广播电视网络同样可以开展信息服务、网络游戏等业务,从而使广播电视业者成为多媒体综合信息服务提供商。当然,“三网融合”不是“三网合一”,广播电视作为强大的大众传播媒介,将一直保持自己的主体性和独立性,点对面的信息传播仍然是它的基本模式,那些全双向的信息业务会成为广播电视业者收益丰厚的“副业”,而所谓“融合”不过是行业界限模糊了些而已。

数字时代广播电视传播形态的变化

数字时代广播电视传播形态的变化

数字技术在广播电视行业的广泛应用使广播电视这个电子传播媒介走进了数字化时代。美国早已宣布将在2006年停止模拟电视广播。我国也宣布要在2015年停止模拟电视广播。广播电视的模拟时代正在挥手向我们告别。
从传播的角度看,数字时代的广播电视与模拟时代相比会发生哪些变化呢?
数字化这种广播电视的新的生产方式给人们带来了很多、琳琅满目的礼物,惊喜之余,人们发现,最有价值者有三:一是传播信道的利用率成倍地提高了,二是信息的复制代数几乎是无限的,三是信源与信宿之间有了互动反馈的直接通道。
首先,传播信道利用率的成倍提高,使只能传输一套模拟电视节目的一个电视频道可以传送4-8套数字电视节目。对于我国上亿的有线电视用户来说,他们能够收看的节目套数将由几十套猛增到几百套。2005年,我国的直播电视卫星上天后,人们用一个“猫耳朵”(碟式卫星接收天线,还有人形象地称之为“锅”),就可以接受数百套卫星广播电视节目。
路宽了,坑坑洼洼的、仅容一辆车摇摇晃晃地爬行的乡间小道变成了宽阔平坦的高速路,必然带来传播形态的根本变化。观众早已厌倦了广播电视节目的雷同、重复、拙劣模仿。九百六十万平方公里的大国、几亿观众,在“信息爆炸”的时代,每天的信息来源都依赖于仅仅半小时的“联播”,而且被大小会议、领导人活动占去了不知百分之几十。“黄金时间”里,连遥控器都失去了作用,要么都是不知几条算得上新闻的“联播”,要么满屏装腔作势的皇上、淌眼抹泪的嫔妃、咋咋唬唬的格格,观众只有看与不看的选择权。这种现象是频道资源紧缺和计划经济体制共同作用的结果。首先,广播电视频率是稀缺资源,掌握着公共权力的政府独占尚嫌不足,就更不会放给民间了,所以,政府垄断了广播电视的全部频率,也垄断了节目的制作权、集成(广播电视台或频道的设立)权、传输权和播出权等等所有权力,观众只能看到各级权力机关的“喉舌”。其次,在这样的语境中,只有担当着“喉舌”功能的综合频道才是名正气壮的“正宗老大”,即使有那么几个经济、文艺、交通之类的专业台,在人们眼里(包括他们自己)都感觉是“主旋律”之外的“杂耍儿”,是二流的、从属的陪衬,因此才会出现争相竭力地把专业频道办成综合化的现象。各台(频道)都千方百计地跻身于“喉舌”主流,办节目就要千方百计地取悦于官员,最直接的办法就是尽量多让官员在自己的频道里露脸,管它是什么专业台,都要办新闻,管它是什么会议、什么剪彩,只要有官,我的记者就要去拍,拍回来就得发。你敢不发吗?一旦让领导在你的频道里没看到自己的形象,一个电话甩过来:“怎么回事儿?”足够你心慌几天的。某些人在电视上露不露脸,有时是政治风云的晴雨表,耽误了某位大人的前程,或者影响了某个地区的社会政治稳定,你担当得起吗!因此,不是办电视的人压根儿不懂新闻,不是他们不懂受众心理,那是存在着一个特殊的受众群,电视是他家的!
数字技术是广播电视新的生产力,数字化的节目生产方式是广播电视新的生产方式。新生产力的首要优势就是大大提高了资源的利用率,数字技术彻底改变了频道资源紧缺的状况,政府失去了独占频率资源这种本属于公共资源的天然合理性。信息时代社会对信息传播的需求也要求打破政府对频率资源、对信息传播信道的垄断,作为人际传播的电信行业由官办转向市场,顺应了时代潮流,作为大众传播的广播电视也必然要顺应时代的潮流,不过迟早而已,早一点儿它的发展就快一点儿、顺一点儿,迟了不过更被动而已。因为新的生产方式必然要求新的生产关系,所以广播电视的数字化绝不仅仅是建立技术新体系的事。多样化、对象化、个性化,这些喊了多年的事情,现在有了基础、手段,缺什么?缺环境、条件,缺推动广播电视数字化的法规、政策,缺适应、解放数字化这种新的生产力、新的生产方式的新的生产关系。
由于频道资源从紧缺变为丰富,数字技术为广播电视节目的多样化、对象化、个性化准备了物质基础,准备了可能实现的首要条件。多样化,就是不仅要有政府需要的“喉舌”,更多的是社会各种人群不同需求的各种内容和形式,这就是对象化、个性化。频道多了,如果还是老面孔,几百个频道唱“同一首歌”,观众肯定不答应。这几百个频道的生存,一、政府肯定不会包养,既没这个财力也没这个必要;二、单纯靠广告费也不行,因为广告费的投入在某个地区一定时期总量是一定的,饼只会越摊越薄,几十个频道就已经几乎榨干了的广告市场肯定养活不了几百个频道。广播电视节目自身必须商品化,它自身就必须直接带来利润。因此,商品生产的普遍规律,诸如消费决定生产,找准市场、找准消费群体,以销定产等等也必将成为广播电视人的必修课和口头禅。