3.3 Signal-Processing Elements
Many classical signal-processing procedures have become deeply embedded in the multidimensional fields. A key driver is optimization for representation of multimedia components, as well as the associated storage and delivery requirements. The optimization procedures range from very simple to sophisticated. Some of the principal techniques are the following:
Nonlinear analog (video and audio) mapping
Quantization of the analog signal
Motion representation and models
A nonlinear analog (video and audio) mapping procedure may be purely analog. Its intention may be the desire to enhance the delivery process. It could also be introduced to mask the limitations of various components of the overall multimedia chain. Typical constraints are introduced by bandwidth limitations and constrained dynamic range in the display terminal.
Quantization of the analog signal is fundamental to any digital representation that has originated in the analog world. The quantization process is an inherently lossy procedure and fundamentally noninvertible. This classical signal-processing element still remains the basic constraint in limiting performance, although not very exciting compared with other multimedia issues [3.5]. Quantization techniques comprise a whole field by themselves. The major relevant issues include uniform and nonuniform techniques and adaptive and nonadaptive procedures [3.6].
Statistical concepts and applications are directly and indirectly strongly embedded in processing components associated with multimedia. This relevant field is part of classical signal processing, and we can only highlight the major categories. A spectral analysis is fundamental to the entire range of image models for filtering and algorithm design. The procedures are critical to both visual and audio data components [3.7, 3.8]. Statistical redundancy is the basic concept upon which the entire field of data compression is based. Mathematical extension of the concept leads to optimum transform for decorrelation. This in turn leads to the entire field of modem transform-coding technology [3.9]. Model-based representations, primarily for compression, are determined from assumed or derived statistical models. The classes of transform-coding algorithms are based on this technology [3.10]. The utility of Fourier transform and its discrete extensions such as Discrete Cosine Transform (DCT), wavelets and others are based on the principle that these transforms asymptotically approach the optimum transform, assuming a reason- able statistical behavior [3.11]. Visual and audio models are fundamental to the relevant multimedia representations, primarily compression procedures. These models are based on fundamental statistical representations of the elementary components, including their evaluation by the human observer [3.12, 3.13].
The models are:
Implementation of motion detection and associated compensation in subsequent image frames can significantly reduce the required bandwidth. Successful prediction of image segment locations in future frames reduces the required information update to the required motion vectors. Thus, under this condition, the associated update information is dramatically reduced.
Combining the presence of motion in video segments with the limitations for human visual systems provides additional bandwidth-reduction potentials. Because the human vision deteriorates when observing moving areas, image blur associated with these regions becomes significantly less noticeable. Consequently, additional image compression can be introduced in segments that contain motion, with minimal noticeable effect.
Human vision is basically 3D. Efficient representation of a 3D signal is a major challenge of multimedia. The most common 3D techniques are based on 2D display techniques. The 3D scene is projected onto two dimensions in the rendering phase of the multimedia chain. The proper hierarchy of object elements and behavior maintains the 3D illusion. The relevant processes include shadowing consideration and preserving the proper hidden body behavior. The required processing resources are still significant. A substantial industry produces various processing components, such as chip sets and graphics boards, to develop solutions for many diverse applications including desktop computing. The associated technology is very effective in high-end applications. Virtual reality models using large screens are impressive even though the presentation remains 2D. In 3D representations, the stereo projection is the best known. The same 3D scene is recorded from two slightly different perspectives, essentially replicating our eyes. The two separate recordings are subsequently presented to the eyes separately. Unlike the early stereo film-based recordings, modern techniques are heavily dependent on digital processing, which corrects for camera-projection inaccuracies, resulting in significantly enhanced stereo display.
Projection techniques comprise an effective group to recreate multidimensionality from individual projections through the original object. Although this technology has been used very effectively in medical applications, its utility to multimedia applications is not likely to be useful in the near future. The primary limitations are complexity and lack of easy real-time implementation [3.14].
For efficient representation of color processing, modeling and communication applications, color plays a very important role. The correlation properties among color planes are used in image and video compression algorithms.
3.3 Signal-Processing Elements