Xem mẫu

Compressed Video Communications Abdul Sadka Copyright © 2002 John Wiley & Sons Ltd ISBNs: 0-470-84312-8 (Hardback); 0-470-84671-2 (Electronic) 6 Video Transcoding for Inter-network Communications S. Dogan, A.H. Sadka 6.1 Introduction Dueto the expansion and diversityof multimediaapplicationsand the underlying networking platforms with their associated communication protocols, there has been a growing need for inter-network communications and media gateways. Eventually, these applications will encounter compatibility problems. Not only will asymmetric networks run different set of communication protocols, but they will also operate various kinds of incompatible source coding algorithms that are characterised by different target bit rates and compression techniques. Therefore, the interoperability of these source coders necessitates the presence of a control unit which acts as a media traffic gateway lying on the borders of the underlying networking platforms. This chapter is dedicated to the investigation of various methods which achieve the interoperability of compressed video streams while taking into consideration the application-driven constraints and the varying network conditions. The video transcoding algorithms are examined and ana-lysed, and their performances are evaluated using both subjective and objective methods. 6.2 What is Transcoding? Video transcoding comprises the necessary operations for the conversion of a compressed video stream from one syntax to another one for inter-network communications. Thus, the tool that makes use of this algorithm to perform the necessary conversions is called a video transcoder. The original idea behind video transcoding was the scaleability of video coding techniques(Ghanbari, 1989; Radha and Chen, 1999). These techniques comprise a 216 VIDEO TRANSCODING FOR INTER-NETWORK COMMUNICATIONS layeredvideo encoderstructurethat providesdifferent layersof compressedvideo, with each layer coded at a different bit rate. Scaleability allows the video coder to produce different video streams at different bit rates and QoS levels using only a single video source. At the time, this was necessary due to the wide deployment of video-on-demand (VoD) applications, where high-resolution high-quality video was required for delivery to network subscribers with bandwidth-limited or con-gested links. In such cases, the most appropriate low bit rate version of the bit stream could be chosen at the expense of smaller resolution and lower perceptual quality. Layering was accomplished with one base layer providing the minimum requirementsfor the reconstructionof low bit rate video and several enhancement layers (on top of the base layer) for enhanced quality resulting in increased bit rates. According to the varying network conditions, adequate bit rates were achieved by selecting either the base layer only or the base plus one or more enhancement layers. However, scaleable encoding required the use of complex scaleability techniques, leading to extra processing power requirements and addi-tional delays resulting in complex and sub-optimal video encoder and decoder implementations. Besidescomplexity,the frequent changesin networkconditions and constraints require necessary actions to be taken at a different location (other than encoder and decoder) within the network. This specific location, as seen in Figure 6.1, is referred to as video proxy, that enables faster network responses. The video proxy helps the video encoders and decoders remain free of unnecessary Network-1 Multimedia Networking Network-2 H.263 CIF resolution 25 fr/s 256 kbit/s Transmitting source Network-4 20 fr/s 64 kbit/s Video Proxy MPEG-4 Error-prone channel QCIF resolution Network-3 CIF resolution 25 fr/s 4 Mbit/s or more H.263 96 kbit/s QCIF resolution 25 fr/s MPEG-2 Congested channel Figure 6.1 A heterogeneous multimedia networking scenario using a transcoder at the video proxy 6.3 HOMOGENEOUS VIDEO TRANSCODING 217 compression algorithm:A1 bit rate:BR1 frame rate:FR1 resolution:RES1 video transcoder compression algorithm:A1 or A2 bit rate:BR1 or BR2 frame rate:FR1 or FR2 resolution:RES1 or RES2 network-1 network-2 Figure 6.2 Video transcoding complexitiesincurredbythescaleabilityalgorithms.Avideoproxycanconsistofa single or a group of video transcoders operating simultaneously. Therefore, video transcoding is a process whereby an incoming compressed video stream is converted to a different video format, size, transmission rate or simply translated to a new syntax without the need for the full decoding/re-encodingoperations,as depictedin Figure 6.2. Using transcoding,the complexity, processing power and delay incurred by the necessary conversion operations are keptminimalwhileachievinganimprovementto thedecodedvideoquality(Bjork and Christopoulos, 1998; Kan and Fan, 1998; Keesman et al., 1996). Four major types of video transcoding algorithms have been proposed and presented (Assuncao and Ghanbari, 1996; Kan and Fan, 1998; Keesman et al., 1996; de los Reyes et al., 1998; Warabino et al., 2000; Youn, Sun and Xin, 1999; Youn and Sun, 2000). The most commonly discussed one is the homogeneous video transcoding that comprises bit rate, frame rate and/or resolution reduction algorithmsforvarying transmissionconditions.Heterogeneousvideo transcoding has become popular as diverse multimedia networks have emerged and become operational.Moreover, the third and fourth types are gaining increasing attention for error resilience applications and multimedia traffic planning purposes. 6.3 Homogeneous Video Transcoding Homogeneousvideo transcoding algorithms aim to reduce the bit rate, frame rate and/or resolution of the pre-encoded video stream. The reason they are called homogeneous transcoding methods is that they do not involve any kind of syntax modifications to coded video data. Therefore, the incoming compressed video stream preserves its format and compression characteristics after it has been converted to a lower rate or resolution, as illustrated in Figure 6.3. By using the incoming video bit stream as input to the video transcoder, it is possible to transmit the transcoded video data onto the communication channels that have different bandwidth requirements, and at various output bit rates. This very important feature gives support for multipoint video conferencing scenarios. 218 VIDEO TRANSCODING FOR INTER-NETWORK COMMUNICATIONS Higher Bit Rate Higher Frame Rate Higher Resolution Figure 6.3 VIDEO Video Coding Standard-X Homogeneous video transcoding Lower Bit Rate Lower Frame Rate Lower Resolution Thereare two methodsfor combiningmultiplevideostreams to achievesuccessful video conferencing, namely the coded domain combiner and transcoding. The former is rather a simple and a less complex process, whereby the outgoing video stream is obtained by concatenating the incoming multiple video streams. Thus, thecombinedbit rate is the sum of bit rates of all the incoming videostreams. This methoddistributes the available bandwidth evenly among all the participants of a videoconferencing session. Therefore, the input/output bit rates for each user become highly asymmetric, yet allocating bandwidth to video sources regardless of their activity. On the other hand, the latter method, namely transcoding, partially decodes each of the incoming video streams, combines them in the pixel domain and re-encodes the video data in the form of a single video stream. This method provides every user with full bandwidth and uniform video quality due to the re-encoding of high motion areas of active conference participants with higher bit rates. Obviously, this second method incurs a higher complexity than the simpler combination method (Sun, Wu and Hwang, 1998). Similarly, Lin, Liou and Chen (2000) present a dynamic rate control method that operates in the video transcoder to enhance the visual quality and allow region of interest (ROI) coding in multipoint video conferencing. This method firstly identifies the active conference participants from the multiple incoming video streams. Then the motion active streams are transcoded with a more optimised bit allocation approach at the expense of relatively reduced qualities provided to inactive users. Researchintohomogeneousvideotranscodinghasbeen boostedby theincreas-ing popularity of VoD applications. Since VoD data is encoded as a high quality, high resolution and high bit rate MPEG-2 stream (i.e. a few Mbit/s), reducing the rate is at times necessary, particularly when an end-user cannot handle the rate of the original video stream. This rate reduction is also necessary in bandwidth-limited networks or even at congested network nodes. Not only the original rate, but also sometimes the original spatial video resolution need to be reduced (such as CIF to QCIF) as end-users are equipped with smaller resolution displays. 6.4 BIT RATE REDUCTION 219 6.4 Bit Rate Reduction Bitratereductionalgorithmshavebeenthemostpopularresearchtopicamongall the different video transcoding schemes available so far, due to considerable interestinVoDapplications.Theexamplesofstandardrateconversionscaneasily be found in literature for high bit rate video transmissions, such as conversions from a few Mbit/s down to a few hundred kbit/s. However, due to the deployment of mobile wireless interfaces and satellite links, conversions from high to low rates andfromlowtoverylowbit rates(i.e.fromafewMbit/stoafewhundredkbit/sor from a few hundred kbit/s to a few ten kbit/s) have also become increasingly important. As described in Chapter 3, the incoming bit rates can be down-scaled either by arbitrarily selecting the high-frequency discrete cosine transform (DCT) coeffi-cientsfirst and then simplydiscarding(truncating)them (Assuncaoand Ghanbari, 1997) or by performing a re-quantisation process with a coarser quantisation step-size (Nakajima, Hori and Kanoh, 1995; Sun, Wu and Hwang, 1998; Werner, 1999). Both methods reduce the number of DCT coefficients by causing a number of them to become zero coefficients, thereby reducing the number of non-zero coefficients to be coded. This gives rise to a lower bit rate at the output of the transcoder. One of the bit rate reduction methods is the re-quantisation of the transform coefficients, as already discussed in Chapter 3. Re-quantisation is achieved by the useofbuilt-inscalarquantisersinMPEGvideostandards.Asecondapproachhas been introduced by Lois and Bozoki (1998). Instead of using the scalar quantisa-tion in the transcoder, a lattice vector quantiser (LVQ) is applied to exceed the MPEG compression capabilities while providing acceptable quality. LVQ is a multidimensionalgeneralisationof uniform step scalar quantisers which produces minimal distortion for a certain input of uniform distribution. The codebook storage is not required and the search complexity is simplified. LVQ allows the quantisation errors to be more uniform in the transcoded pictures, and hence smaller artefacts are visible on the edges. However, the drawbackof the algorithm is that LVQ transcoding leads to MPEG-incompatible bit streams. Therefore, a lowcomplexityand lowcost user interfaceis alsoneeded, whichinvolvesthe LVQ decoderand the MPEGentropy encodingengine. The output can then be directly fedintoanMPEGvideodecoderattheveryendofthetelecommunicationsystem. The DCT is a widely used method in most of the current image and video compression standards, such as JPEG, MPEG, H.26X series, etc. Guo, Au and Letaief (2000) present three distribution parameter estimation methods based on the de-quantised values of DCT coefficients used in the transcoding schemes. The methods achieve good transcoding qualities even for fixed rate scenarios. Bit rate reduction can be accomplished using one of five different schemes. The first one is the conventional cascaded fully decoding/re-encoding scheme. The ... - tailieumienphi.vn
nguon tai.lieu . vn