Wednesday, 30 January 2013

Video Compression Standards


In the earlier post basics of digital video compression was discussed. In this post five broad application areas of video and H.26x and MPEG4 video compression standards will be discussed.
  In all cases video frames are acquired by camera, compressed and stored in a non-volatile memory. Based on the requirement compressed video is transmitted to other places. Quality of video depends upon following factors viz: Number of frames per second, frame resolution, pixel depth. A High Definition (HD) video exhibits 25 frames per second and each frame has 1920 x 1080 pixels.  A pixel consumes 24 bits to represent RGB values. Video quality has a direct bearing on cost. One has to understand the requirement first and based on that video quality has to be finalized to keep cost at minimum.

1. Studio:
In video and film production video taken in a set or location is called raw footage. Then footages are taken up editing. Here video processing operations like colour space conversion, noise reduction, motion compensation and frame rate conversion is carried out based on the requirement. After this Director and editor of the movie sit together and remove unwanted portions and then rearrange footage in an order to make a movie. At the time of editing there will be loss of quality. To compensate this raw footage should be of highest quality.

2. Television:
         Digital television signals are broadcasted through terrestrial transmitters or by satellite transponders. Digital Terrestrial broadcast is popular in USA and Europe. Digital video is economically stored and distributed through Video Compact Disc (VCD) and Digital Versatile Disc (DVD). In News clips frame-to-frame transition  will be less. In sports and action movies frame-to-frame transition will be high. Digital Video signals are optimized to standard resolution TVs ( old NTSC, PAL, SECAM). Earlier MPEG-1 video compression standard was used. Now MPEG-2 is used to get HD (HD720, HD1080) quality.

Figure 1.  Frame sizes used by different TV standards. 
3. Internet streaming:
 In video streaming continuous data is sent to the client over the Internet and the user is able to decode and view the video in near real-time. Internet is slowly becoming like a video server. YouTube, metacafe, dailymotion are few examples for popular video servers. The files used by servers are MOV, FLV, MPEG-4, 3GP and FLV.  It is called as wrappers and it contains meta-data. The video codec used are MPEG-4, H.264, Sorenson Spark, VC-1 etc [1].  The video resolution available is 240, 360 and HD. In streaming latency (time delay) is the greatest problem. The problem of latency is unheard in broadcast technologies. In video streaming server do not allow to store the content but on-line tools are available to store in local hard disk.

4. Video conferencing:
        The next great thing is video conferencing. It may be one-to-one or conference call. Foundation for video telephony started 40 years back. Integrated Services Digital Network (ISDN) technology was built to handle video telephony. A new compression standard H.261 was built. At that time video telephony was not commercially successful and stood as technological feat only. Advent of third generation (3G) wireless technologies recreated the interest in video telephony and conferencing. Video conference system has much more stringent latency considerations. Humans can tolerate loss in visual quality and not on latency. Now H.264 protocol is used widely. The video resolution will be 352 x288 i.e. one fourth size of PAL TV signals.

5. Surveillance:
     Fall in prices surveillance video systems and proven ability in crime prevention and crime detection made wider deployment. The video should be of high quality so as to able to recognize suspect face and the video content should not be altered. If it altered then it will not be accepted as proof in the court of law. They use Motion JPEG, H.264 and MPEG standards for recording surveillance video. Real-time monitoring systems use H.264 and MPEG video codecs and to capture frames Motion JPEG (MJPEG) codec are employed. Entertainment industry and surveillance industry requirements are totally different. Poor lighting, 24x7 storage requirement are unique to surveillance applications [2].

Video Compression standard interrelationships:

          There is a long list of video compression standards are available. Careful study of various standards will reveal lot of commonality among them. MPEG and H.26x  stands out as top contenders.

I. MPEG:
            Motion Pictures Experts Group (MPEG) is a study group that develops digital video standards for International Standards Organization (ISO) and International Electrotechnical Commission (IEC).  These standards were built for entertainment industry.  In 1993 MPEG-1 was introduced to store digital video with a quality equal to VHS tape. In 1994 MPEG-2 was developed to handle HD video. MPEG-3 was merged with MPEG-2. MPEG-4 was introduced in 1999 and it uses Wavelet transform instead of Discrete Cosine Transform (DCT). Variants like MPEG-7 and MPEG-21 are available [3],[4].

II. H.26x:
           International Telecommunications Union’s Telecommunication standards division was responsible to develop H.26x series of standards. These standards were built to handle video calls. In video calls the frame-to-frame transition will be less. Most of the time it has to transmit human face which moves mildly. This H.26x is network resilient and it has lowest latency. To reduce latency  ‘B’ frames will be avoided in the coded frames. As it evolved from telephone systems it uses 64k chunks. Here also DCT is used.
            Later H.262 was developed and  it is similar to MPEG-2. Then H.263 was developed. In 1999 developers of H.26x, Video Coding Experts Group (VCEG) joined with ISO/IEC to form Joint Video Team (JVT). They built H.264/MPEG-4 Part 10. This standard is otherwise called as Advanced Video Coding (AVC). In MPEG-4 there are 16 parts and 10th part discuss about video coding. The 4:2:0 sampling is used and both progressive and interlaced scanning is permitted.

III. Motion JPEG:
          It was developed in 1992. Here only intra-frame coding is used. Put it simply each frame is a JPEG image. It never uses inter-frame coding. Because of this compression efficiency is poor but it has relatively less latency and more resilient to errors. One may wonder how MJPEG different from JPEG. In MJPEG 16 JPEG frames are shown within a second to create an illusion of motion. It consumes more storage size but it contains more information. It frames can be used as proof in court of law. In MPEG systems it sends only two to four full frames (I-frames) per second to receiver. Now MJPEG2000 that is similar to JPEG2000 is introduced. It uses Wavelet transform instead of DCT. It is computationally tedious but compression efficiency is high. [4]

Source:
 [1] Yassine Bechqito, High Definition Video Streaming Using H.264 Video Compression, Master's Thesis, Helsinki Metropolia University of Applied Sciences. pg.18, 21(PDF, 3642 KB)
[2] http://www.initsys.net/attachments/Compression and DigitisationPDF.pdf (PDF, 242 KB)
[3] Iain E. G. Richardson, “H.264 and MPEG-4 Video Compression Video Coding for Next-generation Multimedia,” John Wiley & Sons Ltd, 2003. (Examples are very good.  ISBN 0-470-84837-5)
[4] Salent-Compression-Report.pdf,  http://www.salent.co.uk/downloads/Salent-Compression-Report.pdf  (PDF, 1921 KB)