Deck 5: Lossy Compression Algorithms, Image Compression Standards and Basic Video Compression Techniques

Full screen (f)
exit full mode
Question
Explain why the factorization steps in Eqs. (8.24) and (8.25) save CPU cycles. About how much faster is the factorized approach to the DCT than the straightforward definition in (8.17)? (In both approaches, assume that of course the cosine matrix values are pre-stored outside any loop.)
Use Space or
up arrow
down arrow
to flip the card.
Question
The top part of Figure 8.1 shows the DCT F(u)F(u) for a 1-D signal f(i)f(i) .
i. What does F(5)F(5) tell you in terms of the image content in f(i)f(i) ?
ii. What is the reason that F(5)F(5) is less than 0 ?
(b) The Inverse Discrete Cosine Transform (IDCT) can be viewed as a reconstruction process in which the Cosine basis functions are scaled and added one by one to reconstruct f~(i)\tilde{f}(i) . Assume F(0)=500F(0)=500 and F(1)=70F(1)=-70 ; in the space provided at the bottom part of the figure,
i. draw f~(i)\tilde{f}(i) after only the F(0)F(0) is used,
ii. draw f~(i)\tilde{f}(i) after F(0)F(0) and F(1)F(1) are used.
Show all details including magnitudes, and explain why they look like that.
Question
Suppose we have 32 samples from a mono audio file.
(a) If the audio was sampled at 8kHz8 \mathrm{kHz} , how many seconds does this represent?
(b) If we carry out a DCT with blocksize equal to 32, how many DCT coefficients do we end up with?
(c) Numbering the DCT coefficients from 0,
i) What frequency, in Hz\mathrm{Hz} , does coefficient number 1 represent?
ii) What frequency, in Hz\mathrm{Hz} , does coefficient number 16 represent?
Question
Another name for zig-zag coding is "zonal coding". Suppose you invent a new zonal coding scheme for JPEG that simply discards diagonals above the first few. Suppose we keep the first six zig-zag lines.
(a) How many coefficients are we keeping?
(b) How will we do, compared to keeping all the zig-zags (still using run-length encoding)? Comment on both compression capability and image quality.
Question
Briefly explain why DCT was chosen for JPEG compression.
Question
Why does JPEG give compression? At which points in the algorithm does compression occur? Please estimate (as a general guess) how much compression occurs at each of these points.
Question
Explain why the block DCT is preferred to taking the whole image DCT in JPEG compression.
Question
Describe how H.261 deals with temporal and spatial redundancies in video.
Question
Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).)
To simplify the data, we assume that the size of macroblocks is 8×88 \times 8 instead of 16×1616 \times 16 .
Intensity Values for the First Image:
 Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).) To simplify the data, we assume that the size of macroblocks is  8 \times 8  instead of  16 \times 16 . Intensity Values for the First Image:   Intensity Values for the Second Image:  <div style=padding-top: 35px>
Intensity Values for the Second Image:
 Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).) To simplify the data, we assume that the size of macroblocks is  8 \times 8  instead of  16 \times 16 . Intensity Values for the First Image:   Intensity Values for the Second Image:  <div style=padding-top: 35px>
Question
We have seen that a logarithmic-based block search strategy is fast: it is "suboptimal but still usually effective." That being the case, why should we be bothered carrying out a full search in any situation? Why not just stick to the fast method - what is the advantage to be gained, if any, from a comprehensive, optimal search strategy?
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/10
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 5: Lossy Compression Algorithms, Image Compression Standards and Basic Video Compression Techniques
1
Explain why the factorization steps in Eqs. (8.24) and (8.25) save CPU cycles. About how much faster is the factorized approach to the DCT than the straightforward definition in (8.17)? (In both approaches, assume that of course the cosine matrix values are pre-stored outside any loop.)
Not Answer
2
The top part of Figure 8.1 shows the DCT F(u)F(u) for a 1-D signal f(i)f(i) .
i. What does F(5)F(5) tell you in terms of the image content in f(i)f(i) ?
ii. What is the reason that F(5)F(5) is less than 0 ?
(b) The Inverse Discrete Cosine Transform (IDCT) can be viewed as a reconstruction process in which the Cosine basis functions are scaled and added one by one to reconstruct f~(i)\tilde{f}(i) . Assume F(0)=500F(0)=500 and F(1)=70F(1)=-70 ; in the space provided at the bottom part of the figure,
i. draw f~(i)\tilde{f}(i) after only the F(0)F(0) is used,
ii. draw f~(i)\tilde{f}(i) after F(0)F(0) and F(1)F(1) are used.
Show all details including magnitudes, and explain why they look like that.
Not Answer
3
Suppose we have 32 samples from a mono audio file.
(a) If the audio was sampled at 8kHz8 \mathrm{kHz} , how many seconds does this represent?
(b) If we carry out a DCT with blocksize equal to 32, how many DCT coefficients do we end up with?
(c) Numbering the DCT coefficients from 0,
i) What frequency, in Hz\mathrm{Hz} , does coefficient number 1 represent?
ii) What frequency, in Hz\mathrm{Hz} , does coefficient number 16 represent?
(a) 32/8000sec==4msec32/8000 \mathrm{sec}==4 \mathrm{msec}
(b) 32
(c) i) 1/21/2 cycle per 32 samples
=1/2=1/2 cycle per 32/800032 / 8000 sec
=1/2/(32/8000)=1/2 /(32/8000) cycles-per-sec =4000/32=4000 / 32 cycles-per-sec =125 Hz=125 \mathrm{~Hz}
 (a)  32/8000 \mathrm{sec}==4 \mathrm{msec}  (b) 32 (c) i)  1/2  cycle per 32 samples  =1/2  cycle per  32 / 8000  sec  =1/2 /(32/8000)  cycles-per-sec  =4000 / 32  cycles-per-sec  =125 \mathrm{~Hz}    Fig. 8.1: DCT for a 1-D signal; reconstruction using only  F(0)  as opposed to only  F(0)  and  F(1) . ii)  =8  cycles per  32/8000 \mathrm{sec}   =8 /(32/8000)  cycles-per-sec  =2 \mathrm{kHz}  (Note that the max frequency we analyze for is  4 \mathrm{kHz} .) BEGIN not\_on\_exam:  \begin{aligned} & \# \text { S-plus language script. } \\ & \# \text { 1Ddct.s } \\ & \text { blocksize }=32 \\ & \text { pi }=3.14159265 \\ & \text { factor }=2.0 / \text { blocksize } \\ & \text { denom }=\left(2.0^{*} \text { blocksize }\right) / p i \\ & \text { Lambda }=\text { function }(u)\{ \\ & \text { if }(u==0) \text { return }(1 / \operatorname{sqrt}(2))\\ & \text{else return(1)}\\ & \}~\# \text{end of Lambda} \end{aligned}    END not on exam. Fig. 8.1: DCT for a 1-D signal; reconstruction using only F(0)F(0) as opposed to only F(0)F(0) and F(1)F(1) .
ii) =8=8 cycles per 32/8000sec32/8000 \mathrm{sec}
=8/(32/8000)=8 /(32/8000) cycles-per-sec =2kHz=2 \mathrm{kHz}
(Note that the max frequency we analyze for is 4kHz4 \mathrm{kHz} .)
BEGIN not\_on\_exam:
# S-plus language script. # 1Ddct.s  blocksize =32 pi =3.14159265 factor =2.0/ blocksize  denom =(2.0 blocksize )/pi Lambda = function (u){ if (u==0) return (1/sqrt(2))else return(1)} #end of Lambda\begin{aligned}& \# \text { S-plus language script. } \\& \# \text { 1Ddct.s } \\& \text { blocksize }=32 \\& \text { pi }=3.14159265 \\& \text { factor }=2.0 / \text { blocksize } \\& \text { denom }=\left(2.0^{*} \text { blocksize }\right) / p i \\& \text { Lambda }=\text { function }(u)\{ \\& \text { if }(u==0) \text { return }(1 / \operatorname{sqrt}(2))\\& \text{else return(1)}\\& \}~\# \text{end of Lambda}\end{aligned}
 (a)  32/8000 \mathrm{sec}==4 \mathrm{msec}  (b) 32 (c) i)  1/2  cycle per 32 samples  =1/2  cycle per  32 / 8000  sec  =1/2 /(32/8000)  cycles-per-sec  =4000 / 32  cycles-per-sec  =125 \mathrm{~Hz}    Fig. 8.1: DCT for a 1-D signal; reconstruction using only  F(0)  as opposed to only  F(0)  and  F(1) . ii)  =8  cycles per  32/8000 \mathrm{sec}   =8 /(32/8000)  cycles-per-sec  =2 \mathrm{kHz}  (Note that the max frequency we analyze for is  4 \mathrm{kHz} .) BEGIN not\_on\_exam:  \begin{aligned} & \# \text { S-plus language script. } \\ & \# \text { 1Ddct.s } \\ & \text { blocksize }=32 \\ & \text { pi }=3.14159265 \\ & \text { factor }=2.0 / \text { blocksize } \\ & \text { denom }=\left(2.0^{*} \text { blocksize }\right) / p i \\ & \text { Lambda }=\text { function }(u)\{ \\ & \text { if }(u==0) \text { return }(1 / \operatorname{sqrt}(2))\\ & \text{else return(1)}\\ & \}~\# \text{end of Lambda} \end{aligned}    END not on exam. END not on exam.
4
Another name for zig-zag coding is "zonal coding". Suppose you invent a new zonal coding scheme for JPEG that simply discards diagonals above the first few. Suppose we keep the first six zig-zag lines.
(a) How many coefficients are we keeping?
(b) How will we do, compared to keeping all the zig-zags (still using run-length encoding)? Comment on both compression capability and image quality.
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
5
Briefly explain why DCT was chosen for JPEG compression.
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
6
Why does JPEG give compression? At which points in the algorithm does compression occur? Please estimate (as a general guess) how much compression occurs at each of these points.
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
7
Explain why the block DCT is preferred to taking the whole image DCT in JPEG compression.
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
8
Describe how H.261 deals with temporal and spatial redundancies in video.
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
9
Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).)
To simplify the data, we assume that the size of macroblocks is 8×88 \times 8 instead of 16×1616 \times 16 .
Intensity Values for the First Image:
 Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).) To simplify the data, we assume that the size of macroblocks is  8 \times 8  instead of  16 \times 16 . Intensity Values for the First Image:   Intensity Values for the Second Image:
Intensity Values for the Second Image:
 Given the following two consecutive frames in an H.261 video, if the first image is treated as an Iframe and the second a P-frame. Show in detail how the second image is encoded. (Work out the numbers at each step. You can assume any quantizer and stop after the RLC (Run-length Coding).) To simplify the data, we assume that the size of macroblocks is  8 \times 8  instead of  16 \times 16 . Intensity Values for the First Image:   Intensity Values for the Second Image:
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
10
We have seen that a logarithmic-based block search strategy is fast: it is "suboptimal but still usually effective." That being the case, why should we be bothered carrying out a full search in any situation? Why not just stick to the fast method - what is the advantage to be gained, if any, from a comprehensive, optimal search strategy?
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 10 flashcards in this deck.