(ELEC360)E360-03F-Q01 Paper.pdf

=========================preview======================
(ELEC360)E360-03F-Q01 Paper.pdf
Back to ELEC360 Login to download
======================================================

Hong Kong University of Science and Technology
Department of Electrical & Electronic Engineering

ELEC 360 C Digital Media and Multimedia Applications By Dr. Sam Chiu
Fall 2003 Quiz #1
Name:
ID:

Email:
Answer all questions. Time allowed: 45 minutes
1. Audio Coding and Processing
The GSM (European cellular) standard encodes speech signals in 20 ms frame. Each frame contains 160 samples and is encoded into 260 bits.
(a)
What is the sampling rate of the signal before compression? [1 mark]

(b)
The original signal is low-pass filtered before sampling. What is the maximum allowable cut-off frequency? [1 mark]

(c)
What is the data rate of the encoded signal? [1 mark]

(d)
The speech signal is usually non-linearly transformed at the transmitter end before quantization, such that small values are enhanced, and large values are compressed. In the following figure, if the slanted dash line represents the case of Vout = Vin, i.e. no transformation, sketch the effect of the non-linear transformation as described.

[1 mark]
Vout Vout = Vin
Vin
(e)
Suggest two reasons why the operation described in (d) above is performed. Explain your answer. [2 marks]

(f)
Additionally, a technique known as noise gating is used to eliminate background noise in the quiet gaps between spoken words. In the following waveform, indicate an appropriate threshold level for effective noise gating. [1 mark]

Signal amplitude
time

(g) The GSM coder has a Mean Opinion Score (MOS) of 3.6. Explain what is meant by MOS. [1 mark]

2. MPEG-1 Audio
MPEG-1 Audio Layer III (more popularly known as MP3) uses both subband coding and perceptual coding. The signal is analyzed by a filter bank of 32 equally spaced subbands that are 750 Hz wide at a sampling rate of 48 kHz.
(a)
Explain the relationship between these numbers: 32 subbands, 750 Hz bandwidth, and 48 kHz sampling rate. [1 mark]

(b)
Encoding is performed on a window of 3 frames of 12 samples each. What is the window length in milliseconds? [1 mark]

(c)
MP3 encoding uses both frequency masking and temporal masking in its

psychoacoustic model. Briefly explain the two phenomena with simple sketches. [2 marks]

(d)
Which of the following techniques is NOT used by MP3 encoding to achieve further data compression? (Circle all that apply) [2 marks]

A. Modified discrete cosine transform
B. Non-uniform quantization
C. Huffman coding
D. Run-length coding
E. Time-domain aliasing cancellation
F. Left/Right channel mixing and differencing
G. Variable bit rate recoding
H. Direct Stream Digital process

(e)
A song is recorded on a 4-minute stereo CD track (44.1 kHz, 16 bit) without compression. What is the storage required (in mega-bytes)? Suppose MP3 encoding achieves a compression ratio of 8:1. What is the storage required for the compressed song? [2 marks]

3. Digital Image and Color Represent