Weekly projects, December 5th
Project 1
Find/produce a sound file representing some approximately-periodic
signal such as one instrument playing a single note for a long time.
Plot the discrete Fourier transform for a block of coefficient of size
n with n being an arbitrarily chosen number (say, 1024)
and compare to a plot of the discrete Fourier transform for a block of
coefficient of size n' where n' is chosen to match the
actual period of the signal (you may have to compute the DFT naively
in the latter case).
Project 2
Experiment with the naive transform coding strategy of throwing away
high frequency components of a signal (images or sound), using
either the Fourier
transform and/or the Cosine transform and/or the DWHT transform
(not covered at lecture, but defined in Kieffer) on images and sound files.
What happens, in terms of rate/distortion tradeoff (distortion
measured by SNR or observed) when you - Vary the block size (the block
may range from having size, say, 8, to be the entire signal)
- Vary the transform?
- Vary the range of frequencies that are cut off?
Can you get an improvement by replacing the "throw away" strategy by
an (ad hoc) bit allocation strategy assigning fewer bits to higher
frequencies?
Project 3
According to Kieffers notes, the normalization matrix is built into
the JPEG standard. Is this the whole truth? How does the quality
setting of JPEG affect the matrix and is there some way to define
it yourself?
Experiment with bit allocation strategies in the following ideal
version of JPEG: Follow the strategy of JPEG by dividing
the image into 8 by 8 blocks, transform each block and quantize
the coefficients according to some normalization matrix but assume
zeroth order entropy coding of the resulting quantized values with
separate codes
for each coefficient (for instance, all topmost-leftmost coefficients
are collected into a "band" and entropy-encoded separately).
Compare the built-in matrix of JPEG with an "optimum" one for
the empirical zeroth order model of the blocked and transformed image at hand
(that is, for a given
band, allocate a number of bits equal to half of the log (base 2) of
the variance of the
band, plus a constant, with negative values replaced with zero and
find a step size of the uniform quantizer corresponding to this bit rate). Does
the "optimum" matrix look very different from the built-in one?
Does it perform better assuming the idealized entropy
coding? Would it still perform better if the actual encoding strategy
of JPEG was used?