Weekly projects, December 5th

Project 1

Find/produce a sound file representing some approximately-periodic signal such as one instrument playing a single note for a long time. Plot the discrete Fourier transform for a block of coefficient of size n with n being an arbitrarily chosen number (say, 1024) and compare to a plot of the discrete Fourier transform for a block of coefficient of size n' where n' is chosen to match the actual period of the signal (you may have to compute the DFT naively in the latter case).

Project 2

Experiment with the naive transform coding strategy of throwing away high frequency components of a signal (images or sound), using either the Fourier transform and/or the Cosine transform and/or the DWHT transform (not covered at lecture, but defined in Kieffer) on images and sound files. What happens, in terms of rate/distortion tradeoff (distortion measured by SNR or observed) when you Can you get an improvement by replacing the "throw away" strategy by an (ad hoc) bit allocation strategy assigning fewer bits to higher frequencies?

Project 3

According to Kieffers notes, the normalization matrix is built into the JPEG standard. Is this the whole truth? How does the quality setting of JPEG affect the matrix and is there some way to define it yourself? Experiment with bit allocation strategies in the following ideal version of JPEG: Follow the strategy of JPEG by dividing the image into 8 by 8 blocks, transform each block and quantize the coefficients according to some normalization matrix but assume zeroth order entropy coding of the resulting quantized values with separate codes for each coefficient (for instance, all topmost-leftmost coefficients are collected into a "band" and entropy-encoded separately).

Compare the built-in matrix of JPEG with an "optimum" one for the empirical zeroth order model of the blocked and transformed image at hand (that is, for a given band, allocate a number of bits equal to half of the log (base 2) of the variance of the band, plus a constant, with negative values replaced with zero and find a step size of the uniform quantizer corresponding to this bit rate). Does the "optimum" matrix look very different from the built-in one? Does it perform better assuming the idealized entropy coding? Would it still perform better if the actual encoding strategy of JPEG was used?