23/3/00, pm-n.

timbreAnalysis.

Peter Møller-Nielsen
( pmn@daimi.au.dk )
Department of Computer Science
University of Aarhus
Ny Munkegade build. 540
8000 Århus C
Denmark

TimbreAnalysis is a set of MATLAB -functions for analysis of the temporal development of the spectral contents of a digitized sound.
The main purpose of timbreAnalysis is to plot the temporal development of the amplitudes in the spectra under various tunings and calibrations.

The section Getting started explains the most common commands by means of a sample session. The section Reference contains a comprehensive and more formal description of the commands.

You can down-load the MATLAB-functions and and a few sounds from THIS page.


Getting started:

You start the program by typing:

matlab
path('/users/pmn/public/timbreAnalysis',path);
timbreAnalysis

Suppose your working directory ( i.e. the directory from which you started the program ) contains a digitized sound in wave-format in a file named s1.wav. Then the sound is loaded and prepared for analysis by typing:

loadSound('s1.wav')

If the file happens not to be in wave-format but in e.g. aiff-format and named s1.aiff, then you must convert it into wave-format by typing:

! sfconvert s1.aiff s1.wav format wave

The first few seconds of the sound can now be analysed and plotted by typing:

firstPage

( click HERE to see a sample plot )

If the sound is longer than a few seconds, a plot of the complete sound cannot be fitted within the screen. So, the plot is divided into pages. Only one page is shown at a time. A page shows the spectral intensities in a certain time interval. Each spectrum is shown as a horizontal line in the plot. For each tone in a spectrum a little rectangle is shown. The rectangle contains a colour and a number, both reflecting the intensity in the frequency interval associated with that tone. The relation between spectral amplitude/energy and intensity is established by a process called calibration. The relation between frequencies and tones is established by a process called tuning.
The vertical axis is the time axis, starting from the top of the plot. To improve readability, the time axis is devided into bars. The bars are separated by lines showing the tone names on a white background.

A lot of parameters control the analysis and the presentation of the results. When the program is started default values are assigned to all parameters. You can change the values of the parameters whenever needed during the analysis of the sound. A new sound ( e.g. s2.wav ) can be loaded for analysis by typing:

loadSound('s2.wav')

Notice! When you change one or more of the parameters that control the analysis or presentation, the plot does not change right away. The changes will appear in subsequent plots. If you want to plot the current page with the new parameters type:

thisPage
When you look at the plot of a page you can see that the tonal range ( horizontally ) is divided into a number of octaves. The tuning dictates the mapping from frequencies in the spectrum to tones in an octave. E.g. by typing:

octave('equal12', 261.6)

an equal tempered octave with 12 tones is established and the center frequency of the first tone ( the 'c' ) is set to 261.6 Hz. This implies that the 'a' will be 440 Hz. The frequency range of the tones of lower octaves are obtained by dividing the frequencies of the initial octave by 2, 4, etc. Similar for the octaves above by multiplying by 2, 4, etc. The number of octaves below the initial one is set ( to 2 ) by typing:

setGlobal('octavesBelow', 2)

Similarly, the number of octaves above is set ( to 4 ) by typing:

setGlobal('octavesAbove',4)

The first, the next or the previous page is shown by typing: firstPage, nextPage or prevPage.

Vertically the page is divided into frames that are grouped into bars for readability. Each frame reflects the spectral contents of short time interval of the sound. The length of this time interval is set ( to 0.2 seconds ) by typing:

setGlobal('frameSize', 0.2)

The time distance between frames is set ( to 0.35 ) by typing:

setGlobal('hopSize', 0.35 )

The number of frames pr. bar is set ( to 6 ) by typing:

setGlobal('framesPrBar', 6 )

and the number of bars ( vertically ) in the plot is set ( to 8 ) by typing:

setGlobal('bars', 8)

The numbers to the left of the vertical axis indicate the start time of the frame measured in seconds from the start of the sound. By typing:

gotoPage( 17.2 )

you plot a page starting 17.2 seconds from the start of the sound.

If you want to hear the sound you are analyzing type:

playSound

to hear everything, or:

playSound( 10, 20 )

to hear the sound in the interval from 10 seconds to 20 seconds

By typing:

playChord( 13.2, 10 );

you can generate and play a new sound based on the timbre of the sound around the time 13.2 seconds. Based on two closely separated spectra from the sound around 13.2 seconds the new sound is generated by repeating the spectra for 10 seconds. This corresponds to infinite time-scaling. If you want to save the new sound in a file named newsound.wav, you must type:

newsound = playChord( 13.2, 10);
wavwrite(newsound, 44100, 'newsound.wav')

For each tone in each frame the plot shows a little rectangle with a colour and a number. Colour and number both reflects the intensity of the spectral contents around that particular tone in that particular frame. The intensities for the tones of a frame ( i.e. the colours and numbers for a row of rectangles in the plot ) are calculated as follows: A 'window' is applied to the samples within the frame. Then a Fourier-analysis is used to determine the amplitudes ( or energies ) for a number of ( short ) frequency intervals of equal length. For each tone an amplitude for the tone is calculated based on the overlaps of the frequency range of the tone and the frequency intervals from the Fourier-analysis. For each tone the amplitude is now converted into the intensity ( the number in the rectangle ) depending on the current choice of calibration. Suppose you type:

calibrate('log', 9, 1.5 )

Then you choose a logarithmic calibration ( i.e. a logarithmic relation between amplitude and intensity ), 9 intensity levels and a factor 1.5 between intensity levels. The precise mapping between amplitude and intensity is determined as follows: First you look for the maximum amplitude over all tones and frames in the current page. Suppose this happens to be 4638. Then the mapping is:

If the amplitude is higher than 4638 then the intensity is set to 10. If the amplitude is between 4638 and 3092 ( = 4638/1.5 ) then the intensity is set to 9. If the amplitude is between 3092 and 2061 ( = 4638/( 1.5 * 1.5 ) ) the intensity is set to 8 , and so on down to intensity level 1. Amplitudes smaller than this are also mapped to 1. If you type:

calibrate('lin', 10)

you choose a linear calibration with 10 levels. That is: The range of amplitudes from 4638 to 0 is divided into 10 intervals of equal size ( 463.8 ). The mapping is:

If the amplitude is higher than ( or equal to ) 4638 then the intensity is set to 11. If the amplitude is between 4638 and 4174.2 ( = 4638 - 463.8 ) then the intensity is set to 10. If the amplitude is between 4174.2 and 3710.4 ( = 4638 - 2*436.8 ) then the intensity is set to 9 etc. down to intensity equal to 1.

Calibrating relative to the maximum amplitude on the current page introduces a kind of deadlock at start up. You need a plot in order to calibrate, but you need a calibration in order to produce a plot. The deadlock is broken by using a default calibration for the first plot. This default corresponds to calibrate('log',9,1.5).
A calibration is retained from page to page until a new calibration is explicitly invoked.

You stop the program by typing:

quit

Most commands ( i.e. the lines you type ) have an optional number of parameters. The parameters are the numbers or quoted character strings you type between ( and ) separated by commas. You can find more precise information about this in the section Reference below.


Printing plots on paper:

Before printing plots on paper a "Page Setup" must be done as follows:

 
     Click on:   File ---> Page Setup

in the window where the plot appears. Then click:

     Landscape, Color, Fill and OK

Now the command:

         print -Pr212

prints the current page ( i.e. the page in the window ).

The command:

 
         printPages( 15, 26 );

prints the pages covering the time interval from 15 seconds to 26 seconds.

In both cases the printing will be done by the printer named 'r212' which is situated on R2 in building 540.


Reference:

This section contains a comprehensive list of commands and the optional parameters for each.


    setGlobal(<parameter>,<new value>);
                        / assign a new value to one of the parameters
                        / that control the analysis or presentation.
                        / <parameter> is a quoted character
                        / string. It selects the parameter to which
                        / value is assigned as follows:
                        / 'octavesBelow': the number of octaves below
                        / the one established by the tuning ( default
                        / is 3 ).
                        / 'octavesAbove': the number of octaves above
                        / the one established by the tuning ( default
                        / is 3 ).
                        / 'bars': the number of bars ( default is 6 ).
                        / 'framesPrBar': the number of frames in each
                        / bar( default is 8 ).
                        / 'frameSize': the length ( in seconds ) of a
                        / frame ( default is 0.125 ).
                        / 'hopSize': the distance ( in seconds )
                        / between consecutive frames ( default is
                        / 0.125 ).




    octave( .... );     / the tuning ( i.e. the mapping of frequency
                        / intervals to tones ) is established by calling
                        / 'octave'. If more than a single octave is
                        / needed, the frequency intervals for these
                        / are created by multiplying or dividing by
                        / 2, 4, 8 etc.

      octave(<tuning>,<first freq.>);
       
                        / <tuning> can be:
                        /     'equal12' , an Equal Temperament with 12
                        /                 tones pr. octave.
                        /     'equal24' , an Equal Temperament with 24
                        /                 tones pr. octave.
                        / <first freq.> is the center frequency of the
                        /     first tone in the octave.
             Default is:  octave('equal12', 261.6);


    loadSound( ... );   / loads a sound file for analysis. 

      loadSound('<file name>');
                        / loads the contents of the file named 
                        / <file name>. The name may include a path.
                        / If the sound has more than one channel,
                        / the channels are added.

      loadSound('<file name>',<channel>);
                        / as above, but only channel number <channel>
                        / is loaded. If <channel> is 0, the channels
                        / are added ( as above ).

      loadSound('<file name>',<channel>,<duration>);
                        / as above, but only the first <duration> sec.
                        / are loaded.

      loadSound('<file name>',<channel>,<start time>,<stop time>);
                        / only the interval between <start time> and
                        / <stop time> ( both in sec. ) is
                        loaded.
             The is no useful default sound.

    gotoPage(<start time>);  
                        / plot a page starting at time
                        / <start time> ( in sec. ).

    thisPage;           / plot this page

    firstPage;          / plot the first page
    
    nextPage;           / plot the next page.

    prevPage;           / plot the previous page.


    playSound( ... )    / the sound analyzed ( or part of it ) is
                        / played.

      playSound;        / the sound is played from beginning to end.
 
      playSound(<duration>);
                        / only the first <duration> sec. are played.

      playSound(<start time>,<stop time>);
                        / only the interval between <start time> and
                        / <stop time> is played.

    playChord( ... );   / generate a new sound by infinite time-scaling.    

      playChord(<time>,<duration>);
                        / based on two closely separated frames 
                        / extracted from the sound around <time>
                        / ( in sec. ) a new sound , lastin <duration>
                        / sec., is generated having a constant 
                        / spectral contents. This corresponds to
                        / infinite time-scaling.

      playChord(<time>,<duration>,<frame size>,
                  <overlap>,<eps>,<max. peaks>,
                    <min. width>);
                        / the function and the meaning of the first
                        / two parameters are as stated above. 
                        / <frame size> is the length of each
                        / frame ( in sec. ). <overlap>
                        / controls the separation between the two
                        / frames. The separation is:
                        / <frame size>/<overlap>.
                        / Synthesis of spectra includes restoration
                        / of phases under peaks in the amplitudes.
                        / The last three parameters control which
                        / peaks to include in this process.
                        / The amplitude of a peak must be greater
                        / than <eps> times the maximum amplitude
                        / in the spectrum as a whole. Only the
                        / <max. peaks> highest peaks are
                        / included. A peak must have a width greater
                        / than ( or equal to ) <min. width> to
                        / be included.
           Default for <frame size>, <overlap>,
              <eps>, <max. peaks> and <min. width>
              is: 0.5,  4, 0.05, 10 and 3.    

    calibrate( ... );   / establish a new calibration. 
                        / establishes a new relation between the 
                        / amplitude for the tones in a spectrum
                        / and the intensity shown in the plot.
        
      calibrate('log',<levels>,<scale factor>,<max.amplitude>);
                        / establishes a logarithmic mapping from amplitudes
                        / to intensities as follows:
                        /   if  amplitude > <max. amplitude> 
                        /      then the intensity is <levels>+1
                        /   if  <max. amplitude> > amplitude AND
                        /         amplitude > <max. amplitude>/<scale factor>
                        /      then intensity is <levels>
                        /   if  <max. amplitude>/<scale factor> > amplitude AND
                        /         amplitude > <max. amplitude>/(<scale factor>2)
                        /      then intensity is <levels>-1
                        /  etc., and finally:
                        /   if <max. amplitude>/(<scale factor>(<levels>-1)) > amplitude
                        /      then intensity is 1
                        /
                        / The parameter <max. amplitude> is optional. The
                        / default value is the maximum amplitude in the current
                        / page.

      calibrate('lin',<levels>,<max. amplitude>);
                        / establishes a linear mapping from amplitudes
                        / to intensities as follows:
                        /   The interval from  to 0 is divided
                        / into <levels> intervals of equal size:
                        /  <max. amplitude> = amp(<levels>) > amp(<levels>-1) >
                        /     ... > amp(1) > 0.
                        /   if amplitude > amp(<levels>)
                        /     then intensity is <levels>+1  
                        /   if amp(k-1) < amplitude AND amplitude < amp(k)
                        /     then intensity is k
                        /   if amplitude < amp(1) 
                        /     then intensity is 1 
                        /
                        / The parameter <max. amplitude> is optional. The
                        / default value is the max. amplitude in the current
                        / page.
           Default is calibrate('log', 9, 1.5)