Page 1 of12
Chapter 4: Physics & Biology of Audition
• Vision is restricted to the immediate field of vision (approximately 200°) and relies on the presence of
an adequate light source. Sound can be sensed in all directions, and through opaque occluding bodies.
Sound as a Physical Stimulus
• Sound consists of pressure waves carried by vibrating air molecules, produced by a vibrating surface.
Areas where air pressure is increased are compressions, and areas where pressure is decreased are
• Distal stimulus: Sounds from the outside world, e.g. car, music, speech. Proximal Stimulus: The
combined sounds which the brain must decipher, e.g. combined signals of car, music, and speech.
• The medium through which sound travels determines the speed of sound: approximately 334
m/second in air.
o Denser = slower; elastic = faster
o Water is denser than air, but sound travels faster at 1400 m/s. This is because water is more
elastic than air. This is also the case for solids, like hearing a train through the ground.
o Even at short distances, because sound is so much slower than light, information reaches the
eyes much faster than the ears (3 nanoseconds vs. 3 milliseconds).
• Longitudinal Waves: Displacement of air particles is parallel to
propagation movement, like sound. There is no net motion of any
individual particle, they simply oscillate.
• Transverse Waves: Displacement of air molecules is perpendicular to
propagation movement, like light.
Type of wave Longitudinal Transverse
Speed 340 m/s 3 x 10 m/s
Wavelength 0.017-17 m 400 – 700 x 10 m9
Frequency 20 – 20,000 Hz 4.3 – 7.5 x 10 Hz
Pressure 000002 – 20,000 Pa 0.0001 – 100,000 lux
0 – 180 dB SPL
Amplitude is… Loudness Brightness
λ or F is… Pitch Colour
Spatial Variation Small (poor sensitivity) Large (high
Temporal Variation Large (high sensitivity) Small (poor Page 2 of 12
• Waves can have both transverse and longitudinal components: surface of
• For an object to create sound, it must vibrate. This is determined by:
o Inertia: Size and density of an object. Smaller and lighter the object,
higher the resonance frequency.
o Elasticity: The stiffer, the higher the resonance frequency.
• A pure tone can be described mathematically as a sine wave, with cyclical alternation between
compression and rarefaction. This wave has three important features: frequency, amplitude, and phase.
• These waves can be depicted in either a waveform manner (pressure vs. time) or in a spectrum
(frequency vs. amplitude)
• Frequency: F = 1/T = C/λ. Frequency corresponds to the number of alternations between compression
and rarefaction generated in one second, or the number of
• λ is wavelength; longer wavelength = lower frequency, more time
required for each cycle.
• The rate of vibration of the sound source determines the frequency of
the resulting sound pressure wave.
• Frequency relates perceptually to pitch: low frequencies are perceived
as deep bass, high frequencies as high treble pitches.
• When sound interacts with matter, three things can happen:
absorption, reflection, diffraction.
o High frequency: Occluding object is much larger than
wavelength. Absorption and reflection happen.
o Low frequency: Object less large, mostly diffraction occurs.
• Amplitude: Corresponds to amount of change in pressure created by a wave. It is usually expressed
on the decibel (dB) scale, named after Alexander Graham Bell. A decibel is 1/10 of a Bel.
• The dB sound pressure level (SPL) scale measures sound pressure relative to a fixed reference
pressure, the minimal audible sound detectable by humans – the scale is thus relative.
• dB = 20 * log (P/P ) r where P rs the reference sound pressure level = 2 x 10 Pa.5
• Intensity = Pressure so db = 10 * log (I/I ) r where Iris reference intensity.
• Intensity is the energy held in the pressure of air; doubling pressure quadruples intensity.
• A logarithmic scale allows a very wide range of pressure levels to be expressed in a compact range of
• 20 dB SPL is a sound that is
10 times more pressure than
the threshold sound, and 10 2
= 100 times more intense Page 3 of12
• Each doubling of intensity adds 3 dB. Since doubling sound pressure is quadrupling intensity, each
doubling of pressure adds 6 dB.
• How many dB do you add to sound pressure if you have two singers instead of one?
o Two singers is double intensity of one singer (double energy) = 3 dB.
o Same for two jet engines than one: from 140 to 143 dB
o Having 16 singers is 16x intensity of one singer = 16 x 3 = 48 dB.
• A conversation is about 60 dB, and loud rock music
120 dB; the pressure amplitude of rock music is in
fact 10 , or 1000 times greater than of normal
• One complete cycle of a sine wave spans 360°,
with 0° as resting level, and maximum pressure at 90° and 270°.
• Phase: The part of the cycle that a sound wave has reached at a given
point in time, in degrees; often used to compare the timing of two sound
waves by referring to one at a phase relative to the other (phase-
• If two identical tones have phase difference 0°, then there is
constructive interference; if two tones are out of phase with phase
difference 180°, there is destructive interference.
• If the two tones are of different frequencies, a beat frequency can
result if there is only a small difference in the frequencies and
maxima/minima can coincide when out of phase. E.g. 330 and 331Hz
beat, while 330 and 340Hz simply interfere.
• Phase differences matter for sound localization, due to the distance between left vs. right ear (about 0.7
• Complex waves of complex sounds can be treated as a large collection of simple sine waves, with
individual frequencies, amplitudes, and phases determining its overall form.
• Some complex waves are aperiodic, and are just
treated as noise – no clear pitch.
• Fundamental Frequency: The lowest frequency in a
complex wave. The higher frequencies are harmonics,
having a frequency which is an integer multiple of the
fundamental frequency (e.g. 5 harmonic, 5f of
• The fundamental frequency relates to a sound’s
perceived pitch, while harmonics relate to timbre.
• Many natural sounds are not periodic, and do not
contain a harmonic series of frequency components –
instead, they contain a continuous “spectrum” of
• Fourier Analysis: Mathematical procedure allows
breakdown of complex sound into its component sine
functions. Developed by Joseph Fourier in the 18 th
century. Page 4 of12
• Fourier analysis yields a Fourier magnitude spectrum with information about the amplitude, energy,
and power in the signal at each frequency (amplitude vs. frequency).
• It also generates a phase spectrum with information about the phases of each component (phase vs.
• The original signal can be reconstituted perfectly by combining the components represented in these
spectra by the process of Fourier synthesis.
• Fourier theory assumes that the sound signal remains unchanged over an infinite period of time, but
this does not occur for real signals, e.g. human speech, or musical instruments, Shepard’s ever
• Attack: Complex variation in harmonic amplitude at the
start of a note produced by a musical instrument. This
• Spectrogram: Time-varying spectrum depicting temporal
variations in frequency of a complex sound (time vs.
frequency). Magnitude is represented by the colour or
darkness of the plot.
• The spectrogram represents a series of Fourier spectra of
the acoustic signal, taken over successive time windows.
• Frequency resolution is equal to the reciprocal of the
sampling window, so there is a trade-off between temporal and frequency accuracy; e.g. 3.3 ms (good
time resolution) and 300Hz (poor temporal resolution) in wideband spectrograms.
• Fourier theory can be used to investigate how a particular transmitting device or medium modifies
acoustic signals, causing changes in the amplitude of certain components in the signal’s spectrum.
• Fourier synthesis is applied to the modified spectrum, and the resulting signal is compared to the
• Includes signal attenuation, such as when sounds from a source pass over one’s head to reach the
other ear. The signal is measured at the source and at the ear, and Fourier spectra show that the
obstruction provided by the head removed the higher frequency components from the signal.
• In this case, the head acts as a filter to modify the frequency content of the signal passing through it.
• Transfer Function: Function that describes a linear filter’s frequency response, in terms of the degree
of attenuation at each frequency (i.e. relative amplitude of signal transmitted through filter). Values close
to 1 indicate very little attenuation, while values close to zero indicate little transmission.
• When the transfer function is known, its effect on any complex signal can be calculated by multiplying
the spectrum by the function, and applying Fourier synthesis to the output spectrum.
• The function can be derived by presenting the filter with a simple sinusoidal signal at a known
frequency and amplitude, and measuring the frequency and amplitude of the transmitted signal. The ratio
of input to output amplitude defines the filter characteristic at that frequency. This is repeated at a wide
range of frequencies.
• The use of Fourier analysis depends on the assumption that the filter is linear. B = F(A)
o Output never contains a frequency component not present in the input.
o If amplitude of the input is changed by a factor, the output should change by the same factor
F (cA) = c F(A) Page 5 of 12
o If a number of inputs are applied to the filter simultaneously, the output should match the output
that would be produced if the inputs had been applied separately, and their individual outputs
summed. F(A1 + A2) = F(A1) + F(A2)
• Nonlinear filters do not obey at least one of these rules; they often add distortions in the form of
additional frequencies and their outputs cannot be predicted straightforwardly.
• One form of nonlinearity is a failure to respond to low-intensity parts of a signal, with input waveforms
truncated. The frequency spectrum will contain many other frequencies in addition to the original sine
The Physiology of the Auditory
• The auditory system is a complex and
sophisticated biological system that
detects and encodes sound waves. It
extends past the peripheral system in
the ear, and into the circuits of the
central auditory system.
The Outer Ear
• The outer ear gathers sound energy
using the pinna, and focuses it down the
ear canal (meatus).
• The pinna is funnel-shaped, and made
mostly of cartilage, attached to the skull
by ligaments and muscles. In some
animals, it is mobile for orientation in the
direction of a sound source.
• The pinna acts as an amplifier, boosting sound pressure for frequencies between 1500-1700Hz. It also
acts as an acoustic filter, attenuating high-frequency sound components; the extent of attenuation
depends on the elevation of the sound source relative to the head.
• The tympanic membrane is a thin, flexible membrane stretched across the canal, connecting the outer
to the inner ear. Air pressure waves cause the membrane to vibrate, transmitting this to the middle ear
The Middle Ear
• Middle Ear: The air-filled cavity containing the bones and associated structures that transfer sound
energy from the outer ear, to the fluid-filled inner ear.
• Pressure in the middle ear is maintained at atmospheric pressure by the Eustachian tube, which
connects the middle ear chamber to the nasal cavity.
• The three middle ear bones are the malleus (hammer), incus
(anvil), and stapes (stirrup, smallest bone in body).
• The malleus is attached to the inner surface of the tympanic
membrane; the stapes is connected to the oval window of the
• The ossicles are held in position by two tiny muscles: tensor
tympani + stapedius. These help attenuate some tension Page 6 of12
generated by very loud sounds, protecting the middle and inner ear; however, this reflex needs 200ms to
occur, so cannot protect against abrupt sounds like gunshots.
• The outer and middle ear cavities are filled with air, while the inner ear is filled with fluid
• Air offers much less resistance to movement (small force, large amplitude), lower in acoustic
impedance than fluid (large force, small amplitude). Sound energy from air, if transmitted directly to the
oval window, would largely be reflected back out.
• The ossicles maximize transmission of sound from middle to inner ear by impedance matching.
• This is achieved by increasing the force/unit area (much higher at stapes than tympanic membrane due
to smaller area), i.e. force concentration, with ossicles as levers to increase the tympanic membrane
force by 1.3.
• Combined, these actions increase pressure by a factor of 44, or 33 dB, counteracting fluid’s high
• The middle ear is considered an approximately linear transmitter of sound
The Inner Ear
• Inner Ear: Fluid-filled organ in temporal bone, containing
mechanoreceptors for hearing and balance.
• The semicircular canals and otolith organs form the vestibular organs,
while the cochlea is the sense organ for hearing, converting sound energy
into neural impulses.
Structure of the Cochlea
• The cochlea is a small coiled tube, 4mm in diameter and 34mm long in humans.
• It is divided into two chambers, the scala vestibuli (connects
to oval window) and the scala tympani (connects to round
window), separated by the cochlear partition.
• A small opening called the helicotrema between the two
chambers allows them to share the same fluid, perilymph,
which has the same composition as cerebrospinal fluid.
• A third canal, the cochlear duct contains a different fluid
(endolymph), and is not connected to the other two. The
Reissner’s Membrane separates the scala vestibuli from the
• The cochlear partition houses the basilar membrane, the structure which contains the sensory hair
cells responsible for transducing fine changes in sound pressure into neural signals. The basilar
membrane also separates the cochlear duct from the scala tympani.
Mechanical Properties of the Cochlea
• Sound vibrations cause the stapes to push back and forth on the oval window at the same frequency as
the sound wave. This displaces the fluid in the scala vestibuli, which transmits the pressure across the
cochlear partition, deforming the basilar membrane.
• The pressure waves transmit into the scala tympani, and cause displacement of the round window.
• Bekesy: The displacement of the basilar membrane takes the form of a traveling wave from the basal
end to the apical end. The wave’s envelope is drawn through all points of maxi