PSYC 212 Lecture Notes - Lecture 15: Inverse-Square Law, Azimuth, Trachea

20 views5 pages
Auditory distance perception
Problem: ITDs and ILDs can't tell us how far away an object
is on a given azimuth
Ex.: for an azimuth of -60 degrees, ITD will always
be -480 ms even if in absolute it takes more time for a
sound to travel from a farther distance
10 meters: left ear = 400 ms, right ear = 880 ms
100 meters: left ear = 4400 ms, right ear = 4880 ms
§
Inverse square law: the intensity of sound decreases as a
function of the inverse (i.e. "one divided by" of the square
of the distance)
Intensity (at distance "t") = intensity
(source)/distcance2
Sound decreases with distance
It's harder to tell small differences in distance
between two objects if they are far away than if they
are close
Longer distance will decrease more slowly
Intensity of sound as a determinant for distance, is
more useful for short distances
Longer distances won't be precise
®
Problem: how can I tell if a source is loud and far
away vs close but quiet?
Solution: the spectral composition of sounds changes
with distance
In general, long wavelengths are always more
resistant to obstacles
®
Sources that are far away are likely to have
encountered more obstacles
®
Air also has "sound-absorbing" qualities
®
Therefore the intensity of higher frequencies
decreases as a function of distance (note that
this only starts to have a real impact for
distances greater than 1000 meters)
Solutions: distal sounds are more
reverberated than direct energy
Detecting reverberated energy:
timing and change in spectral
}
®
§
Cones of confusion
Problem: ITD for a sound coming from these two azimuths
is exactly the same
§
ILD for the same azimuths should also be very similar
There will be small differences because the head is
not perfectly round but it will be negligible
§
If you consider all three dimensions this makes a "cone"
§
Solution #1: moving your head will change the "cones of
confusion"
The only point that will retain its ITD and ILD is the
"real" source
By moving our heads regularly, the only intercept
would have to be the only true sound
It's unlikely that the same problem will occur again
What is constant: what is probably the true sound
source
§
Solution #2: the pinna (and also ear canal, head and torso)
slightly distorts the amplitude of certain frequencies as a
function of elevation
Ex.: sounds coming from an elevation of -50 degrees
lose a lot of intensity (dB) between 8000 and 11000
Hz
Directional transfer function (DTF): a measure that
describes how the pinna, ear canal, head, and torso
change the intensity of sounds with different
frequencies that arrive at each ear from different
elevations
Our ears continue to change as we grow, so we have
to continuously adapt to these changes
We constantly learn to update this function
§
Auditory stream segregation
The perceptual organization of a complex acoustic signal into
separate auditory events for which each stream is heard a
separate event
Segregation/grouping can be based on several acoustic cues, such
as:
The location of sounds
Previous section
§
The frequency (pitch) of the sounds
Tones that have similar frequencies will tend to be
grouped together
§
The timing of the sounds
Tones that are close together in time will tend to be
grouped together
When we increase the distance between the tones, it
sounds like different sounds, breaking this grouping
§
The timbre of the sounds
Tones that have similar timbre will tend to be
grouped together
§
The onset of the sounds
When sounds begin at different times they appear to
be coming from different sound sources
§
Rule of "good continuation" (continuity effects)
In spite of interruptions, one can still "hear" a
continuous sound if the gap is filled with noise. In that
case the sound is perceived as continuing behind the
noise. However, if the gaps aren't filled with noise,
the sound is perceived as separate chunks
As the white noise gets louder, you begin to hear the
tone behind the white noise
This occurs even if it’s a tone with a "tune"
§
Higher-order information (restoration effects)
In spite of interruptions, one can still "hear" a
sentence if the gaps are filled with noise. In this case,
higher-order semantic/syntaxic knowledge is used to
"fill the blanks". As for continuity effects, the effect
banishes if the gaps are not filled with noise
§
Speech
Speech
The sound of speech: phonemes
Phoneme: a unit of sound that distinguishes one word from
another in a particular language (e.g.: kill vs kiss)
To get around confusing differences between sound
and spelling, we use the International Phonetic
Alphabet (IPA)
About 5000 languages spoken today, utilizing over
850 different speech sounds
English language uses approximately 40 phonemes
§
Speech production
Respiration: the diaphragm pushes air out of lungs through
the trachea, up to the larynx
1.
Phonation: the process through which vocal folds are made
to vibrate when air pushes out of the lungs
At the larynx, the air must pass through two vocal
folds (aka vocal cords)
More tension will cause more high-pitched sounds
Small vocal folds, high-pitched voices
Children < women < men
The spectrum of sound passing through
the vocal folds has a harmonic spectrum
Harmonic structure: single source)
}
®
2.
Articulation: the act or manner of producing a speech sound
using the vocal tract
The vocal tract is the area above the larynx (oral +
nasal tract)
Humans can change the shape of their vocal tract by
manipulating their jaws, lips, tongue body, tongue tip,
and velum (soft palate) - this is what we call
"articulation"
Resonance characteristics created by changing size
and shape of vocal tracts to affect sound frequency
distribution
Changing size and shape of vocal tracts will
increase/decrease energy at different frequencies
Peaks in the speech spectrum are called formants
Formants are labelled by number, from lowest to
highest (F1, F2, F3)
Formants have higher frequencies for people who
have shorter vocal tracts. It is the relationship
between the formants that counts
Most of the time, the first three formants are
sufficient to identify the phoneme
The spectrum of speech sounds change over time
Spectrograms help represent that third dimension
(time)
X: time
®
Y: frequency
®
Colour: energy (amplitude)
®
3.
1:01 PM
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 5 pages and 3 million more documents.

Already have an account? Log in
Auditory distance perception
Problem: ITDs and ILDs can't tell us how far away an object
is on a given azimuth
Ex.: for an azimuth of -60 degrees, ITD will always
be -480 ms even if in absolute it takes more time for a
sound to travel from a farther distance
10 meters: left ear = 400 ms, right ear = 880 ms
100 meters: left ear = 4400 ms, right ear = 4880 ms
§
Inverse square law: the intensity of sound decreases as a
function of the inverse (i.e. "one divided by" of the square
of the distance)
Intensity (at distance "t") = intensity
(source)/distcance2
Sound decreases with distance
It's harder to tell small differences in distance
between two objects if they are far away than if they
are close
Longer distance will decrease more slowly
Intensity of sound as a determinant for distance, is
more useful for short distances
Longer distances won't be precise
®
Problem: how can I tell if a source is loud and far
away vs close but quiet?
Solution: the spectral composition of sounds changes
with distance
In general, long wavelengths are always more
resistant to obstacles
®
Sources that are far away are likely to have
encountered more obstacles
®
Air also has "sound-absorbing" qualities
®
Therefore the intensity of higher frequencies
decreases as a function of distance (note that
this only starts to have a real impact for
distances greater than 1000 meters)
Solutions: distal sounds are more
reverberated than direct energy
Detecting reverberated energy:
timing and change in spectral
}
®
§
Cones of confusion
Problem: ITD for a sound coming from these two azimuths
is exactly the same
§
ILD for the same azimuths should also be very similar
There will be small differences because the head is
not perfectly round but it will be negligible
§
If you consider all three dimensions this makes a "cone"
§
Solution #1: moving your head will change the "cones of
confusion"
The only point that will retain its ITD and ILD is the
"real" source
By moving our heads regularly, the only intercept
would have to be the only true sound
It's unlikely that the same problem will occur again
What is constant: what is probably the true sound
source
§
Solution #2: the pinna (and also ear canal, head and torso)
slightly distorts the amplitude of certain frequencies as a
function of elevation
Ex.: sounds coming from an elevation of -50 degrees
lose a lot of intensity (dB) between 8000 and 11000
Hz
Directional transfer function (DTF): a measure that
describes how the pinna, ear canal, head, and torso
change the intensity of sounds with different
frequencies that arrive at each ear from different
elevations
Our ears continue to change as we grow, so we have
to continuously adapt to these changes
We constantly learn to update this function
§
Auditory stream segregation
The perceptual organization of a complex acoustic signal into
separate auditory events for which each stream is heard a
separate event
Segregation/grouping can be based on several acoustic cues, such
as:
The location of sounds
Previous section
§
The frequency (pitch) of the sounds
Tones that have similar frequencies will tend to be
grouped together
§
The timing of the sounds
Tones that are close together in time will tend to be
grouped together
When we increase the distance between the tones, it
sounds like different sounds, breaking this grouping
§
The timbre of the sounds
Tones that have similar timbre will tend to be
grouped together
§
The onset of the sounds
When sounds begin at different times they appear to
be coming from different sound sources
§
Rule of "good continuation" (continuity effects)
In spite of interruptions, one can still "hear" a
continuous sound if the gap is filled with noise. In that
case the sound is perceived as continuing behind the
noise. However, if the gaps aren't filled with noise,
the sound is perceived as separate chunks
As the white noise gets louder, you begin to hear the
tone behind the white noise
This occurs even if it’s a tone with a "tune"
§
Higher-order information (restoration effects)
In spite of interruptions, one can still "hear" a
sentence if the gaps are filled with noise. In this case,
higher-order semantic/syntaxic knowledge is used to
"fill the blanks". As for continuity effects, the effect
banishes if the gaps are not filled with noise
§
Speech
Speech
The sound of speech: phonemes
Phoneme: a unit of sound that distinguishes one word from
another in a particular language (e.g.: kill vs kiss)
To get around confusing differences between sound
and spelling, we use the International Phonetic
Alphabet (IPA)
About 5000 languages spoken today, utilizing over
850 different speech sounds
English language uses approximately 40 phonemes
§
Speech production
Respiration: the diaphragm pushes air out of lungs through
the trachea, up to the larynx
1.
Phonation: the process through which vocal folds are made
to vibrate when air pushes out of the lungs
At the larynx, the air must pass through two vocal
folds (aka vocal cords)
More tension will cause more high-pitched sounds
Small vocal folds, high-pitched voices
Children < women < men
The spectrum of sound passing through
the vocal folds has a harmonic spectrum
Harmonic structure: single source)
}
®
2.
Articulation: the act or manner of producing a speech sound
using the vocal tract
The vocal tract is the area above the larynx (oral +
nasal tract)
Humans can change the shape of their vocal tract by
manipulating their jaws, lips, tongue body, tongue tip,
and velum (soft palate) - this is what we call
"articulation"
Resonance characteristics created by changing size
and shape of vocal tracts to affect sound frequency
distribution
Changing size and shape of vocal tracts will
increase/decrease energy at different frequencies
Peaks in the speech spectrum are called formants
Formants are labelled by number, from lowest to
highest (F1, F2, F3)
Formants have higher frequencies for people who
have shorter vocal tracts. It is the relationship
between the formants that counts
Most of the time, the first three formants are
sufficient to identify the phoneme
The spectrum of speech sounds change over time
Spectrograms help represent that third dimension
(time)
X: time
®
Y: frequency
®
Colour: energy (amplitude)
®
3.
Lecture 15
Thursday, March 1, 2018 1:01 PM
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 5 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Problem: itds and ilds can"t tell us how far away an object is on a given azimuth. : for an azimuth of -60 degrees, itd will always be -480 ms even if in absolute it takes more time for a sound to travel from a farther distance. 10 meters: left ear = 400 ms, right ear = 880 ms. 100 meters: left ear = 4400 ms, right ear = 4880 ms. Inverse square law: the intensity of sound decreases as a function of the inverse (i. e. "one divided by" of the square of the distance) It"s harder to tell small differences in distance between two objects if they are far away than if they are close. Intensity of sound as a determinant for distance, is more useful for short distances. Solution: the spectral composition of sounds changes with distance. In general, long wavelengths are always more resistant to obstacles.

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related Documents