Lecture 10

Music and Speech Perception

Matthias Neimier

PSYB51-Lec 10 st Friday March 21 , 2014 Music and Speech Perception Hearing in the Environment • Are advantage of hearing/speech comes from evolution • Extremely sophisticated • Complex Sounds • Auditory Scene Analysis • Music o Marvel to why we have developed this phenomenon o Not something that can save our lives o Expressing experiences/emotions • Speech o Related to music  help each other • Complex Sounds • Harmonics o How they come to exist o Missing fundamentals • Timbre o Power spectrum of harmonic sounds o Timbre aftereffect  After effect for sounds  Crucial for speech perception o Attack & decay Complex Sounds • Harmonics o Sounds composed of many sounds of specific frequencies  Several sine wave tones  Add up pure tones of all sorts of frequencies o If you have a certain relationship to frequencies  Multiple integers st  Multiple of some fundamental frequency aka 1 harmonic  Regular patterns o Lowest frequency of harmonic spectrum: Fundamental frequency o Auditory system is acutely sensitive to natural relationships between harmonics  Very sensitive o E.g. vowels  Harmonic sounds we produce o Missing-fundamental effect  Helps us understand how we perceive harmonics  What we actually perceive is frequency of 1 harmonic, not actual pitch  True even when it’s not even played  You perceive it at the pitch it would be in Missing Fundamental (Part 1) • Missing fundamental (x1) is not/barely noticed o More like illusion • More harmonics can be missing • How come? o We can determine it using numbers • You hear a sound of 20 Hz, then you hear one with, 30 Hz, 40 Hz, 50 Hz o Question: what is the fundamental frequency? o Answer: 10 o Distance between neighbouring frequencies is 10 • Mathematically… it’s not a mystery • Demo in Matlab o Example not a pure tone o When all 5 tones are played at the same time, you hear the fundamental frequency o Taking away certain harmonics, you can still hear the fundamental frequency o Looking at different sine waves result in a complx-ish pattern o Has peaks at certain points  Have the same distance as the peaks in the original harmonic sounds  Peaks that overlap coming at the same harmonic as the fundamental frequency Missing Fundamental (Part 3) • Missing fundamental harmonic: 250 Hz • 2 , 3 & 4 harmonic overlap in peaks every 4 ms o Added up they yield a fluctuation in energy at 250 Hz o Temporal coding of frequency  Picked up by temporal code in cochlea Complex Sounds (cont’d) o Sounded different even though they had the same frequency • Timbre: Psychological sensation by which a listener can judge that two sounds that have the same loudness and pitch are dissimilar; conveyed by harmonics and other high frequencies o Perception of timbre depends on context in which sound is heard  Sounds that sound different due to context in which it is heard o Experiment by Summerfield et al/ (1984)  “timbre contrast” or “timbre aftereffect”  Played sounds to people, and asked people to name the sound  You may think it is the vowel ‘e’ but what you hear is a harmonic  Power declines as frequency increases  Created a sound that is opposite of the letter ‘e’ (artificial sound created by synthesized) • Similar harmonics… but has some harmonics cut out  Listen to something… then the next time you hear it again, it is reduced o Fundamental frequency reduced when you listen to it a second time… certain ranges in which harmonics are reduced … adaptations  Will get peaks  Fingerprint for vowel ‘e’  Due to adaptation… the will think they hear the letter ‘e’ even when they don’t o Causes additional challenge for perception because we hear different vowels together at a time o Sounds change in time • Singing a song… everyone sings at different pitches • Vowels sound the same even when different people say it • Attack and decay of sound o Attack: part of a sound during which amplitude increases (onset)  Plucking o Decay: part of a sound during which amplitude decreases (offset)  Energy decreases more quickly • Violin pluck vs. bow o Onset of sound is much more immediate when you pluck it • Same with speech : baa vs. waa Hearing in the Environment • Auditory Scene Analysis o Segregating sound sources  Attack & decay  Separate sounds that don’t belong together o Grouping o The ventriloquist effect  Beyond vision Auditory Scene Analysis • What happens in natural situations?: o We hear a lot of things around us o Eg. Frog in pond, splashes, bird chirping  all three sounds together o A challenge to tell things apart • Acoustic environment can be a busy place • Multiple sound sources • How does auditory system sort out these sources? o How does system tell things apart o Certain tricks • Source segregation, or auditory scene analysis • A number of strategies segregate sound sources: • Spatial separation between sounds; motion parallax o Walk around: seeing things pass by (motion parallax) o Angle changes for people at different distances o Move around to segregate where sound comes from o Sensory phenomenon • Separation on basis of sounds’ spectral or temporal qualities • Auditory stream segregation: Perceptual organization of a complex acoustic signal into separate auditory events for which each stream is heard as a separate event o Graphically, when you hear sound, you would be more likely to put different tones/frequencies together as opposed to when they are further apart o Have tones that have similar frequencies o Sound segregation in different frequencies • Gestalt law: “similarity” o Sounds based on this o Frequencies far apart o Group frequencies together based on the law • Gestalt: German for “form”. In perception a term introduced by a school of thought stressing that the perceptual whole could be greater than the sum of its parts • Grouping by timbre o Tones that have increasing and decreasing frequencies, or tones that deviate from rising/falling pattern “pop out” of sequence  Group things together using timbre of sound  Sharing similar timbres will be grouped together  Depends on how timbre is chosen o On organ you have different pipes for the same pitch that sound different • Grouping by onset o Harmonics of speech sound or music  Consists of multiple pure tones o Grouping different harmonics into a single complex tone o Rasch showed that it is much easier to distinguish two notes from one another when onset of one precedes onset of other by very short time  Can tell right away if they don’t have the same onselt o Gestalt law of common fate o Does the bottle break?  Can you hear if the bottle is bouncing and/or breaking? • Spectogram: A pattern for sound analysis that provides a 3D display of intensity as a function of time and frequency  On vertical axis: power – how loud it is  On horizontal axis: frequency  Shows how it sounds regardless of time  Plot things in 3D  Energy is then plotted using colour, time – horizontal axis; frequency – vertical axis o E.g., bouncing/breaking bottle  First graph is bouncing bottle, quite a bit of energy in between  Fundamental frequency is much more powerful  Idealization  Depends on vibrational physical properties of bottle o Harmonics are the same, same spikes of power  Essentially same bottle o Breaking bottle is much messier  Shards that hit the floor at different times  Harmonic sounds not the same o Second graph… there is more than one thing… can tell using timbre • Multisensory integration: vision (usually) helps audition o Not an island by itself o Functions with all perceptual systems together o Uses whatever is useful • Ventriloquist effect: An audio-visual illusion in which sound is misperceived as emanating from a source that can be seen to be moving appropriately when it actually emanates from a different invisible source o Visual dominance for a location  Localizing where things are o Everyone uses lip-reading to a certain degree o Puppet seems to be speaking… but it’s actually the ventriloquist who is making the sounds  Puppet moves its “lips”  See lip movements o Visual system is much more acute in terms of telling what direction things are coming from o Multisensory form of perception  Reassigns direction of sounds towards the lips of the puppet o Visual system determines what you actually are hearing, not in terms of direction, but what the sound is actually about • Ventriloquist effect illustrates how we are able to put things together in real life o Use visual information to segregate who is talking Hearing in the Environment (cont’d) • Ability to fill in information o With so many sound around us, we can miss things • Continuity and restoration effects • Gestalt principle of good continuation for simple sounds. • … and for complex ones Continuity and Restoration Effects • How do we know that listeners really hear a sound as continuous? • Principle of good continuation: In spite of interruptions, one can still “hear” sound o Fill in things whether they are there or not • Experiments that use signal detection task (e.g., Kluender and Jenison) suggest that at some point restored missing sounds are encoded in brain as if they were actually present! o Bar demonstrates frequency over time o There could be white noise… o Ask people whether they heard the tone during the white noise  They couldn’t tell whether the tone was played or not at all o Very difficult to tell the difference • Simple filling in of information • Restoration of complex sound, (e.g., music, speech) • “Higher-order” sources of information, not just auditory information o “The *eel feel off the car.” (wheel)  Will automatically fill in the ‘wh’ o “… the table” (Meal)  If you talk about a table... you would fill it in with meal rather than wheel • Can use semantics to fill in information The Restoration Effect • Example of filling in • Add something to occlude it, the weird shapes become a cube • Perceptual system sees boundaries that separate the shapes • Amodal completion • Noise helps comprehension. o “The mailman brought the letter.”  Spectrum that codes this sentence • A) Spectrogram: frequency vs. time w/ colour coding • B) same spectrogram, but with blue spaces o Blue = less energy = silence o Missing pieces, but no sensory evidence, you have great difficulty to understand it st • C) another version of 1 spectrogram o But high energy white noise o But missing clear words of sentence o Yet you can fill it in using perceptual system Music • What’s special about music? o What’s the difference between tone height and tone chroma?  Important concept o Chords o Melody • Music is a way to express thoughts and emotions  Main purpose to induce happiness  Fundamentally important o Pythagoras: Numbers and music intervals o Some clinical psychologists practice music therapy • Musical notes o Sounds of music extend across a frequency range from about 25 to 4500 Hz  Instruments designed to play pitches with these frequencies o Pitch: The psychological aspect of sounds related mainly to the fundamental frequency • The sounds of music extend across a frequency range from about 25 to 4200 Hz o Illustrates frequencies of different instruments o Piano has quite a large range o Auditory threshold curve: lower end of where we perceive sound, below that, energy is too little to perceive things o Region where threshold is best:  hearing there is the best o No musical instruments that play around 10 000 Hz  Best frequency for us to hear  Humans designed instruments that way  Enjoy music better if these frequencies did not exist  Works only up to 5000 hz… beyond that becomes place coding • Can’t tell difference between music, can’t enjoy it • Octave: The interval between two sound frequencies having a ratio of 2:1  Ratios are always the same o Example: middle C (C 4 has a fundamental frequency of 261.6 Hz; notes that are one octave from middle C are 130.8 Hz (C 3 and 523.2 Hz (C )5  Octaves create the sound C… but frequencies are different o C3(130.8)sounds more similar
