Class Notes (835,539)
Canada (509,225)
Psychology (6,249)

Chapter 13- Speech Perception.docx

10 Pages
Unlock Document

Psychology 2115A/B
Christine Tsang

Chapter 13: Speech Perception By the end of this section, you should know: • The acoustic signal of speech • How variability affects speech perception • Categorical perception • Multimodal issues in speech • Word segmentation • Brain areas involved in speech What is speech? • An acoustic signal or stimulus • Picture of inside our heads, if we think about the acoustic signal essentially its air from our lungs, pushed up from lungs through vocal chords. • Passed vocal chods into the vocal tract and sound is created by the vibration of the vocal chords in our throats • All the other pieces of anatomy up - are the articulators- change the shape ofvocal tract, narrowing- higher pitch, and opening up is lowering it • Figure in textbook • The shape of vocal tract for two diff. vowel sound- a short I sound • And another vowel u sound like in put • You can see how it becomes narrow for the I and bigger for the EU sound • The shape of vocal tract changing to create diff sound that we can produce • They form the basic element of the acoustic of speech • A collection of ordered sounds to which meaning is attached- predominant way of thinking of speech • That‟s what makes it special wheres as music and sound from environment don‟t carry sound meaning • Speech can be understood at a rate of 50 units per second, normal speech would occur in 12 units • We can compare that level of speech stimulus to non speech sounds whichc is limited to .67 seconds per unit • Our ability to perceive sounds persists even when sounds are distorted in specific ways • We lose a lot of resolution sounds that come through the phone line but still able to perceive those speech sounds and make sense of what they mean Spectrogram - Graphs of the maoung of energy of frequency over time - In speech hey are helpful to us - The darkness of image= how intense and how much energy is at this sound - You end up with bands that run across horizontally of the figure and gets lighter as you get higher up - Bands= 4 dark bands= Formants refer to bands of energy in the spectrogram - We number our formants- the lowest is always F1, and the highest is the last F4 - Formants are useful , info providers because formants tell us about what vowel sounds are being emitted - In fact if you look at the formants, in particular the first two, we pay the most attention too - The difference between F1 and 2 tells us whichc vowel is being sounded - Vowels** - The transition between formants tells us about the consonants being sounded (T1, T2, T3) transition of frequencies between formants - Most correlated with consonants Speech Formats - the continuousness of speech - it tells us more than wave forms do (spectrogram) - most of the energy is centered in the first two bands Vowels -vibrating vocal chords as the air moves in and out - for example all the vowels your mouth is fairly open • Produced by vibrating the vocal folds as air moves out of the lungs through the open mouth • Sound produced depends on positions of structures of the vocal tract • Example: toungue position being a critical part • E as in beat, you notice that your tongue is relatavily high and at front of your mouth • In the center- A as in sofa , produce with tongue in the middle of your mouth in middle height • Then low sound of hot- the tounge is back and low in your mouth • The shapew of your mouth is also a critical pice • Degree of rounding of your lips • Ex: OU sounds, in a kissing position, in WHO are you • Ex: E and in HE is unrounded you have a smiley kind of face. And lips are flat - vowels sounds are produced in conjuction with vocal chords moving in and out and various articulators Consonants • Produced by closing or constricting the vocal tract 1) Place of articulation- where is it contricted 2) Manner of articulation- how is it constricted 3) Voiced vs. unvoiced- both to place of articulate, which reflects how the air is being pushed through the opening of the vocal tract - Voice: constricting the aire out of your mouth - Unvoice: refer to consonants that do not being until a longer time after constriction o Refer to before 30 seconds What is the basic unit of speech, how do we measure speech? We need to figure out the littlest unit and then go up from there. Not an easy task- its continuous. Do we break up signal at word? At syllable, per letter? The Phoneme • Smallest unit of speech/ sound and does affect meaning • In englighs we have 47 phonemes. Our vocal tract can produce 100-120 diff. sounds • We can produce a lot more sound than we have phones in any given languge • There are 13 major vowel sounds • And 24 major consonant sound • Why? • Vowels have diff. meaning  short and long • The number of phonemes in any languge differs • Hawaiian (few consonants but many vowels sounds) where as some Africana dialects can have up to 6 phonemes • Contains sound energy at a number of different frequencies, creating an acoustic signal The Variability Problem Context • No 1:1 simple correspondence between the acoustic speech signal and the actual phonemes • Context changes the relationship between the acoustic signal and individual phoneme • This is a problem because of co-articulation the overlap between articulation between 1 phoneme and a neighbouring honeme in a word and causes varitations depending on what is the neighbouring phoneme • The B sound in Bat and Boo are different • Bat lips are flat around the b sound • Boot your lips are rounded • The 2 B „s form a signal poot of view are different, but form perceptual view we hear the same • “I owe you a YOYO” • if you look at the spectrogram, the OWE doesn not look the same for YoYO • perception of O is constant - Figure in textbook o Shows us the formant bands o Its telling us someone saying „di‟ and someone saying „du‟ o Consonants are definined by the transition of formants o If you look at “di” and “du” you can see the formant trnastion very clearly o The „Du‟ sound is still percied as „di” o The actual consonant varies depending on what vowel follows the consonant sound Speaker Variation - this creates a problem in terms of constancy of sounds - all our voices sound different - we can change the pitch of our voices and still make sense what people are saying - accent nd pronunciation • Pitch • Speed • Pronunciation Categorical Perception How do we solve this variability problem? - we catergorically perceive sounds - it constraints the variability - we do this with colour too • This occurs when a wide range of acoustic cues results in the perception of a limited number of sound categories • Common solution that brain does when it dels with long continuous values • In speech, consonants are categorically perceived • The classic set has sttled on one particular measure  Voice onset time • VOT time when sound begins (vocal tract vibrates) and when we start voicing that The Voice Onset Time - to study VOT • Manipulate VOT • Phonemic boundaries- to describe and label what phoneme they hear • Chaging the amount of time from voicing to unvoicing in the stimulus • The shorter your VOT is more likely they say „Ba‟ then the longer it is you hear „Pa‟ • Figure in textbook • Psychophysical curve • Were looking at „da‟ and „ta‟ • We present VOT of 0- all the way to 35 ms- da • Around 45 ms we get ta • Very sort period - voice onset point where people are guessing • We never head a mixture of „da‟ or „ta‟ only one or the other • Everything on left side of phonetic boundary is perceived as „DA‟ and the other side everything is perceived as „TA‟ sound • They don‟t hear incomentral changes in voicing • Instead they hear the sudden change of stiumulc chaging from da to ta • Voicing refers to quickly to we add the buzz to how quickly we don‟t - data • People might being pronouncing twith different VOT depeing on which accent you speak with, allows us to categorize the VOT as being the same • The way to get around the articulation and variability- to make sure any small variability around that consonant we will ignore • If you think about context variability (bat boot) is probably because the voice onset time has not changed significantly • Another interesting thing is that it turns out you can shift the phonetic boundary, you can do this by giving you exposure • Example: When I go to school all Year and my A‟s are different but when im here all year, my A‟s start to sound like how the Ontarioans say it • Is Speech Special? • Phonemic boundaries are perceived by non-human species • Speech may just be a case of auditory perception • Species that do not have language and speech • It‟s the case when it looks like your dog understands you • Adaptation effects- the fact that we can shift the phonetic boundaries it suggests that it‟s a specialized highly practiced stimulus • We have categorization of vowels perception as well • Not special - its practiced th March 28 Speech is not just auditory processing, and that although we talk about it as an auditory signal or acoustic signal that actually there is a lot more to the perception than that and what is present in that stimulus. McGurk Effect • Speech perception is not just a product of auditory processing • Multimodal auditory processing and visual processing • Video • Auditory illusion • Senses work together to perceive our world • First, watched a boy say DA DA DA-- you hear something in the middle
More Less

Related notes for Psychology 2115A/B

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.