CMD 276 Lecture 10: CMD 276 - Lecture 10 - Speech Perception

5 Pages
Unlock Document

University of Rhode Island
Communicative Disorders
CMD 276
Milner Bethany

CMD 276 - Lecture 10 - Speech Perception Perception of Speech ● How we assign meaning to what the auditory system receives ● Occurs by perceiving the whole signal rather than the acoustic features of each sound individually ○ This is especially difficult given that speech is dynamic and occurs rapidly ■ Phonemes vary according to coarticulation and prosody ● There is no single accepted theory Issues in Speech Perception ● Primary recognition problem that has been addressed by investigators is: ○ How the form of a spoken word is recognized from acoustic information in the speech waveform ● There are 3 basic issues addressed repeatedly in speech perception literature ○ 1. Acoustic-phonetic invariance: ■ A distinct set of acoustic features corresponds to each phoneme ■ Each time a certain phoneme is produced, the same acoustic features are identifiable, regardless of context ○ 2. Linearity: ■ In a word, a specific sound corresponds to each phoneme ■ Units of sound that correspond to phonemes are discrete and ordered in a particular sequence ○ 3. Segmentation: ■ A speech signal can be divided and recombined into independent units that correspond to specific phonemes ● Can move around the phonemes → cats, scat, tacs ○ These principles imply a 1:1 connection between acoustic and phonemic properties of sounds in words that allow for speech perception; research has demonstrated this is not exactly true: ■ There are more acoustic cues than phonemes in words ■ Acoustic properties of a phoneme vary based on phonetic context ■ Acoustic properties of one phoneme overlap with the properties of the adjacent phonemes ● Articulators move during conversational speech; thus vocal tract shape is influenced by phonemes before and after the target ■ Temporal (time) boundaries between phonemes are inconsistent in a spoken message ■ Coarticulation shows the lack of clear segmentation in speech ● There is a great amount of sound production variability between and within speakers ○ How can listeners still pick out the relevant features of speech and understand the message? ● Because there is so much information coming in during speech, how do we decide how auditory information is encoded, and the best or most natural coding unit to select? Vowel Perception ● Theories suggest we perceive ever-changing vowels’ F2:F1 ratio (distance in frequency between the formants) ○ Remains relatively stable across speakers ● Listeners use additional cues to correctly perceive vowels ○ F0 and formants of preceding sounds ○ Context of ongoing speech ■ Knowledge of speaker, topic, grammar, social context, environment Consonant Perception ● Accurate consonant perception is more difficult to achieve than accurate vowel perception ○ Articulators move more rapidly for consonant production ○ More consonants than vowels ○ Consonants have more complex acoustic cues ● Use categorical perception ○ Each sound has a set of features that are used to differentiate it from other sounds ○ Listener categorizes speech stimuli in such a way that stimuli within a particular category sound alike; noncontinuous perception ■ Example: use of voice onset time (VOT) and frequency of F1 at onset as a cue for differentiating a voiced from voiceless stop Speech Perception ● 3 categories of speech perception theories ○ Active vs. passive ○ Bottom-up vs. top-down ○ Autonomous vs. interactive Active vs. Passive ● Active theories: ○ Link between speech production and perception ■ Knowing how a sound is produced helps to recognize it ○ Speech sounds are sensed → analyzed for phonetic properties by referencing how such sounds are produced → recognized ● Passive theories: ○ Stress sensory aspects of speech perception ■ Sensory part of articulators hitting their targets ○ Less stress on knowledge of speech production Bottom-Up vs. Top-Down ● Because speech perception is complex, no one theory is all bottom-up or top-down ● Bottom-Up theories: ○ Data-driven ○ All information needed for recognition of the sound is contained in the acoustic signal ■ Everything you need to find out what the sound is, is contained in the sound ■ Small bits of information ● Top-Down theories: ○ Higher level (brain) linguistic and cognitive processes are important in the
More Less

Related notes for CMD 276

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.