PSY384H5 Lecture Notes - Lecture 9: Diphone, Speech Perception, Speech Synthesis
Document Summary
Text-to-speech (tts) systems: turns written speech into synthesized auditory speech, making a good tts system is harder than it sounds, to begin with: Must find all abbreviations and convert them to full form (ex. St. john st. : america) st. is either saint or street. Symbols much be converted to text (ex. Acronyms have to be spelled out (tts; note however usa vs. unesco) Numbers must also be dealt with (note: the year1999 is diff than 1999 cars) the way u say 1999 in both cases. Ph" in phone" must become f": but ph" in haphazard" does not become f", and the words this" and thin" start with diff sounds, tough, though, bough, cough, through. The dog lead was not made of lead". Computers reading long passages (or even short utterances!) never seem to get the intonation right: paul is a dog, monday is good, but imagine a bad tts system reading hamlet!