Categories


Introduction & Applications

The first multimedia software project that I’m working on is an application that changes the playback rate of recorded speech, also called time scale modification of digitized speech in the technical literature.  Obviously this is more complicated than just playing back digitized speech at a faster or slower rate than it was originally recorded.  Doing this changes both the speech rate and the pitch of the speech.  The object of this development is to play digitized speech at different rates with no change in pitch and no loss of ineligibility.

I’ll give an overview of the techniques used to accomplish this in a later post but first I want to describe why we might be interested in doing this at all.  Here are a few possible applications:

  1. Voice Mail – have you ever gotten one of those long winded voice mail messages that ramble on forever only to miss the return phone number at the end?  Or, had someone just rush through their name and phone number so quickly that you can’t quite catch it?  This application would allow you to fast forward through the long winded messages by speeding up the speaking rate.  You could still understand the message but it wouldn’t take so long to listen to it.  Or slow down the important information like names and contact numbers to make the message more intelligible or just give you enough time to remember the information.
  2. Speech Therapy: By slowing down the speaking rate, students would hear the correct pronunciation of words better.  If the student’s speech is recorded and played back at a slow rate, they would better understand the difference between how they are pronouncing words and the correct pronunciation.
  3. Audio Books for the Blind: Audio books are a real boon for people who can’t see well enough to read.  But this still leaves one major disadvantage.  Sighted people can skim over parts of a book that are not interesting or important, then slowly read complex writing or important information.  Being able to change the speaking rate when listening to an audio book could give audio book users a similar capability.
  4. Changing the speaking rate of digitized speech is a good learning experience because it requires understanding the component sounds that make up speech and techniques for detecting these component sounds.
  5. It’s fun!

So, that’s the why of this project.  Next I’ll start to get into the how.

Comments are closed.