Categories


Quick note: Publishing equations on the Web

I’m transcribing my paper notes on short-time measurment of the speech signal and needed a way to insert some equations into my blog post.  I found an easy way to do this is to use GNU TeXmacs to enter the equations which can produce postscript files.  Then I use GIMP, the GNU Image Manipulation Program [...]

Zooming in on speech

The plot from last time doesn’t reveal much about the nature of speech so I’ve been looking at some smaller bits.  First I isolated the first phrase from the sample speech: “Mister Maryk”:

Plot of the phrase "Mister Meryk"

This is a good chunk of speech for experimenting with classifying algorithms but this is still too [...]

Handling Speech Samples with GNU Octave

The first step is to get some sample speech to work with.  I found this clip on the Web.  The first problem is that all of the speech samples that I could find on the Web were encoded in mp3 format.  Speech processing requires linear encoding.  I also wanted to sample at 8kHz which is [...]

Algorithm Prototyping: GNU Octave

Signal processing algorithms are usually prototyped and tested using high level math tools such as MATLAB or Mathcad. These tools use a high level language that closely models actual mathematical equations. Tools like these include many built in math functions and utilities to plot results. MATLAB even has add-on packages that can [...]

Beagle Board Contest Entry

The Beagle Board is a low cost single board computer that is ideally suited to prototyping multimedia embedded systems applications.  It is based on the Texas Instruments OMAP3530 application processor.  The TI OMAP 3530 integrates an ARM Cortex RISC processor and a TI C64++ DSP making it an ideal processor for digital audio and video [...]

Basic Technique

Basically, timescale modification of speech is accomplished by first dividing the speech into segments.   Then segments are deleted to speed up the speaker rate or segments are repeated to slow down the speaker rate.  The two key issues are how to segment the speech and what segments can be deleted or repeated without degrading intelligibility.

Speech [...]

Introduction & Applications

The first multimedia software project that I’m working on is an application that changes the playback rate of recorded speech, also called time scale modification of digitized speech in the technical literature.  Obviously this is more complicated than just playing back digitized speech at a faster or slower rate than it was originally recorded.  Doing [...]