Abiro  - Personal Pages
 
Web Abiro
Cats

Ebooks
   Getting
   Reading
   Making

Games
   Morrowind
   Oblivion
   Text Adventures
   My Games

Internet Search

Music
   A.C.T. Tribute
   My Music

Photography
   My Photos

Ray Tracing

Speech Technology
 

Speech Technology

In my full time job I'm involved in specifying user interfaces for mobile phones (among many other things), and man/machine interface technologies have always interested me. The most intuitive, yet still very complex to achieve with quality, method for user input/output is to have the device speak and listen. Especially for handsfree use and for disabled this comes into very good use. What is very obvious to a human being means though use of quite complex mathematics and powerful CPUs to realize (even a shadow of) in a computer. Nowadays a normal desktop computer is powerful enough to handle continuous speech recognition in software without any special hardware (except sound capabilities of course). Mobile phones are just starting to get there (Samsung has introduced such a phone). Speech synthesis is much simpler to process for a CPU than speech recognition. The most typical application for speech synthesis nowadays is text-to-speech, meaning you just feed the application with text in a given language and it speaks the text with possible added inflection etc, based on marks used etc. On mobile phones speech recognition is the buzz at the moment, but having messages, news and directions be spoken autonomously while e.g. driving would no doubt be very nice. If you could then also call by speaking names or numbers and the same way perform actions would make for a complete handsfree solution.

Primarily check the comp.speech FAQ for up-to-date information about algorithms, products, etc. The sites listed here are the ones I've made use of myself.

General

News Groups

Research and Toolkits

Product Information

Microsoft

Microsoft now includes speech recognition in Office and you can very easily write your own speech-enabled applications in Visual Basic or similar, both in terms of input and output. My own applications Notify and Agent are very simple examples of text-to-speech via MS Agent, which is extremely easy to use (see the code for Agent).

(c) 2004-2008 Abiro. All rights reserved. Terms of Service | Privacy Statement | Info Links
Site design, programming and information by Anders Borg.