Abiro  - Personal Pages

 
Web Abiro

Cats

Ebooks
   Getting
   Reading
   Making

Games
   Morrowind
   Oblivion
   Text Adventures
   My Games

Internet Search

Music
   A.C.T. Tribute
   Korg M50 Videos
   My Music

Photography
   3D Photography
   My Photos

Ray Tracing

Speech Technology
 

Speech Technology

Man/machine interface design and technologies have always interested me. The most intuitive, yet still very complex to achieve with quality, method for user input/output is to have the device speak and listen. Especially for handsfree use and for disabled this comes into very good use. What is very obvious to a human being means though use of quite complex mathematics and powerful CPUs to realize (even a shadow of) in a computer.

Nowadays a normal desktop computer or smartphone are powerful enough to handle continuous speech recognition in software without any special hardware (except sound capabilities of course).

Speech synthesis is much simpler to process for a CPU than speech recognition. The most typical application for speech synthesis nowadays is text-to-speech, meaning you just feed the application with text in a given language and it speaks the text with possible added inflection etc, based on marks used etc.

On mobile phones speech recognition is the buzz at the moment, but having messages, news and directions be spoken autonomously while e.g. driving would no doubt be very nice. If you could then also call by speaking names or numbers and the same way perform actions would make for a complete handsfree solution, and less risky than trying to read or key in messages.

Primarily check the comp.speech FAQ for up-to-date information about algorithms, products, etc. The sites listed here are the ones I've made use of myself.

General

News Groups

Research and Toolkits

Product Information

Microsoft

Microsoft now includes speech recognition in Office and you can very easily write your own speech-enabled applications in Visual Basic or similar, both in terms of input and output. My own applications Notify and Agent are very simple examples of text-to-speech via MS Agent, which is extremely easy to use (see the code for Agent).

© 2004-2010 Abiro. All rights reserved. Terms of Service | Privacy Statement
Site design, programming and information by Anders Borg.