Speak with your desktop


Multimodal interfaces are a very popular area of research. Recent advances allow you to use speech and other methods in your everyday work increasing the productivity of the computer interaction. Free software also made a great progress in this domain last years. This talk will discuss speech-related desktop issues, demonstrate use cases for natural and very promising interaction method.


Speech is a very powerful method of information exchange. People speak much faster than type and understand spoken word better than text. Though it's a active area of research and many issues exist, GNOME desktop already allows you to control applications with speech. Traditional application in accessibility framework is also important.

The whole area of speech is probably too complicated currently to be used by an average user right now but we hope that situation will change soon. It's really possible to use stable and reliable speech recognition right now, we just need some work in this direction. Speech-related infrastructure like CMUSphinx[2] and Voxforge[3] are in very active development, but they really need our involvement to create applications for end-user.

Gnome-voice-control [1] application developed during Google SOC 2007 is though just as a little part of the application suite. We'll demonstrate it's capabilities, like command and control, multilingual speech recognition. We'll light some directions of the development of this application like spoken language understanding work.

Command and control is for sure not the only area where speech could be applied. Others include dictation and voice search. Both could be successfully used. Voice notes in Tomboy will be a nice capability showing that. Such a hard use case like dictation can be implemented with proper adaptation technique and a little work too.

One day speech input will be as important as traditional methods today. We hope to give a light review on all directions of the developement in this area.

[1] More information about gnome-voice-control can be found at http://live.gnome.org/GnomeVoiceControl

[2] CMU sphinx http://cmusphinx.sourceforge.net

[3] Voxforge http://voxforge.org

