Gnome-Voice-Control is a dialogue system to control the GNOME Desktop. It is developed on Google Summer of Code 2007.
The idea is to develop some features that will improve the usability in the Gnome Desktop. The goal is to implement a Desktop Voice Control System. The system consists in an application that will be monitoring the audio input(microphone) and when a significant audio signal has been detected, the software catches, processes and recognizes the signal and then executes the desired action over the Gnome Desktop. In a set of actions could include maximize, minimize, close the active window; open a specific program; switching from one desktop to another; among others. GnomeVoiceControl is implemented in C in conjunction with CMU Sphinx, which is an open source tool, created to convert speech to text.
If you'd like to have a quick start, look at screencast here
Where to get it?
It requires pocketsphinx, gstreamer and accessibility development libraries. Please note that your interface should be in English. It's possible to work with more languages like German, Russian or Spanish, but it will require additional effort from you.
If you have any feature request or problem, please submit a bug in GNOME Bugzilla.
- You can always checkout latest development code from SVN repository. Just run the following command
svn checkout http://svn.berlios.de/svnroot/repos/festlang/trunk/gnome-voice-control
Alternatively you can browse it with ViewCVS
- join #cmusphinx channel on freenode.net if you have quick question.
How to build
Build is quite complicated since there are no packages in common distors unfortunately. Please follow the instructions on this /Build or in README file in a tarball.
It's recommended to build a version from trunk. Also please note that there are imporant accessibility issues, you should either patch pocketsphinx or disable accessibility and restrict yourself to several basic commands. See this discussion for details Discussion on sourceforge
The pocketsphinx-patch.diff is attached to this page
Your voice is not recognized properly/Your language is not supported yet?
Submit your voice to free GPL speech corpora http://voxforge.org. We also consider databases for different languages, but that will require very close interoperation.
For now we are in early beginning, but we targeting many interesting things like a support for text items navigation, mouse events, models for different languages like French, Spanish and Russian an many more.
Slides from Raphael N. Motta and Nickolay V. Shmyrev's presentation "Speaking with your Desktop" at 2008 GUADEC