Speechlab

KeyVoice PDF Tisk Email

Voice instead of Keyboard (1996)

o004_keyvoice.gif

Modern computer systems offer a rapidly growing number of applications and services (for example, telephoning, faxing, controlling devices like radio, TV or home appliences) that might be very helpful, particularly, for people with different kinds of disabilities. Unfortunately, not all of these people can utilize this chance, simply because their handicap does not allow them to use a keyboard or a mouse. For them, a voice controlled computer could be one the most appropriate options. The idea of the voice control developed at our lab differs from those used in similar systems. Instead of designing special new software for the handicaped or adding a speech recogniser to each particular application, we have implemented our voice input as an interface to the MS Windows operating system. The interface allows the computer user to replace or to alternate keyboard and mouse actions by voice commands. Since the number of these standard actions is limited (as is the number of keys and mouse buttons), also the control-word vocabulary can be limited to an acceptable size. Due to this design, virtually any application written for the MS Windows platform can be controlled by voice without a need for updating or retraining the system vocabulary. The first experimental version of the voice input has been designed for Czech users. The interface has a form of a Windows process that runs in background allowing thus the main application to be started and manipulated. The interfaces key component is an isolated-word HMM-based recogniser. A lot of effort has been devoted to its fast and reliable implementation. Though the recognition system must handle some 120 vocabulary items, it is capable of operating in the speaker-independent real-time mode on a common (486DX4 or better), personal computer. The only requested additional piece of hardware is a 16-bit soundcard. The command-word set includes: spelling names of all the letters, numbers 0 to 12, switches (ALT, SHIFT, CTRL), Czech names of the control keys (like "Vezmi" for Enter, "Smaž" for Delete, "Vlož" for Insert, etc.), words for non-letter characters (like +, -, /, ...) and several system commands (for example, Keep, Release, Repeat, Finish, etc.). The preliminary tests performed with several tens of users showed that the voice interface might be practically applicable. The users were able to cope with the Program Manager in order to start any program, they managed in controlling the File Manager, the Control Panel, the Calculator, the Scheduler or even the Write or the Word. Of course, it was necessary for them to learn the standard keyboard control of the Windows and of the applications, first. Then the same keyboard actions could be replaced by equivalent voice commands. Since many of the Windows applications offer well structured menus, access keys and shortcuts, it was practically demonstrated that the voice control does not actually request much speaking effort and extra time. It is even possible to dictate a text into a word processor and format it in the same way as by means of the keyboard and mouse. In such an application, an experienced user can achieve the input rate about 60 characters per minute.