The World Wide Web Consortium (W3C) has just issued VoiceXML 2.0, a protocol for voice applications on the World Wide Web. VoiceXML 2.0 allows developers to create audio dialogs that feature synthesised speech, digitised audio, recognition of spoken and DTMF (touch-tone) key input, recording of spoken input, telephony, and mixed-initiative conversations.
“VoiceXML 2.0 has the power to change the way phone-based information and customer services are developed. No longer will we have to press ‘one’ for this or ‘two’ for that. Instead, we will be able to make selections and provide information by speech,” explained Dave Raggett, W3C Voice Browser Activity Lead.
“In addition, VoiceXML 2.0 creates opportunities for people with visual impairments or those needing web access while keeping their hands and eyes free for other things, such as getting directions while driving,” Mr Raggett added.
In the W3C Speech Interface Framework, VoiceXML controls how the application interacts with the user, while the Speech Synthesis Markup Language (SSML) is used for spoken prompts and the Speech Recognition Grammar Specification (SRGS) for guiding the speech recognisers via grammars that describe the expected user responses.
Other specifications in the framework include Voice Browser Call Control (CCXML), which provides telephony call control support for VoiceXML or other dialog systems, and Semantic Interpretation for Speech Recognition, which defines the syntax and semantics of the contents of tags in SRGS.
There is also an extensive set of test suites publically available with the VoiceXML 2.0 Candidate Recommendation. While the initial version contains over 300 tests, the final version is expected to have more than 500 tests.
The W3C Voice Browser Working Group is amongst the largest and most active in W3C. Its participants include Canon, Comverse, France Telecom, Genesys Telecommunications Laboratories, HP, HeyAnita, Hitachi, IBM, Intel, Loquendo, Microsoft, Mitsubishi, Motorola, Nokia, Philips, and SAP.
Since 1999, W3C has been working on its Speech Interface Framework to expand access to the web allowing people to interact via key pads, spoken commands, listening to pre-recorded speech, synthetic speech and music.
The W3C was created to lead the web to its full potential by developing common protocols that promote its evolution and ensure its interoperability.