544 — Analysis: speech technology future trends

Mar 1, 2003 | Conteúdos Em Ingles

In the effort to maintain a global market presence, companies face some tough challenges – such as cost effective management of multilingual and translingual processes. According to a survey by the VDI/VDE Center for Information Technology (Germany), the main obstacles currently facing this new technology are firstly a lack of industry awareness as to the huge significance of speech-based technologies. Secondly, an unwillingness of end customers to invest. However, the speed with which these barriers are being broken down is accelerating.

The exponential rate of progress, both in the area of speech technology and computer technology as a whole (storage capacities and processor performance), combined with falling prices, is providing the necessary impetus.

Additional boosters cited in the report include the increasing number of mobile end devices and the ever more widespread use of internet and e-commerce. Predictions issued by Frost & Sullivan (Frankfurt am Main, Germany) see the European market volume in speech technology alone increasing fivefold within the next four years.

Linking speech with the processing of digital information is regarded as a fundamental technological goal. “The year of 2003 and 2004 are years which will see the emergence of a wide range of speech technology applications,” confirms Alexander Fries, marketing director at SVOX in Zurich, Switzerland. These will include speech synthesis, speech recognition (speech control, dialogue and dictation systems), translation and OCR (text recognition) systems.

Talking devices take over

Thanks to innovations in natural speech synthesis and articulation, experts look forward to a future full of talking devices: cell phones that relay SMS messages, PDAs that read out emails, VCRs that explain their own instructions, or handheld navigation devices that point directions. Such computer-generated voices promise to especially benefit the blind and visually impaired.

It’s also already possible to get the same voice to talk in different languages. Before long, speech servers with text-to-speech capabilities will have invaded the market. The Giga Information Group asserts such servers could bring about a 20 to 30 percent reduction in call centre and hotline costs.

Fit the above-mentioned examples with additional speech recognition software and the devices react – even in dialogue mode – to spoken commands. Another top trend: speaker-independent remote controlled speech applications that work in loud surroundings. By 2005, it’s expected that 50 percent of all cell phones will feature speech recognition as standard.

The possibility of accessing websites from standard telephones using voice portals is not sci-fi anymore. The Giga Information Group estimates that 70 percent of American companies will be accessing the internet/intranet via natural speech recognition by 2005. According to a study of the dictation systems market, current moves to integrate speech recognition in standard text processing software looks set to galvanise an area which to date has demonstrated slow growth.

Automatic translation

The experts at VDI/VDE are predicting a boom in the automatic translation market. The reason: for many globally active companies, translation costs account for two percent of their revenue. Moreover, 12 percent of all commercial websites are based in countries that aren’t English-speaking. A few major companies have already started using translation servers as translation aids for their employees.

With the release of suitable interpreting software, minicomputers and organizers (PDAs) look set to lead the way. According to a user study by the Fraunhofer Institute , a personal electronic translation aid could bring timesavings of as much as 41.4 percent.

Multidimensional document analysis allows high quality processing of multilingual projects. The combination of OCR and speech articulation systems, enabling for instance scanned texts to be read out loud to the blind and visually impaired, is a major development..

Future combinations

The current trend appears to be for modular concepts. The authors of the VDI/VDE report predict that “the demand for combined solutions – such as speech recognition software featuring synthesis and translation, or the integration of speech technology in other applications, e.g. operating systems or leisure consoles – will inevitably lead to further progress in the area of speech technology implementation”.

Helmut Schöbel from EteX AG (Germany) goes even further: “Long-term, I can see speech technologies being integrated into areas such as artificial intelligence, knowledge management, or biometrics.”

Filipe Samora

Em Foco – Pessoa