584 — Special Report: Human Language Technology benchmarking in Europe

Mar 28, 2003 | Conteúdos Em Ingles

According to a report issued this week by Euromap, a European Commission supported initiative dedicated to foster the uptake of Human Language Technologies (HLT), Germany is the leading country on HLT performance with the highest opportunity index, followed by Netherlands and UK. Furthermore, Euromap concluded that R&D in HLT is a very complex process needing public sector support.

National approaches to HLT research have mirrored local priorities and structures: in Germany, for example, large comprehensive programmes with a single focus (e.g. Verbmobil) linked industry to the research community in a very structured way. In the UK a relatively early Speech and Language Technology Programme solidified a strong network of national researchers.

Behind Germany, Netherlands and UK is a ‘Strong Potential’ group who scored near or below average on the Opportunity Index, but above average on the HLT Benchmark. This includes France, Belgium and Spain.

These countries have well-developed research communities, and a significant depth of HLT research, so they are in a strong position to exploit HLT as opportunity factors improve, e.g. as rates of internet use rise and greater support for business creation is forthcoming.

A third group show ‘Promising’, with Ireland and Denmark ranked near average on both scores, just behind Sweden which scored highest on opportunity factors and Finland which is above average on HLT.

Finally, there is a group of four countries (Greece, Italy, Portugal and Austria) which have reached the ‘Structural Limits’ of their existing HLT market situation, and require a new approach to catch up with the leaders. They all scored below average on both measures, though with different profiles. Both Greece and Portugal scored low on Opportunity factors, though Greece scored higher on HLT measures, due to its strong R&D base.

“Both these countries may need to look beyond their borders for opportunities to exploit their HLT research, and will benefit from enhanced EU collaboration. Portugal, in particular, could improve its research opportunities with more cross-border collaboration.

“Italy has a slightly stronger research base than most of its fellow countries. Austria has the advantage of sharing a language with the leading HLT research country, but this very fact might also act as a disincentive when it comes to expanding its own HLT activities,” the report stated.

Euromap suggested also the creation of an autonomous language technology agency: “As the HLT research community in Europe becomes ever more integrated, language expertise migrates across the whole of the EU, while naturally retaining its roots in national language communities. It is essential that language technology expertise and linguistic expertise be free to migrate and integrate across the EU research community”.

The report concludes also that there has been no direct link between robustness of the HLT research effort in any particular language community and an actual effectiveness of transfer to market, suggesting the HLT development is still somehow hermetical at this stage.

The HLT transfer to market is influenced by three strong factors: size of linguistic community, business environment / infrastructure and sharpness of research focus.

All HLT relies on core language processing components that digitally emulate the way humans process language. These components can be based on linguistic rules (such as grammar), on statistical analysis (e.g. to measure the probability that a text or an utterance has a particular meaning), or on a mix of the two.

In addition, all HLT techniques need a source of linguistic data as a reference, such as a lexicon (a dictionary coded with grammatical information), or a ‘corpus’ that provides a large database of the raw material of language, either text or speech.

Datamonitor puts the worldwide speech technology market for 2003 near 1 billion euros. IDC estimates the current NLP (Natural Language Processing) market at around 400 million euros. By 2005, the combined speech/NLP market is forecast to exceed 2 billion euros.

Filipe Samora

Em Foco – Pessoa