IMAGE
ELRA's WLR Validation Centre at CST
IMAGE


Validation of Written Language Resources

In 2000 ELRA’s Board set up a Validation Committee with the aim to 'maximize the “ease of use” and “suitability” of the language resources (LR) which may be needed for LE-systems'. The term ‘validation’ is understood as the activity involved in quality evaluation of a database against one or more checklists of relevant criteria. These criteria are typically the technical and linguistic specifications that form the basis for the resource. The purpose of validation is to improve the quality of language resources and support their adherence to standards.

 In the context of ELRA, the validation of written language resources, WLR, refers to both lexica and corpora.

 In November 2002, ELRA’s Validation Centre for Written Language Resources became operational at CST with its network of experts. Its main task is to develop a methodology for the validation of WLR and write validation manuals for lexical resources and for corpora. The development of the methodology should be based on the experience gained from the validation of resources from the ELRA/ELDA catalogue.

In the year 2006 the following tasks are completed or underway:

  • QQC checks and reports of 7 resources together with development of tools for QQC production
  • CST and SPEX presented their experiences with the QQC methodology for spoken and written resources at a workshop at LREC06. Hanne Fersøe was invited speaker at the same workshop talking about CST's experiences with validation of written resources in general.

In addition the following tasks were carried over from 2005 and completed

  • QQC Checks and reports of 6 resources

In the year 2005 the following tasks were carried out :

  • Updates on QQC-methodology
  • QQC checks and reports of 6 resources
  • Elaboration of an abstract for LREC06 about QQC

In addition the following tasks were carried over from 2004

  • Update of Validation Manual for Lexica (full validation)
  • Inputs on corpus validation will be consolidated into Version 1 of a Validation Manual for Corpora
  • Establish a draft Bug reporting procedure for WLR and test it on a resource

In the year 2004 the following tasks were carried out:

 In the year 2003 the following tasks were carried out:

 In the year 2002 the following tasks were carried out:

  • The validation web site at CST was established
  • A first draft version of a validation manual for WLR-lexica was created
  • Sub-tasks were agreed with members of the network of experts