ISCA - International Speech
Communication Association


ISCApad Archive  »  2015  »  ISCApad #201  »  Resources  »  Database  »  ELRA - Language Resources Catalogue - Update (2015-02)

ISCApad #201

Wednesday, March 11, 2015 by Chris Wellekens

5-2-1 ELRA - Language Resources Catalogue - Update (2015-02)
  
We are happy to announce that 1 new Written Corpus and 3 Evaluation Packages are now available in our catalogue.

ELRA-W0082 88milSMS. A corpus of authentic text messages in French
ISLRN: 024-713-187-947-8
A pluridisciplinary team of linguists and computer scientists collected more than 88,000 French authentic text messages in Montpellier (2011), as part of the sud4science LR project. The text messages were semi-automatically anonymised, before being partially transcoded (into standardised French) and annotated.


ELRA-E0043 CLEFeHealth 2014 Task 3 Evaluation Package
ISLRN: 725-020-897-275-7
The CLEFeHealth 2014 Task 3 Evaluation Package contains data used for the User-centred health information retrieval Shared task at the CLEFeHealth Lab conducted in 2014. Task 3 aimed at evaluating information retrieval to address questions patients may have when reading clinical reports.

ELRA-E0044 REPERE Evaluation Package
ISLRN: 360-758-359-485-0
The REPERE Evaluation Package contains the visual annotation of 60 hours of French news TV shows, for the purpose of person recognition within TV programs. This annotation concerns both persons and written information appearing on screen.
Provided data consists of:
- video files with indexes and with manual transcriptions in XGTF format (Viper),
- audio files compressed in WAV format with transcriptions in TRS format (Transcriber).

ELRA-E0045 MAURDOR Evaluation Package
ISLRN: 364-018-517-901-2
The MAURDOR project consists in evaluating systems for automatic processing of written documents. Collected written documents are scanned documents (printed, typewritten or manuscripts). This package contains 8,129 documents. Once collected, those documents were submitted to a manual annotation. This package contains the material provided to the evaluation campaign participants:
 - Consistent development and test data corresponding to the application concerned;
- Tools for the automatic measurement of system performances;
- A common assessment protocol applicable to each processing stage, along with a complete automatic processing chain for written documents.
The documents are provided in TIFF format and the annotations are provided in XML format.


For more information on the catalogue, please contact Valérie Mapelli mailto:mapelli@elda.org

Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/en/catalogues/language-resources-announcements/
Follow us on Twitter @ELRANews
 
 
 
 
 
 

 
 

Back  Top


 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA