ISCA - International Speech
Communication Association


ISCApad Archive  »  2015  »  ISCApad #206  »  Resources  »  Database  »  ELRA News

ISCApad #206

Thursday, August 20, 2015 by Chris Wellekens

5-2-17 ELRA News
  

We are happy to announce that 1 new Written Corpus and 1 new Terminological Resource are now available in our catalogue.

ELRA-W0081 Khresmoi manually annotated reference corpus
ISLRN: 764-036-829-417-7
This corpus is a collection of Khresmoi English web documents annotated with key entities (such as disease, drug). The corpus is divided into two parts:
1. The initial corpus: 625 documents from the Genetics Home Reference data set, automatically annotated with anatomical locations and diseases, and manually corrected by 3-4 annotators. Size of documents: between 26 and 8,306 tokens each.
2. The main corpus: 6,950 English documents from the Khresmoi crawl and 5,518 English Wikipedia pages, automatically annotated through the GATE Platform for Anatomy, Disease, Drug and Investigation. Size of documents: between 200 and 2,000 tokens each.
The corpus is using the GATE XML format.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1237

 

ELRA-T0375 ACL RD-TEC: A Reference Dataset for Terminology Extraction and Classification Research in Computational Linguistics
ISLRN: 699-305-362-089-6
This is a reference dataset for terminology extraction and classification research in computational linguistics. It is a set of manually annotated terms in English language that are extracted from the ACL Anthology Reference Corpus (ACL ARC). This dataset, called ACL RD-TEC, is comprised of more than 69,000 candidate terms that are manually annotated as valid and invalid terms. Furthermore, valid terms are classified as technology and non-technology terms.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1236

 

 

For more information on the catalogue, please contact Valérie Mapelli mailto:mapelli@elda.org

 

 

Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/en/catalogues/language-resources-announcements/


Follow us on Twitter:
@ELRANews


Back  Top


 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by ISCA