|    | LREC Workshop on Cross-Language Search and Summarization of Text and Speech
   May 16, 2020 
   Palais du Pharo, Marseilles, France
 
 
   Call for Papers (http://users.umiacs.umd.edu/~oard/clssts)
    
   In today?s global world, people may need  Cross-Language Information Retrieval (CLIR) enables end users to issue queries in their own language, but provides results from multiple languages around the world, often using translation so that the end user can quickly understand whether the retrieved results are relevant. Cross-language summarization can make it easier for an end user to determine if a document is relevant by providing a summary in the user?s language of the foreign language document, highlighting the evidence for relevance.  When the foreign language is a low-resource language, cross-language search and summarization are more difficult; translation capabilities may be poor and the lack of resources makes it difficult to train CLIR and summarization systems.  To complicate matters even more, when the collection contains speech as well as text, producing accurate search results and generating comprehensible summaries is even more difficult.
    
   This workshop aims to stimulate the collection and provision of resources that can improve systems that perform cross-language search and summarization.  To facilitate dissemination of information about existing resources, the workshop will feature keynote speeches and panels by people who have worked in this area, have cross-language resources to share, or can describe ongoing research programs and shared tasks. Papers are also solicited that describe recent and current research in these areas, that describe relevant resources, or that stake out positions on the directions in which the authors think the field should move.
    
   To set the stage, the organizers will provide two small spoken language test collections that include waveforms, transcriptions and possibly queries with relevance judgments. These are conversational genres, one in Somali (a very-low resource language) and the other in Bulgarian (a moderate-resource language) both of which include approximately 80 hours of speech. We will welcome papers that provide results on these test collections as well as results on any datasets that are available from by ELDA, LDC, or other repositories. Participants are also encouraged to describe other datasets that they have access to and to report results on these.
    
   We solicit papers on research that broadly relates to supporting information access to lower-resource languages addressing topics such as the following:
    
   Test collections for evaluating CLIR
   Development of new cross-lingual resources
   Datasets for cross-lingual summarization
   Methods for CLIR
   CLIR over speech
   Evidence generation for CLIR
   Methods for cross-lingual summarization
   Methods for cross-lingual query-focused summarization
   Snippet generation
   Speech summarization
   Multilingual language generation
   Zero-shot learning and domain adaptation
   Explainable methods for cross-lingual NLP
    
    
   Paper length: Both long papers (8 pages plus references) and short papers (4 pages plus references) are welcome. Papers must  follow the LREC stylesheet available here. Papers must be submitted through START at this link: https://www.softconf.com/lrec2020/CLSSTS2020/
    
    
   Important dates: 
   Submissions due: February 15th, 11:59pm AOE
   Acceptance notifications: March 12th
   Camera ready copy due: April 1st
   Workshop date: May 16th
 
 
   Contact person: Kathy McKeown, Kathy@cs.columbia.edu
    
   Organizing Committee:
   James Allan, UMass Amherst (USA)
   Lu Wang, Northeastern University (USA)
   Kathy McKeown, Columbia University (USA)
   Douglas W. Oard, University of Maryland (USA)
   Steve Renals, University of Edinburgh (UK)
   Richard Schwartz, BBN (USA)
    
   Identify, describe and share your Lexical Resource (LR): 
   Authors will have the opportunity, when submitting a paper, to upload LRs in a special LREC repository.  This effort of sharing LRs, linked to the LRE Map for their description contributes to creating a common repository where everyone can deposit and share data. As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2020 endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers  will be offered at submission time.
 
 
   --   The University of Edinburgh 
  |