3-3-2 (2020-05-13) REPROLANG (part of LREC Conference), Marseille , France
FIRST CALL FOR PAPERS
REPROLANG 2020 Shared Task on the Reproduction of Research Results in Science and Technology of Language (part of LREC 2020 conference) Marseille, France May 13-15, 2020 http://wordpress.let.vupr.nl/lrec-reproduction
We are very pleased to announce REPROLANG 2020, the Shared Task on the Reproduction of Research Results in Science and Technology of Language, organized by ELRA - European Language Resources Association with the technical support of CLARIN - European Research Infrastructure for Language Resources and Technology, as part of the LREC 2020 conference.
BACKGROUND
Scientific knowledge is grounded on falsifiable predictions and thus its credibility and raison d?être relies on the possibility of repeating experiments and getting similar results as originally obtained and reported. In many young scientific areas, including ours, acknowledgement and promotion of the reproduction of research results need very much to be increased.
For this reason, a special track on reproducibility is included into the LREC 2020 conference regular program (side by side with other sessions on other topics) for papers on reproduction of research results, and the present specific community-wide shared task is launched to elicit and motivate the spread of scientific work on reproduction. This initiative builds on the previous pioneer LREC workshops on reproducibility 4REAL 2016 and 4REAL 2018.
SHARED TASK
The shared task is of a new type: it is partly similar to the usual competitive shared tasks --- in the sense that all participants share a common goal; but it is partly different to previous shared tasks --- in the sense that its primary focus is on seeking support and confirmation of previous results, rather than on overcoming those previous results with superior ones. Thus instead of a competitive shared task, with each participant struggling for an individual top system that scores as far as possible from a rough baseline, this will be a cooperative shared task, with participants struggling for systems that reproduce as close as possible an original complex research experiment and thus eventually reinforcing the level of reliability on its results by means of their eventually convergent outcomes. Concomitantly, like with competitive shared tasks, in the process of participating in the collaborative shared task, new ideas for improvement and new advances beyond the reproduced results find here an excellent ground to be ignited.
We invite researchers to reproduce the results of a selected set of articles, which have been offered by the respective authors with their consent to be used for this shared task. Papers submitted for this task are expected to report on reproduction findings, to document how the results of the original paper were reproduced, to discuss reproducibility challenges, to inform on time, space or data requirements found concerning training and testing, to ponder on lessons learned, to elaborate on recommendations for best practices, etc. Submissions that in addition to the reproduction exercise, report also on results of the replication of the selected tasks with other languages, domains, data sets, models, methods, algorithms, downstream tasks, etc. are also encouraged. These should permit to gain insight also into the robustness of the replicated approaches, their learning curves and potential of incremental performance, their capacity of generalization, their transferability across experimental circumstances and into eventual real-life usage scenarios, their suitability to support further progress, etc.
PUBLICATION
LREC conferences have one of the top h5-index scores of research impact among the world class venues for research on Human Language Technology.
Accepted papers for the shared task will be published in the Proceedings of the LREC 2020 main conference. LREC Proceedings are freely available from ELRA and ACL Anthology. They are indexed in Scopus (Elsevier) and in DBLP. LREC 2010, LREC 2012 and LREC 2014 Proceedings are included in the Thomson Reuters Conference Proceedings Citation Index (the other editions are being processed).
Substantially extended versions of papers selected by reviewers as the most appropriate will be considered for publication in special issues of the Language Resources and Evaluation Journal published by Springer (a SCI-indexed journal).
IMPORTANT DATES
November 25, 2019: deadline for paper submission (aligned with LREC 2020) November 27: deadline for projects in gitlab.com to go public February 14, 2020: notification of acceptance May 11-16: LREC conference takes place
SELECTED TASKS
The Selection Committee has selected a broad range of papers and tasks.
Chapter A: Lexical processing
Task A.1: Cross-lingual word embeddings
Artetxe, Mikel, Gorka Labaka, and Eneko Agirre. 2018. ?A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 789?798. http://aclweb.org/anthology/P18-1073 Major reproduction comparables: Accuracy scores (tables 1 to 4).
Task A.2: Named entity embeddings
Newman-Griffis, Denis, Albert M Lai, and Eric Fosler-Lussier. 2018. ?Jointly Embedding Entities and Text with Distant Supervision?. In Proceedings of The Third Workshop on Representation Learning for NLP, pp. 195?206. http://aclweb.org/anthology/W18-3026 Major reproduction comparables: Spearman?s ? scores for semantic similarity predictions (tables 3 and 4), and accuracy scores (table 6).
Chapter B: Sentence processing
Task B.1: POS tagging
Bohnet, Bernd, Ryan McDonald, Gonçalo Simões, Daniel Andor, Emily Pitler, and Joshua Maynez. 2018. ?Morphosyntactic Tagging with a Meta-BiLSTM Model over Context Sensitive Token Encodings?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 2642?2652. http://aclweb.org/anthology/P18-1246 Major reproduction comparables: f-score values (tables 2 to 8).
Task B.2: Sentence semantic relatedness
Gupta, Amulya, and Zhu Zhang. 2018. ?To Attend or not to Attend: A Case Study on Syntactic Structures for Semantic Relatedness?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 2116?2125. http://aclweb.org/anthology/P18-1197 Major reproduction comparables: Pearson?s r and Spearman?s ? scores for the semantic relatedness (table 1), and f-score values for paraphrase detection (table 2).
Chapter C: Text processing
Task C.1: Relation extraction and classification
Rotsztejn, Jonathan, Nora Hollenstein, and Ce Zhang. 2018. ?ETH-DS3Lab at SemEval-2018 Task 7: Effectively Combining Recurrent and Convolutional Neural Networks for Relation Classification and Extraction?. In Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval 2018), pp. 689?696. http://aclweb.org/anthology/S18-1112 Major reproduction comparables: precision, recall and f-score values (tables 3 and 4).
Task C.2: Privacy preserving representation
Li, Yitong, Timothy Baldwin, and Trevor Cohn. 2018. ?Towards Robust and Privacy-preserving Text Representations?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 25-30. http://aclweb.org/anthology/P18-2005 Major reproduction comparables: POS accuracy scores (tables 1 and 2), and sentiment analysis f-score scores (table 3).
Task C.3: Language modelling
Howard, Jeremy, and Sebastian Ruder. 2018. ?Universal Language Model Fine-tuning for Text Classification?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 328?339. http://aclweb.org/anthology/P18-1031 Major reproduction comparables: Error rate (%) scores in sentiment analysis and question classification tasks (tables 2 and 3).
Chapter D: Applications
Task D.1: Text simplification
Nisioi, Sergiu, Sanja Stajner, Simone Paolo Ponzetto, and Liviu P. Dinu. 2017. ?Exploring Neural Text Simplification Models?. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017), pp. 85-91. http://aclweb.org/anthology/P/P17/P17-2014.pdf Major reproduction comparables: Averaged human evaluation scores, by 3 evaluators, in 1 to 5 and -2 to +2 scales (table 2).
Task D.2: Language proficiency scoring
Vajjala, Sowmya, and Taraka Rama. 2018. ?Experiments with Universal CEFR classifications?. In Proceedings of Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 147?153. http://aclweb.org/anthology/W18-0515 Major reproduction comparables: f-score values (tables 2, 3 and 4).
Task D.3: Neural machine translation
Vanmassenhove, Eva, and Andy Way. 2018. ?SuperNMT: Neural Machine Translation with Semantic Supersenses and Syntactic Supertags?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 67?73. http://aclweb.org/anthology/P18-3010 Major reproduction comparables: BLEU scores (tables 1 and 2; plots in figures 2, 3 and 4).
Chapter E: Language resources
Task E.1: Parallel corpus construction
Brunato, Dominique, Andrea Cimino, Felice Dell'Orletta, and Giulia Venturi. 2016. ?PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences for Automatic Text Simplification?. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), pp. 351-361. https://aclweb.org/anthology/D16-1034 Major reproduction comparables: data set.
Participants are expected to obtain the data and tools for the reproduction from the information provided in the paper. Using the description of the experiment is part of the reproduction exercise. SUBMISSION The START platform of LREC 2020 will be used for the submission of the following required elements: A paper describing the reproduction effort, and a link to the software and data used to obtain the results reported in the paper (more details below). The submitted materials and results will be checked by a CLARIN panel. Papers will be peer-reviewed.
PAPER PREPARATION REPROLANG 2020 invites the submission of full papers from 4 pages to 8 pages (plus more pages for references if needed). These submissions must strictly follow the LREC 2020 conference stylesheet which will be available on the conference website.
MATERIALS PREPARATION To be checked by a CLARIN panel and the submission to be complete, the software used to obtain the results reported in the paper must be made available as a docker container through a project in gitlab. Detailed instructions are available at https://gitlab.com/CLARIN-ERIC/reprolang/ For technical support, the CLARIN team can be contacted at reprolang-tc@clarin.eu or an issue can be created under https://gitlab.com/CLARIN-ERIC/reprolang/issues.
Submissions are done via the START conference management system used by LREC 2020 and include the following elements: - url address of your gitlab.com project - url of the tar.gz with the datasets - the md5 checksum of the above tar.gz - .pdf with the paper, which must include the above url of your gitlab.com project, and the above commit hash and tag
The project in gitlab.com should be made public within 2 days after the submission deadline.
PRESENTATION Papers accepted for publication will be presented in a specific session of the LREC main conference. There is no difference in quality between oral and poster presentations. Only the appropriateness of the type of communication (more or less interactive) to the content of the paper will be considered. The format of the presentations will be decided by the Program Committee. The proceedings will include both oral and poster papers in the same format.
REGISTRATION For a selected paper to be included in the programme and to be published in the proceedings, at least one of its authors must register for the LREC 2020 conference by the early bird registration deadline. A single registration only covers one paper, following the general LREC policy on registration. Registration service is to be found at the LREC 2020 website.
António Branco, University of Lisbon (chair of Steering Committee) Nicoletta Calzolari, ILC, Pisa (co-chair of Steering Committee) Gertjan van Noord, University of Groningen (chair of Task Selection Committee) Piek Vossen, VU University Amsterdam (chair of Program Committee)
Task Selection Committee
Gertjan van Noord, University of Groningen (chair) Tim Baldwin, University of Melbourne António Branco, University of Lisbon Nicoletta Calzolari, ILC, Pisa Ça?r? Çöltekin, University of Tuebingen Nancy Ide, Vassar College, New York Malvina Nissim, University of Groningen Stephan Oepen, University of Oslo Barbara Plank, University of Copenhagen Piek Vossen, VU University Amsterdam Dan Zeman, Prague University
Program Committee
several invitations awaiting an answer marked with [!]
Piek Vossen, VU University Amsterdam (chair) [!]Gilles Adda, LIMSI-CNRS, Paris [!]Eneko Agirre Basque University Francis Bond, NanyangTechnical University, Singapore António Branco, University of Lisbon
Nicoletta Calzolari, ILC, Pisa Kevin Cohen, University of Colorado Boulder [!]Thierry Declerck declerck@dfki.de, DFKI Saarbruecken [!]John McCrae, Galway University Nancy Ide , Vassar College, New York [!]Antske Fokkens VU University Amsterdam Karën Fort, University of Paris-Sorbonne [!] Cyril Grouin, LIMSI-CNRS, Paris Mark Liberman, University of Pennsylvania [!] Margo Mieskis [!] Aurélie Névéol, LIMSI-CNRS, Paris Gertjan van Noord, University of Groningen Stephan Oepen, University of Oslo [!]Ted Pedersen, University of Minnesota Senja Pollak, Jozef Stefan Institute, Ljubljana [!]Paul Rayson, Lancaster University Martijn Wieling, University of Groningen
Technical Committee reprolang-tc@clarin.eu Dieter Van Uytvanck, CLARIN (chair) André Moreira, CLARIN Twan Goosen, CLARIN João Ricardo Silva, CLARIN and University of Lisbon Luís Gomes, CLARIN and University of Lisbon Willem Elbers, CLARIN