ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2020 » ISCApad #261 » Events » Other Events » (2020-05-13) REPROLANG (part of LREC Conference), Marseille , France

ISCApad #261

Saturday, March 14, 2020 by Chris Wellekens

3-3-2 (2020-05-13) REPROLANG (part of LREC Conference), Marseille , France

FIRST CALL FOR PAPERS

REPROLANG 2020
Shared Task on the Reproduction of Research Results in Science and Technology of Language
(part of LREC 2020 conference)
Marseille, France
May 13-15, 2020
http://wordpress.let.vupr.nl/lrec-reproduction

We are very pleased to announce REPROLANG 2020, the Shared Task on the Reproduction of
Research Results in Science and Technology of Language, organized by ELRA - European
Language Resources Association with the technical support of CLARIN - European Research
Infrastructure for Language Resources and Technology, as part of the LREC 2020 conference.

BACKGROUND

Scientific knowledge is grounded on falsifiable predictions and thus its credibility and
raison d?être relies on the possibility of repeating experiments and getting similar
results as originally
obtained and reported. In many young scientific areas, including ours, acknowledgement
and promotion of the reproduction of research results need very much to be increased.

For this reason, a special track on reproducibility is included into the LREC 2020
conference regular program (side by side with other sessions on other topics) for papers
on reproduction of research results, and the present specific community-wide shared task
is launched to elicit and motivate the spread of scientific work on reproduction. This
initiative builds on the previous pioneer LREC workshops on reproducibility 4REAL 2016
and 4REAL 2018.

SHARED TASK

The shared task is of a new type: it is partly similar to the usual competitive shared
tasks --- in the sense that all participants share a common goal; but it is partly
different to previous shared tasks --- in the sense that its primary focus is on seeking
support and confirmation of previous results, rather than on overcoming those previous
results with superior ones. Thus instead of a competitive shared task, with each
participant struggling for an individual top system that scores as far as possible from a
rough baseline, this will be a cooperative shared task, with participants struggling for
systems that reproduce as close as possible an original complex research experiment and
thus eventually reinforcing the level of reliability on its results by means of their
eventually convergent outcomes. Concomitantly, like with competitive shared tasks, in the
process of participating in the collaborative shared task, new ideas for improvement and
new advances beyond the reproduced results find here an excellent ground to be ignited.

We invite researchers to reproduce the results of a selected set of articles, which have
been offered by the respective authors with their consent to be used for this shared
task. Papers submitted for this task are expected to report on reproduction findings, to
document how the results of the original paper were reproduced, to discuss
reproducibility challenges, to inform on time, space or data requirements found
concerning training and testing, to ponder on lessons learned, to elaborate on
recommendations for best practices, etc.
Submissions that in addition to the reproduction exercise, report also on results of the
replication of the selected tasks with other languages, domains, data sets, models,
methods, algorithms, downstream tasks, etc. are also encouraged. These should permit to
gain insight also into the robustness of the replicated approaches, their learning curves
and potential of incremental performance, their capacity of generalization, their
transferability across experimental circumstances and into eventual real-life usage
scenarios, their suitability to support further progress, etc.

PUBLICATION

LREC conferences have one of the top h5-index scores of research impact among the world
class venues for research on Human Language Technology.

Accepted papers for the shared task will be published in the Proceedings of the LREC 2020
main conference. LREC Proceedings are freely available from ELRA and ACL Anthology. They
are indexed in Scopus (Elsevier) and in DBLP. LREC 2010, LREC 2012 and LREC 2014
Proceedings are included in the Thomson Reuters Conference Proceedings Citation Index
(the other editions are being processed).

Substantially extended versions of papers selected by reviewers as the most appropriate
will be considered for publication in special issues of the Language Resources and
Evaluation Journal published by Springer (a SCI-indexed journal).

IMPORTANT DATES

November 25, 2019: deadline for paper submission (aligned with LREC 2020)
November 27: deadline for projects in gitlab.com to go public
February 14, 2020: notification of acceptance
May 11-16: LREC conference takes place

SELECTED TASKS

The Selection Committee has selected a broad range of papers and tasks.

Chapter A: Lexical processing

Task A.1: Cross-lingual word embeddings

Artetxe, Mikel, Gorka Labaka, and Eneko Agirre. 2018. ?A robust self-learning method for
fully unsupervised cross-lingual mappings of word embeddings?. In Proceedings of the 56th
Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 789?798.
http://aclweb.org/anthology/P18-1073
Major reproduction comparables: Accuracy scores (tables 1 to 4).

Task A.2: Named entity embeddings

Newman-Griffis, Denis, Albert M Lai, and Eric Fosler-Lussier. 2018. ?Jointly Embedding
Entities and Text with Distant Supervision?. In Proceedings of The Third Workshop on
Representation Learning for NLP, pp. 195?206.
http://aclweb.org/anthology/W18-3026
Major reproduction comparables: Spearman?s ? scores for semantic similarity predictions
(tables 3 and 4), and accuracy scores (table 6).

Chapter B: Sentence processing

Task B.1: POS tagging

Bohnet, Bernd, Ryan McDonald, Gonçalo Simões, Daniel Andor, Emily Pitler, and Joshua
Maynez. 2018. ?Morphosyntactic Tagging with a Meta-BiLSTM Model over Context Sensitive
Token Encodings?. In Proceedings of the 56th Annual Meeting of the Association for
Computational Linguistics (ACL 2018), pp. 2642?2652.
http://aclweb.org/anthology/P18-1246
Major reproduction comparables: f-score values (tables 2 to 8).

Task B.2: Sentence semantic relatedness

Gupta, Amulya, and Zhu Zhang. 2018. ?To Attend or not to Attend: A Case Study on
Syntactic Structures for Semantic Relatedness?. In Proceedings of the 56th Annual Meeting
of the Association for Computational Linguistics (ACL 2018), pp. 2116?2125.
http://aclweb.org/anthology/P18-1197
Major reproduction comparables: Pearson?s r and Spearman?s ? scores for the semantic
relatedness
(table 1), and f-score values for paraphrase detection (table 2).

Chapter C: Text processing

Task C.1: Relation extraction and classification

Rotsztejn, Jonathan, Nora Hollenstein, and Ce Zhang. 2018. ?ETH-DS3Lab at SemEval-2018
Task 7: Effectively Combining Recurrent and Convolutional Neural Networks for Relation
Classification and Extraction?. In Proceedings of the 12th International Workshop on
Semantic Evaluation (SemEval 2018), pp. 689?696.
http://aclweb.org/anthology/S18-1112
Major reproduction comparables: precision, recall and f-score values (tables 3 and 4).

Task C.2: Privacy preserving representation

Li, Yitong, Timothy Baldwin, and Trevor Cohn. 2018. ?Towards Robust and
Privacy-preserving Text Representations?. In Proceedings of the 56th Annual Meeting of
the Association for Computational Linguistics (ACL 2018), pp. 25-30.
http://aclweb.org/anthology/P18-2005
Major reproduction comparables: POS accuracy scores (tables 1 and 2), and sentiment
analysis
f-score scores (table 3).

Task C.3: Language modelling

Howard, Jeremy, and Sebastian Ruder. 2018. ?Universal Language Model Fine-tuning for Text
Classification?. In Proceedings of the 56th Annual Meeting of the Association for
Computational Linguistics (ACL 2018), pp. 328?339.
http://aclweb.org/anthology/P18-1031
Major reproduction comparables: Error rate (%) scores in sentiment analysis and question
classification tasks (tables 2 and 3).

Chapter D: Applications

Task D.1: Text simplification

Nisioi, Sergiu, Sanja Stajner, Simone Paolo Ponzetto, and Liviu P. Dinu. 2017.
?Exploring Neural Text Simplification Models?. In Proceedings of the 55th Annual Meeting
of the Association for Computational Linguistics (ACL 2017), pp. 85-91.
http://aclweb.org/anthology/P/P17/P17-2014.pdf
Major reproduction comparables: Averaged human evaluation scores, by 3 evaluators,
in 1 to 5 and -2 to +2 scales (table 2).

Task D.2: Language proficiency scoring

Vajjala, Sowmya, and Taraka Rama. 2018. ?Experiments with Universal CEFR classifications?.
In Proceedings of Thirteenth Workshop on Innovative Use of NLP for Building Educational
Applications, pp. 147?153.
http://aclweb.org/anthology/W18-0515
Major reproduction comparables: f-score values (tables 2, 3 and 4).

Task D.3: Neural machine translation

Vanmassenhove, Eva, and Andy Way. 2018. ?SuperNMT: Neural Machine Translation with
Semantic Supersenses and Syntactic Supertags?. In Proceedings of the 56th Annual Meeting
of the Association for Computational Linguistics (ACL 2018), pp. 67?73.
http://aclweb.org/anthology/P18-3010
Major reproduction comparables: BLEU scores (tables 1 and 2; plots in figures 2, 3 and 4).

Chapter E: Language resources

Task E.1: Parallel corpus construction

Brunato, Dominique, Andrea Cimino, Felice Dell'Orletta, and Giulia Venturi. 2016.
?PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences for Automatic Text
Simplification?. In Proceedings of the 2016 Conference on Empirical Methods in Natural
Language Processing (EMNLP 2016), pp. 351-361.
https://aclweb.org/anthology/D16-1034
Major reproduction comparables: data set.

Participants are expected to obtain the data and tools for the reproduction from the
information provided in the paper. Using the description of the experiment is part of the
reproduction exercise.
SUBMISSION
The START platform of LREC 2020 will be used for the submission of the following required
elements: A paper describing the reproduction effort, and a link to the software and data
used to obtain the results reported in the paper (more details below). The submitted
materials and results will be checked by a CLARIN panel. Papers will be peer-reviewed.

PAPER PREPARATION
REPROLANG 2020 invites the submission of full papers from 4 pages to 8 pages (plus more
pages for references if needed). These submissions must strictly follow the LREC 2020
conference stylesheet which will be available on the conference website.

MATERIALS PREPARATION
To be checked by a CLARIN panel and the submission to be complete, the software used to
obtain the results reported in the paper must be made available as a docker container
through a project in gitlab. Detailed instructions are available at
https://gitlab.com/CLARIN-ERIC/reprolang/ For technical support, the CLARIN team can be
contacted at reprolang-tc@clarin.eu or an issue can be created under
https://gitlab.com/CLARIN-ERIC/reprolang/issues.

Submissions are done via the START conference management system used by LREC 2020 and
include the following elements:
- url address of your gitlab.com project
- url of the tar.gz with the datasets - the md5 checksum of the above tar.gz
- .pdf with the paper, which must include the above url of your gitlab.com project, and
the above commit hash and tag

The project in gitlab.com should be made public within 2 days after the submission
deadline.

PRESENTATION Papers accepted for publication will be presented in a specific session of
the LREC main conference. There is no difference in quality between oral and poster
presentations. Only the appropriateness of the type of communication (more or less
interactive) to the content of the paper will be considered. The format of the
presentations will be decided by the Program Committee. The proceedings will include both
oral and poster papers in the same format.

REGISTRATION
For a selected paper to be included in the programme and to be published in the
proceedings, at least one of its authors must register for the LREC 2020 conference by
the early bird registration deadline. A single registration only covers one paper,
following the general LREC policy on registration. Registration service is to be found at
the LREC 2020 website.

CONTACTS
About the shared task:
Piek Vossen
p.t.j.m.vossen@vu.nl

About the preparation and submission of materials:
reprolang-tc@clarin.eu
REPROLANG 2020 website: http://wordpress.let.vupr.nl/lrec-reproduction

ORGANIZATION

Steering Committee

António Branco, University of Lisbon (chair of Steering Committee)
Nicoletta Calzolari, ILC, Pisa (co-chair of Steering Committee)
Gertjan van Noord, University of Groningen (chair of Task Selection Committee)
Piek Vossen, VU University Amsterdam (chair of Program Committee)

Task Selection Committee

Gertjan van Noord, University of Groningen (chair)
Tim Baldwin, University of Melbourne
António Branco, University of Lisbon
Nicoletta Calzolari, ILC, Pisa
Ça?r? Çöltekin, University of Tuebingen
Nancy Ide, Vassar College, New York
Malvina Nissim, University of Groningen
Stephan Oepen, University of Oslo
Barbara Plank, University of Copenhagen
Piek Vossen, VU University Amsterdam
Dan Zeman, Prague University

Program Committee

several invitations awaiting an answer marked with [!]

Piek Vossen, VU University Amsterdam (chair)
[!]Gilles Adda, LIMSI-CNRS, Paris
[!]Eneko Agirre Basque University
Francis Bond, NanyangTechnical University, Singapore
António Branco, University of Lisbon

Nicoletta Calzolari, ILC, Pisa
Kevin Cohen, University of Colorado Boulder
[!]Thierry Declerck declerck@dfki.de, DFKI Saarbruecken
[!]John McCrae, Galway University
Nancy Ide , Vassar College, New York
[!]Antske Fokkens VU University Amsterdam
Karën Fort, University of Paris-Sorbonne
[!] Cyril Grouin, LIMSI-CNRS, Paris
Mark Liberman, University of Pennsylvania
[!] Margo Mieskis
[!] Aurélie Névéol, LIMSI-CNRS, Paris
Gertjan van Noord, University of Groningen
Stephan Oepen, University of Oslo
[!]Ted Pedersen, University of Minnesota
Senja Pollak, Jozef Stefan Institute, Ljubljana
[!]Paul Rayson, Lancaster University
Martijn Wieling, University of Groningen

Technical Committee
reprolang-tc@clarin.eu
Dieter Van Uytvanck, CLARIN (chair)
André Moreira, CLARIN
Twan Goosen, CLARIN
João Ricardo Silva, CLARIN and University of Lisbon
Luís Gomes, CLARIN and University of Lisbon
Willem Elbers, CLARIN

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy