ISCApad #205 |
Wednesday, July 08, 2015 by Chris Wellekens |
7-1 | Journal of Natural Language Engineering - Special Issue on “Machine Translation Using Comparable Corpora”
***** Journal of Natural Language Engineering - Special Issue on “Machine Translation Using Comparable Corpora” ***** CALL FOR PAPERS Statistical machine translation based on parallel corpora has been very successful. The major search engines' translation systems, which are used by millions of people, are primarily using this approach, and it has been possible to come up with new language pairs in a fraction of the time that would be required when using more traditional rule-based methods. In contrast, research on comparable corpora is still at an earlier stage. Comparable corpora can be defined as monolingual corpora covering roughly the same subject area in different languages but without being exact translations of each other. However, despite its tremendous success, the use of parallel corpora in MT has a number of drawbacks: 1) It has been shown that translated language is somewhat different from original language, for example Klebanov & Flor showed that 'associative texture' is lost in translation. 2) As they require translation, parallel corpora will always be a far scarcer resource than comparable corpora. This is a severe drawback for a number of reasons: a) Among the about 7000 world languages, of which 600 have a written form, the vast majority are of the 'low resource' type. b) The number of possible language pairs increases with the square of the number of languages. When using parallel corpora, one bitext is needed for each language pair. When using comparable corpora, one monolingual corpus per language suffices. c) For improved translation quality, translation systems specialized on particular genres and domains are desirable. But it is far more difficult to acquire appropriate parallel rather than comparable training corpora. d) As language evolves over time, the training corpora should be updated on a regular basis. Again, this is more difficult in the parallel case. For such reasons it would be a big step forward if it were possible to base statistical machine translation on comparable rather than on parallel corpora: The acquisition of training data would be far easier, and the unnatural 'translation bias' (source language shining through) within the training data could be avoided. But is there any evidence that this is possible? Motivation for using comparable corpora in MT research comes from a cognitive perspective: Experience tells that persons who have learned a second language completely independently from their mother tongue can nevertheless translate between the languages. That is, human performance shows that there must be a way to bridge the gap between languages which does not rely on parallel data. Using parallel data for MT is of course a nice shortcut. But avoiding this shortcut by doing MT based on comparable corpora may well be a key to a better understanding of human translation, and to better MT quality. Work on comparable corpora in the context of MT has been ongoing for almost 20 years. It has turned out that this is a very hard problem to solve, but as it is among the grand challenges in multilingual NLP, interest has steadily increased. Apart from the increase in publications this can be seen from the considerable number of research projects (such as ACCURAT and TTC) which are fully or partially devoted to MT using comparable corpora. Given also the success of the workshop series on “Building and Using Comparable Corpora“ (BUCC), which is now in its seventh year, and following the publication of a related book (http://www.springer.com/computer/ai/book/978-3-642-20127-1), we think that it is now time to devote a journal special issue to this field. It is meant to bundle the latest top class research, make it available to everybody working in the field, and at the same time give an overview on the state of the art to all interested researchers. TOPICS OF INTEREST We solicit contributions including but not limited to the following topics: • Comparable corpora based MT systems (CCMTs) • Architectures for CCMTs • CCMTs for less-resourced languages • CCMTs for less-resourced domains • CCMTs dealing with morphologically rich languages • CCMTs for spoken translation • Applications of CCMTs • CCMT evaluation • Open source CCMT systems • Hybrid systems combining SMT and CCMT • Hybrid systems combining rule-based MT and CCMT • Enhancing phrase-based SMT using comparable corpora • Expanding phrase tables using comparable corpora • Comparable corpora based processing tools/kits for MT • Methods for mining comparable corpora from the Web • Applying Harris' distributional hypothesis to comparable corpora • Induction of morphological, grammatical, and translation rules from comparable corpora • Machine learning techniques using comparable corpora • Parallel corpora vs. pairs of non-parallel monolingual corpora • Extraction of parallel segments or paraphrases from comparable corpora • Extraction of bilingual and multilingual translations of single words and multi-word expressions, proper names, and named entities from comparable corpora IMPORTANT DATES December 1, 2014: Paper submission deadline February 1, 2015: Notification May 1, 2015: Deadline for revised papers July 1, 2015: Final notification September 1, 2015: Final paper due GUEST EDITORS Reinhard Rapp, Universities of Aix Marseille (France) and Mainz (Germany) Serge Sharoff, University of Leeds (UK) Pierre Zweigenbaum, LIMSI, CNRS (France) FURTHER INFORMATION Please use the following e-mail address to contact the guest editors: jnle.bucc (at) limsi (dot) fr Further details on paper submission will be made available in due course at the BUCC website: http://comparable.limsi.fr/bucc2014/bucc-introduction.html
| ||||||||||
7-2 | IEEE/ZACM Trans. ASLP: special issue on continuous space and related methods in natural language processing
| ||||||||||
7-3 | Special issue Speech Communication on Advances in Sparse Modeling and Low-rank Modeling for Speech ProcessingManuscript due: Dec. 1, 2014 Journal: Speech Communication Description: Sparse and low-rank modeling aim to incorporate the low-dimensional structures pertained to the geometry of the underlying problems to achieve the optimal solution. These concepts have been proven to be very effective for a wide range of applications at the intersection of multiple fields, including machine learning, signal processing and statistics. In the context of audio and speech processing, and more particularly multiparty communications in reverberant and overlapping conditions, the integration of sparse and low-rank modeling concepts has lead to several interesting new directions and promising results in speech communication problems, ranging from denoising to deconvolution and from separation to recognition. Several other exciting developments include sparse linear prediction, missing data recovery, audio content analysis and inpainting. Addressing such real applications is particularly challenging due to the complex acoustic and speech characteristics, and the need to develop new modeling strategies that meet the foundational theoretical hypotheses. In addition, speech recognition performance seems to degrade in these complex acoustic conditions, and thus research in this direction is critical from both a theoretical and industry perspective. The goal of the proposed special issue is to consolidate the research in these diverse fields in a coherent framework and overview the recent advances and trends where sparse and low-rank modeling and applications are converging to new fundamental and practical paradigms that could also lead to the emergence of new speech technologies. Topics of interest include: * Manifold learning in speech processing: single and multi-microphone speech enhancement * Sparse modeling and low-rank modeling for separation and denoising * Sparse regression and classification * Sparse dimensionality reduction for feature extraction * Structured sparsity models underlying audio and speech representation * Auditory-inspired sparse modeling * Sparse modeling and low-rank modeling for source localization * Sparse representation and low-rank representation for reverberant acoustic modeling * Sparse data processing and modeling in low-resourced languages * Applications in speech recognition, privacy-preserving speech processing, speaker recognition and authentication, speaker diarization, microphone array calibration, audio information retrieval, speech synthesis and coding Lead Guest Editors: * Prof. Hervé Bourlard, herve.bourlard@idiap.ch * Dr. Afsaneh Asaei, afsaneh.asaei@idiap.ch * Dr. Tara N. Sainath, tsainath@us.ibm.com * Prof. Sharon Gannot, sharon.gannot@biu.ac.il For more information about this special issue, please visit: http://si.eurasip.org/issues/36/advances-in-sparse-modeling-and-low-rank-modeling/ _______________________________________________
| ||||||||||
7-4 | [Special Issue] Deep Learning for Speech and Language Processing Applications, EURASIPManuscript due: Dec. 15, 2014 Journal: EURASIP Journal on Audio, Speech, and Music Processing Description: Deep learning techniques have enjoyed enormous success in the speech and language processing community over the past few years, beating previous state-of-the-art approaches to acoustic modeling, language modeling, and natural language processing. A common theme across different tasks is that that the depth of the network allows useful representations to be learned. For example, in acoustic modeling, the ability of deep architectures to disentangle multiple factors of variation in the input, such as various speaker-dependent effects on speech acoustics, has led to excellent improvements in speech recognition performance on a wide variety of tasks. In addition, in natural language processing and language modeling tasks, integrating learned vector space models of words, which perform smoothing and clustering based on semantic and syntactic information contained in word contexts, with recurrent or recursive architectures has led to significant advances. We as a community should continue to understand what makes deep learning successful for speech and language, and how further improvements can be achieved. For example, just as deep networks made us re-think the input feature representation pipeline used for speech recognition, we should continue to push deep learning into other areas of the speech recognition pipeline. In addition, new architectures, such as convolutional neural networks and recurrent networks using long short-term memory cells, have improved performance, and we believe alternative architectures can improve performance further. Secondly, optimization of large neural network models remains a huge challenge, both because of computational cost and amount of data, which could possibly be unsupervised. Topics of interest include: * New deep-learning architectures and algorithms * Optimization strategies for deep learning * Improved adaptation methods for deep learning * Unsupervised and semi-supervised training for deep learning * Novel applications of deep learning for speech and language tasks * Theoretical and empirical understanding of deep learning for speech and language * Deep-learning toolkits and/or platforms for big data Lead Guest Editor: * Tara Sainath, Google Inc., USA Guest Editors: * Michiel Bacchiani, Google Inc., USA * Hui Jiang, York University, Canada * Brian Kingsbury, Thomas J. Watson Research Center, USA * Hermann Ney, RWTH Aachen, Germany * Frank Seide, Microsoft Research Asia, China * Andrew Senior, Google Inc., USA For more information about this special issue, please visit: http://si.eurasip.org/issues/38/deep-learning-for-speech-and-language-processing/ _______________________________________________
| ||||||||||
7-5 | SPECIAL ISSUE ON DIALOGUE STATE TRACKING (Extended deadline)CALL FOR PAPERS
DIALOGUE & DISCOURSE
SPECIAL ISSUE ON DIALOGUE STATE TRACKING
GUEST EDITORS
Jason D. Williams, Microsoft Research
Antoine Raux, Lenovo
Matthew Henderson, Cambridge University
IMPORTANT DATES
Submission deadline: 15 April 2015 Notification: 26 June 2015
Final version of accepted papers due: 28 August 2015
Anticipated publication: 16 October 2015
INTRODUCTION
Conversational systems are increasingly becoming a part of daily life, with examples including Apple's Siri, Google Now,
Nuance Dragon Go, Xbox and Cortana from Microsoft, and numerous new entrants. Many conversational systems include a dialogue state tracking function, which estimates relevant aspects of the interaction such as the user's goal, level of frustration, trust towards the system, etc, given all of the dialogue history so far. For example, in a tourist information system, the dialogue state might indicate the type of business the user is searching for (pub, restaurant, coffee shop), their desired price range and type of food served. Dialogue state tracking is difficult because automatic speech recognition (ASR) and spoken language understanding (SLU) errors are common, and can cause the system to misunderstand the user. At the same time, state tracking is crucial because the system relies on the estimated dialogue state to choose actions -- for example, which restaurants to suggest. Most commercial systems use hand-crafted heuristics for state tracking, selecting the SLU result with the highest confidence score, and discarding alternatives. In contrast, statistical approaches consider many hypotheses for the dialogue state. By exploiting correlations between turns and information from external data sources -- such as maps, knowledge bases, or models of past dialogues -- statistical approaches can overcome some SLU errors. Although dialogue state tracking has been an active area of study for more than a decade, there has been a flurry of new work in the past 2 years. This has been driven in part by the availability of common corpora and evaluation measures provided by a series of three research community challenge tasks called the Dialogue State Tracking Challenge. With these resources, researchers are able to study dialogue state tracking without investing the time and effort required to build and operate a spoken dialogue system. Shared resources also allow direct comparison of methods across research groups. Results from the Dialogue State Tracking Challenge have been presented at special sessions in SIGDIAL 2013, SIGDIAL 2014, and IEEE SLT 2014. TOPICS OF INTEREST The aim of this special issue is to provide a forum for in-depth, journal-level work on dialogue state tracking. This issue welcomes papers covering any topic relevant to dialogue state tracking. Specific examples include (but are not limited to): - Algorithms for dialogue state tracking, including those based on machine learning or novel heuristics - Adaptation and learning in dialogue state tracking, for example across domains, users, usage environments, etc. - Analyses of dialogue state tracking methods, or analyses of characteristics of dialogue that affect dialogue state tracking - Investigations of metrics used for dialogue state tracking, including the impact of dialogue state tracking on end-to-end dialogue systems - Descriptions and analyses of resources for dialogue state tracking, including corpora - Applications of dialogue state tracking to new domains or new settings, such as multi-modal systems Submissions should report on new work, or substantially expand on previously published work with additional experiments, analysis, or important detail. Previously-published aspects may be included but should be clearly indicated. RELEVANT RESOURCES All data from the dialogue state tracking challenge series continues to be available for use, including the dialogue data itself, scripts for evaluation and baseline trackers, raw output from trackers entered in the challenges, and performance summaries. If your work is on dialogue state tracking for information-seeking dialogues and/or you think the data is appropriate, you are strongly encouraged to report results on these data, to enable comparison. The dialogue state tracking challenge data is available here: - Dialogue State Tracking Challenge 1: http://research.microsoft.com/en-us/events/dstc/ - Dialogue State Tracking Challenge 2&3: http://camdial.org/~mh521/dstc/ SUBMISSIONS Papers should be submitted on the Dialogue & Discourse journal website, following instructions and formatting guidelines given there: http://www.dialogue-and-discourse.org/submission.shtml Submitted papers will be reviewed according to the Dialogue & Discourse reviewing criteria and appropriateness to the topic of the special issue. CONTACT Contact Jason Williams (jason.williams@microsoft.com) for further information about this call for papers.
| ||||||||||
7-6 | Special issue of Eurasip Journal in Adv.Signal Proc. 'Silencing the Echoes' - Processing of Reverberant Speech'Silencing the Echoes' - Processing of Reverberant SpeechSubmission InstructionsManuscript DueFeb. 1, 2015 (in 3 months, 3 weeks) DescriptionThe reverberation contained in speech signals captured by distant microphones reduces both the perceptual speech quality and the performance of automatic speech recognition (ASR) systems, hampering both human-human communication and human-machine interaction. Thus, for applications that depend on distant sound capture, processing of the speech signal to mitigate the reverberation problem is essential. Such applications include voice control of consumer products, e.g. smart phones, wearable devices, interactive TVs, and appliances in smart homes; hearing aids; voice communication with robots and avatars; automatic meeting transcription; speech recognition in call centers; automatic annotation of videos; speech-to-speech translation. Topics of interest include:
Guest Editors
| ||||||||||
7-7 | Speech Communication Special issue on Advances in Sparse Modeling and Low-rank Modeling for Speech Processing Manuscript due: Jan. 10, 2015
| ||||||||||
7-8 | Special issue of Speech Communication on 'Phase-Aware Signal Processing Special issue of Speech Communication on 'Phase-Aware Signal Processing in
| ||||||||||
7-9 | Special issue TAL: COMPUTATIONAL LINGUISTICS AND COGNITIVE SCIENCES
| ||||||||||
7-10 | Advances in Applied Acoustics Advances in Applied Acoustics
| ||||||||||
7-11 | JAIR Special Track on Cross-language Algorithms and Applications (UPDATE) JAIR Special Track on Cross-language Algorithms and Applications
Track Editor Lluís Màrquez, Qatar Computing Research Institute Associate Track Editors Marta R. Costa-jussà, Instituto Politécnico Nacional Srinivas Bangalore, AT&T Labs-Research Patrik Lambert, Universitat Pompeu Fabra Elena Montiel-Ponsoda, Universidad Politécnica de Madrid The Journal of Artificial Intelligence Research (JAIR) is pleased to announce the launch of the Special Track on Cross-language Algorithms and Applications. The core Artificial Intelligence technologies of speech and natural language processing need to address the challenges of processing multiple languages. While the first challenge of multilingualism is to bridge the nomenclature gap for the same concepts, the next significant challenge is to develop algorithms and applications that not only scale to multiple languages but also leverage cross-lingual similarities for improved natural language processing. The goal of this special track is to serve as a home for the publication of leading research on Cross-language Algorithms and Applications, focusing on developing unified themes leading to the development of the science of multi- and cross-lingualism. Topics of interest include, but are not limited to: efforts in the direction of multilingual transliteration; multilingual document summarization; rapid prototyping of cross language tools for low resource languages; and machine translation. Articles published in the Cross-language Algorithms and Applications track must meet the highest quality standards as measured by originality and significance of the contribution and clarity of presentation. Papers will be coordinated by the track editor and associate editors, and reviewed by peer reviewers drawn from the JAIR Editorial Board and the larger community. All articles should be submitted using the normal JAIR submission process. Please indicate that the submission is intended for the Special Track in the section 'Special Information for editors'. For more information and submission instructions, please see: http://www.jair.org/specialtrack-claa.html Timetable 24th March 2015 *EXTENDED* Deadline for Submissions 24th June 2015 Notification of Acceptance/Revision/Rejection 5th August 2015 Deadline for Re-submission of papers requiring revision 5th October 2015 Notification of Final Acceptance 24th November 2015 Final manuscript due Contact: martaruizcostajussa@gmail.com Submission Instructions: Use JAIR conventional submissions instructions available at http://www.jair.org/submission_info.html
| ||||||||||
7-12 | CfP Travaux interdisciplinaires sur la parole et le langage TIPA 2015 Call for papers, TIPA 31 - 2015 The impact of language contact: from structural interferences to typological convergences Guest editor: Cyril Aslanov The 31st issue of TIPA will be dedicated to the study of the impact of language contact on the hard core of grammatical systems. In order to counterbalance the strictly internalist approaches to diachronic evolution, we will adopt the theoretical perspective provided by the studies on contact-induced linguistic changes. The contributors are requested to cast a new light on the results of language contact, either as an occasional interference at the level of social or individual speech or as a structural convergence deeply rooted in the grammatical structure. The contact-induced linguistic changes may be considered in the dynamic perspective of the diachrony of language contact or through the study of a given state of language examined synchronically as the present result of a previous contact. Besides the impact of language contact on the inner system of languages, it is important to involve also a sociolinguistic dimension in order to grasp the continuity or the reccursivity that unite the empirical modalities of language contacts (code-switching; code-mixing; hybridization) with considerations more centered on the study of the linguistic systems themselves, especially as far as fusion languages like Yiddish, Romani or Swahili are concerned. Indeed, the very existence of such languages is due to language contact and multilingualism. Lastly, the scientific debate on the impact of language contact on the systems should also take into account the individual dimension. Psycholinguists interested in interference, convergence and mimetism, specialists of individual bilingualism and didacticians dealing with Interlingua are invited to enrich this issue on the results of language contact.
The language of publication will be either English or French. Each article should contain a detailed two-page abstract in the other language, in order to make papers in French more accessible to English-speaking readers, and vice versa, thus insuring a larger audience for all the articles. Important dates
--
| ||||||||||
7-13 | EURASIP Journal on Advances in Signal Processing Manuscript due: June 30, 2015
| ||||||||||
7-14 | CfP Tipa. Travaux interdisciplinaires sur la parole et le langage
******* TIPA 31 - Deadline Extended : july 25, 2015 *******
The impact of language contact: from structural interferences to typological convergences Guest editor: Cyril Aslanov The 31st issue of TIPA will be dedicated to the study of the impact of language contact on the hard core of grammatical systems. In order to counterbalance the strictly internalist approaches to diachronic evolution, we will adopt the theoretical perspective provided by the studies on contact-induced linguistic changes. The contributors are requested to cast a new light on the results of language contact, either as an occasional interference at the level of social or individual speech or as a structural convergence deeply rooted in the grammatical structure. The contact-induced linguistic changes may be considered in the dynamic perspective of the diachrony of language contact or through the study of a given state of language examined synchronically as the present result of a previous contact. Besides the impact of language contact on the inner system of languages, it is important to involve also a sociolinguistic dimension in order to grasp the continuity or the reccursivity that unite the empirical modalities of language contacts (code-switching; code-mixing; hybridization) with considerations more centered on the study of the linguistic systems themselves, especially as far as fusion languages like Yiddish, Romani or Swahili are concerned. Indeed, the very existence of such languages is due to language contact and multilingualism. Lastly, the scientific debate on the impact of language contact on the systems should also take into account the individual dimension. Psycholinguists interested in interference, convergence and mimetism, specialists of individual bilingualism and didacticians dealing with Interlingua are invited to enrich this issue on the results of language contact.
The language of publication will be either English or French. Each article should contain a detailed two-page abstract in the other language, in order to make papers in French more accessible to English-speaking readers, and vice versa, thus insuring a larger audience for all the articles. Important dates
|