ISCApad #189 |
Saturday, March 15, 2014 by Chris Wellekens |
3-1-1 | (2014) INTERSPEECH 2014 Newsletter February 2014 At the southern tip of the Malayan Peninsula...
…INTERSPEECH 2014 will be held in Singapore, between Malaysia and Indonesia. In the constitution, Malays are acknowledge as the “indigenous people of Singapore”. Indeed, Malays are the predominant ethnic group inhabiting the Malay Peninsula, Eastern Sumatra, Brunei, coastal Borneo, part of Thailand, the Southern Burmese coast and Singapore. You get now a better understanding of why Malay is one of the four official – and the only national – language of Singapore.
A Malay history of Singapore
It is said that the city of Singapore was founded in 1299 BCE by a Prince from Palembang (South Sumatra, Indonesia), descendant of Alexander the Great. According to the legend, the Prince named the city Singapura (“Lion City”) after sighting a beast on the island. If it is highly doubtful that any Lion ever lived in Singapore outside the zoo, another story tells that the last surviving tiger in Singapore was shot at the bar of the Raffles Hotel in 1902. Despite this auspicious foundation, the population of Pulau Ujong (the “island at the end”) did not exceed a thousand inhabitants when, in 1819, Sir Thomas Raffles decided to establish a new port to reinforce the British trade between China and India. At this time, the population consisted of different Malay groups (Orang Kallang, Orang Seletar, Orang Gelam, Orang Lauts) and a few Chinese. Nowadays, Malay count for 13.3% of Singapore's population with origins as diverse as Johor, Riau islands (for the Malays Proper), Java (Javanese), Baewan island (Baewanese), Celebes islands (Bugis) or Sumatra (Batak and Minangkabaus).
Malay language
With almost 220 million of speakers, the Malay language in its various forms unites the fifth largest language community in the world [1]. Origins of Malay language can be traced amongst the very first Austronesian languages, back to 2000 BCE [2]. Through the centuries, the major Indian religions brought a number of Sanskrit and Persian words to the Malay vocabulary while islamization of the South East Asia added Arabic influences [3]. Later on, languages from the colonization powers (mainly Dutch and British) and migrants (Chinese and Tamil) contributed to the diversity of Malay influences [4, 5]. In return Malay words have been loaned in other languages, e.g. in English: rice paddy, Orangutan, babirussa, cockatoo, compound, durian, rambutan, etc. During the golden age of Malay empires, Malay has gained its foothold in territories of modern Malaysia and Indonesia where it became a vector for trade and business. Today, Malay is official language in Malaysia, Indonesia, Brunei and Singapore and is spoken in southern Thailand, Philippines or Cocos and Christmas Islands in Australia [6]. Malay counts a total of 35 phonemes: 6 vowels, 3 diphthongs and 27 consonants [1,5]. As an agglutinative language, its vocabulary can be enriched by adding affixes to the root words [7]. Affixations in Malay consist of prefixation, infixation, suffixation or circumfixation1. Malay languages also have two proclitics, four enclitics and three particles that may be attached to an affixed word [8]. In Singapore, Malaysia, Brunei and Indonesia, Malay is officially written using the Latin alphabet (Rumi) but an Arabic alphabet called Jawi is co-official in Brunei and Malaysia.
Bahasa Melayu in Singapore
Bahasa Melayu (or Malay Language) is one of the four official languages of Singapore, the ceremonial national language and is used in the national anthem or for military commands. However, several creoles remain spoken across the island. Amongst them, Bahasa Melayu Pasar or Bazaar Malay is a creole of Malay and Chinese which used to be the lingua franca and the language for trade between communities [4, 5]. Baba Malay, another variety of Malay Creole influenced by Hokkien and Bazaar Malay is still spoken by around 10,000 people in Singapore. Today, Bahasa Melayu is the lingua franca among Malay, Javanese, Boyanese, other Indonesian groups and some Arabs in Singapore. It is used as a mean for transmitting familial and religious values amongst the Malay community as well as in “Madrasahs”, mosques and religious schools. However, with 35% of Malay pupils predominantly speaking English at home and a majority of Singaporeans being bilingual in English, Malay is facing competition from English which is taught as first language [5].
Selemat datang ke Singapura / Welcome to Singapore
Opportunities of discovering the Malay culture in Singapore are everywhere. Depending on time and location you might want to taste the Malay's cuisine or one of the succulent Malay cookies while walking around the streets of Kampong Glam or visiting the Malay heritage center.
[1] Tan, Tien-Ping, et al. 'MASS: A Malay language LVCSR corpus resource.' Speech Database and Assessments, 2009 Oriental COCOSDA International Conference on. IEEE, 2009. [2] http://en.wikipedia.org/wiki/History_of_the_Malay_language#Modern_Malay_.2820th_century_CE.29 [3] http://en.wikipedia.org/wiki/Malay_language [4] http://en.wikipedia.org/wiki/Comparison_of_Malaysian_and_Indonesian [5] http://en.wikipedia.org/wiki/List_of_loanwords_in_Malay [6] http://www.kwintessential.co.uk/resources/global-etiquette/malaysia.html [7] B. Ranaivo-Malacon, 'Computational Analysis of Affixed Words in Malay Language,' presented at International Symposium on Malay /Indonesian Linguistics, Penang, 2004 . [8] http://www-01.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsACliticGrammar.htm
1 circumfixation refers here to the simultaneous adding of morphological units, expressing a single meaning or category, at the left and right side of the root word
| ||||||||||||||||||||
3-1-2 | (2014) INTERSPEECH 2014 Newsletter January 2014
English in Singapore
Attending INTERSPEECH 2014 in Singapore you will probably be glad to know that English is spoken in any corner of the island. Indeed, it is one of the four national languages and the second language spoken in Singapore's homes. According to the last census, in 2010, 89% of the population is literate in English, making of Singapore a very convenient place for tourism, shopping, research or to hold a conference.
Historical context of English in Singapore
The history of Singapore started as the first settlements were established in the 13th century AD [2]. Along the years, Singapore was part of different kingdoms and sultanas until the 19th century, when modern Singapore was founded under the impulsion of the British Empire. In 1819, Sir Thomas Stamford Raffles landed in Singapore and established a treaty with the local rulers to develop a new trading station. From this date, the importance of Singapore continuously grew under the influence of Sir Raffles who, despite not being very present on the island, was the real builder of modern Singapore. Singapore remained under British administration until the Second World War and became a Crown Colony after the end of the conflict. Followed a brief period during which Singapore was part of the Federation of Malaya before becoming independent in 1965 and part of the Commonwealth of Nations.
From this history, Singapore conserved English as one of its four official languages as well as many landmarks that deserve a visit beside of INTERSPEECH. Amongst them, Singapore Botanic Garden, founded in 1859, is internationally renowned [3]. This urban garden of 74 hectares was laid there by Sir Raffles to cultivate and preserve local plants in the tradition of the tropical colonial gardens. Including a Rain Forest, several lakes, an orchid garden and a performance stage, Singapore Botanic Garden is a very popular place to enjoy free concerts on week end afternoons.
Other green spot in the “City in a Garden”, the Padang (field in Malay) was created by Sir Raffles, always him, who planned to reserve the space for public purposes. The place is now famous for the two cricket clubs founded in 1870 and 1883 at both ends of the field and the games that can be watched on weekends.
Amongst the numerous landmarks inherited from the British colonization, the most famous include St Andrew's Anglican cathedral, the Victoria Theater, the Fullerton building, Singapore's City Hall, Old Parliament house, the Central Fire Station and many black and white bungalows built from the 19th century for the rich expatriate families. Some of those bungalows, now transformed in restaurant, will offer you a peaceful atmosphere to enjoy a local diner.
The role of English in Singapore
English has a special place in Singapore as it is the only national language which is not a “mother-tongue”. Indeed, Alsagoff [6] framed English as “cultureless” in that it is “disassociated from Western culture” in the Singaporean context. This cultural voiding makes English an ethnically neutral language used as lingua franca between ethnic groups [5] after replacing the local Malay in this role [4]. Interestingly, English is the only compulsory language of education, and its status in school is that of First Language, as opposed to the Second Language status delegated to the other official languages. By promoting the use of English as working language, the will of the government is to not advantage or disadvantage any ethnic group.
Nevertheless, the theoretical equality stated in the constitution between the four national languages is not always present in practice. For instance, English is overwhelming parliamentary business and some governmental websites are only available in English. Additionally, all legislation is in English only [4].
Singapore English
The standard Singapore English is almost similar to the British English although very cosmopolitan, with 42% of the population born outside the country. Nevertheless, a new standard of pronunciation has been emerging recently [1]. Interestingly, this pronunciation is independent of any external standard and some aspects of it cannot be predicted by reference to British English or any other variety of external English.
The other form of English that you will hear in Singapore is known as Singlish. It is a colorful Creole including words from the many languages spoken in Singapore such as various Chinese dialects (Hokkien, TeoChew, and Cantonese), Malay or Tamil. Many things might be said about Singlish and another newsletter will be especially dedicated to this local variant. Don't miss it!
[1] Deterding, David (2003). 'Emergent patterns in the vowels of Singapore English' National Institute of Education, Singapore. Retrieved 7 June 2013.
[2] http://www.yoursingapore.com/content/traveller/en/browse/aboutsingapore/a-brief-history.html
[3] http://whc.unesco.org/en/tentativelists/5786/ , (on line January 7th, 2014)
[4] Leimgruber, J. R. (2013). The management of multilingualism in a city-state: Language policy in Singapore. In I. G. Peter Siemund, Multilingualism and Language Contact in Urban Areas: Acquisition development, teaching, communication (pp. 229-258). Amsterdam: John Benjamins.
[5] Harada, Shinichi. 'The Roles of Singapore Standard English and Singlish.' 情報研究 40 (2009): 69-81.
[6] Alsagoff, L. (2007). Singlish: Negotiating culture, capital and identity. In Language, Capital, Culture: Critical studies of language and education in Singapore (pp. 25-46). Rotterdam: Sense Publishers.
| ||||||||||||||||||||
3-1-3 | (2014) INTERSPEECH 2014 Newsletter March 2014 Tamil and Indian Languages in Singapore
Our fifth step to INTERSPEECH 2014 brings us to the fourth official language of Singapore: Tamil. Today, Indians constitute 9% of the population of Singaporean citizens and permanent residents. They are considered as the third ethnic group in Singapore, although origins of Singaporean-Indians are diverse. Usually locally born, they are second, third, fourth or even fifth generation descendants of Punjabi, Hindi, Sindhi and Gujarati-speaking migrants from the Northern India and Malayalees, Telugu, and Tamil-speaking migrants from the Southern India. This latter group is the core of Singaporean-Indian population with 58% of the Indian community [2, 5]. Indianised Kingdoms, such as Srivijaya and Majapahit, radiated over South-East Asia. Influenced by Hindu and Buddhist culture, a large area including Cambodia, Thailand, Malaysia, part of Indonesia and Singapore, formed the Greater India. From this period, Singapore kept some of its most important pre-colonial artifacts such as the Singapore Stone and it is also reported that the hill of Fort Canning was chosen for the first settlement as a reference to the Hindu concept of Mount Meru which was associated to kingship in Indian culture [1]. Under British colony, Indian migrants arrived to Singapore from different parts of India to fulfill functions such as clerks, soldiers, traders or English teachers. By 1824, 7% of the population was Indian (756 residents). The part of Indian population in Singapore increased until 1860 when it overtook the Malay community and became the second larger ethnic group of 16%. Due to the nature of this migration, Indians in Singapore were predominantly adult men. A settled community, with a more balanced gender and age ratio, only emerged by the mid-20th century [2]. Although the Indian community increased for the following century, its ratio within the Singaporean population decreased until the 1980's, especially when the British withdrew their troupes after Singapore's independence in 1963. After 1980, the immigration policy aimed at attracting educated people from other Asian countries to settle in Singapore. This change made the Indian population grow from 6.4% to 9%. In addition to this residential population, many ethnic Indian migrant workers temporarily come to work in Singapore (Bangladeshis, Sri Lankans, Malaysian Indians or Indian Indians)[3]. Tamil language is one of the longest surviving classical languages in the world [8]. Existing for over 2,000 years, Tamil has a rich literature history and was the first Indian language to be declared a classical language by the Government of India in 2004. Earliest records of written Tamil were dated from around the 2nd century BC and, despite the significant amount of grammatical and syntactical change, this language demonstrates grammatical continuity across 2 millennium. Tamil is the most populous language from the Dravidian language-family, with important groups of speakers in Malaysia, Philippines, Mauritius, South Africa, Indonesia, Thailand, Burma, Reunion and Vietnam. Significant communities can also be found in Canada, England, Fiji, Germany, Netherlands or United States. It is the official language in Indian states of Tamil Nadu, Puducherry, Andaman and Nicobar Islands as well as in Sri Lanka and Singapore. Like Malay, another local language, Tamil is agglutinative. Affixes are added to words to mark noun class, number, case or verb tense, person, number, mood and voice [7]. Like Finish, not a local language, Tamil sets no limit to the length and extent of agglutination. This leads to long words with a large number of affixes in which its translation might require several sentences in other languages. Phonology of Tamil is characterized by the use of retroflex consonants and multiple rhotics. Native grammarians classify phonemes into vowels, consonants and a secondary character called āytam. Aytam is an allophone of /r/ or /s/ at the end of an utterance. Vowels are called Tamil writing currently includes twelve vowels, eighteen consonants and one special character for the āytam that combine to form a total of 247 characters. In Singapore, Among all the Indian residents in Singapore, 38.8% speaks Tamil daily, 39% speak English, 11% speak Malay, and the remaining 11% speak other Indian languages [2, 4]. Tamil is one the two Indian languages taught as second language (mother tong) in public schools, together with Hindi. It also used in daily newspapers, free-to-air and cable television, radio channels, cinema or theaters [5]. In the multi-cultural environment of Singapore, Tamil influences the other local languages and vice versa. There is especially strong interaction Malay and the colloquial Singaporean English known as Singlish. Singaporean usage of Tamil includes some words from English and Malays while certain words or phrases that are considered archaic in India remain in use in Singapore [2]. During your stay in Singapore, you can easily get to know Tamil culture through its many aspects. Having a walk in Little India, in which its architecture is protected since 1989, is a great opportunity to be exposed to Tamil music and lifestyle. The two-storey shop-houses of Singapore's Indian hub host some of the best ambassadors of Indian cuisine. Here you'll find the local version of the Tamil cuisine that has evolved in response to local taste and influences of other cuisines present in Singapore. Other cuisines also include elements of Indian cuisine such as Singapore-Malay cuisine or Peranakan cuisine. Singaporean Tamil must-try include dishes such as achar, curry fish head, rojak, Indian mee goreng, murtabak, roti john, roti prata and teh tarik. Note that other Indian cuisines from Northern India can also be found.
[1] http://en.wikipedia.org/wiki/History_of_Indians_in_Singapore [2] http://en.wikipedia.org/wiki/Indians_in_Singapore [3] Leow, Bee Geok (2001). Census of Population 2000: Demographic Characteristics. p.47-49. [4] Singapore Census 2010 [5] http://en.wikipedia.org/wiki/Indian_languages_in_Singapore [6] http://en.wikipedia.org/wiki/Dravidian_language [7] http://en.wikipedia.org/wiki/Tamil_language [8] http://en.wikipedia.org/wiki/Classical_language
| ||||||||||||||||||||
3-1-4 | (2014-09-14) CfP INTERSPEECH 2014 Singapore URGENT Action Required Interspeech 2014 Singapore September 14-18, 2014
INTERSPEECH 2014 paper submission deadline is on 24 March 2014. There will be no extension of deadline. Get ready your paper submissions and gear up for INTERSPEECH in Singapore.
INTERSPEECH is the world's largest and most comprehensive conference on issues surrounding the science and technology of spoken language processing, both in humans and in machines. The theme of INTERSPEECH 2014 is 'Celebrating the Diversity of Spoken Languages'. INTERSPEECH 2014 emphasizes an interdisciplinary approach covering all aspects of speech science and technology spanning basic theories to applications. In addition to regular oral and poster sessions, the conference will also feature plenary talks by internationally renowned experts, tutorials, special sessions, show & tell sessions, and exhibits. A number of satellite events will take place immediately before and after the conference. Please follow the details of these and other news at the INTERSPEECH website www.interspeech2014.org. We invite you to submit original papers in any related area, including but not limited to: 1: Speech Perception and Production 2: Prosody, Phonetics, Phonology, and Para-/Non- Linguistic Information 3: Analysis of Speech and Audio Signals 4: Speech Coding and Enhancement 5: Speaker and Language Identification 6: Speech Synthesis and Spoken Language Generation 7: Speech Recognition - Signal Processing, Acoustic Modeling, Robustness, and Adaptation 8: Speech Recognition - Architecture, Search & Linguistic Components 9: LVCSR and Its Applications, Technologies and Systems for New Applications 10: Spoken Language Processing - Dialogue, Summarization, Understanding 11: Spoken Language Processing -Translation, Info Retrieval 12: Spoken Language Evaluation, Standardization and Resources A detailed description of these areas is accessible at:
http://www.interspeech2014.org/public.php?page=conference_areas.html
Paper Submission Papers for the INTERSPEECH 2014 proceedings should be up to 4 pages of text, plus one page (maximum) for references only. Paper submissions must conform to the format defined in the paper preparation guidelines and provided in the Authors’ kit, on the INTERSPEECH 2014 website, along with the Call for Papers. Optionally, authors may submit additional files, such as multimedia files, which will be included in the official conference proceedings USB drive. Authors must declare that their contributions are original and are not being submitted for publication elsewhere (e.g. another conference, workshop, or journal). Papers must be submitted via the online paper submission system, which will be opened in February 2014. The conference will be conducted in English. Information on the paper submission procedure is available at: http://www.interspeech2014.org/public.php?page=submission_procedure.html There will be NO extension to the full paper submission deadline.
We look forward to welcoming you to INTERSPEECH 2014 in Singapore!
Helen Meng and Bin Ma Technical Program Chairs
Contact
Email: tpc@interspeech2014.org organizers.interspeech2014@isca-speech.org— For general enquiries
Conference website: www.interspeech2014.org
| ||||||||||||||||||||
3-1-5 | (2014-09-14) CfP Speech Technology for the Interspeech App Call for Proposals Speech Technology for the Interspeech App During the past Interspeech conference in Lyon, a mobile application (app) was provided for accessing the conference program, designing personal schedules, inspecting abstracts, full papers and the list of authors, navigating through the conference center, or recommending papers to colleagues. This app was designed by students and researchers of the Quality and Usability Lab, TU Berlin, and will be made available to ISCA and to future conference and workshop organizers free-of-charge. It will also be used for the upcoming Interspeech 2014 in Singapore, and is available under both iOS and Android. In its current state, the app is limited to mostly touch-based input and graphical output. However, we would like to develop the app into a useful tool for the spoken language community at large, which should include speech input and output capabilities, and potentially full spoken-language and multimodal interaction. The app could also be used for collecting speech data under realistic environmental conditions, for distributing multimedia examples or surveys during the conference, or for other research purposes. In addition, the data which is being collected with the app (mostly interaction usage patterns) could be analyzed further. The Quality and Usability Lab of TU Berlin would like to invite interested parties to contribute to this development. Contributions could be made by providing ready-built modules (e.g. ASR, TTS, or alike) for integration into the app, by proposing new functionalities which would be of interest to a significant part of the community, and preferably by offering workforce for such future developments. If you are interested in contributing to this, please send an email with your proposals to interspeechapp@qu.tu-berlin.de by October 31, 2013. In case that a sufficient number of interested parties can be found, we plan to submit a proposal for a special session around speech technology in mobile applications for the upcoming Interspeech in Singapore. More information on the current version of the app can be found in: Schleicher, R., Westermann, T., Li, J., Lawitschka, M., Mateev, B., Reichmuth, R., Möller, S. (2013). Design of a Mobile App for Interspeech Conferences: Towards an Open Tool for the Spoken Language Community, in: Proc. 14th Ann. Conf. of the Int. Speech Comm. Assoc. (Interspeech 2013), Aug. 25-29, Lyon.
| ||||||||||||||||||||
3-1-6 | (2014-09-14) INTERSPEECH 2014 Singapore
It is a great pleasure to announce that the 15th edition of the Annual Conference of the International Speech Communication Association (INTERSPEECH) will be held in Singapore during September 14-18, 2014. INTERSPEECH 2014 will bring together the community to celebrate the diversity of spoken languages in the vibrant city state of Singapore. INTERSPEECH 2014 is proudly organized by the Chinese and Oriental Languages Information Processing Society (COLIPS), the Institute for Infocomm Research (I2R), and the International Speech Communication Association (ISCA).
Ten steps to Singapore
You want to know more about Singapore?
During the next ten months, the organization committee will introduce you to Singaporean culture through a series of brief newsletters featuring topics related to spoken languages in Singapore. Please stay tuned!
Workshops
Submission deadline: December 1, 2013
Satellite workshops related to speech and language research will be hosted in Singapore as well as in Phuket Island, Thailand (1 hr 20 min flight from Singapore) and in Penang, Malaysia (1 hr flight from Singapore).
Proposals must be submitted by email to workshops@interspeech2014.org before December 1, 2013. Notification of acceptance and ISCA approval/sponsorship will be announced by January 31, 2014.
Sponsorship and Exhibition
The objective of INTERSPEECH 2014 is to foster scientific exchanges in all aspects of Speech Communication sciences with a special focus on the diversity of spoken languages. We are pleased to invite you to take part in this major event as a sponsor. For more information, view the Sponsorship
Conference venue
INTERSPEECH 2014 main conference will be held in the MAX Atria @ Singapore Expo.
Organizers
Lists of the organizing, advisory and technical program committees are available on line (here).
Follow us
Facebook: ISCA
Twitter: @Interspeech2014 follow hash tags: #is2014 or #interspeech2014
LinkedIn Interspeech
Contact
Conference website: www.interspeech2014.org
organizers.interspeech2014@isca-speech.org— For general enquiries
| ||||||||||||||||||||
3-1-7 | (2014-09-14) Interspeech 2014 special session : Speech technologies for Ambient Assisted Living. Interspeech 2014 special session : Speech technologies for Ambient Assisted Living. Submission deadline: 24th March 2014 Singapore, 14-18 September 2014 http://www.interspeech2014.org/public.php?page=special_sessions.html#speech-technologies-ambient This special session focuses on the use of speech technologies for ambient assisted living, the creation of smart spaces and intelligent companions that can preserve independence and executive function, social communication and security of people with special needs. Currently, speech input for assistive technologies remains underutilized despite its potential to deliver highly informative data and serve as the primary means of interaction with the home. Speech interfaces could replace or augment obtrusive and sometimes outright inaccessible conventional computer interfaces. Moreover, the smart home context can support speech communication by providing a number of concurrent information sources (e.g., wearable sensors, home automation sensors, etc.), enabling multimodal communication. In practice, its use remains limited due to challenging real-world conditions, and because conventional speech interfaces can have difficulty with the atypical speech of many users. This, in turn, can be attributed to the lack of abundant speech material, and the limited adaptation to the user of these systems. Taking up the challenges of this domain requires a multidisciplinary approach to define the user's needs, record corpora in realistic usage conditions, develop speech interfaces that are robust to both environment and user's characteristics and are able to adapt to specific users. This special session aims at bringing together researchers in speech and audio technologies with people from the ambient assisted living and assistive technologies communities to meet and foster awareness between members of either community, discuss problems, techniques and datasets, and perhaps initiate common projects. Topics of the session will include: Assistive speech technology Applications of speech technology (ASR, dialogue, synthesis) for ambient assisted living Understanding, modelling, or recognition of aged and atypical speech Multimodal speech recognition (context-aware ASR) Multimodal emotion recognition Audio scene and smart space context analysis Assessment of speech and language processing within the context of assistive technology Speech synthesis and speech recognition for physical or cognitive impairments Symbol languages, sign languages, nonverbal communication Speech and NLP applied to typing interface applications Language modelling for Augmentative and Alternative Communication text entry and speech generating devices Deployment of speech and NLP tools in the clinic or in the field Linguistic resources; corpora and annotation schemes Evaluation of systems and components. Submission instructions: Researchers who are interested in contributing to this special session are invited to submit a paper according to the regular submission procedure of INTERSPEECH 2014, and to select 'Speech technologies for Ambient Assisted Living' in the special session field of the paper submission form. Please feel free to contact the organisers if you have any question regarding the special session. Organizers: Michel Vacher michel.vacher [at] imag.fr Laboratoire d'Informatique de Grenoble, François Portet francois.portet [at] imag.fr Laboratoire d'Informatique de Grenoble, Frank Rudzicz frank [at] cs.toronto.edu University of Toronto, Jort F. Gemmeke jgemmeke [at] amadana.nl KU Leuven, Heidi Christensen h.christensen [at] dcs.shef.ac.uk University of Sheffield,
| ||||||||||||||||||||
3-1-8 | (2014-09-14) Special sessions at Interspeech 2014: call for submissions --- INTERSPEECH 2014 - SINGAPORE --- September 14-18, 2014 --- http://www.INTERSPEECH2014.org INTERSPEECH is the world's largest and most comprehensive conference on issues surrounding the science and technology of spoken language processing, both in humans and in machines. The theme of INTERSPEECH 2014 is --- Celebrating the Diversity of Spoken Languages --- INTERSPEECH 2014 includes a number of special sessions covering interdisciplinary topics and/or important new emerging areas of interest related to the main conference topics. Special sessions proposed for the forthcoming edition are: • A Re-evaluation of Robustness • Deep Neural Networks for Speech Generation and Synthesis • Exploring the Rich Information of Speech Across Multiple Languages • INTERSPEECH 2014 Computational Paralinguistics ChallengE (ComParE) • Multichannel Processing for Distant Speech Recognition • Open Domain Situated Conversational Interaction • Phase Importance in Speech Processing Applications • Speaker Comparison for Forensic and Investigative Applications • Text-dependent for Short-duration Speaker Verification • Tutorial Dialogues and Spoken Dialogue Systems • Visual Speech Decoding A description of each special session is given below. For paper submission, please follow the main conference procedure and chose the Special Session track when selecting your paper area. Paper submission procedure is described at: http://www.INTERSPEECH2014.org/public.php?page=submission_procedure.html For more information, feel free to contact the Special Session Chair, Dr. Tomi H. Kinnunen, at email tkinnu [at]cs.uef.fi ---------------------------------------------------------------------------------------------------- Special Session Description ---------------------------------------------------------------------------------------------------- A Re-evaluation of Robustness The goal of the session is to facilitate a re-evaluation of robust speech recognition in the light of recent developments. It’s a re-evaluation at two levels: • a re-evaluation in perspective brought by breakthroughs in performance obtained by Deep Neural Network which leads to a fresh questioning of the role and contribution of robust feature extraction. • A literal re-evaluation on common databases to be able to present and compare performances of different algorithms and system approaches to robustness. Paper submissions are invited on the theme of noise robust speech recognition and required to submit results on the Aurora 4 database to facilitate cross comparison of the performance between different techniques. Recent developments raise interesting research questions that the session aims to help Progress by bringing focus and exploration of these issues. For example 1. What role is there for signal processing to create feature representations to use as inputs to Deep Learning or can deep learning do all the work? 2. What feature representations can be automatically learnt in a deep learning architecture? 3. What other techniques can give great improvement in robustness? 4. What techniques don’t work and why? The session organizers wish to encourage submissions that bring insight and understanding to the issues highlighted above. Authors are requested not only to present absolute performance of the whole system but also to highlight the contribution made by various components in a complex system. Papers that are accepted for the session are encouraged to also evaluate their techniques on new test data sets (available in July) and submit their results at the end of August. Session organization The session will be structured as a combination of 1. Invited talks 2. Oral paper presentations 3. Poster presentations 4. Summary of contributions and results on newly released test sets 5. Discussion Organizers: David Pearce, Audience dpearce [at]audience.com Hans-Guenter Hirsch, Niederrhein University of Applied Sciences, hans-guenter.hirsch [at]hs-niederrhein.de Reinhold Haeb-Umbach, University of Paderborn, haeb [at]nt.uni-paderborn.de Michael Seltzer, Microsoft, mseltzer [at]microsoft.com Keikichi Hirose, The University of Tokyo, hirose [at]gavo.t.u-tokyo.ac.jp Steve Renals, University of Edinburgh, s.renals [at]ed.ac.uk Sim Khe Chai, National University of Singapore, simkc [at]comp.nus.edu.sg Niko Moritz, Fraunhofer IDMT, Oldenburg, niko.moritz [at]idmt.fraunhofer.de K K Chin, Google, kkchin [at]google.com Deep Neural Networks for Speech Generation and Synthesis This special session aims to bring together researchers who work actively on deep neural networks for speech research, particularly, in generation and synthesis, to promote and to understand better the state-of-art DNN research in statistical learning and compare results with the parametric HMM-GMM model which has been well-established for speech synthesis, generation, and conversion. DNN, with its neuron-like structure, can simulate human speech production system in a layered, hierarchical, nonlinear and self-organized network. It can transform linguistic text information into intermediate semantic, phonetic and prosodic content and finally generate speech waveforms. Many possible neural network architectures or typologies exist, e.g. feed-forward NN with multiple hidden layers, stacked RBM or CRBM, Recurrent Neural Net (RNN), which have been used to speech/image recognition and other applications. We would like to use this special session as a forum to present updated results in the research frontiers, algorithm development and application scenarios. Particular focused areas will be on parametric TTS synthesis, voice conversion, speech compression, de-noising and speech enhancement. Organizers: Yao Qian, Microsoft Research Asia, yaoqian [at]microsoft.com Frank K. Soong, Microsoft Research Asia, frankkps [at]microsoft.com Exploring the Rich Information of Speech Across Multiple Languages Spoken language is the most direct means of communication between human beings. However, speech communication often demonstrates its language-specific characteristics because of, for instance, the linguistic difference (e.g., tonal vs. non-tonal, monosyllabic vs. multisyllabic) across languages. Our knowledge on the diversities of speech science across languages is still limited, including speech perception, linguistic and non-linguistic (e.g., emotion) information, etc. This knowledge is of great significance to facilitate our design of language-specific application of speech techniques (e.g., automatic speech recognition, assistive hearing devices) in the future. This special session will provide an opportunity for researchers from various communities (including speech science, medicine, linguistics and signal processing) to stimulate further discussion and new research in the broad cross-language area, and present their latest research on understanding the language-specific features of speech science and their applications in the speech communication of machines and human beings. This special session encourages contributions all fields on speech science, e.g., production and perception, but with a focus on presenting the language-specific characteristics and discussing their implications to improve our knowledge on the diversities of speech science across multiple languages. Topics of interest include, but are not limited to: 1. characteristics of acoustic, linguistic and language information in speech communication across multiple languages; 2. diversity of linguistic and non-linguistic (e.g., emotion) information among multiple spoken languages; 3. language-specific speech intelligibility enhancement and automatic speech recognition techniques; and 4. comparative cross-language assessment of speech perception in challenging environments. Organizers: Junfeng Li, Institute of Acoustics, Chinese Academy of Sciences, junfeng.li.1979 [at]gmail.com Fei Chen, The University of Hong Kong, feichen1 [at]hku.hk INTERSPEECH 2014 Computational Paralinguistics ChallengE (ComParE) The INTERSPEECH 2014 Computational Paralinguistics ChallengE (ComParE) is an open Challenge dealing with speaker characteristics as manifested in their speech signal's acoustic properties. This year, it introduces new tasks by the Cognitive Load Sub-Challenge, the Physical Load Sub-Challenge, and a Multitask Sub-Challenge: For these Challenge tasks, the COGNITIVE-LOAD WITH SPEECH AND EGG database (CLSE), the MUNICH BIOVOICE CORPUS (MBC), and the ANXIETY-DEPRESSION-EMOTION-SLEEPINESS audio corpus (ADES) with high diversity of speakers and different languages covered (Australian English and German) are provided by the organizers. All corpora provide fully realistic data in challenging acoustic conditions and feature rich annotation such as speaker meta-data. They are given with distinct definitions of test, development, and training partitions, incorporating speaker independence as needed in most real-life settings. Benchmark results of the most popular approaches are provided as in the years before. Transcription of the train and development sets will be known. All Sub-Challenges allow contributors to find their own features with their own machine learning algorithm. However, a standard feature set will be provided per corpus that may be used. Participants will have to stick to the definition of training, development, and test sets. They may report on results obtained on the development set, but have only five trials to upload their results on the test sets, whose labels are unknown to them. Each participation will be accompanied by a paper presenting the results that undergoes peer-review and has to be accepted for the conference in order to participate in the Challenge. The results of the Challenge will be presented in a Special Session at INTERSPEECH 2014 in Singapore. Further, contributions using the Challenge data or related to the Challenge but not competing within the Challenge are also welcome. More information is given also on the Challenge homepage: http://emotion-research.net/sigs/speech-sig/is14-compare Organizers: Björn Schuller, Imperial College London / Technische Universität München,schuller [at]IEEE.org Stefan Steidl, Friedrich-Alexander-University, stefan.steidl [at]fau.de Anton Batliner, Technische Universität München / Friedrich-Alexander-University, batliner [at]cs.fau.de Jarek Krajweski, Bergische Universität Wuppertal, krajewsk [at]uni-wuppertal.de Julien Epps, The University of New South Wales / National ICT Australia, j.epps [at]unsw.edu.au Multichannel Processing for Distant Speech Recognition Distant speech recognition in real-world environments is still a challenging problem: reverberation and dynamic background noise represent major sources of acoustic mismatch that heavily decrease ASR performance, which, on the contrary, can be very good in close-talking microphone setups. In this context, a particularly interesting topic is the adoption of distributed microphones for the development of voice-enabled automated home environments based on distant-speech interaction: microphones are installed in different rooms and the resulting multichannel audio recordings capture multiple audio events, including voice commands or spontaneous speech, generated in various locations and characterized by a variable amount of reverberation as well as possible background noise. The focus of the proposed special session will be on multichannel processing for automatic speech recognition (ASR) in such a setting. Unlike other robust ASR tasks, where static adaptation or training with noisy data sensibly ameliorates performance, the distributed microphone scenario requires full exploitation of multichannel information to reduce the highly variable dynamic mismatch. To facilitate better evaluation of the proposed algorithms the organizers will provide a set of multichannel recordings in a domestic environment. The recordings will include spoken commands mixed with other acoustic events occurring in different rooms of a real apartment. The data is being created in the frame of the EC project DIRHA (Distant speech Interaction for Robust Home Applications) which addresses the challenges of speech interaction for home automation. The organizers will release the evaluation package (datasets and scripts) on February 17; the participants are asked to submit a regular paper reporting speech recognition results on the evaluation set and comparing their performance with the provided reference baseline. Further details are available at: http://dirha.fbk.eu/INTERSPEECH2014 Organizers: Marco Matassoni, Fondazione Bruno Kessler, matasso [at]fbk.eu Ramon Fernandez Astudillo, Instituto de Engenharia de Sistemas e Computadores, ramon.astudillo [at]inesc-id.pt Athanasios Katsamanis, National Technical University of Athens, nkatsam [at]cs.ntua.gr Open Domain Situated Conversational Interaction Robust conversational systems have the potential to revolutionize our interactions with computers. Building on decades of academic and industrial research, we now talk to our computers, phones, and entertainment systems on a daily basis. However, current technology typically limits conversational interactions to a few narrow domains/topics (e.g., weather, traffic, restaurants). Users increasingly want the ability to converse with their devices over broad web-scale content. Finding something on your PC or the web should be as simple as having a conversation. A promising approach to address this problem is situated conversational interaction. The approach leverages the situation and/or context of the conversation to improve system accuracy and effectiveness. Sources of context include visual content being displayed to the user, Geo-location, prior interactions, multi-modal interactions (e.g., gesture, eye gaze), and the conversation itself. For example, while a user is reading a news article on their tablet PC, they initiate a conversation to dig deeper on a particular topic. Or a user is reading a map and wants to learn more about the history of events at mile marker 121. Or a gamer wants to interact with a game’s characters to find the next clue in their quest. All of these interactions are situated – rich context is available to the system as a source of priors/constraints on what the user is likely to say. This special session will provide a forum to discuss research progress in open domain situated conversational interactions. Topics of the session will include: • Situated context in spoken dialog systems • Visual/dialog/personal/geo situated context • Inferred context through interpretation and reasoning • Open domain spoken dialog systems • Open domain spoken/natural language understanding and generation • Open domain semantic interpretation • Open domain dialog management (large-scale belief state/policy) • Conversational Interactions • Multi-modal inputs in situated open domains (speech/text + gesture, touch, eye gaze) • Multi-human situated interactions Organizers: Larry Heck, Microsoft Research, larry [at]ieee.org Dilek Hakkani-Tür, Microsoft Research, dilek [at]ieee.org Gokhan Tur, Microsoft Research, gokhan [at]ieee.org Steve Young, Cambridge University, sjy [at]eng.cam.ac.uk Phase Importance in Speech Processing Applications In the past decades, the amplitude of speech spectrum is considered to be the most important feature in different speech processing applications and phase of the speech signal has received less attention. Recently, several findings justify the phase importance in speech and audio processing communities. The importance of phase estimation along with amplitude estimation in speech enhancement, complementary phase-based features in speech and speaker recognition and phase-aware acoustic modeling of environment are the most prominent reported works scattered in different communities of speech and audio processing. These examples suggest that incorporating the phase information can push the limits of state-of-the-art phase-independent solutions employed for long in different aspects of audio and speech signal processing. This Special Session aims to explore the recent advances and methodologies to exploit the knowledge of signal phase information in different aspects of speech processing. Without a dedicated effort to bring researchers from different communities, a quick advance in investigation towards the phase usefulness in speech processing applications is difficult to achieve. Therefore, as the first step in this direction, we aim to promote the 'phase-aware speech and audio signal processing' to form a community of researchers to organize the next steps. Our initiative is to unify these efforts to better understand the pros and cons of using phase and the degree of feasibility for phase estimation/enhancement in different areas of speech processing including: speech enhancement, speech separation, speech quality estimation, speech and speaker recognition, voice transformation and speech analysis and synthesis. The goal is to promote the importance of the phase-based signal processing and studying its importance and sharing interesting findings from different speech processing applications. Organizers: Pejman Mowlaee, Graz University of Technology, pejman.mowlaee [at]tugraz.at Rahim Saeidi, University of Eastern Finland, rahim.saeidi [at]uef.fi Yannis Styilianou, Toshiba Labs Cambridge UK / University of Crete, yannis [at]csd.uoc.gr Speaker Comparison for Forensic and Investigative Applications In speaker comparison, speech/voice samples are compared by humans and/or machines for use in investigation or in court to address questions that are of interest to the legal system. Speaker comparison is a high-stakes application that can change people’s lives and it demands the best that science has to offer; however, methods, processes, and practices vary widely. These variations are not necessarily for the better and though recognized, are not generally appreciated and acted upon. Methods, processes, and practices grounded in science are critical for the proper application (and non-application) of speaker comparison to a variety of international investigative and forensic applications. This special session will contribute to scientific progress through 1) understanding speaker comparison for investigative and forensic application (e.g., describe what is currently being done and critically analyze performance and lessons learned); 2) improving speaker comparison for investigative and forensic applications (e.g., propose new approaches/techniques, understand the limitations, and identify challenges and opportunities); 3) improving communications between communities of researchers, legal scholars, and practitioners internationally (e.g., directly address some central legal, policy, and societal questions such as allowing speaker comparisons in court, requirements for expert witnesses, and requirements for specific automatic or human-based methods to be considered scientific); 4) using best practices (e.g., reduction of bias and presentation of evidence); 5) developing a roadmap for progress in this session and future sessions; and 6) producing a documented contribution to the field. Some of these objectives will need multiple sessions to fully achieve and some are complicated due to differing legal systems and cultures. This special session builds on previous successful special sessions and tutorials in forensic applications of speaker comparison at INTERSPEECH beginning in 2003. Wide international participation is planned, including researchers from the ISCA SIGs for the Association Francophone de la Communication Parlée (AFCP) and the Speaker and Language Characterization (SpLC). Organizers: Joseph P. Campbell, PhD, MIT Lincoln Laboratory, jpc [at]ll.mit.edu Jean-François Bonastre, l'Université d'Avignon, jean-francois.bonastre [at]univ-avignon.fr Text-dependent for Short-duration Speaker Verification In recent years, speaker verification engines have reached maturity and have been deployed in commercial applications. Ergonomics of such applications is especially demanding and imposes a drastic limitation in terms of speech duration during authentication. A well known tactic to address the problem of lack of data, due to short duration, is using text-dependency. However, recent breakthroughs achieved in the context of text-independent speaker verification in terms of accuracy and robustness do not benefit text-dependent applications. Indeed, large development data required by the recent approaches is not available in the text-dependent context. The purpose of this special session is to gather the research efforts from both academia and industry toward a common goal of establishing a new baseline and explore new directions for text-dependent speaker verification. The focus of the session is on robustness with respect to duration and modeling of lexical information. To support the development and evaluation of text-dependent speaker verification technologies, the Institute for Infocomm Research (I2R) has recently released the RSR2015 database, including 150 hours of data recorded from 300 speakers. The papers submitted to the special session are encouraged, but not limited, to provide results based on the RSR2015 database in order to enable comparison of algorithms and methods. For this purpose, the organizers strongly encourage the participants to report performance on the protocol delivered with the database in terms of EER and minimum cost (in the sense of NIST 2008 Speaker Recognition evaluation). To get the database, please contact the organizers. Further details are available at: http://www1.i2r.a-star.edu.sg/~kalee/is2014/tdspk.html Organizers: Anthony LARCHER (alarcher [at]i2r.a-star.edu.sg) Institute for Infocomm Research Hagai ARONOWITZ (hagaia [at]il.ibm.com) IBM Research – Haifa Kong Aik LEE (kalee [at]i2r.a-star.edu.sg) Institute for Infocomm Research Patrick KENNY (patrick.kenny [at]crim.ca) CRIM – Montréal Tutorial Dialogues and Spoken Dialogue Systems The growing interest in educational applications that use spoken interaction and dialogue technology has boosted research and development of interactive tutorial systems, and over the recent years, advances have been achieved in both spoken dialogue community and education research community, with sophisticated speech and multi-modal technology which allows functionally suitable and reasonably robust applications to be built. The special session combines spoken dialogue research, interaction modeling, and educational applications, and brings together the two INTERSPEECH SIG communities: SLaTE and SIGdial. The session focuses on methods, problems and challenges that are shared by both communities, such as sophistication of speech processing and dialogue management for educational interaction, integration of the models with theories of emotion, rapport, and mutual understanding, as well as application of the techniques to novel learning environments, robot interaction, etc. The session aims to survey issues related to the processing of spoken language in various learning situations, modeling of the teacher-student interaction in MOOC-like environments, as well as evaluating tutorial dialogue systems from the point of view of natural interaction, technological robustness, and learning outcome. The session encourages interdisciplinary research and submissions related to the special focus of the conference, 'Celebrating the Diversity of Spoken Languages'. For further information click http://junionsjlee.wix.com/INTERSPEECH Organizers: Maxine Eskenazi, max+ [at]cs.cmu.edu Kristiina Jokinen, kristiina.jokinen [at]helsinki.fi Diane Litman, litman [at]cs.pitt.edu Martin Russel, M.J.RUSSELL [at]bham.ac.uk Visual Speech Decoding Speech perception is a bi-modal process that takes into account both the acoustic (what we hear) and visual (what we see) speech information. It has been widely acknowledged that visual clues play a critical role in automatic speech recognition (ASR) especially when audio is corrupted by, for example, background noise or voices from untargeted speakers, or even inaccessible. Decoding the visual speech is utterly important for ASR technologies to be widely implemented to realize truly natural human-computer interactions. Despite the advances in acoustic ASR, visual speech decoding remains a challenging problem. The special session aims to attract more effort to tackle this important problem. In particular, we would like to encourage researchers to focus on some critical questions in the area. We propose four questions as the initiative as follows: 1. How to deal with the speaker dependency in visual speech data? 2. How to cope with the head-pose variation? 3. How to encode temporal information in visual features? 4. How to automatically adapt the fusion rule when the quality of the two individual (audio and visual) modalities varies? Researchers and participants are encouraged to raise more questions related to visual speech decoding. We expect the session to draw a wide range of attention from both the speech recognition and machine vision communities to the problem of visual speech decoding. Organizers: Ziheng Zhou, University of Oulu, ziheng.zhou [at]ee.oulu.fi Matti Pietikäinen, University of Oulu, matti.pietikainen [at]ee.oulu.fi Guoying Zhao, University of Oulu, gyzhao [at]ee.oulu.fi
| ||||||||||||||||||||
3-1-9 | (2015) INTERSPEECH 2015 Dresden RFA Interspeech 2015
September 6-10, 2015, Dresden, Germany
SPECIAL TOPIC Speech Beyond Speech: Towards a Better Understanding of the Most Important Biosignal
MOTIVATION Speech is the most important biosignal humans can produce and perceive. It is the most common means of human-human communication, and therefore research and development in speech and language are not only paramount for understanding humans, but also to facilitate human-machine interaction. Still, not all characteristics of speech are fully understood, and even fewer are used for developing successful speech and language processing applications. Speech can exploit its full potential only if we consider the characteristics which are beyond the traditional (and still important) linguistic content. These characteristics include other biosignals that are directly accessible to human perception, such as muscle and brain activity, as well as articulatory gestures.
INTERSPEECH 2015 will therefore be organized around the topic “Speech beyond Speech: Towards a Better Understanding of the Most Important Biosignal”. Our conviction is that spoken language processing can make a substantial leap if it caters for the full information which is available in the speech signal. By opening our prestigious conference to researchers in other biosignal communities, we expect that substantial advances can be made discussing ideas and approaches across discipline and community boundaries.
ORGANIZERS The following preliminary list of principal organizers plan INTERSPEECH 2015:
LOCATION The event will be staged in the recently built Maritim International Congress Center (ICD) in Dresden, Germany. As the capital of Saxony, an up-and-coming region located in the former eastern part of Germany, Dresden combines glorious and painful history with a strong dedication to future and technology. It is located in the heart of Europe, easily reached via two airports, and will offer a great deal of history and culture to INTERSPEECH 2015 delegates. Guests are well catered for in a variety of hotels of different standards and price ranges, making INTERSPEECH 2015 an exciting as well as an affordable event.
CONTACT Prof. Dr.-Ing. Sebastian Möller, Quality and Usability Lab, Telekom Innovation Laboratories, TU Berlin Sekr. TEL-18, Ernst-Reuter-Platz 7, D-10587 Berlin, Germany Web: www.interspeech2015.org
| ||||||||||||||||||||
3-1-10 | (2016) INTERSPEECH 2016, San Francisco, CA, USA Interspeech 2016 will take place from September 8-12 2016 in San Francisco, CA, USA General Chair is Nelson Morgan.
|