ISCA Services

ISCA - International Speech
Communication Association

ISCApad Archive » 2017 » ISCApad #231 » Jobs » (2017-04-10) 3 Funded PhD Research Studentships at CSTR, Edinburgh, Scotland, UK

ISCApad #231

Sunday, September 10, 2017 by Chris Wellekens

6-1 (2017-04-10) 3 Funded PhD Research Studentships at CSTR, Edinburgh, Scotland, UK

Three Funded PhD Research Studentships at the Centre for Speech Technology Research,
University of Edinburgh.

Please see http://www.cstr.ed.ac.uk/opportunities for full details, eligibility
requirements, application procedure and deadlines.

1. Embedding enhancement information in the speech signal

Speech becomes harder to understand in the presence of noise and other distortions, such
as telephone channels. This is especially true for people with a hearing impairment. It
is difficult to enhance the intelligibility of a received speech+noise mixture, or of
distorted speech, even with the relatively sophisticated enhancement algorithms that
modern hearing aids are capable of running. A clever way around this problem might be for
the sender to add extra information to the original speech signal, before noise or
distortion is added. The receiver (e.g., a hearing aid) would use this to assist speech
enhancement.

Funding: Marie Sklodowska-Curie fellowship

2. Broadcast Quality End-to-end Speech Synthesis

Advances in neural networks made jointly in the fields of automatic speech recognition
and speech synthesis, amongst others, have led to a new understanding of their
capabilities as generative models. Neural networks can now directly generate synthetic
speech waveforms, without the limited quality of a vocoder. We have made separate
advances, using neural networks to discover representations of spoken and written
language that have applications in lightly-supervised text processing for almost any
language, and for adaptation of speaker identity and style. The project will combine
these techniques into a single end-to-end model for speech synthesis. This will require
new techniques to learn from both text and speech data, which may have other
applications, such as automatic speech recognition.

Funding: EPSRC Industrial CASE award (in collaboration with the BBC)

3. Automatic Extraction of Rich Metadata from Broadcast Speech (in collaboration with the
BBC)

The research studentship will be concerned with automatically learning to extract rich
metadata information from broadcast television recordings, using speech recognition and
natural language processing techniques. We will build on recent advances in
convolutional and recurrent neural networks, using architectures which learn
representations jointly, considering both acoustic and textual data. The project will
build on our current work in the rich transcription of broadcast speech using neural
network based speech recognition systems, along with neural network approaches to machine
reading and summarisation. In particular, we are interested in developing approaches to
transcribing broadcast speech in a way appropriate to the particular context. This may
include compression or distillation of the content (perhaps to fit in with the
constraints of subtitling), transforming conversational speech into a form that is more
easy to read as text, or transcribing broadcast speech in a way appropriate for a
particular reading age.

Funding: EPSRC Industrial CASE award (in collaboration with the BBC)

--

Back

Top

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy