[Adapt] FW: 讲座通知：深度学习在大规模语言模型方面的应用
kzhu at cs.sjtu.edu.cn
Tue Nov 15 22:29:15 CST 2016
Please attend this talk on Friday.
From: Yanmin Qian [mailto:yanminqian at sjtu.edu.cn]
Sent: Tuesday, November 15, 2016 10:27 PM
To: all at cs.sjtu.edu.cn
Cc: xunying.liu at gmail.com
本周五(11月18号)上午，来自香港中文大学系统工程系的Prof. Xunying Liu将访问计
deep learning和big data的各位老师及同学前来交流。
Scalable Deep Language Models
Statistical language models (LMs) form key parts of many human language
technology applications including speech recognition, machine translation,
natural language processing and handwriting recognition. Key research
problems are modelling long range context dependencies and handling data
sparsity. Deep language modelling approaches represented by recurrent neural
network (RNNs) are becoming increasingly popular for current speech and
language technology applications due to their inherently strong sequence
modelling ability and generalization performance.
This talk presents a series of recent research efforts aiming to improve the
scalability and performance of RNN language models (RNNLMs) on large data
sets. A noise contrastive estimation (NCE) based RNNLM training criterion
combined with an efficient GPU based bunch mode training algorithm obtained
over 50 times training and evaluation time speed up over the publicly
available RNNLM toolkit. Two history clustering schemes based efficient
RNNLM lattice rescoring approaches produced over 70% more compact decoding
network size than tree structured 10k-best lists with comparable
performance. Novel approaches modelling multiple paraphrase alternatives and
topic variation increased the total RNNLM improvements over baseline n-gram
LMs by a factor of 2.5.
Experimental results are presented for multiple state-of-the-art large
vocabulary speech recognition tasks.
Xunying Liu received his PhD degree in speech recognition and MPhil degree
in computer speech and language processing both from University of
Cambridge, after his undergraduate study at Shanghai Jiao Tong University.
He was a Senior Research Associate at the Machine Intelligence Laboratory of
the Cambridge University Engineering Department, prior to joining the
Department of Systems Engineering and Engineering Management, Chinese
University of Hong Kong, as an Associate Professor in 2016. He was the
recipient of best paper award at ISCA
Interspeech2010 for his paper titled "Language Model Cross Adaptation For
LVCSR System Combination". He is a co-author of the widely used HTK toolkit
and has continued to contribute to its current development in deep neural
network based acoustic and language modelling. His research outputs led to
several large scale speech recognition systems that were top ranked in a
series of international research evaluations. These include the Cambridge
Mandarin Chinese broadcast and conversational telephone speech recognition
systems developed for DARPA sponsored GALE and BOLT speech translation
evaluations from 2006 to 2014, and the Cambridge 2015 multigenre broadcast
speech transcription system. His current research interests include large
vocabulary continuous speech recognition, machine learning, statistical
language modelling, noise robust speech recognition, speech synthesis,
speech and language processing.
He is a regular reviewer for journals including IEEE/ACM Transactions on
Audio, Speech and Language Processing, Computer Speech and Language, Speech
Communication, the Journal of the Acoustical Society of America Express
Letters, Language Resources and Evaluation, and Natural Language
Engineering. He has served as a member of the scientific committee and
session chair for conferences including IEEE ICASSP and ISCA Interspeech.
Dr. Xunying Liu is a member of IEEE and ISCA.
More information about the Adapt