Файл:Very deep multilingual convolutional neural networks for LVCSR 1509.08967.pdf

Материал из Материалы по машинному обучению
Перейти к: навигация, поиск
Very_deep_multilingual_convolutional_neural_networks_for_LVCSR_1509.08967.pdf(0 × 0 пикселей, размер файла: 269 КБ, MIME-тип: application/pdf)

Tom Sercu1,2 Christian Puhrsch1 Brian Kingsbury2 Yann LeCun1 1 Center for Data Science, Courant Institute of Mathematical Sciences, New York University 2 IBM T. J. Watson Research Center, Yorktown Heights, NY, 10598, U.S.A. 2{tsercu,bedk}@us.ibm.com, 1cpuhrsch@nyu.edu,yann@cs.nyu.edu


Convolutional neural networks(CNNs) are a standard component of many current state-of-the-art Large Vocabulary Continuous Speech Recognition (LVCSR) systems. However, CNNs in LVCSR have not kept pace with recent advances in other domains where deeper neural networks provide superior performance. In this paper we propose a number of architectural advances in CNNs for LVCSR. First, we introduce a very deep convolutional network architecture with up to 14 weight layers. There are multiple convolutional layers before each pooling layer, with small 3×3 kernels, inspired by the VGG Imagenet 2014 architecture. Then, we introduce multilingual CNNs with multiple untied layers. Finally, we introduce multi-scale input features aimed at exploiting more context at negligible computational cost. We evaluate the improvements first on a Babel task for low resource speech recognition, obtaining an absolute 5.77% WER improvement over the baseline PLP DNN by training our CNN on the combined data of six different languages. We then evaluate the very deep CNNs on the Hub5’00 benchmark (using the 262 hours of SWB-1 training data) achieving a word error rate of 11.8% after cross-entropy training, a 1.4% WER improvement (10.6% relative) over the best published CNN result so far.

Index Terms — Convolutional Networks, Multilingual, Acoustic Modeling, Speech Recognition, Neural Networks

История файла

Нажмите на дату/время, чтобы просмотреть, как тогда выглядел файл.

текущий12:16, 22 декабря 20160 × 0 (269 КБ)Slikos (обсуждение | вклад)
  • Вы не можете перезаписать этот файл.

Следующая 1 страница ссылается на данный файл: