Файл:Lexicon-Free Conversational Speech Recognition with Neural Networks 2015.pdf
AndrewL.Maas∗, ZiangXie∗, DanJurafsky, AndrewY.Ng Stanford University Stanford, CA 94305, USA {amaas, zxie, ang}@cs.stanford.edu, jurafsky@stanford.edu
Abstract
We present an approach to speech recognition that uses only a neural network to map acoustic input to characters, a character-level language model, and a beam search decoding procedure. This approach eliminates much of the complex infrastructure of modern speech recognition systems, making it possible to directly train a speech recognizer using errors generated by spoken language understanding tasks. The system naturally handles out of vocabulary words and spoken word fragments. We demonstrate our approach using the challenging Switchboard telephone conversation transcription task, achieving a word error rate competitive with existing baseline systems. To our knowledge, this is the first entirely neural-network-based system to achieve strong speech transcription results on a conversational speech task. We analyze qualitative differences between transcriptions produced by our lexicon-free approach and transcriptions produced by a standard speech recognition system. Finally, we evaluate the impact of large context neural network character language models as compared to standard n-gram models within our framework.
Keywords: beam search decoding, hidden markov model, gradient-based Nesterov’s accelerated gradient (NAG), Large vocabulary continuous speech recognition (LVCSR)
История файла
Нажмите на дату/время, чтобы просмотреть, как тогда выглядел файл.
Дата/время | Размеры | Участник | Примечание | |
---|---|---|---|---|
текущий | 17:17, 22 декабря 2016 | 0 × 0 (650 КБ) | Slikos (обсуждение | вклад) |
- Вы не можете перезаписать этот файл.
Использование файла
Следующая 1 страница ссылается на данный файл: