Материал из Материалы по машинному обучению
Перейти к: навигация, поиск
MODELING_OUT-OF-VOCABULARY_WORDS_FOR_ROBUST_SPEECH_RECOGNITION.pdf(0 × 0 пикселей, размер файла: 87 КБ, MIME-тип: application/pdf)

Issam Bazzi and James R. Glass Spoken Language Systems Group Laboratory for Computer Science Massachusetts Institute of Technology Cambridge, Massachusetts 02139, USA

issam,jrg @sls.lcs.mit.edu


In this paper we present an approach for modeling and recognizing out-of-vocabulary (OOV) words in a single stage recognizer. A word-based recognizer is augmented with an extra OOV word model, which enables the OOV word to be predicted by a wordbased language model. The OOV model itself is phone-based, so that an OOV word can be realized as an arbitrary sequence of phones. A phone bigram is used to provide phonotactic constraints within the OOV model. A recognizer with this configuration can recognize words in the original vocabulary as well as any potential new words of arbitrary pronunciation. In our preliminary investigation of this framework, we have evaluated the recognizer on a weather information domain with one test set containing only in-vocabulary (IV) data, and another containing OOV words. On the IV test set, the recognizer had an OOV insertion rate of only 1.3%, and degraded the baseline WER from 10.4% to 10.7%. On the OOV test set, the recognizer was able to detect nearly half of the OOV words (47% detection rate).

История файла

Нажмите на дату/время, чтобы просмотреть, как тогда выглядел файл.

текущий17:20, 22 декабря 20160 × 0 (87 КБ)Slikos (обсуждение | вклад)
  • Вы не можете перезаписать этот файл.

Следующая 1 страница ссылается на данный файл: