Файл:Recognising Conversational Speech - What an Incremental ASR Should Do for a Dialogue System and How to Get There BaumannEtAl16IWSDS.pdf

Материал из Материалы по машинному обучению
Перейти к: навигация, поиск
Recognising_Conversational_Speech_-_What_an_Incremental_ASR_Should_Do_for_a_Dialogue_System_and_How_to_Get_There_BaumannEtAl16IWSDS.pdf(0 × 0 пикселей, размер файла: 3,77 МБ, MIME-тип: application/pdf)

Timo Baumann, Casey Kennington, Julian Hough and David Schlangen


Automatic speech recognition (ASR) is not only becoming increasingly accurate, but also increasingly adapted for producing timely, incremental output. However, overall accuracy and timeliness alone are insufficient when it comes to interactive dialogue systems which require stability in the output and responsivity to the utterance as it is unfolding. Furthermore, for a dialogue system to deal with phenomena such as disfluencies, to achieve deep understanding of user utterances these should be preserved or marked up for use by downstream components, such as language understanding, rather than be filtered out. Similarly, word timing can be informative for analyzing deictic expressions in a situated environment and should be available for analysis. Here we investigate the overall accuracy and incremental performance of three widely used systems and discuss their suitability for the aforementioned perspectives. From the differing performance along these measures we provide a picture of the requirements for incremental ASR in dialogue systems and describe freely available tools for using and evaluating incremental ASR.

Keywords: Automatic speech recognition (ASR), Spoken dialogue system (SDS), Sphinx-4, Google’s web-based ASR API, Kaldi

История файла

Нажмите на дату/время, чтобы просмотреть, как тогда выглядел файл.

текущий17:04, 22 декабря 20160 × 0 (3,77 МБ)Slikos (обсуждение | вклад)
  • Вы не можете перезаписать этот файл.

Следующая 1 страница ссылается на данный файл: