Affiliation : Department of Computer Science and Engineering, Dongguk University
Country : Korea
Category : Computer Science & Information Technology
Volume, Issue, Month, Year : 8, 2, January, 2018
Abstract
This
paper aims to improve the inaccuracy problem of the existing informatized
caption in the noisy environment by using the additional caption information.
The IBM Watson API can automatically generate the informatized caption
including the timing information and the speaker ID information from the voice
information input. In this IBM Watson API, when there is noise in the voice
signal, the recognition results are not good, causing the informatized caption
error. Especially, it is more easily found in movies such as background music
and special sound. Specifically, to reduce caption error, additional captions
and voice information are entered at the same time, and the result of the informatized
caption of voice information from IBM Watson API is compared with the original
text to automatically detect and modify the error part. Based on the database
containing the average pronunciation time, each word for each speaker is
changed into the informatized caption in this process. In this way, more
precise informatized captions could be generated based on the IBM Watson API.
Keyword : Informatized caption, Speaker Pronunciation Time, IBM Watson API, Speech to Text Translation
For More Details : https://airccj.org/CSCP/vol8/csit88211.pdf
No comments:
Post a Comment