Repository at Hanyang University: 이중음소 모델에 기초한 대용량 음성인식

288 0

이중음소 모델에 기초한 대용량 음성인식

Abstract: 본 논문에서는 문맥 종속형 유사음소 단위 모델에 기반한 대용량 어휘인식에 관한 실험을 수행하였다. 한국어의 발음 및 음운 특성을 고려하여 선정한 50개의 단일 유사음소 모델로부터 총 593개의 이중음소(diphone)를 구성하여 인식의 기본단위로 설정하혔다. 유사 음소 및 언어 모델 데이타베이스로부터 발음사전을 구성하고 기본단위 인식시 탐색 공간을 줄이고 성능을 향상시키기 위해 bigram을, 구문단위로 분할한 문장인식에 대해서는 unigram을 적용하였다. 실험 결과, 유사음소단위 인식에서는 69.16%의 인식률을, 문장인식을 거친 최종 인식률은 87.36%의 성능을 나타내었다. In this paper, We studied and carried out an experiment on speech recognition of large vocabulary based on diphones. We set up 593 diphones for basic PLUs from 50 phonemes considering pronunciation and phonemic characteristic of Korean language and constructed the lexicon from the database of phoneme and language models. Bigram was applied to recognition of PLUs to reduce search space and increase efficiency of recognition and unigram was applied to language model divided by phrase. As a result, experiment about the sentence recognition shows that diphone models would have 9.14% better performance than monophones.

URI: https://www.earticle.net/Article/A106055 https://repository.hanyang.ac.kr/handle/20.500.11754/158700

Appears in Collections:: COLLEGE OF SCIENCE AND CONVERGENCE TECHNOLOGY[E](과학기술융합대학) > ETC

한양대학교 리포지터리는 국립중앙도서관 OAK 보급사업으로 구축되었습니다. Feedback 개인정보처리방침

Hanyang University repository