Repository at Hanyang University: Augmented Latent Features of Deep Neural Network-Based Automatic Speech Recognition for Motor-Driven Robots

Browse

My Repository

Repository at Hanyang UniversityCOLLEGE OF ENGINEERING[S](공과대학)ELECTRONIC ENGINEERING(융합전자공학부)Articles

201 122

Full metadata record

DC Field	Value	Language
dc.contributor.author	장준혁	-
dc.date.accessioned	2022-03-29T02:35:01Z	-
dc.date.available	2022-03-29T02:35:01Z	-
dc.date.issued	2020-07	-
dc.identifier.citation	APPLIED SCIENCES-BASEL, v. 10, no. 13, article no. 4602	en_US
dc.identifier.issn	2076-3417	-
dc.identifier.uri	https://www.mdpi.com/2076-3417/10/13/4602	-
dc.identifier.uri	https://repository.hanyang.ac.kr/handle/20.500.11754/169489	-
dc.description.abstract	Speech recognition for intelligent robots seems to suffer from performance degradation due to ego-noise. The ego-noise is caused by the motors, fans, and mechanical parts inside the intelligent robots especially when the robot moves or shakes its body. To overcome the problems caused by the ego-noise, we propose a robust speech recognition algorithm that uses motor-state information of the robot as an auxiliary feature. For this, we use two deep neural networks (DNN) in this paper. Firstly, we design the latent features using a bottleneck layer, one of the internal layers having a smaller number of hidden units relative to the other layers, to represent whether the motor is operating or not. The latent features maximizing the representation of the motor-state information are generated by taking the motor data and acoustic features as the input of the first DNN. Secondly, once the motor-state dependent latent features are designed at the first DNN, the second DNN, accounting for acoustic modeling, receives the latent features as the input along with the acoustic features. We evaluated the proposed system on LibriSpeech database. The proposed network enables efficient compression of the acoustic and motor-state information, and the resulting word error rate (WER) are superior to that of a conventional speech recognition system.	en_US
dc.description.sponsorship	This work was supported by the research fund of Signal Intelligence Research Center supervised by the Defense Acquisition Program Administration and the Agency for Defense Development of Korea.	en_US
dc.language.iso	en	en_US
dc.publisher	MDPI	en_US
dc.subject	automatic speech recognition	en_US
dc.subject	human-robot interaction	en_US
dc.subject	deep learning	en_US
dc.subject	bottleneck layer	en_US
dc.subject	latent feature	en_US
dc.subject	bottleneck network	en_US
dc.title	Augmented Latent Features of Deep Neural Network-Based Automatic Speech Recognition for Motor-Driven Robots	en_US
dc.type	Article	en_US
dc.relation.no	13	-
dc.relation.volume	10	-
dc.identifier.doi	10.3390/app10134602	-
dc.relation.page	1-10	-
dc.relation.journal	APPLIED SCIENCES-BASEL	-
dc.contributor.googleauthor	Lee, Moa	-
dc.contributor.googleauthor	Chang, Joon-Hyuk	-
dc.relation.code	2020047168	-
dc.sector.campus	S	-
dc.sector.daehak	COLLEGE OF ENGINEERING[S]	-
dc.sector.department	SCHOOL OF ELECTRONIC ENGINEERING	-
dc.identifier.pid	jchang	-
dc.identifier.orcid	https://orcid.org/0000-0003-2610-2323	-

Appears in Collections:: COLLEGE OF ENGINEERING[S](공과대학) > ELECTRONIC ENGINEERING(융합전자공학부) > Articles

Files in This Item:: Augmented Latent Features of Deep Neural Network-Based Automatic Speech Recognition for Motor-Driven Robots.pdf Download

Export: RIS (EndNote); XLS (Excel); XML

Show simple item record

한양대학교 리포지터리는 국립중앙도서관 OAK 보급사업으로 구축되었습니다. Feedback 개인정보처리방침

Hanyang University repository

Browse

My Repository

BROWSE