Repository at Hanyang University: DNN based multi-speaker speech synthesis with temporal auxiliary speaker ID embedding

Browse

My Repository

Repository at Hanyang UniversityCOLLEGE OF ENGINEERING[S](공과대학)ELECTRONIC ENGINEERING(융합전자공학부)Articles

258 0

Full metadata record

DC Field	Value	Language
dc.contributor.author	장준혁	-
dc.date.accessioned	2020-01-20T06:07:41Z	-
dc.date.available	2020-01-20T06:07:41Z	-
dc.date.issued	2019-01	-
dc.identifier.citation	ICEIC 2019 - International Conference on Electronics, Information, and Communication, 8706390	en_US
dc.identifier.isbn	978-899500444-9	-
dc.identifier.uri	https://ieeexplore.ieee.org/document/8706390	-
dc.identifier.uri	https://repository.hanyang.ac.kr/handle/20.500.11754/122095	-
dc.description.abstract	In this paper, multi speaker speech synthesis using speaker embedding is proposed. The proposed model is based on Tacotron network, but post-processing network of the model is modified with dilated convolution layers, which used in Wavenet architecture, to make it more adaptive to speech. The model can generate multi speaker voice with only one neural network model by giving auxiliary input data, speaker embedding, to the network. This model shows successful result for generating two speaker's voices without significant deterioration of speech quality. © 2019 Institute of Electronics and Information Engineers (IEIE).	en_US
dc.description.sponsorship	This work was supported by Institute for Information \& communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No.2017-0-00474, Intelligent Signal Processing for AI Speaker Voice Guardian) This research was supported by Projects for Research and Development of Police science and Technology under Center for Research and Development of Police science and Technology and Korean National Police Agency funded by the Ministry of Science and ICT(PA-J000001-2017-101).	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE/ICEIC	en_US
dc.subject	Deep learning	en_US
dc.subject	Sequence to sequence	en_US
dc.subject	Speech synthesis	en_US
dc.subject	Multi speaker speech synthesis	en_US
dc.title	DNN based multi-speaker speech synthesis with temporal auxiliary speaker ID embedding	en_US
dc.type	Article	en_US
dc.identifier.doi	10.23919/ELINFOCOM.2019.8706390	-
dc.relation.page	61-64	-
dc.contributor.googleauthor	Lee, Junmo	-
dc.contributor.googleauthor	Song, Kwangsub	-
dc.contributor.googleauthor	Noh, Kyoungjin	-
dc.contributor.googleauthor	Park, Tae-Jun	-
dc.contributor.googleauthor	Chang, Joon-Hyuk	-
dc.sector.campus	S	-
dc.sector.daehak	COLLEGE OF ENGINEERING[S]	-
dc.sector.department	DEPARTMENT OF ELECTRONIC ENGINEERING	-
dc.identifier.pid	jchang	-
dc.identifier.orcid	https://orcid.org/0000-0003-2610-2323	-

Appears in Collections:: COLLEGE OF ENGINEERING[S](공과대학) > ELECTRONIC ENGINEERING(융합전자공학부) > Articles

Files in This Item:

Export: RIS (EndNote); XLS (Excel); XML

Show simple item record

한양대학교 리포지터리는 국립중앙도서관 OAK 보급사업으로 구축되었습니다. Feedback 개인정보처리방침

Hanyang University repository

Browse

My Repository

BROWSE