Sequential Deep Neural Networks Ensemble for Speech Bandwidth Extension
- Title
- Sequential Deep Neural Networks Ensemble for Speech Bandwidth Extension
- Author
- 장준혁
- Keywords
- Bandwidth extension; sequential deep neural network; ensemble; log-power spectra; regression; voiced/unvoiced classification
- Issue Date
- 2018-05
- Publisher
- IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
- Citation
- IEEE ACCESS, v. 6, page. 27039-27047
- Abstract
- In this paper, we propose a subband-based ensemble of sequential deep neural networks (DNNs) for bandwidth extension (BWE). First, the narrow-band spectra are folded into the high-band (HB) region to generate the high-band spectra, and then the energy levels of the HB spectra are adjusted using the DNN-based on the log-power spectra feature. For this, we basically build the multiple DNNs, which is responsible for each subband of the HB and the DNN ensemble is sequentially connected from lower to higher subbands. This sequential structure for the DNN ensemble carries out the denoising and HB regression to better estimate the HB energy levels. In addition, we use the voiced/unvoiced (V/UV) classification to differently apply the DNN ensemble depending on either V/UV sounds. To demonstrate the performance of the proposed BWE algorithm, we compare it with a speech production model-based BWE system and a DNN-based BWE system in which the log-power spectra in the HB are estimated directly. The experimental results show that the proposed approach provides better speech quality than conventional approaches.
- URI
- https://ieeexplore.ieee.org/document/8355864https://repository.hanyang.ac.kr/handle/20.500.11754/118648
- ISSN
- 2169-3536
- DOI
- 10.1109/ACCESS.2018.2833890
- Appears in Collections:
- COLLEGE OF ENGINEERING[S](공과대학) > ELECTRONIC ENGINEERING(융합전자공학부) > Articles
- Files in This Item:
There are no files associated with this item.
- Export
- RIS (EndNote)
- XLS (Excel)
- XML