Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | 장준혁 | - |
dc.date.accessioned | 2022-09-05T01:19:48Z | - |
dc.date.available | 2022-09-05T01:19:48Z | - |
dc.date.issued | 2020-11 | - |
dc.identifier.citation | SENSORS, v. 20, no. 22, article no. 6493, page. 1-17 | en_US |
dc.identifier.issn | 1424-8220 | - |
dc.identifier.uri | https://www.mdpi.com/1424-8220/20/22/6493 | - |
dc.identifier.uri | https://repository.hanyang.ac.kr/handle/20.500.11754/172755 | - |
dc.description.abstract | In this paper, we propose a multi-channel cross-tower with attention mechanisms in latent domain network (Multi-TALK) that suppresses both the acoustic echo and background noise. The proposed approach consists of the cross-tower network, a parallel encoder with an auxiliary encoder, and a decoder. For the multi-channel processing, a parallel encoder is used to extract latent features of each microphone, and the latent features including the spatial information are compressed by a 1D convolution operation. In addition, the latent features of the far-end are extracted by the auxiliary encoder, and they are effectively provided to the cross-tower network by using the attention mechanism. The cross tower network iteratively estimates the latent features of acoustic echo and background noise in each tower. To improve the performance at each iteration, the outputs of each tower are transmitted as the input for the next iteration of the neighboring tower. Before passing through the decoder, to estimate the near-end speech, attention mechanisms are further applied to remove the estimated acoustic echo and background noise from the compressed mixture to prevent speech distortion by over-suppression. Compared to the conventional algorithms, the proposed algorithm effectively suppresses the acoustic echo and background noise and significantly lowers the speech distortion. | en_US |
dc.description.sponsorship | This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No. 2017-0-00474, Intelligent Signal Processing for AI Speaker Voice Guardian). | en_US |
dc.language.iso | en | en_US |
dc.publisher | MDPI | en_US |
dc.subject | acoustic echo suppression | en_US |
dc.subject | noise suppression | en_US |
dc.subject | attention mechanism | en_US |
dc.subject | temporal convolutional network | en_US |
dc.subject | cross-tower | en_US |
dc.title | Multi-TALK: Multi-Microphone Cross-Tower Network for Jointly Suppressing Acoustic Echo and Background Noise | en_US |
dc.type | Article | en_US |
dc.relation.no | 22 | - |
dc.relation.volume | 20 | - |
dc.identifier.doi | 10.3390/s20226493 | - |
dc.relation.page | 1-17 | - |
dc.relation.journal | SENSORS | - |
dc.contributor.googleauthor | Park, Song-Kyu | - |
dc.contributor.googleauthor | Chang, Joon-Hyuk | - |
dc.relation.code | 2020053568 | - |
dc.sector.campus | S | - |
dc.sector.daehak | COLLEGE OF ENGINEERING[S] | - |
dc.sector.department | SCHOOL OF ELECTRONIC ENGINEERING | - |
dc.identifier.pid | jchang | - |
dc.identifier.orcid | https://orcid.org/0000-0003-2610-2323 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.