449 0

Frequency-Domain Nonlinear Acoustic Echo Reduction Techniques Using Single-Channel and Multi-Channel Microphones

Title
Frequency-Domain Nonlinear Acoustic Echo Reduction Techniques Using Single-Channel and Multi-Channel Microphones
Author
Jihwan Park
Advisor(s)
Joon-Hyuk Chang
Issue Date
2018-08
Publisher
한양대학교
Degree
Doctor
Abstract
In general, acoustic echo is caused by the acoustical link between the loudspeaker and the microphone. These acoustic echoes decrease performance of various applications including speech communication and teleconferencing. To eliminate these interferences, many acoustic echo reduction algorithms have been studied. For past decades, the acoustical link has been assumed to be linear, so that many algorithms tried to estimate the acoustic echo path by using a simple finite impulse responses (FIR) filter, and in fact they worked well in a simulated environment. However, in practical environments, the loudspeaker is frequently over-driven with maximum voltages. The nonlinearity prevents the simple FIR filter from accurately estimating the acoustic echo path and this phenomenon leads to performance degradation of acoustic echo reduction with linear assumption. In addition, conventional acoustic echo reduction methods still have limitation in highly reverberant environments. As the reverberation becomes higher, the correlation between the far-end signal and the microphone input signal decreases, which reduces echo path estimation performance. A method for effectively reducing acoustic echoes in such nonlinearity and highly reverberation environments is essential. In this regard, this dissertation presents two algorithms for reducing nonlinear acoustic echo signal: 1) frequency-domain power filter-based AES with single-channel microphone; 2) frequency-domain Kalman filter-based AEC with a microphone array to reduce acoustic echo more efficiently in presence of the nonlinearity and severe reverberation. In the first part of the dissertation, we propose a novel nonlinear acoustic echo suppression based on frequency-domain power filter. The proposed NAEC algorithm is divided into three steps. First of all, the nonlinear acoustic echo power is estimated by using the frequency-domain power filter which can capture a memoryless saturation-type nonlinearity of of loudspeakers. Coefficients for frequency-domain power filter are obtained by multi-tap least squares (MTLS) estimator which allocates the finite impulse response filter per subband for utilizing the temporal evolution. Secondly, we then derive the near-end speech absence probability (NSAP) which is obtained based on the statistical model of the near-end speech for each frequency bin to adjust the minimum mean square estimation (MMSE) gain in the AES framework. We further devise an optimized way of estimating the NSAP that is based on the data-driven method, which finds the ratio of the \textit{a priori} probabilities of the near-end speech presence and absence for a wide range of signal-to-echo ratios (SERs). As a result, the proposed method shows improved speech quality compared to conventional algorithms. In the next part of the dissertation, we propose multiple-input multiple-output (MIMO) nonlinear AEC (NAEC) to filter nonlinear acoustic echo out spatially in the reverberant condition. We extend a single-channel of nonlinear acoustic echo cancellation to multichannel one, and propose to model the nonlinear acoustic transfer function (ATF) vector using a state-space equation. The Kalman filter is also adopted to estimate the nonlinear ATF vector optimally and recursively. Furthermore, a low-rank approximation and a spatial filtering are applied to estimate the power spectral density (PSD) matrix of near-end speech, which makes MIMO NAEC to show predictable performance under severe signal-to-echo ratio (SER) and highly reverberant conditions. It is shown that our approach outperforms conventional methods in terms of echo reduction and near-end speech quality for a wide range of SER and reverberation conditions. The proposed nonlinear acoustic echo reduction techniques show considerably better performance than the conventional methods in both single-channel microphone and multi-channel microphones condition. The techniques can be useful as a pre-processor to improve speech quality even in a speech communication environment in the presence of the nonlinear acoustic echo signal.
URI
https://repository.hanyang.ac.kr/handle/20.500.11754/75887http://hanyang.dcollection.net/common/orgView/200000433388
Appears in Collections:
GRADUATE SCHOOL[S](대학원) > ELECTRONICS AND COMPUTER ENGINEERING(전자컴퓨터통신공학과) > Theses (Ph.D.)
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE