Estimation the Number of Speakers Based on Adaptive Wavelet Transform by Generalized Eigenvalue Decomposition and K-means Clustering

Abstract
The aim of this paper is estimation the number of simultaneous speakers from the overlapped speech signals. The proposed method in this paper is based on spectrum estimation with adaptive wavelet transform in combination with generalized eigenvalue decomposition (GEVD) and K-means clustering. Firstly, the spectral estimation method is implemented on all microphone signals to select the best part of signal spectrum. In following, the microphone signals are divided to different subbands by using of adaptive wavelet transform. The GEVD algorithm is implemented on each microphone pairs in different subband to estimate the room impulse response and time difference of arrival (TDOA). Finally, the K-means clustering with silhouette criteria is used to estimate the number of speakers (K value). The proposed algorithm is implemented on simulated and real data to show the superiority of the proposed method in comparison with other previous works.
Description
Keywords
Estimation, Wavelet transforms, Clustering algorithms, Microphone arrays, Eigenvalues and eigenfunctions
Citation