Deep Neural Network based Robust Text independent Speaker Identification

  • S.M.Jagdale, A.A. Shinde, J.S. Chitode
Keywords: DNN, Robust, Text Independent, MFCC.

Abstract

The successful implementation of DNN (Deep Neural Network) in speech recognition encouraged researchers to implement DNN in speaker recognition effectively. A deep neural network is a complex artificial neural network (ANN) which has more than two layers between the input and output. In this work DNN is receiving input as the speech signal feature vector and providing output as the estimated probabilities of the target speaker. Experiments are conducted on ELSDSR database. The inter-session variation, acoustic variability and the presence of noise are the major challenges in speaker recognition task. These issues severely reduce the performance of speaker recognition. DNN shows remarkable improvements in the recognition accuracies in these situations [1]. The acoustic and prosodic features are given as input to DNN classifier. The maximum work has been dominated in speaker recognition by only low level features such as MFCC. In this work the feature vector contains MFCC and prosodic features such as pitch, energy, formants etc., as the feature level fusion shows improved recognition accuracy and prosodic features gives improved performance in presence of noise. The results with low level and high level features are compared with combined system with feature level fusion. The recognition accuracy using combined features with DNN classifier is improved by approximately 5 % compared with only low level features.

Published
2021-09-24
How to Cite
J.S. Chitode, S. A. S. (2021). Deep Neural Network based Robust Text independent Speaker Identification. Design Engineering, 14156-14162. Retrieved from http://www.thedesignengineering.com/index.php/DE/article/view/4686
Section
Articles