Hybrid PSO-ANFIS for Speaker Recognition

samiya silarbi

doi:10.4018/IJCINI.20210401.OA7

Outline

Hybrid PSO-ANFIS for Speaker Recognition

samiya silarbi

International Journal of Cognitive Informatics and Natural Intelligence

https://doi.org/10.4018/IJCINI.20210401.OA7

visibility

…

description

14 pages

link

1 file

Abstract

This paper introduces an evolutionary approach for training the adaptive network-based fuzzy inference system (ANFIS). The previous works are based on gradient descendent (GD); this algorithm converges very slowly and gets stuck down at bad local minima. This study applies one of the swarm intelligent branches, named particle swarm optimization (PSO), where the premise parameters of the rules are optimized by a PSO, and the conclusion part is optimized by least-squares estimation (LSE). The hybrid PSO-ANFIS model is performed for speaker recognition on CHAINS speech dataset. The results obtained by the hybrid model showed an improvement on the accuracy compared to similar ANFIS based on gradient descendent optimization.

References (45)

Abraham, A. (2001, June). Neuro Fuzzy Systms: State-of-the-Art Modeling Techniques. In Artificial and Natural Neural Networks IWANN, 6th International Work-Conference on (Vol. 2084, pp. 269-276). Academic Press.
Algabri, M., Mathkour, H., Bencherif, M. A., Alsulaiman, M., & Mekhtiche, M. A. (2017). Automatic Speaker Recognition for Mobile Forensic Applications. Hindawi Mobile Information Systems, 2017, 1-6. doi:10.1155/2017/6986391
Apsingekar, V. R., & De Leon, P. L. (2009, November). Support Vector Machine Based Speaker Identification Systems Using GMM Parameters. In Signals, Systems and Computers, 2009 Conference Record of the Forty- Third Asilomar Conference on, (pp. 1766-1769). IEEE. doi:10.1109/ACSSC.2009.5470201
Bojadziev, G., & Bojadziev, M. (1995). Fuzzy Sets, Fuzzy Logic, Applications, Advances in fuzzy systems applications and theory. World Scientific. doi:10.1142/2867
Campbell, W. M., Campbell, J. P., Reynolds, D. A., Singer, E., & Torres-Carrasquillo, P. A. (2006). Support vector machines for speaker and language recognition. Computer Speech & Language, 20(2-3), 210-229. doi:10.1016/j.csl.2005.06.003
Campbell, W. M., Sturim, D. E., & Reynolds, D. A. (2006). Support Vector Machines Using GMM Supervectors for Speaker Verification. IEEE Signal Processing Letters, 13(5), 308-311. doi:10.1109/LSP.2006.870086
Chakroun, R., Zouari, L. B., Frikha, M., & Hamida, A. B. (2015, December). A hybrid system based on GMM- SVM for Speaker Identification. In Intelligent Systems DeSign and Applications(ISDA), 2015 15th International Conference on, (pp. 645-658). IEEE. doi:10.1109/ISDA.2015.7489195
Chen, Y., Wang, L., Lin, H., & Li, J. (2012, October). Design of speaker recognition system based on artificial neural network. In Advanced Optical Manufacturing and Testing Technologies: Optical System Technologies for Manufacturing and Testing. Proceedings of the SPIE, 2012. AOMATT 6th International Symposium on (Vol. 8420, pp. 1-7). doi:10.1117/12.970642
Cummins, F., Leonard, M., Leonardo, T., & Simko, J. (2006, June). The CHAINS corpus CHAracterizing INdividual Speakers. In Speech and Computer SPECOM, The International Conference on (pp. 431-435). Academic Press.
Daqrouqa, K., & Tutunjib, T. A. (2015). Speaker identification using vowels features through a combined method of formants, wavelets, and neural network classifiers. Applied Soft Computing, 27, 231-239. doi:10.1016/j. asoc.2014.11.016
Dhineshkumar, R., Ganesh, A. B., & Sasikala, S. (2016). Speaker Identification System using Gaussian Mixture Model and Support Vector Machines (GMM-SVM) under Noisy Conditions. Indian Journal of Science and Technology, 9(19), 1-6.
Eberhart, R. C., & Shi, Y. (2001). Particle swarm optimization: developments, applications and resources. In Proceedings of the 2001 Congress on of IEEE international conference on, (pp. 81-86). IEEE. doi:10.1109/ CEC.2001.934374
Elwakdy, A. M., Elsehely, B. E., Eltokhy, C. M., & Elhennawy, D. A. (2008, July). Speech recognition using a wavelet transform to establish fuzzy inference system through subtractive clustering and neural network (ANFIS).
In ICS'08 Proceedings. the 12th WSEAS international conference on Systems (pp. 381-386). Academic Press.
Fatemeh, Z., & Zahra, Z. (2018). A review of neuro-fuzzy systems based on intelligent control. Journal of Electrical and Electronic Engineering, 3(2-1), 58-61.
Fazakis, N., Karlos, S., Kotsiantis, S., & Sgarbas, K. (2015). Speaker Identification Using Semi-supervised Learning. Lecture Notes in Computer Science, 9319, 389-396. doi:10.1007/978-3-319-23132-7_48 Fuzzy Logic Toolbox. (2000). Fuzzy Logic Toolbox User's Guide, Version 2. The MathWorks, Inc.
Ge, Z., Iyer, A. N., Cheluvaraja, S., Sundaram, R., & Ganapathiraju, A. (2017, September). Neural Network Based Speaker Classification and Verification Systems with Enhanced Features. In Intelligent Systems Conference IntelliSys (pp. 1-6). IEEE. doi:10.1109/IntelliSys.2017.8324265
Gunasekaran, M., Varatharajan, R., & Priyan, M. K. (2018). Hybrid Recommendation System for Heart Disease Diagnosis based on Multiple Kernel Learning with Adaptive Neuro-Fuzzy Inference System. An International Journal Multimedia Tools and Applications, 77(4), 4379-4399. doi:10.1007/s11042-017-5515-y He, X., & Deng, L. (2008). Discriminative Learning for Speech Recognition: Theory and practice. Morgan & Claypool.
Helmi, N., & Helmi, B. H. (2008, October). Speech recognition with fuzzy neural network for discrete words. In Natural Computation 2008, Proceedings ICNC Fourth International Conference on (Vol. 7, pp. 265-269). IEEE. doi:10.1109/ICNC.2008.666
Huang, X., Acero, A., Hon, H., & Reddy, R. (2001). Spoken Language Processing: A guide to Theory, Algorithm, And System Development. Prentice-Hall.
Jang, J. S. R. (1993). ANFIS: Adaptive Network Based Fuzzy Inference Systems. IEEE Transactions on Systems, Man, and Cybernetics, 23(3), 665-685. doi:10.1109/21.256541
Jang, J. S. R., Sun, C. T., & Mizutani, E. (1997). Neuro-Fuuzy and Soft Computing: a Computational Approach to Learning and Machine Intelligence. Prentice-Hall.
Kamaruddin, N., & Wahab, A. (2008, July). Speech Emotion Verification System (SEVS) based on MFCC for real time application. In Intelligent Environments, IET 4th International Conference on (pp. 1-7). IEEE.
Karlos, S., Fazakis, N., Karanikola, K., Kotsiantis, S., & Sgarbas, K. (2016). Speech Recognition CombiningMFCCs and Image Features. Lecture Notes in Computer Science, 9811, 651-658. doi:10.1007/978-3-319-43958-7_79
Karlos, S., Kaleris, K., Fazakis, N., Kanas, V. G., & Kotsiantis, S. (2018). Optimized Active Learning Strategy for Audiovisual Speaker Recognition. Lecture Notes in Computer Science, 11096, 281-290. doi:10.1007/978- 3-319-99579-3_30
Kennedy, J., & Eberhart, R. C. (1995). Particle swarm optimization. In Neural Networks, 1995. Proceedings. IEEE International Conference on (pp. 1942-1948). IEEE. doi:10.1109/ICNN.1995.488968
Kosko, B. (1991). Neural Networks and Fuzzy Systems A Dynamic Systems Approach. Prentice-Hall.
Li, J., Yang, L., Qu, Y., & Sexton, G. (2018). An extended Takagi-Sugeno-Kang inference system (TSK+) with fuzzy interpolation and its rule base generation. Soft Computing, 22(10), 3155-3170. doi:10.1007/s00500- 017-2925-8
Liu, Y., Qian, Y., Chen, N., Fu, T., Zhang, Y., & Yu, K. (2015). Deep feature for text-dependent speaker verification. Speech Communication, 73, 1-13. doi:10.1016/j.specom.2015.07.003
Milani, A., & Santucci, V. (2009). Online PSO for Web Marketing Optimization. IEEE International Conference on e-Business Engineering, Macau, 2009, 583-587. doi:10.1109/ICEBE.2009.92
Nayyar, A., Garg, S., Gupta, D., & Khanna, A. (2018). Evolutionary computation: theory and algorithms. In Advances in Swarm Intelligence for Optimizing Problems in Computer Science (pp. 1-26). Chapman and Hall/ CRC. doi:10.1201/9780429445927-1
Nayyar, A., Le, D. N., & Nguyen, N. G. (Eds.). (2018). Advance in Swarm Intelligence for Optimizing Problems in Computer Science. CRC Press. doi:10.1201/9780429445927
Nayyar, A., & Nguyen, N. G. (2018). Introduction to Swarm Intelligence. In Advances in Swarm Intelligence for Optimizing Problems in Computer Science (pp. 53-78). Chapman and Hall/CRC. doi:10.1201/9780429445927-3
Novakovic, J. (2011). Speaker identification in smart environments with multilayer perceptron. 2011 19thTelecommunications Forum (TELFOR) Proceedings of Papers, 1418-1421.
Pandey, B., Ranjan, A., Kumar, R., & Shukla, A. (2010, July). Multilingual Speaker Recognition Using ANFIS. Signal Processing Systems (ICSPS), 2010 2nd International Conference on, 3, 714-718.
Priyono, A., Ridwan, M., Alias, A. J., Rahmat, R. A. O. K., Hassan, A., & Ali, M. A. M. (2005). Generation of fuzzy rules with subtractive clustering. Journal Teknologi, 43(1), 143-153. doi:10.11113/jt.v43.782
Sabah, R., & Ainon, R. N. (2009, May). Isolated Digit Speech Recognition in Malay Language using Neuro- Fuzzy Approach. In Modelling & Simulation, 2009. AMS '09. Third Asia International Conference on, (pp. 336-340). IEEE.
Silarbi, S., Bendahmane, A., & Benyettou, A. (2014). Adaptive Network Based Fuzzy Inference System For Speech Recognition Through Subtractive Clustering. International Journal of Artificial Intelligence & Applications, 5(6), 43-52. doi:10.5121/ijaia.2014.5604
Srihari, V., Karthik, R., Anitha, R., & Suganthi, S. D. (2010, December). Speaker verification using combinational features and adaptive neuro-fuzzy inference systems. In Intelligent Interactive Technologies and Multimedia. IIMT'10 the First International Conference on (pp. 98-103). doi:10.1145/1963564.1963580
Sun, C. T. (1994). Rule-Base Structure Identification in an Adaptive-Network-Based Fuzzy Inference System. IEEE Transactions on Fuzzy Systems, 2(1), 64-73. doi:10.1109/91.273127
Takagi, T., & Sugeno, M. (1985). Fuzzy identification of systems and its applications to modeling and control. Syst Man and Cybern IEEETrans SMC, 15(1), 116-132. doi:10.1109/TSMC.1985.6313399
Tolba, H. (2011). A high-performance text-independent speaker identification of Arabic speakers using a CHMM- based approach. Alexandria Engineering Journal, 50(1), 43-47. doi:10.1016/j.aej.2011.01.007
Wali, S. S., & Hatture, S. M. (2015). MFCC Based Text-Dependent Speaker Identification Using BPNN. International Journal of Signal Processing Systems, 3(1), 30-34.
Wong, C. C., & Chen, C. C. (1999). A Hybrid Clustering and Gradient Descent Approach for Fuzzy Modeling. IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics, 29(6), 686-693. doi:10.1109/3477.809024 PMID:18252349
Wu, J. D., & Tsai, Y. J. (2011). Speaker identification system using empirical mode decomposition and an artificial neural network. Expert Systems with Applications, 38(5), 6112-6117. doi:10.1016/j.eswa.2010.11.013

Hybrid PSO-ANFIS for Speaker Recognition

Sign up for access to the world's latest research

Abstract

Related papers

References (45)

Related papers

Related topics