Hybrid PSO-ANFIS for Speaker Recognition
International Journal of Cognitive Informatics and Natural Intelligence
https://doi.org/10.4018/IJCINI.20210401.OA7Abstract
This paper introduces an evolutionary approach for training the adaptive network-based fuzzy inference system (ANFIS). The previous works are based on gradient descendent (GD); this algorithm converges very slowly and gets stuck down at bad local minima. This study applies one of the swarm intelligent branches, named particle swarm optimization (PSO), where the premise parameters of the rules are optimized by a PSO, and the conclusion part is optimized by least-squares estimation (LSE). The hybrid PSO-ANFIS model is performed for speaker recognition on CHAINS speech dataset. The results obtained by the hybrid model showed an improvement on the accuracy compared to similar ANFIS based on gradient descendent optimization.
References (45)
- Abraham, A. (2001, June). Neuro Fuzzy Systms: State-of-the-Art Modeling Techniques. In Artificial and Natural Neural Networks IWANN, 6th International Work-Conference on (Vol. 2084, pp. 269-276). Academic Press.
- Algabri, M., Mathkour, H., Bencherif, M. A., Alsulaiman, M., & Mekhtiche, M. A. (2017). Automatic Speaker Recognition for Mobile Forensic Applications. Hindawi Mobile Information Systems, 2017, 1-6. doi:10.1155/2017/6986391
- Apsingekar, V. R., & De Leon, P. L. (2009, November). Support Vector Machine Based Speaker Identification Systems Using GMM Parameters. In Signals, Systems and Computers, 2009 Conference Record of the Forty- Third Asilomar Conference on, (pp. 1766-1769). IEEE. doi:10.1109/ACSSC.2009.5470201
- Bojadziev, G., & Bojadziev, M. (1995). Fuzzy Sets, Fuzzy Logic, Applications, Advances in fuzzy systems applications and theory. World Scientific. doi:10.1142/2867
- Campbell, W. M., Campbell, J. P., Reynolds, D. A., Singer, E., & Torres-Carrasquillo, P. A. (2006). Support vector machines for speaker and language recognition. Computer Speech & Language, 20(2-3), 210-229. doi:10.1016/j.csl.2005.06.003
- Campbell, W. M., Sturim, D. E., & Reynolds, D. A. (2006). Support Vector Machines Using GMM Supervectors for Speaker Verification. IEEE Signal Processing Letters, 13(5), 308-311. doi:10.1109/LSP.2006.870086
- Chakroun, R., Zouari, L. B., Frikha, M., & Hamida, A. B. (2015, December). A hybrid system based on GMM- SVM for Speaker Identification. In Intelligent Systems DeSign and Applications(ISDA), 2015 15th International Conference on, (pp. 645-658). IEEE. doi:10.1109/ISDA.2015.7489195
- Chen, Y., Wang, L., Lin, H., & Li, J. (2012, October). Design of speaker recognition system based on artificial neural network. In Advanced Optical Manufacturing and Testing Technologies: Optical System Technologies for Manufacturing and Testing. Proceedings of the SPIE, 2012. AOMATT 6th International Symposium on (Vol. 8420, pp. 1-7). doi:10.1117/12.970642
- Cummins, F., Leonard, M., Leonardo, T., & Simko, J. (2006, June). The CHAINS corpus CHAracterizing INdividual Speakers. In Speech and Computer SPECOM, The International Conference on (pp. 431-435). Academic Press.
- Daqrouqa, K., & Tutunjib, T. A. (2015). Speaker identification using vowels features through a combined method of formants, wavelets, and neural network classifiers. Applied Soft Computing, 27, 231-239. doi:10.1016/j. asoc.2014.11.016
- Dhineshkumar, R., Ganesh, A. B., & Sasikala, S. (2016). Speaker Identification System using Gaussian Mixture Model and Support Vector Machines (GMM-SVM) under Noisy Conditions. Indian Journal of Science and Technology, 9(19), 1-6.
- Eberhart, R. C., & Shi, Y. (2001). Particle swarm optimization: developments, applications and resources. In Proceedings of the 2001 Congress on of IEEE international conference on, (pp. 81-86). IEEE. doi:10.1109/ CEC.2001.934374
- Elwakdy, A. M., Elsehely, B. E., Eltokhy, C. M., & Elhennawy, D. A. (2008, July). Speech recognition using a wavelet transform to establish fuzzy inference system through subtractive clustering and neural network (ANFIS).
- In ICS'08 Proceedings. the 12th WSEAS international conference on Systems (pp. 381-386). Academic Press.
- Fatemeh, Z., & Zahra, Z. (2018). A review of neuro-fuzzy systems based on intelligent control. Journal of Electrical and Electronic Engineering, 3(2-1), 58-61.
- Fazakis, N., Karlos, S., Kotsiantis, S., & Sgarbas, K. (2015). Speaker Identification Using Semi-supervised Learning. Lecture Notes in Computer Science, 9319, 389-396. doi:10.1007/978-3-319-23132-7_48 Fuzzy Logic Toolbox. (2000). Fuzzy Logic Toolbox User's Guide, Version 2. The MathWorks, Inc.
- Ge, Z., Iyer, A. N., Cheluvaraja, S., Sundaram, R., & Ganapathiraju, A. (2017, September). Neural Network Based Speaker Classification and Verification Systems with Enhanced Features. In Intelligent Systems Conference IntelliSys (pp. 1-6). IEEE. doi:10.1109/IntelliSys.2017.8324265
- Gunasekaran, M., Varatharajan, R., & Priyan, M. K. (2018). Hybrid Recommendation System for Heart Disease Diagnosis based on Multiple Kernel Learning with Adaptive Neuro-Fuzzy Inference System. An International Journal Multimedia Tools and Applications, 77(4), 4379-4399. doi:10.1007/s11042-017-5515-y He, X., & Deng, L. (2008). Discriminative Learning for Speech Recognition: Theory and practice. Morgan & Claypool.
- Helmi, N., & Helmi, B. H. (2008, October). Speech recognition with fuzzy neural network for discrete words. In Natural Computation 2008, Proceedings ICNC Fourth International Conference on (Vol. 7, pp. 265-269). IEEE. doi:10.1109/ICNC.2008.666
- Huang, X., Acero, A., Hon, H., & Reddy, R. (2001). Spoken Language Processing: A guide to Theory, Algorithm, And System Development. Prentice-Hall.
- Jang, J. S. R. (1993). ANFIS: Adaptive Network Based Fuzzy Inference Systems. IEEE Transactions on Systems, Man, and Cybernetics, 23(3), 665-685. doi:10.1109/21.256541
- Jang, J. S. R., Sun, C. T., & Mizutani, E. (1997). Neuro-Fuuzy and Soft Computing: a Computational Approach to Learning and Machine Intelligence. Prentice-Hall.
- Kamaruddin, N., & Wahab, A. (2008, July). Speech Emotion Verification System (SEVS) based on MFCC for real time application. In Intelligent Environments, IET 4th International Conference on (pp. 1-7). IEEE.
- Karlos, S., Fazakis, N., Karanikola, K., Kotsiantis, S., & Sgarbas, K. (2016). Speech Recognition CombiningMFCCs and Image Features. Lecture Notes in Computer Science, 9811, 651-658. doi:10.1007/978-3-319-43958-7_79
- Karlos, S., Kaleris, K., Fazakis, N., Kanas, V. G., & Kotsiantis, S. (2018). Optimized Active Learning Strategy for Audiovisual Speaker Recognition. Lecture Notes in Computer Science, 11096, 281-290. doi:10.1007/978- 3-319-99579-3_30
- Kennedy, J., & Eberhart, R. C. (1995). Particle swarm optimization. In Neural Networks, 1995. Proceedings. IEEE International Conference on (pp. 1942-1948). IEEE. doi:10.1109/ICNN.1995.488968
- Kosko, B. (1991). Neural Networks and Fuzzy Systems A Dynamic Systems Approach. Prentice-Hall.
- Li, J., Yang, L., Qu, Y., & Sexton, G. (2018). An extended Takagi-Sugeno-Kang inference system (TSK+) with fuzzy interpolation and its rule base generation. Soft Computing, 22(10), 3155-3170. doi:10.1007/s00500- 017-2925-8
- Liu, Y., Qian, Y., Chen, N., Fu, T., Zhang, Y., & Yu, K. (2015). Deep feature for text-dependent speaker verification. Speech Communication, 73, 1-13. doi:10.1016/j.specom.2015.07.003
- Milani, A., & Santucci, V. (2009). Online PSO for Web Marketing Optimization. IEEE International Conference on e-Business Engineering, Macau, 2009, 583-587. doi:10.1109/ICEBE.2009.92
- Nayyar, A., Garg, S., Gupta, D., & Khanna, A. (2018). Evolutionary computation: theory and algorithms. In Advances in Swarm Intelligence for Optimizing Problems in Computer Science (pp. 1-26). Chapman and Hall/ CRC. doi:10.1201/9780429445927-1
- Nayyar, A., Le, D. N., & Nguyen, N. G. (Eds.). (2018). Advance in Swarm Intelligence for Optimizing Problems in Computer Science. CRC Press. doi:10.1201/9780429445927
- Nayyar, A., & Nguyen, N. G. (2018). Introduction to Swarm Intelligence. In Advances in Swarm Intelligence for Optimizing Problems in Computer Science (pp. 53-78). Chapman and Hall/CRC. doi:10.1201/9780429445927-3
- Novakovic, J. (2011). Speaker identification in smart environments with multilayer perceptron. 2011 19thTelecommunications Forum (TELFOR) Proceedings of Papers, 1418-1421.
- Pandey, B., Ranjan, A., Kumar, R., & Shukla, A. (2010, July). Multilingual Speaker Recognition Using ANFIS. Signal Processing Systems (ICSPS), 2010 2nd International Conference on, 3, 714-718.
- Priyono, A., Ridwan, M., Alias, A. J., Rahmat, R. A. O. K., Hassan, A., & Ali, M. A. M. (2005). Generation of fuzzy rules with subtractive clustering. Journal Teknologi, 43(1), 143-153. doi:10.11113/jt.v43.782
- Sabah, R., & Ainon, R. N. (2009, May). Isolated Digit Speech Recognition in Malay Language using Neuro- Fuzzy Approach. In Modelling & Simulation, 2009. AMS '09. Third Asia International Conference on, (pp. 336-340). IEEE.
- Silarbi, S., Bendahmane, A., & Benyettou, A. (2014). Adaptive Network Based Fuzzy Inference System For Speech Recognition Through Subtractive Clustering. International Journal of Artificial Intelligence & Applications, 5(6), 43-52. doi:10.5121/ijaia.2014.5604
- Srihari, V., Karthik, R., Anitha, R., & Suganthi, S. D. (2010, December). Speaker verification using combinational features and adaptive neuro-fuzzy inference systems. In Intelligent Interactive Technologies and Multimedia. IIMT'10 the First International Conference on (pp. 98-103). doi:10.1145/1963564.1963580
- Sun, C. T. (1994). Rule-Base Structure Identification in an Adaptive-Network-Based Fuzzy Inference System. IEEE Transactions on Fuzzy Systems, 2(1), 64-73. doi:10.1109/91.273127
- Takagi, T., & Sugeno, M. (1985). Fuzzy identification of systems and its applications to modeling and control. Syst Man and Cybern IEEETrans SMC, 15(1), 116-132. doi:10.1109/TSMC.1985.6313399
- Tolba, H. (2011). A high-performance text-independent speaker identification of Arabic speakers using a CHMM- based approach. Alexandria Engineering Journal, 50(1), 43-47. doi:10.1016/j.aej.2011.01.007
- Wali, S. S., & Hatture, S. M. (2015). MFCC Based Text-Dependent Speaker Identification Using BPNN. International Journal of Signal Processing Systems, 3(1), 30-34.
- Wong, C. C., & Chen, C. C. (1999). A Hybrid Clustering and Gradient Descent Approach for Fuzzy Modeling. IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics, 29(6), 686-693. doi:10.1109/3477.809024 PMID:18252349
- Wu, J. D., & Tsai, Y. J. (2011). Speaker identification system using empirical mode decomposition and an artificial neural network. Expert Systems with Applications, 38(5), 6112-6117. doi:10.1016/j.eswa.2010.11.013