Deepak Gala

New Mexico State University, Klipsch School of Electrical & Computer Eng, Alumnus

Followers

Following

Co-author

Public Views

Deepak Gala received his B.E. and M.E. in Instrumentation Engineering and Instrumentation & Control Engineering degrees from the University of
Mumbai, India in May 2005 and June 2010 respectively. He worked as a Lecturer and as an Assistant Professor at the University of Mumbai, India from August 2005 to July 2010 and from August 2010 to July 2014, respectively. In August 2019, he received his Ph.D. degree in Electrical and Computer Engineering with a specialization in Autonomy from New Mexico State University, Las Cruces, NM, USA. Deepak Gala's research interests include sensor fusion, state estimation, guidance and navigation control,
sound source localization and microphone array signal processing.

less

Interests

Uploads

Papers by Deepak Gala

Moving Sound Source Localization and Tracking Using a Self Rotating Bi-Microphone Array

ASME 2019 Dynamic Systems and Control Conference, Nov 26, 2019

In this paper, we present three approaches to localizing and tracking a sound source moving in a ... more In this paper, we present three approaches to localizing and tracking a sound source moving in a three-dimensional (3D) space using a bi-microphone array rotating at a fixed angular velocity. The motion of the sound source along with the rotation of the bi-microphone array results in a sinusoidal inter-channel time difference (ICTD) signal with time-varying amplitude and phase. Two state-space models were employed to develop extended Kalman filters (EKFs) that identify instantaneous amplitude and phase of the signal. Observability analysis of the two state-space models was conducted to reveal singularities. We also developed a method based on Hilbert transform, which is done by comparing the analytic signal of the true ICTD signal with that of a virtual signal having zero elevation and azimuth angles. A moving average filter is then applied to reduce the noise and the effect of the artifacts at the beginning and the ending portion of the estimates. The effectiveness of the proposed methods was tested and comparison studies were conducted in the simulation.

Speech Signal Enhancement Techniques for Microphone Arrays

In all speech communication settings, the quality and intelligibility of speech is of utmost impo... more In all speech communication settings, the quality and intelligibility of speech is of utmost importance for ease and accuracy of information exchange. The speech processing systems used to communicate or store speech are usually designed for a noise free environment but in a real-world environment, the presence of background interference in the form of additive background and channel noise drastically degrades the performance of these systems, causing inaccurate information exchange and listener fatigue. The Spectral Subtraction Technique can be used to reduce stationary noise but the non stationary noise still passes through it. Spectral Subtraction also introduces a musical noise which is very annoying to human ears. Beamforming is another possible method of speech enhancement that can be used. Further, the musical noise of Spectral Subtraction can be reduced by Beamforming. Beamforming by itself, however, does not appear to provide enough improvement. Further, the performance of Beamforming becomes worse if the noise source comes from many directions or the speech has strong reverberation. Therefore, a system has been designed with a combination of Spectral Subtraction Technique followed by Beamforming Technique reducing stationary as well as residual, musical noise. Algorithms and associated software have been developed for 1) Spectral Subtraction 2) Beamforming Technique and 3) Spectral Subtraction followed by Beamforming Technique. The last developed technique results in getting a noise free speech free of musical noise and reverberation making the speech intelligible and of good quality. Processing of the signal for Spectral Subtraction, Delay Sum Beamforming and the Combined Techniques, was carried out individually for three different experiments (with 3 , 6 and 10 microphones) and for 4 different cases with 3 different signals and fourth a signal with Gaussian white Noise. The SNR in each case was calculated.

Speech enhancement combining spectral subtraction and beamforming techniques for microphone array

Proceedings of the International Conference and Workshop on Emerging Trends in Technology, 2010

ABSTRACT In all speech communication settings the quality and intelligibility of speech is of utm... more ABSTRACT In all speech communication settings the quality and intelligibility of speech is of utmost importance for ease and accuracy of information exchange. The Spectral Subtraction Technique is one of the methods to reduce stationary noise but the non stationary noise still passes through it. Spectral Subtraction also introduces a musical noise which is very annoying to human ears. Beamforming is another possible method of speech enhancement that can be used. Beamforming by itself, however, does not appear to provide enough improvement. Further, the performance of Beamforming becomes worse if the noise source comes from many directions or the speech has strong reverberation. A combined technique using the advantages of Spectral Subtraction and Beamforming Techniques is proposed where the Spectral Subtraction Technique followed by Beamforming Technique reduces stationary as well as residual, musical noise. It can be observed that the Spectral Subtraction followed by Beamforming gives better SNR value as compared to that of individual techniques, thereby improving the quality of speech. Numerous simulation results are used to illustrate the reasoning.

SNR improvement with speech enhancement techniques

Proceedings of the International Conference & Workshop on Emerging Trends in Technology - ICWET '11, 2011

ABSTRACT Speech enhancement aims to improve the speech quality by using various techniques. Spect... more ABSTRACT Speech enhancement aims to improve the speech quality by using various techniques. Spectral Subtraction Technique is one earliest and longer standing, popular approaches to noise compensation and speech enhancement. It reduces stationary noise but the non stationary noise still passes through it. Further, it also introduces a musical noise which is very annoying to human ears. Beamforming is another possible method of speech enhancement that can be used. Beamforming by itself, however, does not appear to provide enough improvement. Further, the performance of Beamforming becomes worse if the noise source comes from many directions or the speech has strong reverberation. A combined technique using the Spectral Subtraction Technique followed by Beamforming Technique reduces stationary as well as residual, musical noise. It can be observed that the Spectral Subtraction followed by Beamforming gives better SNR value as compared to that of individual techniques, thereby improving the quality of speech. Numerous simulation results are used to illustrate the reasoning.

Three-Dimensional Sound Source Localization for Unmanned Ground Vehicles with a Self-Rotational Two-Microphone Array

Proceedings of the 5th International Conference of Control, Dynamic Systems, and Robotics (CDSR'18), Jun 1, 2018

This paper presents a novel three-dimensional (3D) sound source localization (SSL) technique base... more This paper presents a novel three-dimensional (3D) sound source localization (SSL) technique based on only Interaural Time Difference (ITD) signals, acquired by a self-rotational two-microphone array on an Unmanned Ground Vehicle. Both the azimuth and elevation angles of a stationary sound source are identified using the phase angle and amplitude of the acquired ITD signal. An SSL algorithm based on an extended Kalman filter (EKF) is developed. The observability analysis reveals the singularity of the state when the sound source is placed above the microphone array. A means of detecting this singularity is then proposed and incorporated into the proposed SSL algorithm. The proposed technique is tested in both a simulated environment and two hardware platforms, i.e., a KEMAR dummy binaural head and a robotic platform. All results show the fast and accurate convergence of estimates.

Download

Multi-Sound-Source Localization Using Machine Learning for Small Autonomous Unmanned Vehicles with a Self-Rotating Bi-Microphone Array

Journal of Intelligent & Robotic Systems

While vision-based localization techniques have been widely studied for small autonomous unmanned... more While vision-based localization techniques have been widely studied for small autonomous unmanned vehicles (SAUVs), sound-source localization capabilities have not been fully enabled for SAUVs. This paper presents two novel approaches for SAUVs to perform three-dimensional (3D) multi-sound-sources localization (MSSL) using only the interchannel time difference (ICTD) signal generated by a self-rotating bi-microphone array. The proposed two approaches are based on two machine learning techniques viz., Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Random Sample Consensus (RANSAC) algorithms, respectively, whose performances are tested and compared in both simulations and experiments. The results show that both approaches are capable of correctly identifying the number of sound sources along with their 3D orientations in a reverberant environment.

Download

Multi-Sound-Source Localization for Small Autonomous Unmanned Vehicles with a Self-Rotating Bi-Microphone Array

ArXiv, 2018

While vision-based localization techniques have been widely studied for small autonomous unmanned... more While vision-based localization techniques have been widely studied for small autonomous unmanned vehicles (SAUVs), sound-source localization capability has not been fully enabled for SAUVs. This paper presents two novel approaches for SAUVs to perform multi-sound-sources localization (MSSL) using only the interaural time difference (ITD) signal generated by a self-rotating bi-microphone array. The proposed two approaches are based on the DBSCAN and RANSAC algorithms, respectively, whose performances are tested and compared in both simulations and experiments. The results show that both approaches are capable of correctly identifying the number of sound sources along with their three-dimensional orientations in a reverberant environment.

Download

Realtime Active Sound Source Localization for Unmanned Ground Robots Using a Self-Rotational Bi-Microphone Array

Journal of Intelligent & Robotic Systems

This work presents a novel technique that performs both orientation and distance localization of ... more This work presents a novel technique that performs both orientation and distance localization of a sound source in a three-dimensional (3D) space using only the interaural time difference (ITD) cue, generated by a newly-developed self-rotational bi-microphone robotic platform. The system dynamics is established in the spherical coordinate frame using a state-space model. The observability analysis of the state-space model shows that the system is unobservable when the sound source is placed with elevation angles of 90 and 0 degree. The proposed method utilizes the difference between the azimuth estimates resulting from respectively the 3D and the two-dimensional models to check the zero-degreeelevation condition and further estimates the elevation angle using a polynomial curve fitting approach. Also, the proposed method is capable of detecting a 90-degree elevation by extracting the zero-ITD signal 'buried' in noise. Additionally, a distance localization is performed by first rotating the microphone array to face toward the sound source and then shifting the microphone perpendicular to the source-robot vector by a predefined distance of a fixed number of steps. The integrated rotational and translational motions of the microphone array provide a complete orientation and distance localization using only the ITD cue. A novel robotic platform using a self-rotational bi-microphone array was also developed for unmanned ground robots performing sound source localization. The proposed technique was first tested in simulation and was then verified on the newly-developed robotic platform. Experimental data collected by the microphones installed on a KEMAR Deepak Gala,

Download

SPEECH ENHANCEMENT COMBINING SPECTRAL SUBTRACTION AND BEAMFORMING TECHNIQUES FOR MICROPHONE ARRAY

In all speech communication settings the quality and intelligibility of speech is of utmost impor... more In all speech communication settings the quality and intelligibility of speech is of utmost importance for ease and accuracy of information exchange. The Spectral Subtraction Technique is one of the methods to reduce stationary noise but the non stationary noise still passes through it. Spectral Subtraction also introduces a musical noise which is very annoying to human ears. Beamforming is another possible method of speech enhancement that can be used. Beamforming by itself, however, does not appear to provide enough improvement. Further, the performance of Beamforming becomes worse if the noise source comes from many directions or the speech has strong reverberation. A combined technique using the advantages of Spectral Subtraction and Beamforming Techniques is proposed where the Spectral Subtraction Technique followed by Beamforming Technique reduces stationary as well as residual, musical noise. It can be observed that the Spectral Subtraction followed by Beamforming gives better SNR value as compared to that of individual techniques, thereby improving the quality of speech. Numerous simulation results are used to illustrate the reasoning.

Download

SOUND SOURCE LOCALIZATION AND TRACKING USING A SELF-ROTATING BI-MICROPHONE ARRAY

Dissertation, 2019

While vision-based localization techniques have been widely studied, sound-source localization ca... more While vision-based localization techniques have been widely studied, sound-source localization capabilities have not been fully enabled. In this dissertation, I present novel three-dimensional (3D) sound source localization (SSL) techniques based on only inter-channel time difference (ICTD) signals, acquired by a self-rotating bi-microphone array on a ground robot.

The rest of the dissertation is as follows. Chapter 2 presents the preliminaries. In Chapter 3, I present the localization of a single stationary sound source in a 3D environment. Both the azimuth and elevation angles of a stationary sound source are identified using the phase angle and amplitude of the acquired ICTD signal. An SSL algorithm based on an extended Kalman filter (EKF) is developed. The observability analysis reveals the singularity of the state estimates when the sound source is placed above the microphone array. A means of detecting this singularity is then proposed and incorporated into the proposed SSL algorithm. The proposed technique is tested in both a simulated environment and two hardware platforms, i.e., a KEMAR dummy binaural head and a robotic platform. All results show the fast and accurate convergence of estimates.

Chapter 4 presents a novel technique that performs both orientation and distance localization of a sound source in a 3D space using only the ICTD cue, generated by the self-rotating bi-microphone array mounted on the robotic platform. The system dynamics is established in the spherical coordinate frame using a state-space model. The observability analysis of the state-space model shows that the system is unobservable when the sound source is placed with elevation angles of 90 and 0 degrees. The proposed method utilizes the difference between the azimuth estimates resulting from respectively the 3D and the two-dimensional (2D) models to check the zero-degree-elevation condition and further estimates the elevation angle using a polynomial curve fitting approach. Also, the proposed method is capable of detecting a 90-degree elevation by extracting the zero-ICTD signal 'buried' in noise. Additionally, a distance localization is performed by first rotating the microphone array to face toward the sound source and then shifting the microphone perpendicular to the source-robot vector by a predefined distance of a fixed number of steps. The integrated rotational and translational motions of the microphone array provide a complete orientation and distance localization using only the ICTD cue. The proposed technique is first tested in simulation and is then verified on the robotic platform. Experimental data collected by the microphones installed on a KEMAR dummy head are also used to test the proposed technique. All results show the effectiveness of the proposed technique.

In Chapter 5, I present two novel approaches to perform 3D multi-sound-source localization (MSSL) using only the ICTD signal generated by a self-rotating bi-microphone array. The two approaches are based on two machine learning techniques, viz., Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Random Sample Consensus (RANSAC) algorithms, respectively, whose performances are tested and compared in both simulations and experiments. The results show that both approaches are capable of correctly identifying the number of sound sources along with their 3D orientations in a reverberant
environment.

Chapter 6 presents three approaches to localizing and tracking a sound source moving in a 3D space using a bi-microphone array rotating at a fixed angular velocity. The motion of the sound source along with the rotation of the bi-microphone array results in a sinusoidal ICTD signal with time-varying amplitude and phase. Four state-space models are employed to develop EKFs that identify the instantaneous amplitude and phase of the signal. Observability analysis of the four state-space models is conducted to reveal singularities. A method based on Hilbert transform is also developed, which compares the analytic signal of the true ICTD signal with a virtual signal having zero elevation and azimuth angles. A moving average filter is then applied to reduce the noise and the effect of the artifacts at the beginning and the ending portion of the estimates. The effectiveness of the proposed methods is tested and comparison studies are conducted in the simulation.

MOVING SOUND SOURCE LOCALIZATION AND TRACKING USING A SELF-ROTATING BI-MICROPHONE ARRAY

Download

Realtime Active Sound Source Localization for Unmanned Ground Robots Using a Self-Rotational Bi-Microphone Array

by Deepak Gala and Nathan Lindsay

This work presents a novel technique that performs both orientation and distance localization of ... more This work presents a novel technique that performs both orientation and distance localization of a sound source in a three-dimensional (3D) space using only the interaural time difference (ITD) cue, generated by a newly-developed self-rotational bi-microphone robotic platform. The system dynamics is established in the spherical coordinate frame using a state-space model. The observability analysis of the state-space model shows that the system is unobservable when the sound source is placed with elevation angles of 90 and 0 degree. The proposed method utilizes the difference between the azimuth estimates resulting from respectively the 3D and the two-dimensional models to check the zero-degree-elevation condition and further estimates the elevation angle using a polynomial curve fitting approach. Also, the proposed method is capable of detecting a 90-degree elevation by extracting the zero-ITD signal 'buried' in noise. Additionally, a distance localization is performed by first rotating the microphone array to face toward the sound source and then shifting the microphone perpendicular to the source-robot vector by a predefined distance of a fixed number of steps. The integrated rotational and translational motions of the microphone array provide a complete orientation and distance localization using only the ITD cue. A novel robotic platform using a self-rotational bi-microphone array was also developed for unmanned ground robots performing sound source localization. The proposed technique was first tested in simulation and was then verified on the newly-developed robotic platform. Experimental data collected by the microphones installed on a KEMAR dummy head were also used to test the proposed technique. All results show the effectiveness of the proposed technique.

Download

Three-Dimensional Sound Source Localization for Unmanned Ground Vehicles with a Self-Rotational Two-Microphone Array

Download

SNR Improvement with Speech Enhancement Techniques

Speech enhancement aims to improve the speech quality by using various techniques. Spectral Subtr... more Speech enhancement aims to improve the speech quality by using
various techniques. Spectral Subtraction Technique is one earliest
and longer standing, popular approaches to noise compensation
and speech enhancement. It reduces stationary noise but the non
stationary noise still passes through it. Further, it also introduces a
musical noise which is very annoying to human ears. Beamforming
is another possible method of speech enhancement that can be
used. Beamforming by itself, however, does not appear to provide
enough improvement. Further, the performance of Beamforming
becomes worse if the noise source comes from many directions or
the speech has strong reverberation. A combined technique using
the Spectral Subtraction Technique followed by Beamforming
Technique reduces stationary as well as residual, musical noise. It
can be observed that the Spectral Subtraction followed by
Beamforming gives better SNR value as compared to that of
individual techniques, thereby improving the quality of speech.
Numerous simulation results are used to illustrate the reasoning.

Download

Thesis Chapters by Deepak Gala

SPEECH SIGNAL ENHANCEMENT TECHNIQUES FOR MICROPHONE ARRAYS

In all speech communication settings, the quality and intelligibility of speech is of utmost impo... more In all speech communication settings, the quality and intelligibility of speech is of utmost importance for ease and accuracy of information exchange. The speech processing systems used to communicate or store speech are usually designed for a noise free environment but in a real-world environment, the presence of background interference in the form of additive background and channel noise drastically degrades the performance of these systems, causing inaccurate information exchange and listener fatigue.
The Spectral Subtraction Technique can be used to reduce stationary noise but the non stationary noise still passes through it. Spectral Subtraction also introduces a musical noise which is very annoying to human ears. Beamforming is another possible method of speech enhancement that can be used. Further, the musical noise of Spectral Subtraction can be reduced by Beamforming. Beamforming by itself, however, does not appear to provide enough improvement. Further, the performance of Beamforming becomes worse if the noise source comes from many directions or the speech has strong reverberation.
Therefore, a system has been designed with a combination of Spectral
Subtraction Technique followed by Beamforming Technique reducing stationary as well as residual, musical noise. Algorithms and associated software have been developed for 1) Spectral Subtraction 2)
Beamforming Technique and 3) Spectral Subtraction followed
by Beamforming Technique. The last developed technique results in getting a noise free speech free of musical noise and reverberation making the speech intelligible and of good quality.
Processing of the signal for Spectral Subtraction, Delay Sum Beamforming and the Combined Techniques, was carried out individually for three different experiments (with 3 , 6 and 10 microphones) and for 4 different cases with 3 different signals and
fourth a signal with Gaussian white Noise. The SNR in each case was calculated.

Download

Deepak Gala

Uploads

Papers by Deepak Gala

Thesis Chapters by Deepak Gala

Log In