Machine Learning with Feature Extractions for Regression Estimation of Binaural Sound Source Localization

Public Deposited
Resource Type
  • Binaural sound source localization is the determination of the position of a sound source based on two data sensors, microphones, mimicking the human auditory system. Many audio processing systems in our daily work and life rely on sound source localization, such as speech enhancement/recognition and human-robot interaction. However, the accuracy of sound source localization under adverse acoustic scenarios is still hard to ensure. This thesis proposes machine learning with feature extractions to estimate the sound source localization by manipulating and analyzing data collected by public Head Related Transfer Function databases. The two proposed methods are wavelet scattering long short-term memory and wavelet scattering convolutional neural network. These developed methods are studied in classification and regression approaches for different scenarios. The results demonstrate that the proposed methods achieve excellent performance in multiple noisy environments compared to recent literature, especially in regression binaural sound source localization.

Thesis Degree Level
Thesis Degree Name
Thesis Degree Discipline
Rights Notes
  • Copyright © 2022 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.
Date Created
  • 2022


In Collection: