Photo: John Boyd A spectrograph of the sound the car's microphone picks up when the driver is speaking [left]. A system developed using machine learning lets through only the person's voice [right]. |
Mitsubishi Quiets Car Noise With Machine Learning
Mitsubishi Electric is claiming a breakthrough with its development of noise suppression technology to aid hands-free phone calls in the car and elsewhere. The technology improves the quality of the communication by filtering out almost all of the unwanted ambient sound that enters a far-field microphone while speaking.
Noises removed include rapidly changing sounds—which were, until now, difficult to deal with—such as passing cars, windshield wipers, and turn signals. In tests the system cancelled out 96 percent of the ambient noise compared to just 78 percent achieved by conventional methods
“Previously, only stationary noises such as road noise or the sound of the air conditioner were really dealt with, because the noise mixed with the speech could be easily predicted from past observations when the driver was not talking,” says Jonathan Le Roux, a principal researcher at Mitsubishi Electric Research Labs in Cambridge, Mass. “It is much harder to reduce noise when its characteristics are largely unpredictable.”
To better distinguish human speech from other sounds, the researchers are developing speech-enhancement systems that learn to exploit spectral and dynamic characteristics of human speech such as pitch and timber.
These systems employ machine-learning methods based on deep neural networks. (Facebook’s AI chief, Yann Le Cun, explained deep neural networks for us here.) These are trained to distinguish and suppress the noise and retain the clean speech using massive amounts of noise -contaminated speech data. The systems have millions of parameters that are optimized during training in order to reduce the difference between the output of the system and the original clean speech.
In order to reconstruct the clean speech, the neural networks construct special time-varying filters on the fly and apply them to the contaminated speech.
“The frequency contents of the speech and the noise can be intricately intermingled, and change abruptly,” says Le Roux. “Transient noises may last only tens of milliseconds, while speech changes from one phoneme to another every 100 to 200 milliseconds. So to effectively remove the noise, the filter needs to have a fine frequency resolution and be updated very rapidly.”
In tests, Le Roux says they were able to cancel out 96 percent of the ambient noise compared to just 78 percent achieved by conventional methods.
This technology fundamentally differs in approach and aim from active noise-cancellation methods such as those in anti-noise headphones, which try to physically remove ambient noise in a user’s environment. Examples of these methods applied in the car are Bose’s engine-noise cancellation and Harman’s road noise suppression.
Mitsubishi’s goal is to eliminate the noise picked up by the microphone while the user is speaking during telephone calls. Although active noise-cancellation methods could indirectly help with this problem by reducing noise in the cabin, Mitsubishi says they can only suppress low-frequency noise.
“We want to make the driver’s speech more clear and intelligible to the person on the other end of the call by cancelling as much noise as possible, not just low-frequency noise,” says Le Roux. “Our technology will also be useful for hands-free command and control situations, such as when using Apple’s Siri or Google’s Voice Search in smart phones, as well as in call centers that use speech recognition to handle common requests.”
Mitsubishi plans to launch the technology in 2018 in its line of automotive navigation and communication devices.
ORIGINAL: IEEE Spectrum
Mitsubishi Electric is claiming a breakthrough with its development of noise suppression technology to aid hands-free phone calls in the car and elsewhere. The technology improves the quality of the communication by filtering out almost all of the unwanted ambient sound that enters a far-field microphone while speaking.
Noises removed include rapidly changing sounds—which were, until now, difficult to deal with—such as passing cars, windshield wipers, and turn signals. In tests the system cancelled out 96 percent of the ambient noise compared to just 78 percent achieved by conventional methods
“Previously, only stationary noises such as road noise or the sound of the air conditioner were really dealt with, because the noise mixed with the speech could be easily predicted from past observations when the driver was not talking,” says Jonathan Le Roux, a principal researcher at Mitsubishi Electric Research Labs in Cambridge, Mass. “It is much harder to reduce noise when its characteristics are largely unpredictable.”
To better distinguish human speech from other sounds, the researchers are developing speech-enhancement systems that learn to exploit spectral and dynamic characteristics of human speech such as pitch and timber.
These systems employ machine-learning methods based on deep neural networks. (Facebook’s AI chief, Yann Le Cun, explained deep neural networks for us here.) These are trained to distinguish and suppress the noise and retain the clean speech using massive amounts of noise -contaminated speech data. The systems have millions of parameters that are optimized during training in order to reduce the difference between the output of the system and the original clean speech.
In order to reconstruct the clean speech, the neural networks construct special time-varying filters on the fly and apply them to the contaminated speech.
“The frequency contents of the speech and the noise can be intricately intermingled, and change abruptly,” says Le Roux. “Transient noises may last only tens of milliseconds, while speech changes from one phoneme to another every 100 to 200 milliseconds. So to effectively remove the noise, the filter needs to have a fine frequency resolution and be updated very rapidly.”
In tests, Le Roux says they were able to cancel out 96 percent of the ambient noise compared to just 78 percent achieved by conventional methods.
This technology fundamentally differs in approach and aim from active noise-cancellation methods such as those in anti-noise headphones, which try to physically remove ambient noise in a user’s environment. Examples of these methods applied in the car are Bose’s engine-noise cancellation and Harman’s road noise suppression.
Mitsubishi’s goal is to eliminate the noise picked up by the microphone while the user is speaking during telephone calls. Although active noise-cancellation methods could indirectly help with this problem by reducing noise in the cabin, Mitsubishi says they can only suppress low-frequency noise.
“We want to make the driver’s speech more clear and intelligible to the person on the other end of the call by cancelling as much noise as possible, not just low-frequency noise,” says Le Roux. “Our technology will also be useful for hands-free command and control situations, such as when using Apple’s Siri or Google’s Voice Search in smart phones, as well as in call centers that use speech recognition to handle common requests.”
Mitsubishi plans to launch the technology in 2018 in its line of automotive navigation and communication devices.
ORIGINAL: IEEE Spectrum
By John Boyd
Posted 9 Mar 2015
Posted 9 Mar 2015
No hay comentarios:
Publicar un comentario
Nota: solo los miembros de este blog pueden publicar comentarios.