Cancelling Ambient Noise in Telephony
- On February 13, 2011
What is ambient noise?
Ambient noise is all sounds that are picked up by the microphone but are not part of the conversation. Usually all sounds that are not originated from the vocal tract of the speaker are considered ambient noise.
Types of ambient noise
Stationary Noise – monotonic sound that has similar acoustic characteristics over a long period of time. A good example of stationary noise is a white noise generated, for example, by the fan of the computer.
Non-Stationary Noise – all other sounds. Most of the ambient noise is non-stationary. Non-stationary noises are everywhere and you usually cannot avoid them. Even in your quiet office when you type on your keyboard or close a door, you generate a non-stationary noise.
Noise cancellation techniques – basics
The standard noise cancellation techniques deploy sophisticated signal processing algorithm to improve SNR (Signal to Noise ratio) while adding minimal distortion to the voice. These techniques can easily detect the stationary noise due to its constant acoustic nature. The problem starts with the non-stationary noise. How can the algorithm decide is a sound is a non-stationary noise or actually a legitimate part of the conversation? In most cases, the noise reduction algorithm must be given additional information in order to distinguish between the voice of the speaker and the non-stationary ambient noise.
Adding Information using a dedicated hardware
A common practice is the usage of special hardware in order to allow the noise cancellation algorithm to distinguish between the speaker and the ambient noise. For example:
- Adding a second microphone (reference microphone) or even an array of microphones aimed on capturing ambient noise from multiple directions.
- Making a physical contact between the sensors and your face.
- Using a directional microphone that focuses on the location of your mouth.
Adding Information using a prior knowledge
Most communication devices today and especially mobile phones are personal devices that are used, most of the time, by a single person. The personalize noise reduction algorithm learns the voice of the owner of the device and uses this information to distinguish between his/her voice and the ambient voice. This technique has two main advantages over the previous hardware-based techniques:
1. Lower Cost – software solution is much cheaper than hardware.
2. No physical limitation. As opposed to the hardware-based solutions that assume specific setup (e.g. your mouth facing the microphone or your face touching the sensor) the personal noise reduction is free from these limitations.