Echo Cancellation and silence
- On March 11, 2015
- Acoustic Echo Cancellation, AEC
Background
Following our previous post on synchronization, in this post we would like to cover a specific aspect of un-synchronized input which causes the speakers to play silence that is not monitored and therefore unknown to the AEC.
Case Study
In the following graph, you can see a visual behavior of mic frames minus spk frames over time.
As you can see, there is a drift that is caused by constant loss of spk frames. Although there might be a good reason why this is happening – for example network problems – if this behavior is not controlled it will cause problem to the AEC. The logic is very simple: during a call, the same number of audio packets are captured from the mic on both sides (for example 1,000 packets). The speakers on both sides also play audio worth of 1000 packets. If you give the speakers only 980 packets of audio data then in the remaining time (instead of the missing 20 packets) the speakers will play silence. As a result, from time to time the speakers will “play silence”. The timing when this happens is unknown to the AEC and prevents from providing high quality professional echo cancellation. In other words, the AEC needs to know exactly what is being played on the speakers and we should not allow the speakers to be run out of data and therefore randomly play silence.
The solution
Always monitor the buffers of the speakers and make sure they are never empty. If they run out of data, then you can inject a short period of silence so your application controls exactly what is being played on the speakers at any given point in time.