Time manipulation and its impact on echo cancellation
- On December 12, 2016
- echo cancellation, time manipulation, WSOLA
Time manipulation is a known technique in audio processing. It can be used to either compress the audio (resulting in speed-up) or expand the audio (resulting in slow-down). Usually this is done without altering the pitch and therefore without changing the fundamental signature of the voice.
Why manipulating the time?
In real-time voice communication, there are usually two cases where time manipulation is applied:
1. Whenever a buffer overrun is expected, the audio might be speed-up.
2. Whenever a buffer underrun is expected or in case of packet loss in the network, the audio might be slow-down.
While these technique try to maintain voice quality as high as possible, they might cause a damage in case it stands in the way of the echo cancellation.
Conflict with echo cancellation
As we have discussed in our previous posts on echo cancellation, the basic principle in echo cancellation is:
1. receive two streaming signals: (1) reference signal which is the audio that is going to be rendered to the speakers, and (2) captured audio which is the audio that is captured from the microphone and contains the echo.
2. Finding the correlation between the reference signal and the captured audio.
3. Using the detected correlation to cancel the echo.
If time manipulation is done on the audio prior to rendering it to the speakers and without modifying the reference signal – this means that the echo cancellation will have a reference signal that is not the true signal that was played on the speakers.
If time manipulation is done on the captured audio before it is analyzed by the echo cancellation, it means that the echo cancellation will not have the original signal that was captured by the microphone.
In both cases, the echo cancellation might not find a good correlation between the reference signal and the captured audio and therefore will not be able to properly cancel the echo.
If your system might be doing time manipulation to the audio, make sure this manipulation is synchronized with your echo cancellation engine so that the echo canceller will be able to find correlation between the reference signal and the captured signal.