AI, Can you identify echo?
- On March 15, 2021
- AEC, AI, echo cancellation
Acoustic Echo in phone calls is caused when audio that is being played on the loudspeakers is then picked up by the microphone and afterwards transmitted to the far-end person who is talking. As a result the far-end talking person will hear himself/herself in a delay. When echo appears in a phone call it is a significant disturbance and it might yield to a howling effect and even cause the termination of the call.
In order to prevent from the echo to be played to the far-end talking person, an echo cancellation algorithm needs to be used. The purpose of the echo cancellation algorithm is to remove the echo from the audio signal that was picked up by the microphone and then the clean audio can be safely transmitted to the far-end person who is talking.
Echo cancellation is a mathematical algorithm that needs to take into account many parameters. In a nutshell, the echo cancellation algorithm is comparing the audio that is played to the loud-speakers with the audio that was captured by the microphone in order to identify the mathematical transformation from the played audio to the captured echo. Once this transformation is identified, the echo can be easily removed. For more details about the internals of echo cancellation algorithms you are invited to take a look at the following blog series titled echo cancellation – art or science.
Having said that analytic echo cancellation algorithm is complex, one might ask himself can we implement the echo cancellation by simply listening to the captured audio using Artificial Intelligence (AI)? That is indeed a very good question since we know that AI can sometimes be effective in solving problems that are too complex for conventional algorithms. And now we finally discuss the title of this post, “AI, can you identify echo“? When you listen to an audio track that is captured from the microphone, can you identify when the signal is the near-end talking and when it is the echo of the far-end? The answer is sometimes. Sometimes the echo is distorted and can be easily identified but sometimes the echo is very clean and sound just like the near-end talking. Since the answer to the above question is not conclusive this means that when you listen to the microphone you cannot always distinguish between the far-end echo and the near-end talking, as a result there is no AI that can do it and therefore AI cannot remove the echo by listening to the microphone track only. The only way to do a proper echo cancellation is by comparing both audio tracks and for this job the analytic approach was proven to be the best and efficient one.