Interactive Audio Optimization

Standard optimization procedures for audio enhancement requires the definition of an objective error measure to be minimized (or maximized). This creates an 'open loop' where the user is not involved in the overall optimization process, despite the fact that maximizing his/her preferences should be the final objective of the system.

This can be solved by considering interactive optimization techniques, which can be guided by the user (e.g. by providing ratings on the current solution) so as to achieve a minimum with the highest possible quality in subjective terms. The simplest solution in this sense is the so-called interactive genetic algorithm (IGA) [1], where the user's ratings are used instead of the standard fitness function.

classical

Open loop optimization (standard)

interactive

Closed loop optimization (interactive)

Use case: interactive acoustic echo cancellation

The code for this example is available under open-source license: https://bitbucket.org/ispamm/iga-audio/

Acoustic echo cancellation (AEC) is the problem of removing unwanted echo effects (e.g. due to distortions in speech equipment) from a signal [2]. An echo canceller should remove the echo while at the same time preserving the perceptual quality of the speech. In the standard setting, a far-end signal x[n] is reproduced by a loudspeaker and reacquired by a microphone, producing the so-called echo signal. At the same time, the near-end signal s[n] is also acquired by the microphone, possibly after the superimposition of some background noise \upsilon[n]:

Standard AEC setting.

Standard AEC setting.

The task is to estimate the acoustic impulse response of the room, so as to remove unwanted noise from the near-end signal acquisition. This can be achieved by the use of standard linear filters, which are however dependent on a number of free parameters. With the use of an IGA, these parameters can be adapted in a subjective fashion, and are able to match the user's preference [3,4].

Research at the ISPAMM Lab

We have several research projects available in the field of interactive audio enhancement, including:

  1. The extension of IGA techniques to other real-world problems, including echo cancellation in immersive environments, signal restoration, musical queries, and others.
  2. The use of alternative evolutionary (or non-evolutionary) techniques with respect to a standard genetic algorithm.
  3. The implementation of machine learning algorithms, able to 'learn' the user's preference, so as to be able to automatically assign preferences to a given number of hypotheses. In this way, it is possible to reduce the problem of user's fatigue and converge faster to a perceptual optimum.

References

[1] Takagi, H. (2001). Interactive evolutionary computation: Fusion of the capabilities of EC optimization and human evaluation. Proceedings of the IEEE, 89(9), 1275-1296.
[2] Comminiello, D., Scarpiniti, M., Azpicueta-Ruiz, L. A., Arenas-Garcia, J., & Uncini, A. (2013). Functional link adaptive filters for nonlinear acoustic echo cancellation. IEEE Transactions on Audio, Speech, and Language Processing,21(7), 1502-1512.
[3] Comminiello, D., Scardapane, S., Scarpiniti, M., & Uncini, A. (2013, May). User-Driven Quality Enhancement for Audio Signal Processing. In Audio Engineering Society Convention 134. Audio Engineering Society.
[4] Comminiello, D., Scardapane, S., Scarpiniti, M., & Uncini, A. (2013, July). Interactive quality enhancement in acoustic echo cancellation. In 2013 36th International Conference on Telecommunications and Signal Processing (TSP), (pp. 488-492). IEEE.