- A tool for editing audio signals in the spectrogram is presented. It allows manipulating the spectrogram of a signal at any chosen time-frequency resolution directly and to reconstruct the edited signal in HiFi quality - a capability that is usually not possible with the Fourier or wavelet transformation. Image processing and computer vision methods are applied to the spectrogram in order to identify, separate, eliminate and/or modify audio objects visually. As spectrograms give descriptive information about the sound, this tool allows editing audio in a "what you see is what you hear" style. This is enabled by a thorough investigation and exploitation of Gabor analysis and synthesis. We further propose to use a kind of zooming, as in visual painting tools, which results in a change of time and frequency resolution, and can be adapted for the task at hand. Results on applying this tool to erasing audio objects such as whistles, music, clapping and alike in audio tracks are presented.A tool for editing audio signals in the spectrogram is presented. It allows manipulating the spectrogram of a signal at any chosen time-frequency resolution directly and to reconstruct the edited signal in HiFi quality - a capability that is usually not possible with the Fourier or wavelet transformation. Image processing and computer vision methods are applied to the spectrogram in order to identify, separate, eliminate and/or modify audio objects visually. As spectrograms give descriptive information about the sound, this tool allows editing audio in a "what you see is what you hear" style. This is enabled by a thorough investigation and exploitation of Gabor analysis and synthesis. We further propose to use a kind of zooming, as in visual painting tools, which results in a change of time and frequency resolution, and can be adapted for the task at hand. Results on applying this tool to erasing audio objects such as whistles, music, clapping and alike in audio tracks are presented. Hence audio objects are automatically identied as visual objects in the spectrogram and eliminated therein. The cleaned signal is then reconstructed from the spectrogram in HiFi quality.…

