Sound Noise Reduction Based on Deep Neural Network
DOI:
https://doi.org/10.62647/Keywords:
Variational U-Net, Speech Enhancement, Audio Denoising, Spectro-temporal Features, Signal-to-Distortion Ratio (SDR), Probabilistic Modeling, Spectral Reconstruction, Impulsive Noise Suppression.Abstract
We investigate the viability of a variational U-Net architecture for denoising of single-channel audio data. Deep network speech enhancement systems commonly aim to estimate filter masks, or opt to work on the waveform signal, potentially neglecting relationships across higher dimensional Spectro temporal features. We study the adoption of a probabilistic bottleneck into the classic U-Net architecture for direct spectral reconstruction. Evaluation of several ablation network variants is carried out using signal-to-distortion ratio and perceptual measures, on audio data that includes known and unknown noise types as well as reverberation. Our experiments show that the residual (skip) connections in the proposed system are a prerequisite for successful spectral reconstruction, i.e., without filter mask estimation. Results show, on average, an advantage of the proposed variational U-Net architecture over its classic, nonvariational version in signal enhancement performance under reverberant conditions. To improved suppression of impulsive noise sources with the variational U-Net.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Amtul Shanaz, C Sakshi,P Sharon Rose, E Sravanthi4,T Sreeja (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.











