DOIONLINE

DOIONLINE NO - IJEEDC-IRAJ-DOIONLINE-3271

Publish In
International Journal of Electrical, Electronics and Data Communication (IJEEDC)-IJEEDC
Journal Home
Volume Issue
Issue
Volume-3, Issue-11  ( Nov, 2015 )
Paper Title
Leveraging Jointly Spatial, Temporal And Modulation Enhancement In Creating Noise-Robust Features For Speech Recognition
Author Name
Hsin-Ju Hsieh, Hao-Teng Fan, Jeih-Weih Hung
Affilition
Dept of Electrical Engineering, National Chi Nan University, Taiwan, Republic of China
Pages
15-19
Abstract
This paper presents to adopt various fusion types of spatial, temporal and modulation domain speech feature enhancement techniques in order to achieve superior speech recognition performance under noise-corrupted environments. With the mel-frequency cepstral coefficients (MFCC) as the standard speech feature representation, the spatial-domain techniques involve the short-time intra-frame feature enhancement, while the temporal-domain techniques compensate for the noise distortion that exists in the long-term inter-frame MFCC time stream. Furthermore, the modulation- domain techniques are conducted on the Fourier transform of a MFCC time stream. The evaluation experiments conducted on the connected-digit Aurora-2 database reveal that each of the spatial/temporal enhancement techniques adopted here performs better than the unprocessed MFCC baseline, and the integration of the methods respectively for spatial-, temporal-and modulation-domain features can result in even better recognition accuracy than the individual component method under a wide range of noise-corrupted environments. These results clearly demonstrate that the methods in the three domains treat noise in different aspects and therefore they are complementary to each other. Keywords- Noise Robustness, Speech Recognition, Spatial Processing, Temporal Processing, Modulation Domain.
  View Paper