Page 220 - 《应用声学》2023年第3期
P. 220
658 2023 年 5 月
[3] Boll S. Suppression of acoustic noise in speech using Acoustics, Speech and Signal Processing (ICASSP). IEEE,
spectral subtraction[J]. IEEE Transactions on Acoustics, 2018: 2901–2905.
Speech, and Signal Processing, 1979, 27(2): 113–120. [18] Pandey A, Wang D L. TCNN: temporal convolutional
[4] Weiss M R, Aschkenasy E, Parsons T W. Study and devel- neural network for real-time speech enhancement in the
opment of the INTEL technique for improving speech in- time domain[C]//ICASSP 2019-2019 IEEE International
telligibility[R]. Nicolet Scientific Corp Northvale Nj, 1975. Conference on Acoustics, Speech and Signal Processing
[5] Chen J, Benesty J, Huang Y, et al. New insights into (ICASSP). IEEE, 2019: 6875–6879.
the noise reduction Wiener filter[J]. IEEE Transactions [19] Qian K, Zhang Y, Chang S, et al. Speech enhancement us-
on Audio, Speech, and Language Processing, 2006, 14(4): ing Bayesian wavenet[C]//Interspeech, 2017: 2013–2017.
1218–1234. [20] Kim S, Lee S, Song J, et al. FloWaveNet: a generative
[6] Lim J S, Oppenheim A V. Enhancement and bandwidth flow for raw audio[J]. arXiv Preprint, arXiv: 1811.02155,
compression of noisy speech[J]. Proceedings of the IEEE, 2018.
1979, 67(12): 1586–1604. [21] Rethage D, Pons J, Serra X. A wavenet for speech denois-
[7] Hu Y, Loizou P. Incorporating a psycho-acoustical model
ing[C]//2018 IEEE International Conference on Acous-
in frequency domain speech enhancement[J]. IEEE Signal
tics, Speech and Signal Processing (ICASSP). IEEE, 2018:
Process Letters, 2004, 11(2): 270–273.
5069–5073.
[8] Dendrinos M, Bakamidis S, Carayannis G. Speech en-
[22] Yuan W. Incorporating group update for speech enhance-
hancement from noise: a regenerative approach[J]. Speech
ment based on convolutional gated recurrent network[J].
Communication, 1991, 10(1): 45–57.
Speech Communication, 2021, 132: 32–39.
[9] Virtanen T. Monaural sound source separation by non-
[23] Hu Y, Liu Y, Lyu S, et al. DCCRN: deep complex convo-
negative matrix factorization with temporal continuity
lution recurrent network for phase-aware speech enhance-
and sparseness criteria[J]. IEEE Transactions on Audio,
ment[J]. arXiv Preprint, arXiv: 2008.00264, 2020.
Speech, and Language Processing: A Publication of the
[24] Tu Y H, Du J, Lee C H. Speech enhancement based on
IEEE Signal Processing Society, 2007, 15(3): 1066–1074.
teacher–student deep learning using improved speech pres-
[10] Schmidt M N, Larsen J. Reduction of non-stationary
ence probability for noise-robust speech recognition[J].
noise using a non-negative latent variable decomposi-
IEEE/ACM Transactions on Audio, Speech, and Lan-
tion[C]//2008 IEEE Workshop on Machine Learning for
guage Processing, 2019, 27(12): 2080–2091.
Signal Processing. IEEE, 2008: 486–491.
[25] 时云龙, 袁文浩, 胡少东, 等. 一种用于实时语音增强的卷积准
[11] Mohammadiha N, Smaragdis P, Leijon A. Supervised and
循环网络 [J]. 西安电子科技大学学报, 2022, 49(3): 183–190.
unsupervised speech enhancement using nonnegative ma-
Shi Yunlong, Yuan Wenhao, Hu Shaodong, et al. Convo-
trix factorization[J]. IEEE Transactions on Audio, Speech,
lutional quasi-recurrent network for real-time speech en-
and Language Processing, 2013, 21(10): 2140–2151.
hancement[J]. Journal of Xidian University, 2022, 49(3):
[12] Ou S, Song P, Gao Y. Laplacian speech model and soft
183–190.
decision based MMSE estimator for noise power spec-
[26] 李江和, 王玫. 一种用于因果式语音增强的门控循环神经网
tral density in speech enhancement[J]. Chinese Journal
络 [J]. 计算机工程, 2022, 48(11): 77–82.
of Electronics, 2018, 27(6): 1214–1220.
Li Jianghe, Wang Mei. A gated recurrent neural network
[13] Wang Z, Sha F. Discriminative non-negative matrix fac-
torization for single-channel speech separation[C]//2014 for causal speech enhancement[J]. Computer Engineering,
2022, 48(11): 77–82.
IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP). IEEE, 2014: 3749–3753. [27] Dauphin Y N, Fan A, Auli M, et al. Language model-
[14] Kwon K, Shin J W, Sonowat S, et al. Speech enhance- ing with gated convolutional networks[C]//International
ment combining statistical models and NMF with update Conference on Machine Learning. PMLR, 2017: 933–941.
of speech and noise bases[C]//2014 IEEE International [28] Garofolo J S, Lamel L F, Fisher W M, et al. TIMIT
Conference on Acoustics, Speech and Signal Processing acoustic-phonetic continuous speech corpus[EB/OL].
(ICASSP). IEEE, 2014: 7053–7057. [2018-09-10]. https://catalog.ldc.upenn. edu/LDC93S1.
[15] Kwon K, Shin J W, Kim N S. NMF-based speech en- [29] Hu G. 100 Nonspeech environmental sounds[EB/OL].
hancement using bases update[J]. IEEE Signal Processing [2018-09-03]. http://web.cse.ohio-state.edu/pnl/corpus/
Letters, 2014, 22(4): 450–454. HuNonsp eech/HuCorpus.html.
[16] Wang D L, Chen J. Supervised speech separation based [30] Arga A, Steeneken H J M. Assessment for automatic
on deep learning: an overview[J]. IEEE/ACM Transac- speech recognition: II. NOISEX-92: a database and an
tions on Audio, Speech, and Language Processing, 2018, experiment to study the effect of additive noise on speech
26(10): 1702–1726. recognition systems[J]. Speech Communication, 1993,
[17] Yang Y, Bao C. DNN-based AR-wiener filtering for speech 12(3): 247–251.
enhancement[C]//2018 IEEE International Conference on