Page 80 - 《应用声学》2023年第2期
P. 80
268 2023 年 3 月
speech using auditory-inspired features[J]. arXiv Preprint, [20] Arjovsky M, Chintala S, Bottou L. Wasserstein genera-
arXiv: 1510.04620, 2015. tive adversarial networks[C]//International Conference on
[10] Gamper H, Tashev I J. Blind reverberation time estima- Machine Learning, 2017: 214–223.
tion using a convolutional neural network[C]//2018 16th [21] Antsalo P, Makivirta A, Valimaki V, et al. Estimation
International Workshop on Acoustic Signal Enhancement of modal decay parameters from noisy response measure-
(IWAENC), 2018: 136–140. ments[C]//Audio Engineering Society Convention 110,
[11] Shuku T, Ishihara K. The analysis of the acoustic field in 2001.
irregularly shaped rooms by the finite element method[J]. [22] Murphy D T, Shelley S. Openair: an interactive aural-
Journal of Sound and Vibration, 1973, 29(1): 67–IN1. ization web resource and database[C]//Audio Engineering
[12] Kirkup S. The boundary element method in acoustics: Society Convention 129, 2010.
[23] Kinoshita K, Delcroix M, Yoshioka T, et al. The
a survey[J]. Applied Sciences, Multidisciplinary Digital
REVERB challenge: a common evaluation frame-
Publishing Institute, 2019, 9(8): 1642.
work for dereverberation and recognition of reverberant
[13] Allen J B, Berkley D A. Image method for efficiently sim-
speech[C]//2013 IEEE Workshop on Applications of Sig-
ulating small-room acoustics[J]. The Journal of the Acous-
nal Processing to Audio and Acoustics, 2013: 1–4.
tical Society of America, 1979, 65(4): 943–950.
[24] Nakamura S, Hiyane K, Asano F, et al. Acoustical sound
[14] Krokstad A, Strom S, Sørsdal S. Calculating the acousti-
database in real environments for sound scene understand-
cal room response by the use of a ray tracing technique[J].
ing and hands-free speech recognition[J]. LREC2000: the
Journal of Sound and Vibration, 1968, 8(1): 118–125.
2nd International Conference on Language Resources and
[15] Ratnarajah A, Tang Z, Manocha D. IR-GAN: room
Evaluation, 2000.
impulse response generator for far-field speech recogni-
[25] Zheng K, Zheng C, Sang J, et al. Noise-robust blind
tion[C]. Interspeech 2021, 2021.
reverberation time estimation using noise-aware time-
[16] Ratnarajah A, Zhang S X, Yu M, et al. FAST-RIR: fast frequency masking[J]. arXiv Preprint, arXiv: 2112.04726,
neural diffuse room impulse response generator[J]. arXiv
2021.
Preprint, arXiv: 2110.04057, 2021.
[26] Garofolo J S. Timit acoustic phonetic continuous speech
[17] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Genera- corpus[J]. Linguistic Data Consortium, 1993.
tive adversarial nets[C]. Advances in Neural Information [27] Schroeder M R. Integrated-impulse method measuring
Processing Systems, 2014: 27. sound decay without using impulses[J]. The Journal of
[18] Mirza M, Osindero S. Conditional generative adversarial the Acoustical Society of America, 1979, 66(2): 497–500.
nets[J]. arXiv Preprint, arXiv: 1411.1784, 2014. [28] Farina A. Simultaneous measurement of impulse response
[19] Donahue C, McAuley J, Puckette M. Adversarial audio and distortion with a swept-sine technique[C]//Audio En-
synthesis[J]. arXiv Preprint, arXiv: 1802.04208, 2018. gineering Society Convention 108, 2000.