Page 213 - 《应用声学)》2023年第5期
P. 213

第 42 卷 第 5 期              罗宇等: 一种基于聚类的门控卷积网络语声分离方法                                         1105


                 IEEE/ACM transactions on Audio, Speech, and Language  研究现状与展望 [J]. 自动化学报, 2019, 45(2): 234–251.
                 Processing, 2019, 27(8): 1256–1266.               Huang Yating, Shi Jing, Xu Jiaming, et al. Research ad-
              [9] 刘文举, 聂帅, 梁山, 等. 基于深度学习语音分离技术的研究                  vances and perspectives on the cocktail party problem and
                 现状与进展 [J]. 自动化学报, 2016, 42(6): 819–833.           related auditory models[J]. Acta Automatica Sinica, 2019,
                 Liu Wenju, Nie Shuai, Liang Shan, et al. Deep learn-  45(2): 234–251.
                 ing based speech separation technology and its develop-  [15] Bahmaninezhad F, Zhang S X, Xu Y, et al. A unified
                 ments[J]. Acta Automatica Sinica, 2016, 42(6): 819–833.  framework for speech separation[J]. arXiv Preprint, arXiv:
             [10] Lea C, Vidal R, Reiter A, et al.  Temporal convolu-  1912.07814, 2019.
                 tional networks: a unified approach to action segmen-  [16] 刘航, 李扬, 袁浩期, 等. 基于生成对抗网络的语音信号分
                 tation[C]//European Conference on Computer Vision.  离 [J]. 计算机工程, 2020, 46(1): 302–308.
                 Springer, Cham, 2016: 47–54.                      Liu Hang, Li Yang, Yuan Haoqi, et al. Speech signal sep-
             [11] Dauphin Y N, Fan A, Auli M, et al. Language model-  aration based on generative adversarial networks[J]. Com-
                 ing with gated convolutional networks[C]//International  puter Engineering, 2020, 46(1): 302–308.
                 Conference on Machine Learning. PMLR, 2017: 933–941.  [17] Kingma D P, Ba J. Adam: a method for stochastic opti-
             [12] 郝敏, 刘航, 李扬, 等. 基于聚类分析与说话人识别的语音跟                  mization[J]. arXiv Preprint, arXiv: 1412.6980, 2014.
                 踪 [J]. 计算机与现代化, 2020(4): 7–13.                 [18] Le Roux J, Wisdom S, Erdogan H, et al. SDR–half-baked
                 Hao Min, Liu Hang, Li Yang, et al. Speech tracking based  or well done?[C]//ICASSP 2019-2019 IEEE International
                 on cluster analysis and speaker recognition[J]. Computer  Conference on Acoustics, Speech and Signal Processing
                 and Modernization, 2020(4): 7–13.                 (ICASSP). IEEE, 2019: 626–630.
             [13] Han C, O’Sullivan J, Luo Y, et al. Speaker-independent  [19] Gu W, Tandon A, Ahn Y Y, et al. Principled approach to
                 auditory attention decoding without access to clean  the selection of the embedding dimension of networks[J].
                 speech sources[J]. Science Advances, 2019, 5(5): eaav6134.  Nature Communications, 2021, 12(1): 1–10.
             [14] 黄雅婷, 石晶, 许家铭, 等. 鸡尾酒会问题与相关听觉模型的
   208   209   210   211   212   213   214   215   216   217   218