Page 72 - 《应用声学》2020年第2期
P. 72

230                                                                                  2020 年 3 月


             [10] Graves A, Fernández S, Gomez F. Connectionist temporal  [22] Zhang Y, Pezeshki M, Brakel P, et al.  Towards end-
                 classification: labelling unsegmented sequence data with  to-end speech recognition with deep convolutional neural
                 recurrent neural networks[C]// International Conference  networks[J]. arXiv: 1701.02720, 2017.
                 on Machine Learning. Pittsburgh, 2006: 369–376.  [23] Hu J, Li S, Samuel A, et al. Squeeze-and-excitation net-
             [11] Graves A. Sequence transduction with recurrent neural  works[J]. arXiv: 1709.01507, 2018.
                 networks[J]. Computer Science, 2012, 58(3): 235–242.  [24] 张顺, 龚怡宏, 王进军. 深度卷积神经网络的发展及其在计算
             [12] Kim S, Hori T, Watanabe S. Joint CTC-attention based  机视觉领域的应用 [J]. 计算机学报, 2019, 42(3): 453–482.
                 end-to-end speech recognition using multi-task learn-  Zhang Shun, Gong Yihong, Wang Jinjun. Development of
                 ing[J]. arXiv: 1609.06773, 2017.                  deep convolutional neural networks and its application in
             [13] 于重重, 陈运兵, 孙沁瑶, 等. 基于动态 BLSTM 和 CTC 的             computer vision[J]. Chinese Journal of Computers, 2019,
                 濒危语言语音识别研究 [J]. 计算机应用研究, 2019, 36(11):            42(3): 453–482.
                 3334–3337.                                     [25] 吴仁彪, 赵婷, 屈景怡. 基于深度 SE-DenseNet 的航班延误预
                 Yu Chongchong, Chen Yunbing, Sun Qinyao, et al. Re-  测模型 [J]. 电子与信息学报, 2019, 41(6): 1510–1517.
                 search on endangered language speech recognition based  Wu Renbiao, Zhao Ting, Qu Jingyi. Flight delay pre-
                 on dynamic BLSTM and CTC[J]. Application Research  diction model based on deep SE-DenseNet[J]. Journal
                 of Computers, 2019, 36(11): 3334–3337.            of Electronics and Information Technology, 2019, 41(6):
             [14] 姚煜, Ryad Chellali. 基于双向长短时记忆 -联结时序分类和            1510–1517.
                 加权有限状态转换器的端到端中文语音识别系统 [J]. 计算机                 [26] 仇利克, 郭忠文, 刘青, 等. 基于冗余分析的特征选择算法 [J].
                 应用, 2018, 38(9): 2495–2499.                       北京邮电大学学报, 2017, 40(1): 36–41.
                 Yao Yu, Ryad C. End-to-end Chinese speech recogni-  Qiu Like, Guo Zhongwen, Liu Qing, et al. Feature selec-
                 tion system based on bidirectional long-term memory-  tion algorithm based on redundancy analysis[J]. Journal
                 join timing classification and weighted finaite-state trans-  of Beijing University of Posts and Telecommunications,
                 ducer[J]. Journal of Computer Applications, 2018, 38(9):  2017, 40(1): 36–41.
                 2495–2499.                                     [27] Wang D, Zhang X. THCHS-30: a free chinese speech cor-
             [15] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述 [J]. 计算机学               pus[J]. arXiv: 1512.01882, 2015.
                 报, 2017, 40(6): 1229–1251.                     [28] Li Jie, Zhang Heng, Cai Xinyuan, et al. Towards end-to-
                 Zhou Feiyan, Jin Linpeng, Dong Jun. A review of convolu-  end speech recognition for Chinese mandarin using long
                 tional neural networks[J]. Chinese Journal of Computers,  short-term memory recurrent neural networks[C]// Inter-
                 2017, 40(6): 1229–1251.                           Speech. Dresden, 2015: 615–3619.
             [16] Karen S, Andrew Z. Very deep convolutional networks for  [29] Kingma D, Ba J. Adam: a method for stochastic opti-
                 large-scale image recognition[J]. arXiv: 1409.1556, 2014.  mization[J]. arXiv: 1412.6980, 2015.
             [17] He K, Zhang X, Ren S, et al.  Deep residual learning  [30] Sergey I, Christian S. Batch normalization: accelerat-
                 for image recognition[C]// Computer Vision and Pattern  ing deep network training by reducing internal covariate
                 Recognition, Las Vegas, 2016: 770–778.            shift[J]. arXiv: 1502.03167, 2015.
             [18] Huang G, Liu Z, Laurens V D M, et al. Densely con-  [31] Srivastava N, Hinton G, Krizhevsky A, et al. Dropout:
                 nected convolutional networks[C]// Computer Vision and  a simple way to prevent neural networks from over-
                 Pattern Recognition. Hawaii, 2017: 2261–2269.     fitting[J]. Journal of Machine Learning Research, 2014,
             [19] 王珂, 武军, 周天相, 等. 一种融合全局时空特征的 CNNs                 15(1): 1929–1958.
                 动作识别方法 [J]. 华中科技大学学报 (自然科学版), 2018,            [32] Tan T, Qian Y, Zhou Y, et al. Adaptive very deep convo-
                 46(12): 36–41.                                    lutional residual network for noise robust speech recogni-
                 Wang Ke, Wu Jun, Zhou Tianxiang, et al. A CNNs mo-  tion[J]. IEEE/ACM Transactions on Audio, Speech, and
                 tion recognition method based on global spatiotemporal  Language Processing, 2018, 26(8): 1393–1405.
                 features[J]. Journal of Huazhong University of Science and  [33] 杨洋, 汪毓铎. 基于改进卷积神经网络算法的语音识别 [J]. 应
                 Technology, 2018, 46(12): 36–41.                  用声学, 2018, 37(6): 940–946.
             [20] Abdel H O, Mohamed A R, Jiang H, et al. Applying  Yang Yang, Wang Yuduo. Speech recognition based on
                 convolutional neural networks concepts to hybrid NN-  improved convolutional neural network[J]. Journal of Ap-
                 HMM model for speech recognition[C]// IEEE Interna-  plied Acoustics, 2018, 37(6): 940–946.
                 tional Conference on Acoustics, Speech and Signal Pro-  [34] 张立民, 王彦哲, 张兵强, 等. 基于 CTC 准则的普通话识别及
                 cessing. Kyoto, 2012: 4277–4280.                  改进 [J]. 计算机工程, 2019,45(6): 249–253, 266.
             [21] Sainath T N, Mohamed A R, Kingsbury B, et al. Deep  Zhang Limin, Wang Yanzhe, Zhang Bingqiang, et al.
                 convolutional neural networks for LVCSR[C]// IEEE In-  Mandarin recognition and improvement based on CTC
                 ternational Conference on Acoustics, Speech and Signal  criterion[J]. Computer Engineering, 2019, 45(6): 249–253,
                 Processing. Vancouver, 2013: 8614–8618.           266.
   67   68   69   70   71   72   73   74   75   76   77