Page 92 - 《应用声学》2025年第3期
P. 92

626                                                                                  2025 年 5 月


                 Tao Yuang. Application of MFCC feature training tech-  and spark[C]//2018 International Conference on Intelli-
                 nology in voiceprint recognition[J]. Application of IC,  gent Autonomous Systems (ICoIAS). March 1–3, 2018.
                 2024, 41(2): 386–387.                             Singapore. IEEE, 2018.
             [28] 李磊, 朱永同, 杨琦, 等. 基于多任务学习与注意力机制的                [35] Yu Y, Luo S, Liu S L, et al. Deep attention based music
                 多层次音频特征情感识别研究 [J]. 智能计算机与应用, 2024,                genre classification[J]. Neurocomputing, 2020, 372: 84–91.
                 14(1): 85–94, 101.                             [36] 连子宽, 姚力, 刘晟源, 等. 基于 t-SNE 降维和 BIRCH 聚类的
                 Li Lei, Zhu Yongtong, Yang Qi, et al. Multilevel emotion
                                                                   单相用户相位及表箱辨识 [J]. 电力系统自动化, 2020, 44(8):
                 recognition of audio features based on multitask learn-
                                                                   176–184.
                 ing and attention mechanism[J]. Intelligent Computer and
                                                                   Lian Zikuan, Yao Li, Liu Shengyuan, et al. Phase and
                 Applications, 2024, 14(1): 85–94, 101.            meter box identification for single-phase users based on
             [29] 杨蕊檄. 基于时频特征信息的声学事件检测算法研究 [D]. 成
                                                                   t-SNE dimension reduction and BIRCH clustering[J]. Au-
                 都: 西南交通大学, 2019.
                                                                   tomation of Electric Power Systems, 2020, 44(8): 176–184.
             [30] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is
                                                                [37] Peng N, Chen A B, Zhou G X, et al.  Environment
                 worth 16×16 words: Transformers for image recognition
                                                                   sound classification based on visual multi-feature fusion
                 at scale[J]. arXiv Preprint, arXiv: 2010.11929, 2020.
                                                                   and GRU-AWS[J]. IEEE Access, 2020, 8: 191100–191114.
             [31] 余正涛, 董凌, 高盛祥. 低资源语音识别研究进展 [J]. 昆明理
                                                                [38] Zhu W J, Li X. Speech emotion recognition with
                 工大学学报 (自然科学版), 2024, 49(3): 86–102.
                                                                   global-aware fusion on multi-scale feature representa-
                 Yu Zhengtao, Dong Ling, Gao Shengxiang.  Research
                                                                   tion[C]//ICASSP 2022—2022 IEEE International Con-
                 progress of low-resource speech recognition[J]. Journal of
                                                                   ference on Acoustics, Speech and Signal Processing
                 Kunming University of Science and Technology (Natural
                                                                   (ICASSP). May 23–27, 2022. Singapore, IEEE, 2022.
                 Science), 2024, 49(3): 86–102.
             [32] 殷铭旸, 乔亦诚, 张德霄龙, 等. 基于风格迁移的数据增强方               [39] Kong Q Q, Cao Y, Iqbal T, et al. PANNs: Large-scale
                 法 [J]. 信息技术与信息化, 2023(12): 127–130.               pretrained audio neural networks for audio pattern recog-
             [33] Zhang P J, Zheng X Q, Zhang W Q, et al. A deep neu-  nition[J]. IEEE/ACM Transactions on Audio, Speech and
                 ral network for modeling music[C]//Proceedings of the  Language Processing, 2020, 28: 2880–2894.
                 5th ACM on International Conference on Multimedia Re-  [40] Liu Y L, Chen A B, Zhou G X, et al.  Combined
                 trieval. Shanghai China. ACM, 2015.               CNN LSTM with attention for speech emotion recognition
             [34] Karunakaran N, Arya A. A scalable hybrid classifier for  based on feature-level fusion[J]. Multimedia Tools and Ap-
                 music genre classification using machine learning concepts  plications, 2024, 83(21): 59839–59859.
   87   88   89   90   91   92   93   94   95   96   97