Page 91 - 《应用声学》2025年第3期

P. 91

第 44 卷第 3 期林怡等：使用自注意力机制及数据增强策略的乐曲风格识别方法 625

Processing, 2002, 10(5): 293–302. 18(8): 1955–1967.
[2] 周良风. 基于深度学习的实时音乐节拍识别 [D]. 上海: 东华 [14] 焦佳辉, 马思远, 宋玉, 等. 基于卷积注意力机制的双模态音乐
大学, 2023. 流派分类模型 MGTN[J]. 计算机工程与科学, 2023, 45(12):
[3] Ghosh P, Mahapatra S, Jana S, et al. A study on mu- 2226–2236.
sic genre classiﬁcation using machine learning[J]. Interna- Jiao Jiahui, Ma Siyuan, Song Yu, et al. Bi-modal mu-
tional Journal of Engineering Business and Social Science, sic genre classiﬁcation model MGTN based on convolu-
2023, 1(4): 308–320. tional attention mechanism[J]. Computer Engineering &
[4] Ndou N, Ajoodha R, Jadhav A. Music genre classiﬁca- Science, 2023, 45(12): 2226–2236.
tion: A review of deep-learning and traditional machine- [15] Wyse L. Audio spectrogram representations for process-
learning approaches[C]//2021 IEEE International IOT, ing with convolutional neural networks[EB/OL]. 2017:
Electronics and Mechatronics Conference (IEMTRON- 1706.09559. https://arxiv.org/abs/1706.09559v1.
ICS). April 21–24, 2021. Toronto, ON, Canada. IEEE, [16] Lidy T, Schindler A. Parallel convolutional neural
2021. networks for music genre and mood classiﬁcation[J].
[5] Zhang X Z. Music genre classiﬁcation by machine learn- MIREX2016, 2016, 3.
ing algorithms[J]. Highlights in Science, Engineering and [17] Bahuleyan H. Music genre classiﬁcation using ma-
Technology, 2023, 38: 215–219. chine learning techniques[EB/OL]. 2018: 1804.01149.
[6] 秦丹, 马光志. 基于挖掘技术的音乐风格识别系统 [J]. 计算机 https://arxiv.org/abs/1804.01149v1.
工程与设计, 2005, 26(11): 3094–3096. [18] Zhang W B, Lei W K, Xu X M, et al. Improved
Qin Dan, Ma Guangzhi. Music style identiﬁcation sys- music genre classiﬁcation with convolutional neural net-
tem based on mining technology[J]. Computer Engineer- works[C]//Interspeech 2016. ISCA: ISCA, 2016.
ing and Design, 2005, 26(11): 3094–3096. [19] Wen Z F, Chen A B, Zhou G X, et al. Parallel attention of
[7] Kumar D P, Sowmya B J, Chetan, et al. A comparative representation global time–frequency correlation for mu-
study of classiﬁers for music genre classiﬁcation based on sic genre classiﬁcation[J]. Multimedia Tools and Applica-
feature extractors[C]//2016 IEEE Distributed Comput- tions, 2024, 83(4): 10211–10231.
ing, VLSI, Electrical Circuits and Robotics (DISCOVER). [20] Gong Y, Chung Y A, Glass J. AST: Audio spec-
August 13–14, 2016. Mangalore, India. IEEE, 2016. trogram transformer[EB/OL]. 2021: 2104.01778.
[8] Sharma A, Tomar A. Music genre classiﬁcation using am- https://arxiv.org/abs/2104.01778v3.
plitude and frequency variants of MFCC[J]. International [21] Liu Z W, Bian T, Yang M L. Locally activated gated
Journal of Research, 2015, 2: 648–655. neural network for automatic music genre classiﬁcation[J].
[9] Ghildiyal A, Singh K, Sharma S. Music genre classiﬁca- Applied Sciences, 2023, 13(8): 5010.
tion using machine learning[C]//2020 4th International [22] 路双双. 基于偏序结构表示原理的乐曲结构可视化及分类的
Conference on Electronics, Communication and Aerospace 研究 [D]. 秦皇岛: 燕山大学, 2019.
Technology (ICECA). November 5–7, 2020. Coimbatore, [23] Zhu W T, Omar M. Multiscale audio spectrogram
India. IEEE, 2020. transformer for eﬃcient audio classiﬁcation[C]//ICASSP
[10] Baniya B K, Ghimire D, Lee J. A novel approach of 2023—2023 IEEE International Conference on Acoustics,
automatic music genre classiﬁcation based on timbrai Speech and Signal Processing (ICASSP). June 4–10, 2023.
texture and rhythmic content features[C]//16th Interna- Rhodes Island, Greece. IEEE, 2023.
tional Conference on Advanced Communication Technol- [24] 曾援, 李剑, 马明星, 等. 基于改进 Transformer 模型的多声
ogy. February 16–19, 2014. Pyeongchang, Korea (South). 源分离方法 [J]. 计算机技术与发展, 2024, 34(5): 60–65.
Global IT Research Institute (GIRI), 2014. Zeng Yuan, Li Jian, Ma Mingxing, et al. Multi-source sep-
[11] Arabi Foroughmand A, Lu G J. Enhanced polyphonic mu- aration method based on improved transformer model[J].
sic genre classiﬁcation using high level features[C]//2009 Computer Technology and Development, 2024, 34(5):
IEEE International Conference on Signal and Image Pro- 60–65.
cessing Applications. November 18–19, 2009. Kuala [25] Wang Y N, Chen A B, Li H C, et al. A hierarchical bird-
Lumpur, Malaysia. IEEE, 2009. song feature extraction architecture combining static and
[12] 陆阳, 郭滨, 白雪梅. 基于高斯混合模型的音乐情绪四分 dynamic modeling[J]. Ecological Indicators, 2023, 150:
类研究 [J]. 长春理工大学学报 (自然科学版), 2015, 38(5): 110258.
107–111. [26] 张凯, 王舒蕾, 齐婷婷, 等. 基于功率谱的美声发声特征提
Lu Yang, Guo Bin, Bai Xuemei. Music emotion four clas- 取 [J]. 振动测试与诊断, 2023, 43(6): 1205–1210, 1249.
siﬁcation research based on Gaussian mixture model[J]. Zhang Kai, Wang Shulei, Qi Tingting, et al. Voice feature
Journal of Changchun University of Science and Technol- extraction of bel canto based on power spectrum[J]. Jour-
ogy (Natural Science Edition), 2015, 38(5): 107–111. nal of Vibration, Measurement & Diagnosis, 2023, 43(6):
[13] Benetos E, Kotropoulos C. Non-negative tensor factoriza- 1205–1210, 1249.
tion applied to music genre classiﬁcation[J]. IEEE Trans- [27] 陶雨昂. MFCC 特征训练技术在声纹识别中的应用 [J]. 集成
actions on Audio, Speech, and Language Processing, 2010, 电路应用, 2024, 41(2): 386–387.

86 87 88 89 90 91 92 93 94 95 96