杨春勇,祁宏达,彭焱秋,尹滨,侯金,舒振宇,陈少平.融合声纹信息的能量谱图在鸟类识别中的研究[J].,2020,39(3):453-463 |
融合声纹信息的能量谱图在鸟类识别中的研究 |
Research on the application of energy spectrum with voiceprint information in bird recognition |
投稿时间:2019-07-12 修订日期:2020-04-26 |
中文摘要: |
常用的梅尔倒谱参数结合混合高斯模型(MFCC+GMM)方法的鸟鸣声识别技术难适应噪声环境,模型难以收敛,且计算复杂度高。该文提出一种融合声纹信息的能量谱图的鸟类识别方法(VPS-BR),该方法利用
鸟类鸣声在能量谱图上所表现的多维差异性,定量识别鸣声声纹特征。通过对分贝能量进行颜色映射得到能量谱图,提取其视觉特征所表达的声学特征,分析归纳得到鸟类特有鸣声模式。在特征提取步骤中,选用识别
速度快的局部二值模式、识别鲁棒性高的方向梯度直方图两个参数表征鸟鸣声谱图的边缘声纹;在识别步骤中,用局部二值模式和方向梯度直方图两种特征分别与支持向量机、K最近邻和随机森林3种分类器算法进行
两两组合构建识别模型测试。对15种原始带噪鸟类鸣声数据集进行交叉验证,VPS-BR模型的平均识别率比MFCC+GMM组合模型高出11.3%,方向梯度直方图特征与K最近邻分类器的组合模型识别率达90.5%,表
现出较好的抗噪性能和识别性能。最后针对样本数据集缺乏问题,使用生成对抗网络进行图像增强,进一步将
识别率提升1.48%。 |
英文摘要: |
The bird’s voice recognition technology combined with the Mel-frequency cepstral coefficients and the Gaussian mixture model (MFCC+GMM) method is difficult to adapt to the noise environment, and its
computational complexity is high. In this paper, a novel bird recognition method using voice-power spectrum (VPS-BR) to express acoustic features is proposed. It utilizes the multi-dimensional difference of bird sounds on
the power spectrum to quantitatively identify the texture features of the sound. In the feature extraction step, the edge texture of the bird’s voice-power spectrum is characterized by local binary pattern (LBP) and direction
gradient histogram (HOG); in the identification step, the VPS-BR model is constructed by combining LBP and HOG with support vector machine, K nearest neighbor (KNN) and random forest. The cross-validation of 15 original noisy bird sound data sets from the Xeno-Canto website shows that the recognition rate of the VPS-BR model is better than the MFCC+GMM model; HOG and KNN combined model recognition rate
can reach 90.5%, shows good noise-reception recognition performance. Finally, for the lack of sample data set, image enhancement is made by using generated-adversarial-network, and the recognition rate is further
increased by 1.48%. |
DOI:10.11684/j.issn.1000-310X.2020.03.019 |
中文关键词: 鸟类识别,能量谱图,局部二值模式,方向梯度直方图,生成对抗网络 |
英文关键词: Birds recognition Power spectrogram HOG LBP GAN |
基金项目: |
|
摘要点击次数: 1868 |
全文下载次数: 1544 |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |