| 臧强,马刚,吴文宇,还红华,刘云平.基于多模型集成的鸟声识别方法*[J].,2025,44(4):1008-1017 |
| 基于多模型集成的鸟声识别方法* |
| Bird sound recognition method based on multi-model integration |
| 投稿时间:2024-03-08 修订日期:2025-06-25 |
| 中文摘要: |
| 鸟声识别能够辅助监测鸟类种群和栖息地的动态变化,在鸟类监测、生态保护和生态学研究中具有重要作用。为进一步提升鸟声识别准确率,提出了一种基于多模型集成的方法。首先,通过Mel频谱转换和分贝转换提取鸟声特征图,并利用Mixup操作增加训练数据的多样性;其次,集成了Tf_efficientnetv2_s_in21k、Se_resnext50_32x4d、Cspdarknet53、Eca_nfnet_l0、Resnet34等5个卷积神经网络预训练模型,引入了广义均值池化以提取鸟声的关键特征,并对数据进行训练;然后,通过指数平滑器和加权平均器集成这5个模型的识别结果,有效降低了噪声干扰和模型方差;最后,通过归一化指数函数将集成结果转换为鸟声识别结果。以北京百鸟数据库中的20种中国鸟类为实验对象,实验结果表明,在相同条件下,该方法的识别准确率可达97.93%,较单一模型提高了2.7%,并且优于现有方法的识别效果。 |
| 英文摘要: |
| Bird sound recognition can assist in monitoring the dynamic changes of bird populations and habitats, which plays an important role in bird monitoring, ecological protection and ecological research. In order to further improve the accuracy of bird sound recognition, a method based on multi-model integration is proposed. Firstly, the bird sound feature map was extracted by Mel spectrum conversion and decibel conversion, and the diversity of training data was increased by Mixup operation. Secondly, five convolutional neural network pre-training models including Tf_efficientnetv2_s_in21k, Se_resnext50_32x4d, Cspdarknet53, Eca_nfnet_l0 and Resnet34 were integrated, and generalized mean pooling was introduced to extract the key features of bird sound. And the data were trained. Then, the recognition results of these five models were integrated by exponential smoother and weighted average, which effectively reduced the noise interference and model variance. Finally, the ensemble results were converted into bird sound recognition results by normalizing the exponential function. Taking 20 species of Chinese birds in the Beijing Hundred Birds database as experimental objects, the experimental results show that under the same conditions, the recognition accuracy of the proposed method can reach 97.93%, which is 2.7% higher than that of the single model, and is better than the recognition effect of the existing methods. |
| DOI:10.11684/j.issn.1000-310X.2025.04.021 |
| 中文关键词: 鸟声识别 多模型集成 卷积神经网络 Mel频谱 |
| 英文关键词: Bird speech recognition Multiple model integration Convolutional neural networks Mel frequency spectrum |
| 基金项目:国家自然科学基金项目(61973170),国家重点研发计划项目(2017YFD0701201-02) |
|
| 摘要点击次数: 1335 |
| 全文下载次数: 1108 |
|
查看全文
查看/发表评论 下载PDF阅读器 |
| 关闭 |
|
|
|