Small sample optimized bird sound recognition network based on bridging transformer
投稿时间:2023-01-05  修订日期:2024-04-27
      In view of the uneven distribution of bird sound audio data collected by actual bird monitoring, the neural network training is not sufficient, and the classification recognition test accuracy is low, a bridging Transformer neural network model is designed. The network first uses the original birdsong audio signal to generate the short-time Fourier transform spectrogram as the input feature, and then inputs the spectrogram into the Transformer network composed of the attention module and the convolution module to complete the information interaction of the global and local features in the spectrogram. Finally, the single-layer Transformer encoder is used to optimize the loss in each batch of samples to obtain the final classification result. Small sample experiments were carried out on Birdsdata and xeno-canto bird sound datasets, and the average accuracy rates of 91.34% and 82.63% were obtained, respectively. Comparative experiments were carried out with other bird sound recognition networks to verify the effectiveness of the network.
中文关键词: 鸟声识别  注意力机制  卷积模块  Transformer网络
英文关键词: bird sound recognition  attention mechanism  convolution module  Transformer network
王基豪 南京信息工程大学 1090947435@qq.com 
周晓彦* 南京信息工程大学电子与信息工程学院 xiaoyan_zhou@nuist.edu.cn 
韩智超 南京信息工程大学电子与信息工程学院 xiaoyan_zhou@nuist.edu.cn 
王丽丽 南京信息工程大学电子与信息工程学院 xiaoyan_zhou@nuist.edu.cn 
摘要点击次数: 400
全文下载次数: 302
查看全文   查看/发表评论  下载PDF阅读器