Page 89 - 《应用声学》2020年第3期
P. 89

第 39 卷 第 3 期                                                                       Vol. 39, No. 3
             2020 年 5 月                          Journal of Applied Acoustics                      May, 2020

             ⋄ 研究报告 ⋄



                贝叶斯优化卷积神经网络公共场所异常声识别                                                                    ∗






                                                   曾 宇     †   户文成


                                             (北京市劳动保护科学研究所         北京  100054)

                摘要:针对公共场所异常声的感知和识别问题,提出一种基于贝叶斯优化卷积神经网络的识别方法。提取声
                信号的 Gammatone 倒谱系数、倍频程功率谱、短时能量和谱质心,组合成声信号的特征图。构建卷积神经网
                络作为分类器,利用递增的卷积核设置和池化操作处理不同尺度的特征。基于贝叶斯优化算法优化卷积神经
                网络的模型参数,对包括火苗噼啪声、婴儿啼哭声、烟花燃放声、玻璃破碎声和警报声的 5 种公共场所异常声进
                行识别。该方法的识别结果与基于不同的特征提取和分类器方案得到的识别结果进行比较,结果表明该方法
                的识别效果优于其他特征提取和分类器方案的识别效果。最后分析了该方法在不同信噪比噪声干扰下的识别
                结果,验证了该方法的有效性。
                关键词:公共场所;异常声识别;Gammatone 倒谱系数;贝叶斯优化;卷积神经网络
                中图法分类号: TP391.4; TP183          文献标识码: A          文章编号: 1000-310X(2020)03-0409-08
                DOI: 10.11684/j.issn.1000-310X.2020.03.013



                 Recognition of abnormal sound in public places based on Bayesian optimal

                                            convolutional neural network



                                                 ZENG Yu HU Wencheng

                                  (Beijing Municipal Institute of Labour Protection, Beijing 100054, China)

                 Abstract: Aiming at the problem of abnormal sound perception and recognition in public places, a recogni-
                 tion method based on Bayesian optimal convolution neural network is proposed. The Gammatone cepstrum
                 coefficients, octave power spectrum, short-term energy and spectral centroid of sound signal are extracted and
                 combined to form the characteristic map of sound signal. Using convolution neural network as classifier, dif-
                 ferent convolution kernel settings and pooling operations are adopted to deal with different scales of features.
                 Based on Bayesian optimization algorithm, the model parameters of convolution neural network are optimized.
                 Five kinds of abnormal sounds in public places, including crackling of fire, crying of infants, fireworks, broken
                 glass and alarms, are identified. Finally, the recognition results of different feature extraction and classifier
                 schemes are compared, and the advantages of this method are illustrated. The recognition results of this
                 method under noise jamming are analyzed, and the validity of this method is verified.
                 Keywords: Public place; Abnormal sound recognition; Gammatone cepstrum coefficients; Bayesian optimiza-
                 tion; Convolutional neural network


             2019-07-11 收稿; 2019-11-28 定稿
             北京市财政项目 (PXM2019_178304_000003), 北京市劳动保护科学研究所自立课题 (H194)
             ∗
             作者简介: 曾宇 (1979– ), 男, 山西太原人, 助理研究员, 研究方向: 噪声与振动控制。
             † 通信作者 E-mail: zengyu@bmilp.com
   84   85   86   87   88   89   90   91   92   93   94