《应用声学》编辑部

文章摘要

杨洋,汪毓铎.基于改进卷积神经网络算法的语音识别[J].,2018,37(6):940-946

基于改进卷积神经网络算法的语音识别

Speech recognition based on improved convolutional neural network algorithm

投稿时间：2018-01-25 修订日期：2018-11-02

中文摘要:

为了解决传统卷积神经网络识别连续语音数据时识别性能较差的问题，提出一种改进的卷积神经网络算法。该方法引入Fisher准则以及L2正则化约束，在反向传播调整参数阶段，既保证参数误差的最小化，又确保分类以后的样本类间分布较分散，类内分布较集中，同时保证网络权值具有合适的数量级以有效缓解过拟合问题；采用一种更符合生物神经元激活特性的新型log激活函数进行卷积神经网络的优化，进一步提高语音识别的正确率。在语音识别库TIMIT以及THCHS30上的实验结果表明，相较于传统卷积神经网络算法，本文提出的改进算法能较好的提高语音识别率，且泛化能力更强。

英文摘要:

An improved convolutional neural network(CNN) algorithm is proposed to solve the problem of poor recognition performance when the traditional CNN identifies continuous speech corpus. In this method, Fisher criterion and L2 regularization constraint are introduced. In the phase of back propagation adjustment parameters, it not only ensures the minimum of parameter errors, but also ensures that the distribution of samples after classification is more scattered, and the distribution within class is more concentrated. At the same time, the weights of the network is guaranteed to have the appropriate order of magnitude to effectively alleviate the problem of over-fitting; in order to further improve the accuracy of speech recognition, a new log activation function which is more consistent with the biological neuron is used to optimize the CNN. Experiments on speech corpus TIMIT and THCHS30 show that compared with the traditional convolutional neural network algorithm, the improved algorithm proposed in this paper can better improve the accuracy and the generalization ability.

DOI：10.11684/j.issn.1000-310X.2018.06.016

中文关键词: 语音识别，卷积神经网络，Fisher准则，L2正则化，log激活函数

英文关键词: Speech recognition, Convolutional neural network, Fisher criterion, L2 regularization, log activation function

基金项目:

作者	单位	E-mail
杨洋^*	北京信息科技大学信息与通信工程学院	18811536735@163.com
汪毓铎	北京信息科技大学信息与通信工程学院	wangyuduo@bistu.edu.cn

摘要点击次数: 2266

全文下载次数: 1752

查看全文查看/发表评论下载PDF阅读器

关闭