文章摘要
阴法明,赵 焱,赵力.连续音素的改进深信度网络的识别算法*[J].,2019,38(1):39-44
连续音素的改进深信度网络的识别算法*
Phoneme recognition based on deep belief network
投稿时间:2018-04-25  修订日期:2018-12-29
中文摘要:
      为提高连续语音识别中的音素识别率,提出一种基于改进并行回火训练的受限波尔兹曼机的音素识别算法。首先,算法利用经过等能量划分后的改进并行回火来训练受限玻尔兹曼机,接着将受限玻尔兹曼机堆叠组成一个深信度网络,从而作为深度神经网络预训练的基础模型,然后通过软最大化层输出,得到用于音素状态后验概率检测的深度神经网络。接着利用少量的标签数据,根据反向传播算法对网络权重进行微调。最后将所得后验概率作为隐马尔科夫的发射概率,然后利用Viterbi解码器实现音素识别。在TIMIT语料库上的实验表明,相比于传统的对比散度类算法提高了约4.5%。在不增加计算量的情况下比原始并行回火算法提高约1%。
英文摘要:
      In order to improve the accuracy of phoneme recognition in continuous speech recognition, in this paper, a modified parallel tempering (PT)algorithm applied totrain the Restricted Boltzmann Machine is proposed. Firstly, Restricted Boltzmann Machine(RBM) is trained in light of Metropolis-Hasting for parallel tempering sampling, then stacking up RBMs to form a deep belief network(DBN) as the basis for DNN pre-training ,then by adding an output layer called “softmax” to the network, a deep neural network detecting the posterior probability of phoneme can be created. Subsequently, Backward Propagation algorithm is applied to fine-tune the weights discriminatively with less label data. Finally the sequence of the predicted probability distribution is fed into a standard Viterbi decoder. The experiments show that the proposed method has a better performance on the TIMIT dataset than traditional ways.Its recognition rate is higher 4.5%than CD,and 1% than original PT without more computation.
DOI:10.11684/j.issn.1000-310X.2019.01.006
中文关键词: 并行回火  受限玻尔兹曼机  深信度网络  音素识别
英文关键词: Parallel tempering  Restricted Boltzmann Machine  Deep belief network  Phoneme recognition
基金项目:国家自然科学基金项目 (61571106)
作者单位E-mail
阴法明* 南京信息职业技术学院 yinfm@njcit.cn 
赵 焱 东南大学 xie.yue@seu.edu.cn 
赵力 东南大学 zhaoli@seu.edu.cn 
摘要点击次数: 1942
全文下载次数: 1605
查看全文   查看/发表评论  下载PDF阅读器
关闭