《应用声学》编辑部

文章摘要

阴法明,赵焱,赵力.连续音素的改进深信度网络的识别算法*[J].,2019,38(1):39-44

连续音素的改进深信度网络的识别算法*

Phoneme recognition based on deep belief network

投稿时间：2018-04-25 修订日期：2018-12-29

中文摘要:

为提高连续语音识别中的音素识别率，提出一种基于改进并行回火训练的受限波尔兹曼机的音素识别算法。首先，算法利用经过等能量划分后的改进并行回火来训练受限玻尔兹曼机，接着将受限玻尔兹曼机堆叠组成一个深信度网络，从而作为深度神经网络预训练的基础模型，然后通过软最大化层输出，得到用于音素状态后验概率检测的深度神经网络。接着利用少量的标签数据，根据反向传播算法对网络权重进行微调。最后将所得后验概率作为隐马尔科夫的发射概率，然后利用Viterbi解码器实现音素识别。在TIMIT语料库上的实验表明，相比于传统的对比散度类算法提高了约4.5%。在不增加计算量的情况下比原始并行回火算法提高约1%。

英文摘要:

In order to improve the accuracy of phoneme recognition in continuous speech recognition, in this paper, a modified parallel tempering (PT)algorithm applied totrain the Restricted Boltzmann Machine is proposed. Firstly, Restricted Boltzmann Machine(RBM) is trained in light of Metropolis-Hasting for parallel tempering sampling, then stacking up RBMs to form a deep belief network（DBN） as the basis for DNN pre-training ,then by adding an output layer called “softmax” to the network, a deep neural network detecting the posterior probability of phoneme can be created. Subsequently, Backward Propagation algorithm is applied to fine-tune the weights discriminatively with less label data. Finally the sequence of the predicted probability distribution is fed into a standard Viterbi decoder. The experiments show that the proposed method has a better performance on the TIMIT dataset than traditional ways.Its recognition rate is higher 4.5%than CD，and 1% than original PT without more computation.

DOI：10.11684/j.issn.1000-310X.2019.01.006

中文关键词: 并行回火受限玻尔兹曼机深信度网络音素识别

英文关键词: Parallel tempering Restricted Boltzmann Machine Deep belief network Phoneme recognition

基金项目:国家自然科学基金项目 (61571106)

作者	单位	E-mail
阴法明^*	南京信息职业技术学院	yinfm@njcit.cn
赵焱	东南大学	xie.yue@seu.edu.cn
赵力	东南大学	zhaoli@seu.edu.cn

摘要点击次数: 2179

全文下载次数: 1740

查看全文查看/发表评论下载PDF阅读器

关闭