李鹏,杨元维,杜李慧,高贤君,周意,蒋梦月,张净波.基于双向循环神经网络的汉语语音识别*[J].,2020,39(3):464-471 |
基于双向循环神经网络的汉语语音识别* |
Study of Chinese speech recognition based on Bi-RNN |
投稿时间:2019-03-19 修订日期:2020-05-03 |
中文摘要: |
当前基于深度神经网络模型中,虽然其隐含层可设置多层,对复杂问题适应能力强,但每层之间的节点连接是相互独立的,这种结构特性导致了在语音序列中无法利用上下文相关信息来提高识别效果,而传统的循环神经网络虽然做出了改进,但是只能对上文信息进行利用。针对以上问题,该文采用可以同时利用语音序列中上下文相关信息的双向循环神经网络模型与深度神经网络模型相结合,并应用于语音识别。构建具有5层隐含层的模型,其中第3层为双向循环神经网络结构,其他层采用深度神经网络结构。实验结果表明:加入了双向循环神经网络结构的模型与其他模型相比,较好地提高了识别正确率;噪声对双向循环神经网络汉语识别有重要影响,尤其是训练集和测试集附加噪声类型不同时,单一的含噪声语音的训练模型无法适应不同噪声类型的语音识别;调整神经网络模型中隐含层神经元数量后,识别正确率并不是一直随着隐含层中神经元数量的增加而增加,神经元数量数目增加到一定程度后正确率出现了降低的趋势。 |
英文摘要: |
Within deep neural network (DNN) models, the hidden layer can be set up multi-level, adaptable to complicated problem, but the node connected between each layer is independent of each other, the structure characteristics make it impossible to use contextual information in the speech sequence to improve the effect of recognition, and while a traditional recurrent neural network (RNN) has made the improvement, but only to use the above information. To solve the above problems, the bidirectional RNN (Bi-RNN) model and DNN model were combined in this paper, which can simultaneously utilize the context-related information in speech sequences, and apply them to speech recognition. A model with five hidden layers was constructed, in which the third layer was Bi-RNN structure and the other layers were DNN structure. The experimental results show that: compared with other models, the model with Bi-RNN structure improves the recognition accuracy. Noise plays an important role in Bi-RNN Chinese language recognition. In particular, the training set and test set have different types of additional noise. After adjusting the number of neurons in the hidden layer in the neural network model, the recognition accuracy does not always increase with the increase of the number of neurons in the hidden layer, but decreases after the number of neurons increases to a certain extent. |
DOI:10.11684/j.issn.1000-310X.2020.03.020 |
中文关键词: 语音识别,深度学习,深度神经网络,循环神经网络 |
英文关键词: speech recognition,deep learning,deep neural network,recurrent neural network |
基金项目: |
|
摘要点击次数: 1697 |
全文下载次数: 1974 |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |
|
|
|