《应用声学》编辑部

文章摘要

吴情,胡维平,陈丹丹,肖婷.基于深度学习的语声抑郁识别*[J].,2022,41(5):837-842

基于深度学习的语声抑郁识别*

Speech depression recognition based on deep learning

投稿时间：2021-07-25 修订日期：2022-08-22

中文摘要:

世界各地抑郁症患者数量不断增多，抑郁症的诊断和治疗面临着医生短缺问题，针对这一问题，提出了CNN和结合注意力机制的BLSTM特征融合模型。从特征选择和网络构架两方面进行了研究，对比了几种经典语声特征，得出梅尔倒谱系数对抑郁分类效果最好，再将梅尔倒谱系数分别送进CNN和结合注意力机制的BLSTM网络实现抑郁分类。在DAIC-WOZ数据集上进行实验，所提出的方法对语声抑郁的分类精确度达到78.06 %，F1分数达到74.68%。关键词：抑郁识别；语声分析；分类

英文摘要:

The number of depression patients is increasing around the world. There is a shortage of doctors to diagnose and treat depression. In response to this problem, CNN and BLSTM feature fusion model combined with attention mechanism are proposed.Research has been carried out from the aspects of feature selection and network architecture. By comparing several classical speech features, it is concluded that the Mel-frequency Cepstrum Coefficient (MFCC) has the best effect on depression classification, and then the Meier cepstrum coefficient is sent into CNN and BLSTM network combined with attention mechanism respectively to achieve depression classification.Experiments on the DAIC-WOZ data set show that the proposed method has a classification accuracy of 78.06 % and a F1 score of 74.68%.

DOI：10.11684/j.issn.1000-310X.2022.05.020

中文关键词: 抑郁识别语声分析分类

英文关键词: depression recognition speech analysis classification

基金项目:

作者	单位	E-mail
吴情	广西师范大学	wq15277396121@163.com
胡维平	广西师范大学电子工程学院
陈丹丹	广西师范大学电子工程学院
肖婷	广西师范大学电子工程学院

摘要点击次数: 475

全文下载次数: 339

查看全文查看/发表评论下载PDF阅读器

关闭