Class-Imbalanced Pathological Voice Detection with Data Augmentation and Complex Feature Optimization
投稿时间:2024-08-14  修订日期:2024-12-25
      This paper aims to enhance the accuracy of pathological voice classification by developing a class-imbalanced pathological voice detection system based on the data augmentation and complex feature optimization. Firstly, thirty-two speech features are analyzed and grouped into two categories: time-domain features and frequency-domain features. Secondly, an improved synthetic minority over-sampling technique is employed to augment and balance the dataset. Next, both the efficient correlation-based feature selection algorithm and the boxplot method are applied to optimize and integrate multidimensional speech features, providing a comprehensive evaluation of the discriminative ability of each feature. Finally, the classification performance of different feature combinations is analyzed and verified in detail using the Random Forest classifier. Experimental results demonstrate that the optimized feature set (To, Fatr, Jita, sAPQ, vAm, NHR) exhibits excellent classification performance for four voice disorders, including vocal nodules, polyps, edema, and paralysis, achieving a classification accuracy of 88.6%, a recall rate of 88.4%, an F1 score of 88.4%, and an AUC of 99.7%.
中文关键词: 病理嗓音  数据增强  复杂特征  高效相关性特征选择  盒图  
英文关键词: Pathological voice  Data augmentation  Complex features  Efficient Correlation-based Feature Selection  Box plot
武雅琴* 山西农业大学软件学院 wyq0902@sxau.edu.cn 
张佳庆 山西农业大学软件学院 zhangjq2486@163.com 
张涛 天津大学电气自动化与信息工程学院 zhangtao@tju.edu.cn 
摘要点击次数: 14
全文下载次数: 5
查看全文   查看/发表评论  下载PDF阅读器