Page 185 - 应用声学2019年第4期
P. 185

第 38 卷 第 4 期                                                                       Vol. 38, No. 4
             2019 年 7 月                          Journal of Applied Acoustics                      July, 2019

             ⋄ 李启虎院士八十华诞学术论文 ⋄



                   基于字典学习和稀疏表示的单通道语音增强

                                                   算法综述             ∗





                                             叶中付      †   朱媛媛 贾翔宇



                                          (中国科学技术大学信息科学技术学院           合肥   230027)
                摘要    如何从带噪语音信号中恢复出干净的语音信号一直都是信号处理领域的热点问题。近年来研究者相继
                提出了一些基于字典学习和稀疏表示的单通道语音增强算法,这些算法利用语音信号在时频域上的稀疏特
                性,通过学习训练数据样本的结构特征和规律来构造相应的字典,再对带噪语音信号进行投影以估计出干净
                语音信号。针对训练样本与测试数据不匹配的情况,有监督类的非负矩阵分解方法与基于统计模型的传统语
                音增强方法相结合,在增强阶段对语音字典和噪声字典进行更新,从而估计出干净语音信号。该文首先介绍
                了单通道情况下语音增强的信号模型,然后对 4 种典型的增强方法进行了阐述,最后对未来可能的研究热点
                进行了展望。
                关键词     单通道语音增强,稀疏表示,字典学习
                中图法分类号: TN912.3           文献标识码: A          文章编号: 1000-310X(2019)04-0645-08
                DOI: 10.11684/j.issn.1000-310X.2019.04.022



               Review for speech enhancement algorithms based on dictionary learning and

                                                sparse representation


                                        YE Zhongfu ZHU Yuanyuan JIA Xiangyu

                   (Department of Electronic Engineering and Information Science, University of Science and Technology of China,
                                                     Hefei 230027, China)

                 Abstract  How to recover the clean speech signal from the noisy signal has always been a hot issue in the field
                 of signal processing. In recent years, single-channel speech enhancement algorithms based on dictionary learning
                 and sparse representation have been proposed. These algorithms make full use of the sparsity of signals in time-
                 frequency domain and construct the dictionary by learning the structure characteristics of signals. Finally, the
                 clean speech is estimated by projecting the noisy signal in the dictionary. In terms of mismatched training
                 data, a new approach combining the supervised non-negative matrix factorization method with conventional
                 statistical model-based enhancement methods have been proposed, which can update the speech and noise
                 dictionaries in the enhancement stage and estimate the clean speech. This paper first introduces the signal
                 model of speech enhancement under single-channel condition, and then expounds four typical enhancement
                 methods. Finally, the future research directions are prospected.
                 Key words Single-channel speech enhancement, Sparse representation, Dictionary learning


             2019-01-29 收稿; 2019-04-09 定稿
             国家自然科学基金项目 (61671418)
             ∗
             作者简介: 叶中付 (1959- ), 男, 安徽桐城人, 教授, 博士生导师, 研究方向: 信号与信息处理。
             † 通讯作者 E-mail: yezf@ustc.edu.cn
   180   181   182   183   184   185   186   187   188   189   190