Page 87 - 《应用声学》2023年第1期
P. 87
第 42 卷 第 1 期 胡航烨等: 汉语儿童情感语声合成 83
111–119. Zhuang Fuzhen, Luo Ping, He Qing, et al. Survey on
Wang Guoliang, Chen Mengnan, Chen Lei. An end-to-end transfer learning research[J]. Journal of Software, 2015,
Chinese speech synthesis scheme based on Tacotron 2[J]. 26(1): 26–39.
Joural of East China Normal University(Natural Science), [31] Gibiansky A, Arik S Ö, Diamos G F, et al. Deep voice 2:
2019(4): 111–119. multi-speaker neural text-to-speech[C]//31th Conference
[22] 张亚强. 基于迁移学习和自学习情感表征的情感语音合 on Neural Information Processing Systems. NIPS. Long
成 [D]. 北京: 北京邮电大学, 2019 Beach, 2017.
[23] Skerry-Ryan R J, Battenberg E, Ying X, et al. Towards [32] 都格草, 才让卓玛, 南措吉, 等. 基于神经网络的藏语语音合
end-to-end prosody transfer for expressive speech syn- 成 [J]. 中文信息学报, 2019, 33(2): 75–80.
thesis with tacotron[C]//International Conference on Ma- Dou Gecao, Cai Rangzhuoma, Nan Cuoji, et al. Neu-
chine Learning. PMLR, 2018: 4693–4702. ral network based tibetan speech synthesis[J]. Journal of
[24] Cho K, Merrienboer B V, Gulcehre C, et al. Learn- Chinese Information Processing, 2019, 33(2): 75–80.
ing phrase representations using RNN encoder-decoder [33] Wu X, Cao Y, Wang M, et al. Rapid style adaptation us-
for statistical machine translation[J]. Computer Science, ing residual error embedding for expressive speech synthe-
2014, arXiv: 1406.1078. sis[C]//Interspeech 2018, 19th Annual Conference of the
[25] Tits N, Haddad K E, Dutoit T. Exploring transfer learning International Speech Communication Association, Hyder-
for low resource emotional TTS[C]//Proceedings of SAI abad, India, 2018: 3072–3076.
Intelligent Systems Conference. Springer, Cham, 2019. [34] Kubichek R. Mel-cepstral distance measure for objective
[26] Zhou K, Sisman B, Liu R, et al. Emotional voice conver- speech quality assessment[C]. In Communications, Com-
sion: theory, databases and ESD[J]. Speech Communica- puters and Signal Processing, 1993, IEEE Pacific Rim
tion, 2022, 137: 1–18. Conference on IEEE, 19–21 May, 1993.
[27] 应雨婷. 基于循环神经网络的中文语音合成研究与应用 [D]. [35] Yan C, Zhang G, Ji X, et al. The feasibility of inject-
南京: 东南大学, 2019. ing inaudible voice commands to voice assistants[J]. IEEE
[28] 曹欣怡. 基于韵律参数优化的情感语音合成 [D]. 南京: 南京 Transactions on Dependable and Secure Computing, 2019,
师范大学, 2020. 18(3): 1108–1124.
[29] Pan S J, Qiang Y. A survey on transfer learning[J]. IEEE [36] 赵力, 黄程韦. 实用语音情感识别中的若干关键技术 [J]. 数据
Transactions on Knowledge and Data Engineering, 2010, 采集与处理, 2014, 29(2): 157–170.
22(10): 1345–1359. Zhao Li, Huang Chengwei. Key technologies in practical
[30] 庄福振, 罗平, 何清, 等. 迁移学习研究进展 [J]. 软件学报, speech emotion recognition[J]. Joural of Data Acqisition
2015, 26(1): 26–39. and Processing, 2014, 29(2): 157–170.