AiCon 全球人工智能与机器学习技术大会

语音技术在小米的实践应用之路 王育军

2. 2000~2002 说话人识别 1996~2000 自动化
5. 小米在做什么
10. 80+亿次 3400万+
11. 小米语音的成长之路
12. % %
13. 0 → 2 67 →97.2% . 67 0 1 + 4 &
14. 在 …… 实现了 / /
22. 打破语音天花板 永不停息
23. 多通道唤醒 竞争说话人 语音合成前端 语音识别
24. 93.2% 2米 ≈ 88.5% 93.3% 96.3%
25. 5% 9%12% 8%相对 " " & & &
26. 语音降噪 说话人自适应 深度语言模型
27. 30% 仿真 实录 20% 10% 0% 干净 带噪 特征映射 映射+后验
28. Changhao Shan, Junbo Zhang, Yujun Wang, Lei Xie, “ATTENTION-BASED END-TO-END SPEECH RECOGNITION ON VOICE SEARCH“, ICASSP 2018 Ke Wang, Junbo Zhang, Yujun Wang and Lei Xie, “Empirical Evaluation of Speaker Adaptation on DNN based Acoustic Model” , Interspeech 2018 Ke Wang, Junbo Zhang, Sining Sun, Yujun Wang, Fei Xiang and Lei Xie, “Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition” , Interspeech 2018 Changhao Shan, Junbo Zhang, Yujun Wang and Lei Xie, “Attention-based End-to-End Models for Small-Footprint Keyword Spotting” , Interspeech 2018 Ming Liu, Yujun Wang, Jin Wang, Jing Wang, Xiang Xie, “Speech Enhancement Method Based on LSTM Neural Network for Speech Recognition”, ICSP 2018 Zhang H, Zhang J, Wang Y. Sequence-to-sequence Models for Small-Footprint Keyword Spotting. preprint arXiv, 2018. Zhang H, Zhang J, Wang Y.End-to-End Models with Auditory Attention in Multi-Channel Keyword Spotting. preprint arXiv, 2018. Liu M, Wang Y, Wang J, Wang J, Shan Y, Xie X, Speech Enhancement for Recognition Based on Multi-Objective Learning Tasks with GRU Network
30. 电视是语音之母 人工用在最需要人工的地方 十字开发,不忘初心 压箱底的老技术,多拿出来晒晒 没有理由停止创新

相关幻灯片