Including expressive and controllable text-to-speech synthesis, voice converstion, singing voice synthesis, singing voice conversion ...
Including deep learning approaches, keyword spotting, unsupervised pretraining, data augumentation, mispronunciation detection and diagnosis ...
Including speaker verification, speaker representation learning, speaker darization, adversarial attack and defense, anti-spoofing ...
Including speech enhancement, speech separation, target speaker extraction, singing voice separation ...
Including speech emotion recognition, emotion recognition in conversations, speech emphasis detection, user intention understanding in speech interactive system ...
Including audio-visual bimodal modeling, talking avatar, audio-visual speech separation, natural language understanding and generation ...