Department of Computer Science and Technology
Bachelor of Computer Science & Technology, Tsinghua University, Beijing, China, 1995;
Master of Computer Application Technology, Tsinghua University, Beijing, China, 1999;
Ph.D. in Computer Application Technology, Tsinghua University, Beijing, China, 1999.
Speech Recognition, Speaker Recognition, Speech Affective Computing
National Natural Science Foundation of China: Affective Computing Theories and Methods (2005-2008);
International Joint Research Project: Design and Implementation of Voice Spam Filter (2007-2008);
International Joint Research Project: Research on Robustness of Speaking Style for Speaker Recognition (2008-2009);
International Joint Research Project: Research on Speech Emotion Recognition (2008-2010).
As the most convenient and effective way of human-computer communication, speech conveys rich information such as meaning, voice biometrics, and emotional state. My research group aims at extracting and identifying such information from speech signals. We have built a universal framework for spontaneous speech understanding, speaker verification/identification, and emotion recognition. I am attempting to explore the underlying mechanisms of speech communication, in particular the structured information of speech.
My research group also addresses problems in robust speaker recognition for inter- and intra-speaker variations, such as mismatched channels, emotion states and speaking styles. A cohort-based speaker model synthesis method and an emotion attribute projection algorithm have been published. We have also proposed a GMM supervector-based SVM with spectral features for speech emotion recognition, as well as an ANN-based decision fusion method.
 Wei Wu, Thomas Fang Zheng, Mingxing Xu, Frank Song. A Cohort-based Speaker Model Synthesis for mismatched Channels in Speaker Verification. IEEE Trans. on Audio, Speech and Language Processing, vol. 15, no. 6, pp. 1893-1903, 2007.
 Lu Xu, Mingxing Xu, Dali Yang. ANN based decision fusion for speech emotion recognition. Proc.12th Euro. Conf. on Speech Communication and Technology (InterSpeech 2009), Brighton UK, 2009, pp. 2035-2038.
 Hao Hu, Mingxing Xu, Wei Wu. GMM supervector based SVM with spectral features for speech emotion recognition. Proc.32nd IEEE Intl.Conf. on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, Hawaii, USA, 2007, pp. 413-416.
 Hao Hu, Mingxing Xu, Wei Wu. Fusion of global statistical and segmental spectral features for speech emotion recognition. Proc. 10th Euro. Conf. on Speech Communication and Technology (InterSpeech 2007), Antwerp, Belgium, 2007, pp. 2269-2272.
 Huanjun Bao, Mingxing Xu, Fang Zheng. Emotion attribute projection for speaker recognition on emotional speech. Proc. 10th Euro. Conf. on Speech Communication and Technology (InterSpeech 2007), Antwerp, Belgium, 2007, pp. 758-761.
 Wei Wu, Fang Zheng, Mingxing Xu, et al. Study on speaker verification on emotional speech. Proc. 9th Intl. Conf. on Spoken Language Processing (ICSLP 2006), Pittsburgh, Pennsylvania, USA, 2006, pp. 2102-2105.