Research

Topic 1. Video Understanding: Action Recognition, Detection, Prediction

Human action in videos is a fundamental video understanding problem. We mainly focus on action recognition, anomaly detection, spatial-temporal detection, temporal localization, and action prediction.

  1. Muchao Ye, Xiaojiang Peng, Yu Qiao, etc. AnoPCN: Video Anomaly Detection via Deep Predictive Coding Network. ACM MultiMedia, 2019.

  2. Xiaojiang Peng and Cordelia Schmid. Multi-region two-stream R-CNN for action detection. European Conference on Computer Vision (ECCV), 2016. [PDF, code]

  3. Xiaojiang Peng, Limin Wang, Yu Qiao, etc. Bag of Visual Words and Fusion Methods for Action Recognition: Comprehensive Study and Good Practice. Computer Vision and Image Understanding (CVIU), 2016.[ PDF, code]

  4. Xiaojiang Peng, Yu Qiao, etc. Action Recognition with Stacked Fisher Vectors. European Conference on Computer Vision (ECCV), 2014. [PDF]

  5. Xiaojiang Peng, Yu Qiao, etc. Boosting VLAD with Supervised Dictionary Learning and High-Order Statistics. European Conference on Computer Vision (ECCV), 2014. [PDF]

Topic 2. Face Recognition and 3D Reconstruction

Facial analysis is the most successful real-world application of CV, we mainly focus on cross-modal face recognition and 3D face reconstruction.

  1. Zhongying Deng, Xiaojiang Peng, Yu Qiao, Zhifeng Li. Mutual Component Convolutional Neural Networks for Heterogeneous Face Recognition. IEEE Transactions on Image Processing, 2019.[PDF]

  2. Xiaoxing Zeng, Xiaojiang Peng, Yu Qiao. DF2Net: A Dense-Fine-Finer Network for Detailed 3D Face Reconstruction. ICCV, 2019.

  3. Zhongying Deng, Xiaojiang Peng, Yu Qiao. Residual Compensation Networks for Heterogeneous Face Recognition. AAAI, 2019. [PDF

  4. Guosheng Hu, Xiaojiang Peng, etc. Frankenstein: Learning Deep Face Representations using Small Data. IEEE Transactions on Image Processing, 2018. [PDF

Topic 3. Affective Computing

Our emotion is usually reflected by facial expression. Facial emotion analysis aims to understand our emotion in images or videos.

  1. Debin Meng, Xiaojiang Peng, Yu Qiao, etc. Frame Attention Networks for Facial Expression Recognition in Videos. ICIP, 2019. [PDF]

  2. Jin Ye, Xiaojiang Peng, Yu Qiao, etc. Visual-Textual Sentiment Analysis in Product Reviews. ICIP, 2019.

  3. Debin Meng, Xiaojiang Peng, Yu Qiao, etc. Exploring Emotion Features and Fusion Strategies for Audio-Video Emotion Recognition. International Conference on Multimodal Interaction (ICMI’19), ACM.

  4. Kai Wang, Xiaojiang Peng, Yu Qiao, etc. Cascade Attention Networks For Group Emotion Recognition with Face, Body and Image Cues. International Conference on Multimodal Interaction (ICMI’18), ACM. [PDF

  5. Lianzhi Tan, Xiaojiang Peng, Yu Qiao, etc. Group Emotion Recognition with Individual Facial Emotion CNNs and Global Image Based CNNs. International Conference on Multimodal Interaction (ICMI’17), ACM. [PDF

Topic 4. Weakly-Supervised Deep Learning

Deep learning requires Big data, but what about annotations? Do we really need each sample to be annotated? How far can we go with a reduced set of annotations? Can we compensate the lack of annotations with more computation? Is it better to use a few clean annotations or more but noisy annotations?

  1. Qing Li, Xiaojiang Peng, Yu Qiao, etc. Product Image Recognition with Guidance Learning and Noisy Supervision. Arxiv 2019. [PDF]