Emotion recognition in video using deep learning method with subtract pre-processing

Student thesis: Master's Thesis

Abstract

The main purpose of this paper is to distinguished human expression using a deep learning method. This paper present a new preprocessing method to extract the features of human expression from videos, and then uses deep learning methods to analysis the human emotions. A facial expression is usually regarded as a fixed moment of human disposition. However, recently, researchers have realized the importance of time information for expression recognition.At the same time, the ability of feature extraction in deep learning has also received attention. The method used in this paper uses time information to distinguish expressions. The paper is divided into several chapters to describe the background technology of expression analysis. Relevant information mentioned includes: picture and video information;Basic knowledge of convolution neural network; The basic principle of recurrent neural network; Background technology of face features and traditional classification methods; Application of depth learning method in video; Related technologies of object detection and the amount of classical deep learning models. With this background knowledge, a complete video facial expression analysis scheme can be formed.Chapter 1 is the overall planning of the thesis. Chapter 2 systematically introduces the relevant knowledge of deep learning, including the composition of deep convolution network, the composition of deep loop network, the use of activation function, classical classification network and target detection network. Pretreatment is also needed in facial expression recognition, including face detection, face correction, etc.Chapter 2 introduces face processing algorithms, such as Viola&Jones face detection model and multi-person face detection model. At the same time, the knowledge of actual model training and the analysis experience of experimental results are introduced. At the end of the chapter 2, the classical data sets used in learning are listed. In chapter 3, a new pre-processing method for video temporal feature extraction is proposed.The whole expression recognition process includes face recognition, face alignment and the training of depth learning classifiers. RAVDESS video data set is used in training the depth learning model. After reasonable training, the model was tested in video and real-time video and achieved acceptable results. Chapter 4 summarizes and assesses practical work and discusses future work direction, combining the experience from this work with the problems encountered.
Date of Award31 Mar 2020
Original languageEnglish
Awarding Institution
  • University Of Strathclyde
SponsorsUniversity of Strathclyde
SupervisorJohn Soraghan (Supervisor) & Gaetano Di Caterina (Supervisor)

Cite this