Hidden Conditional Random Fields for action recognition
MetadataShow full item record
In this thesis we apply Hidden Conditional Random Fields (HCRF) to action recognition. HCRF is a classification method modelling the structure among local observations. In our system, an image is modelled as a set of hidden part labels conditioned on their local features. For each action class, the probability of an assignment of part labels to local patch features is modelled by a Conditional Random Field (CRF). These class conditional CRFs are combined into an unified framework of HCRF, which treats the assignment of part labels as hidden variables. This model also combines the local patch features with the global feature of an image under the framework of HCRF. The model parameter is trained with a maximum likelihood criteria. We have also evaluated a baseline model of HCRF, called the root model. It only uses the global feature and it does not include the hidden part labels. The root model is trained with the maximum likelihood criteria as well. An extension of HCRF, Max-Margin Hidden Conditional Random Field (MMHCRF), has also been applied to action recognition. MMHCRF extends HCRF by training with a maximum margin criteria. That is, it sets the model parameter in the way that the margin between the score of the correct action label and the scores of the other labels is maximized. We have also evaluated a baseline model of MMHCRF. Similar to the root model, this baseline model only uses the global feature, but it trains the model parameter with the max-margin criteria. Based on HCRF and the root model, we have proposed a Part Labels method. This method learns the hidden part labels of each image using the model parameter trained by HCRF. It uses these part labels as a new set of local features and combines them with the global feature. It trains these features in the same way as the root model. We have implemented and evaluated these five models on the Weizmann dataset, a human action dataset, and an animal behaviour dataset, called Noldus ABR dataset. Our experiments show that only modelling the spatial structures in 2D space is not sufficient for action recognition. It has been demonstrated that the classification results of the simpler models such as the root model and the multi-class SVM are comparable to the more complex model such as HCRF. We have also found that the performance of MMHCRF is heavily dependent on its model parameter initialization and other parameter settings. It is not a robust method compared to HCRF. The Part Labels method is also less robust than HCRF, but it can be an option to improve the performance as it explicitly used the information of the learned part labels. One of the goals of this project is to investigate alternatives for the automatic rodent behaviour recognition module developed at Noldus IT. We have improved its action classification performance by 15\%, promising a more robust action recognition tool for rodent behaviour.