Deep temporal architectures for activity recognition

Grobler, H.2018-12-052018-12-052009/07/182017Luvhengo, T 2017, Deep temporal architectures for activity recognition, MEng Dissertation, University of Pretoria, Pretoria, viewed yymmdd <http://hdl.handle.net/2263/67779>S2018http://hdl.handle.net/2263/67779Dissertation (MEng)--University of Pretoria, 2017.The amount of video content generated increases daily, three hundred hours of video content is uploaded to YouTube every 60 seconds1. There exists a need to sort, summarise, describe, categorise and retrieve video data based on the content (i.e. the activities occurring in the video). Activity recognition (i.e. automatically naming activities) is an important area for video analysis. Activity recognition has applications in robotics, video surveillance, multimedia retrieval, behaviour analysis, disaster warning systems and content-based browsing. Automatically categorising activities given a video clip poses two main challenges, namely object detection and motion learning. An activity recognition system must detect and localise the agent as well as learn to categorise the action the agent is performing. This research hypothesises that learning models incorporating spatial and temporal aspects from video data should outperform models that learn only spatial or temporal features on activity recognition learning tasks. The above hypothesis is investigated by developing two deep learning architectures for activity recognition that learn temporally independent and dependent features respectively.minima do not exist. A recurrent network (structurally constrained gated recurrent unit (SCGRU)) that adds contextual feature learning to gated recurrent units (GRUs) is proposed. Adding contextual features stabilises the hidden state of a GRU layer. The approach taken to investigate activity recognition architectures in this research involved examining the architectures on four benchmark datasets and analysing the results to 1) find the best model for activity recognition, 2) examine the model’s ability to learn salient temporal features, and 3) examine the model’s computational complexity. SCGRU based models outperform GRU based models on the majority of the investigated activity recognition models and datasets.en© 2018 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.UnrestrictedUCTDDeep temporal architectures for activity recognitionDissertation28216882