scholarly journals A Data Augmentation Method for Skeleton-Based Action Recognition with Relative Features

2021 ◽  
Vol 11 (23) ◽  
pp. 11481
Author(s):  
Junjie Chen ◽  
Wei Yang ◽  
Chenqi Liu ◽  
Leiyue Yao

In recent years, skeleton-based human action recognition (HAR) approaches using convolutional neural network (CNN) models have made tremendous progress in computer vision applications. However, using relative features to depict human actions, in addition to preventing overfitting when the CNN model is trained on a few samples, is still a challenge. In this paper, a new motion image is introduced to transform spatial-temporal motion information into image-based representations. For each skeleton sequence, three relative features are extracted to describe human actions. The three relative features are consisted of relative coordinates, immediate displacement, and immediate motion orientation. In particular, the relative coordinates introduced in our paper not only depict the spatial relations of human skeleton joints but also provide long-term temporal information. To address the problem of small sample sizes, a data augmentation strategy consisting of three simple but effective data augmentation methods is proposed to expand the training samples. Because the generated color images are small in size, a shallow CNN model is suitable to extract the deep features of the generated motion images. Two small-scale but challenging skeleton datasets were used to evaluate the method, scoring 96.59% and 97.48% on the Florence 3D Actions dataset and UTkinect-Action 3D dataset, respectively. The results show that the proposed method achieved a competitive performance compared with the state-of-the-art methods. Furthermore, the augmentation strategy proposed in this paper effectively solves the overfitting problem and can be widely adopted in skeleton-based action recognition.

Data ◽  
2020 ◽  
Vol 5 (4) ◽  
pp. 104
Author(s):  
Ashok Sarabu ◽  
Ajit Kumar Santra

The Two-stream convolution neural network (CNN) has proven a great success in action recognition in videos. The main idea is to train the two CNNs in order to learn spatial and temporal features separately, and two scores are combined to obtain final scores. In the literature, we observed that most of the methods use similar CNNs for two streams. In this paper, we design a two-stream CNN architecture with different CNNs for the two streams to learn spatial and temporal features. Temporal Segment Networks (TSN) is applied in order to retrieve long-range temporal features, and to differentiate the similar type of sub-action in videos. Data augmentation techniques are employed to prevent over-fitting. Advanced cross-modal pre-training is discussed and introduced to the proposed architecture in order to enhance the accuracy of action recognition. The proposed two-stream model is evaluated on two challenging action recognition datasets: HMDB-51 and UCF-101. The findings of the proposed architecture shows the significant performance increase and it outperforms the existing methods.


2021 ◽  
Vol 2021 ◽  
pp. 1-6
Author(s):  
Qiulin Wang ◽  
Baole Tao ◽  
Fulei Han ◽  
Wenting Wei

The extraction and recognition of human actions has always been a research hotspot in the field of state recognition. It has a wide range of application prospects in many fields. In sports, it can reduce the occurrence of accidental injuries and improve the training level of basketball players. How to extract effective features from the dynamic body movements of basketball players is of great significance. In order to improve the fairness of the basketball game, realize the accurate recognition of the athletes’ movements, and simultaneously improve the level of the athletes and regulate the movements of the athletes during training, this article uses deep learning to extract and recognize the movements of the basketball players. This paper implements human action recognition algorithm based on deep learning. This method automatically extracts image features through convolution kernels, which greatly improves the efficiency compared with traditional manual feature extraction methods. This method uses the deep convolutional neural network VGG model on the TensorFlow platform to extract and recognize human actions. On the Matlab platform, the KTH and Weizmann datasets are preprocessed to obtain the input image set. Then, the preprocessed dataset is used to train the model to obtain the optimal network model and corresponding data by testing the two datasets. Finally, the two datasets are analyzed in detail, and the specific cause of each action confusion is given. Simultaneously, the recognition accuracy and average recognition accuracy rates of each action category are calculated. The experimental results show that the human action recognition algorithm based on deep learning obtains a higher recognition accuracy rate.


2021 ◽  
Author(s):  
Akila.K

Abstract Background: Human action recognition encompasses a scope for an automatic analysis of current events from video and has varied applications in multi-various fields. Recognizing and understanding of human actions from videos still remains a difficult downside as a result of the massive variations in human look, posture and body size inside identical category.Objective: This paper focuses on a specific issue related to inter-class variation in Human Action Recognition.Approach: To discriminate the human actions among the category, a novel approach which is based on wavelet packet transformation for feature extraction. As we are concentrating on classifying similar actions non-linearity among the features are analyzed and discriminated by Deterministic Normalized - Linear Discriminant Analysis (DN-LDA). However the major part of the recognition system relays on classification part and the dynamic feeds are classified by Hidden Markov Model at the final stage based on rule set..Conclusion: Experiments results have shown that the proposed approach is discriminative for similar human action recognition and well adapted to the inter-class variation


Drones ◽  
2019 ◽  
Vol 3 (4) ◽  
pp. 82 ◽  
Author(s):  
Asanka G. Perera ◽  
Yee Wei Law ◽  
Javaan Chahl

Aerial human action recognition is an emerging topic in drone applications. Commercial drone platforms capable of detecting basic human actions such as hand gestures have been developed. However, a limited number of aerial video datasets are available to support increased research into aerial human action analysis. Most of the datasets are confined to indoor scenes or object tracking and many outdoor datasets do not have sufficient human body details to apply state-of-the-art machine learning techniques. To fill this gap and enable research in wider application areas, we present an action recognition dataset recorded in an outdoor setting. A free flying drone was used to record 13 dynamic human actions. The dataset contains 240 high-definition video clips consisting of 66,919 frames. All of the videos were recorded from low-altitude and at low speed to capture the maximum human pose details with relatively high resolution. This dataset should be useful to many research areas, including action recognition, surveillance, situational awareness, and gait analysis. To test the dataset, we evaluated the dataset with a pose-based convolutional neural network (P-CNN) and high-level pose feature (HLPF) descriptors. The overall baseline action recognition accuracy calculated using P-CNN was 75.92%.


Inventions ◽  
2020 ◽  
Vol 5 (3) ◽  
pp. 49
Author(s):  
Nusrat Tasnim ◽  
Md. Mahbubul Islam ◽  
Joong-Hwan Baek

Human action recognition has turned into one of the most attractive and demanding fields of research in computer vision and pattern recognition for facilitating easy, smart, and comfortable ways of human-machine interaction. With the witnessing of massive improvements to research in recent years, several methods have been suggested for the discrimination of different types of human actions using color, depth, inertial, and skeleton information. Despite having several action identification methods using different modalities, classifying human actions using skeleton joints information in 3-dimensional space is still a challenging problem. In this paper, we conceive an efficacious method for action recognition using 3D skeleton data. First, large-scale 3D skeleton joints information was analyzed and accomplished some meaningful pre-processing. Then, a simple straight-forward deep convolutional neural network (DCNN) was designed for the classification of the desired actions in order to evaluate the effectiveness and embonpoint of the proposed system. We also conducted prior DCNN models such as ResNet18 and MobileNetV2, which outperform existing systems using human skeleton joints information.


Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 1993
Author(s):  
Malik Ali Gul ◽  
Muhammad Haroon Yousaf ◽  
Shah Nawaz ◽  
Zaka Ur Rehman ◽  
HyungWon Kim

Human action recognition has emerged as a challenging research domain for video understanding and analysis. Subsequently, extensive research has been conducted to achieve the improved performance for recognition of human actions. Human activity recognition has various real time applications, such as patient monitoring in which patients are being monitored among a group of normal people and then identified based on their abnormal activities. Our goal is to render a multi class abnormal action detection in individuals as well as in groups from video sequences to differentiate multiple abnormal human actions. In this paper, You Look only Once (YOLO) network is utilized as a backbone CNN model. For training the CNN model, we constructed a large dataset of patient videos by labeling each frame with a set of patient actions and the patient’s positions. We retrained the back-bone CNN model with 23,040 labeled images of patient’s actions for 32 epochs. Across each frame, the proposed model allocated a unique confidence score and action label for video sequences by finding the recurrent action label. The present study shows that the accuracy of abnormal action recognition is 96.8%. Our proposed approach differentiated abnormal actions with improved F1-Score of 89.2% which is higher than state-of-the-art techniques. The results indicate that the proposed framework can be beneficial to hospitals and elder care homes for patient monitoring.


2013 ◽  
Vol 859 ◽  
pp. 498-502 ◽  
Author(s):  
Zhi Qiang Wei ◽  
Ji An Wu ◽  
Xi Wang

In order to realize the identification of human daily actions, a method of identifying human daily actions is realized in this paper, which transforms this problem into converting human action recognition into analyzing feature sequence. Then the feature sequence combined with improved LCS algorithm could realize the human actions recognition. Data analysis and experimental results show the recognition rate of this method is high and speed is fast, and this applied technology will have broad prospects.


2015 ◽  
Vol 713-715 ◽  
pp. 2152-2155 ◽  
Author(s):  
Shao Ping Zhu

According to the problem that achieves robust human actions recognition from image sequences in computer vision, using the Iterative Querying Heuristic algorithm as a guide, a improved Multiple Instance Learning (MIL) method is proposed for human action recognition in video image sequences. Experiments show that the new method can quickly recognize human actions and achieve high recognition rates, and on the Weizmann database validate our analysis.


2014 ◽  
Vol 2014 ◽  
pp. 1-18 ◽  
Author(s):  
Maria J. Santofimia ◽  
Jesus Martinez-del-Rincon ◽  
Jean-Christophe Nebel

Smart Spaces, Ambient Intelligence, and Ambient Assisted Living are environmental paradigms that strongly depend on their capability to recognize human actions. While most solutions rest on sensor value interpretations and video analysis applications, few have realized the importance of incorporating common-sense capabilities to support the recognition process. Unfortunately, human action recognition cannot be successfully accomplished by only analyzing body postures. On the contrary, this task should be supported by profound knowledge of human agency nature and its tight connection to the reasons and motivations that explain it. The combination of this knowledge and the knowledge about how the world works is essential for recognizing and understanding human actions without committing common-senseless mistakes. This work demonstrates the impact that episodic reasoning has in improving the accuracy of a computer vision system for human action recognition. This work also presents formalization, implementation, and evaluation details of the knowledge model that supports the episodic reasoning.


Author(s):  
Wanqing Li ◽  
Zhengyou Zhang ◽  
Zicheng Liu ◽  
Philip Ogunbona

This chapter first presents a brief review of the recent development in human action recognition. In particular, the principle and shortcomings of the conventional Hidden Markov Model (HMM) and its variants are discussed. We then introduce an expandable graphical model that represents the dynamics of human actions using a weighted directed graph, referred to as action graph. Unlike the conventional HMM, the action graph is shared by all actions to be recognized with each action being encoded in one or multiple paths and, thus, can be effectively and efficiently trained from a small number of samples. Furthermore, the action graph is expandable to incorporate new actions without being retrained and compromised. To verify the performance of the proposed expandable graphic model, a system that learns and recognizes human actions from sequences of silhouettes is developed and promising results are obtained.


Sign in / Sign up

Export Citation Format

Share Document