Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition

Computational Intelligence and Neuroscience ◽

10.1155/2021/8890808 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Zhengkui Weng ◽

Zhipeng Jin ◽

Shuangxi Chen ◽

Quanquan Shen ◽

Xiangyang Ren ◽

...

Keyword(s):

Action Recognition ◽

Geodesic Distance ◽

Weighted Graph ◽

High Dimensionality ◽

Motion Data ◽

Minimal Geodesic ◽

Temporal Encoding ◽

Temporal Redundancy ◽

Independent Motion

Convolutional neural network (CNN) has been leaping forward in recent years. However, the high dimensionality, rich human dynamic characteristics, and various kinds of background interference increase difficulty for traditional CNNs in capturing complicated motion data in videos. A novel framework named the attention-based temporal encoding network (ATEN) with background-independent motion mask (BIMM) is proposed to achieve video action recognition here. Initially, we introduce one motion segmenting approach on the basis of boundary prior by associating with the minimal geodesic distance inside a weighted graph that is not directed. Then, we propose one dynamic contrast segmenting strategic procedure for segmenting the object that moves within complicated environments. Subsequently, we build the BIMM for enhancing the object that moves based on the suppression of the not relevant background inside the respective frame. Furthermore, we design one long-range attention system inside ATEN, capable of effectively remedying the dependency of sophisticated actions that are not periodic in a long term based on the more automatic focus on the semantical vital frames other than the equal process for overall sampled frames. For this reason, the attention mechanism is capable of suppressing the temporal redundancy and highlighting the discriminative frames. Lastly, the framework is assessed by using HMDB51 and UCF101 datasets. As revealed from the experimentally achieved results, our ATEN with BIMM gains 94.5% and 70.6% accuracy, respectively, which outperforms a number of existing methods on both datasets.

Download Full-text

Modeling Long-Term Interactions to Enhance Action Recognition

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9412148 ◽

2021 ◽

Author(s):

Alejandro Cartas ◽

Petia Radeva ◽

Mariella Dimiccoli

Keyword(s):

Action Recognition

Download Full-text

Wearable Wireless Physiological Monitoring System Based on Multi-Sensor

Electronics ◽

10.3390/electronics10090986 ◽

2021 ◽

Vol 10 (9) ◽

pp. 986

Author(s):

Hongru Li ◽

Guiling Sun ◽

Yue Li ◽

Runzhuo Yang

Keyword(s):

Monitoring System ◽

Health Management ◽

Wearable Technology ◽

Physiological Monitoring ◽

Motion Data ◽

Low Load ◽

Measure Heart Rate ◽

Body Surface Temperature ◽

Continuous Work

The purpose of wearable technology is to use multimedia, sensors, and wireless communication to integrate specific technology into user clothes or accessories. With the help of various sensors, the physiological monitoring system can collect, process, and transmit physiological signals without causing damage. Wearable technology has been widely used in patient monitoring and people’s health management because of its low-load, mobile, and easy-to-use characteristics, and it supports long-term continuous work and can carry out wireless transmissions. In this paper, we established a Wi-Fi-based physiological monitoring system that can accurately measure heart rate, body surface temperature, and motion data and can quickly detect and alert the user about abnormal heart rates.

Download Full-text

Timing, Storage, and Comparison of Stimulus Duration Engage Discrete Anatomical Components of a Perceptual Timing Network

Journal of Cognitive Neuroscience ◽

10.1162/jocn.2008.20153 ◽

2008 ◽

Vol 20 (12) ◽

pp. 2185-2197 ◽

Cited By ~ 99

Author(s):

Jennifer T. Coull ◽

Bruno Nazarian ◽

Franck Vidal

Keyword(s):

Stimulus Duration ◽

Brain Activity ◽

Temporal Discrimination ◽

Superior Temporal Gyrus ◽

Motor Preparation ◽

Long Term Memory ◽

Temporal Encoding ◽

Perceptual Timing ◽

The Right

The temporal discrimination paradigm requires subjects to compare the duration of a probe stimulus to that of a sample previously stored in working or long-term memory, thus providing an index of timing that is independent of a motor response. However, the estimation process itself comprises several component cognitive processes, including timing, storage, retrieval, and comparison of durations. Previous imaging studies have attempted to disentangle these components by simply measuring brain activity during early versus late scanning epochs. We aim to improve the temporal resolution and precision of this approach by using rapid event-related functional magnetic resonance imaging to time-lock the hemodynamic response to presentation of the sample and probe stimuli themselves. Compared to a control (color-estimation) task, which was matched in terms of difficulty, sustained attention, and motor preparation requirements, we found selective activation of the left putamen for the storage (“encoding”) of stimulus duration into working memory (WM). Moreover, increased putamen activity was linked to enhanced timing performance, suggesting that the level of putamen activity may modulate the depth of temporal encoding. Retrieval and comparison of stimulus duration in WM selectively activated the right superior temporal gyrus. Finally, the supplementary motor area was equally active during both sample and probe stages of the task, suggesting a fundamental role in timing the duration of a stimulus that is currently unfolding in time.

Download Full-text

Learning Long-Term Dependencies for Action Recognition with a Biologically-Inspired Deep Network

2017 IEEE International Conference on Computer Vision (ICCV) ◽

10.1109/iccv.2017.84 ◽

2017 ◽

Cited By ~ 11

Author(s):

Yemin Shi ◽

Yonghong Tian ◽

Yaowei Wang ◽

Wei Zeng ◽

Tiejun Huang

Keyword(s):

Action Recognition ◽

Biologically Inspired ◽

Deep Network

Download Full-text

3D Action Recognition and Long-Term Prediction of Human Motion

Lecture Notes in Computer Science - Computer Vision Systems ◽

10.1007/978-3-540-79547-6_3 ◽

2008 ◽

pp. 23-32 ◽

Cited By ~ 15

Author(s):

Markus Hahn ◽

Lars Krüger ◽

Christian Wöhler

Keyword(s):

Action Recognition ◽

Human Motion ◽

Term Prediction ◽

Long Term Prediction

Download Full-text

Kinect-Based Limb Rehabilitation Methods

Data Analytics in Medicine ◽

10.4018/978-1-7998-1204-3.ch052 ◽

2020 ◽

pp. 1006-1022

Author(s):

Yongji Yang ◽

Zhiguo Xiao ◽

Furen Jiang

Keyword(s):

Real Time ◽

Action Recognition ◽

Health Informatics ◽

Role Model ◽

Suggested Approach ◽

Motion Data ◽

Time Simulation ◽

Limb Rehabilitation ◽

Real Time Information ◽

Time Information

Within the context of health informatics, this article discusses how real-time information of human skeleton movements can be conveniently captured via the use of Kinect-based deep sensor. It highlights an effective action recognition method, using Unity3D to form virtual models of characters and scenes. Kinect somatosensory cameras can now identify one's motion data and provide feedback on the virtual role model to complete the avatar drive. This is to achieve real-time simulation modeling as well as drive operation instructions based on the design and identification verification to complement limb rehabilitation. The hardware of such a health informatics-related system is simple, inexpensive, and highly meaningful in term of augmenting user experience. In conclusion, the suggested approach can contribute significantly towards aiding physical rehabilitation therapy

Download Full-text

Deep spectral feature pyramid in the frequency domain for long-term action recognition

Journal of Visual Communication and Image Representation ◽

10.1016/j.jvcir.2019.102650 ◽

2019 ◽

Vol 64 ◽

pp. 102650

Author(s):

Gaoyun An ◽

Zhenxing Zheng ◽

Dapeng Wu ◽

Wen Zhou

Keyword(s):

Frequency Domain ◽

Action Recognition ◽

Spectral Feature ◽

Feature Pyramid

Download Full-text

Action recognition based on 2D skeletons extracted from RGB videos

MATEC Web of Conferences ◽

10.1051/matecconf/201927702034 ◽

2019 ◽

Vol 277 ◽

pp. 02034

Author(s):

Sophie Aubry ◽

Sohaib Laraba ◽

Joëlle Tilmanne ◽

Thierry Dutoit

Keyword(s):

Neural Networks ◽

Image Classification ◽

Action Recognition ◽

State Of The Art ◽

Video Stream ◽

Motion Data ◽

Rgb Images ◽

Human Pose ◽

2D Images ◽

Made In

In this paper a methodology to recognize actions based on RGB videos is proposed which takes advantages of the recent breakthrough made in deep learning. Following the development of Convolutional Neural Networks (CNNs), research was conducted on the transformation of skeletal motion data into 2D images. In this work, a solution is proposed requiring only the use of RGB videos instead of RGB-D videos. This work is based on multiple works studying the conversion of RGB-D data into 2D images. From a video stream (RGB images), a two-dimension skeleton of 18 joints for each detected body is extracted with a DNN-based human pose estimator called OpenPose. The skeleton data are encoded into Red, Green and Blue channels of images. Different ways of encoding motion data into images were studied. We successfully use state-of-the-art deep neural networks designed for image classification to recognize actions. Based on a study of the related works, we chose to use image classification models: SqueezeNet, AlexNet, DenseNet, ResNet, Inception, VGG and retrained them to perform action recognition. For all the test the NTU RGB+D database is used. The highest accuracy is obtained with ResNet: 83.317% cross-subject and 88.780% cross-view which outperforms most of state-of-the-art results.

Download Full-text