gated recurrent units
Recently Published Documents


TOTAL DOCUMENTS

170
(FIVE YEARS 145)

H-INDEX

11
(FIVE YEARS 8)

2022 ◽  
Vol 16 (4) ◽  
pp. 1-55
Author(s):  
Manish Gupta ◽  
Puneet Agrawal

In recent years, the fields of natural language processing (NLP) and information retrieval (IR) have made tremendous progress thanks to deep learning models like Recurrent Neural Networks (RNNs), Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTMs) networks, and Transformer [ 121 ] based models like Bidirectional Encoder Representations from Transformers (BERT) [ 24 ], Generative Pre-training Transformer (GPT-2) [ 95 ], Multi-task Deep Neural Network (MT-DNN) [ 74 ], Extra-Long Network (XLNet) [ 135 ], Text-to-text transfer transformer (T5) [ 96 ], T-NLG [ 99 ], and GShard [ 64 ]. But these models are humongous in size. On the other hand, real-world applications demand small model size, low response times, and low computational power wattage. In this survey, we discuss six different types of methods (Pruning, Quantization, Knowledge Distillation (KD), Parameter Sharing, Tensor Decomposition, and Sub-quadratic Transformer-based methods) for compression of such models to enable their deployment in real industry NLP projects. Given the critical need of building applications with efficient and small models, and the large amount of recently published work in this area, we believe that this survey organizes the plethora of work done by the “deep learning for NLP” community in the past few years and presents it as a coherent story.


2022 ◽  
Vol 40 (2) ◽  
pp. 1-22
Author(s):  
Yue Cui ◽  
Hao Sun ◽  
Yan Zhao ◽  
Hongzhi Yin ◽  
Kai Zheng

Accurately recommending the next point of interest (POI) has become a fundamental problem with the rapid growth of location-based social networks. However, sparse, imbalanced check-in data and diverse user check-in patterns pose severe challenges for POI recommendation tasks. Knowledge-aware models are known to be primary in leveraging these problems. However, as most knowledge graphs are constructed statically, sequential information is yet integrated. In this work, we propose a meta-learned sequential-knowledge-aware recommender (Meta-SKR), which utilizes sequential, spatio-temporal, and social knowledge to recommend the next POI for a location-based social network user. The framework mainly contains four modules. First, in the graph construction module, a novel type of knowledge graph—the sequential knowledge graph, which is sensitive to the check-in order of POIs—is built to model users’ check-in patterns. To deal with the problem of data sparsity, a meta-learning module based on latent embedding optimization is then introduced to generate user-conditioned parameters of the subsequent sequential-knowledge-aware embedding module, where representation vectors of entities (nodes) and relations (edges) are learned. In this embedding module, gated recurrent units are adapted to distill intra- and inter-sequential knowledge graph information. We also design a novel knowledge-aware attention mechanism to capture information surrounding a given node. Finally, POI recommendation is provided by inferring potential links of knowledge graphs in the prediction module. Evaluations on three real-world check-in datasets show that Meta-SKR can achieve high recommendation accuracy even with sparse data.


Author(s):  
Ali Bou Nassif ◽  
Abdollah Masoud Darya ◽  
Ashraf Elnagar

This work presents a detailed comparison of the performance of deep learning models such as convolutional neural networks, long short-term memory, gated recurrent units, their hybrids, and a selection of shallow learning classifiers for sentiment analysis of Arabic reviews. Additionally, the comparison includes state-of-the-art models such as the transformer architecture and the araBERT pre-trained model. The datasets used in this study are multi-dialect Arabic hotel and book review datasets, which are some of the largest publicly available datasets for Arabic reviews. Results showed deep learning outperforming shallow learning for binary and multi-label classification, in contrast with the results of similar work reported in the literature. This discrepancy in outcome was caused by dataset size as we found it to be proportional to the performance of deep learning models. The performance of deep and shallow learning techniques was analyzed in terms of accuracy and F1 score. The best performing shallow learning technique was Random Forest followed by Decision Tree, and AdaBoost. The deep learning models performed similarly using a default embedding layer, while the transformer model performed best when augmented with araBERT.


HNO ◽  
2022 ◽  
Author(s):  
Nam Dinh Pham ◽  
Torsten Rahne

Zusammenfassung Hintergrund Zahlreiche Menschen profitieren beim Lippenlesen von den zusätzlichen visuellen Informationen aus den Lippenbewegungen des Sprechenden, was jedoch sehr fehleranfällig ist. Algorithmen zum Lippenlesen mit auf künstlichen neuronalen Netzwerken basierender künstlicher Intelligenz verbessern die Worterkennung signifikant, stehen jedoch nicht für die deutsche Sprache zur Verfügung. Material und Methoden Es wurden 1806 Videos mit jeweils nur einer deutsch sprechenden Person selektiert, in Wortsegmente unterteilt und mit einer Spracherkennungssoftware Wortklassen zugeordnet. In 38.391 Videosegmenten mit 32 Sprechenden wurden 18 mehrsilbige, visuell voneinander unterscheidbare Wörter zum Trainieren und Validieren eines neuronalen Netzwerks verwendet. Die Modelle 3D Convolutional Neural Network, Gated Recurrent Units und die Kombination beider Modelle (GRUConv) wurden ebenso verglichen wie unterschiedliche Bildausschnitte und Farbräume der Videos. Die Korrektklassifikationsrate wurde jeweils innerhalb von 5000 Trainingsepochen ermittelt. Ergebnisse Der Vergleich der Farbräume ergab keine relevant unterschiedlichen Korrektklassifikationsraten im Bereich von 69 % bis 72 %. Bei Zuschneidung auf die Lippen wurde mit 70 % eine deutlich höhere Korrektklassifikationsrate als bei Zuschnitt auf das gesamte Sprechergesicht (34 %) erreicht. Mit dem GRUConv-Modell betrugen die maximalen Korrektklassifikationsraten 87 % bei bekannten Sprechenden und 63 % in der Validierung mit unbekannten Sprechenden. Schlussfolgerung Das erstmals für die deutsche Sprache entwickelte neuronale Netzwerk zum Lippenlesen zeigt eine sehr große, mit englischsprachigen Algorithmen vergleichbare Genauigkeit. Es funktioniert auch mit unbekannten Sprechenden und kann mit mehr Wortklassen generalisiert werden.


2022 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Hamid Reza Tamaddon Jahromi ◽  
Igor Sazonov ◽  
Jason Jones ◽  
Alberto Coccarelli ◽  
Samuel Rolland ◽  
...  

Purpose The purpose of this paper is to devise a tool based on computational fluid dynamics (CFD) and machine learning (ML), for the assessment of potential airborne microbial transmission in enclosed spaces. A gated recurrent units neural network (GRU-NN) is presented to learn and predict the behaviour of droplets expelled through breaths via particle tracking data sets. Design/methodology/approach A computational methodology is used for investigating how infectious particles that originated in one location are transported by air and spread throughout a room. High-fidelity prediction of indoor airflow is obtained by means of an in-house parallel CFD solver, which uses a one equation Spalart–Allmaras turbulence model. Several flow scenarios are considered by varying different ventilation conditions and source locations. The CFD model is used for computing the trajectories of the particles emitted by human breath. The numerical results are used for the ML training. Findings In this work, it is shown that the developed ML model, based on the GRU-NN, can accurately predict the airborne particle movement across an indoor environment for different vent operation conditions and source locations. The numerical results in this paper prove that the presented methodology is able to provide accurate predictions of the time evolution of particle distribution at different locations of the enclosed space. Originality/value This study paves the way for the development of efficient and reliable tools for predicting virus airborne movement under different ventilation conditions and different human positions within an indoor environment, potentially leading to the new design. A parametric study is carried out to evaluate the impact of system settings on time variation particles emitted by human breath within the space considered.


Author(s):  
Валерий Дмитриевич Олисеенко ◽  
Максим Викторович Абрамов ◽  
Александр Львович Тулупьев

В данной статье рассмотрены две архитектуры нейронных сетей глубинного обучения — long short-term memory (LSTM) и gated recurrent units (GRU). Данные модели предлагается применить к задаче многоклассовой классификации постов пользователей социальных сетей, при этом результаты классификации используются для построения эмпирического распределения постов пользователя между классами, которое, в свою очередь, применяется в частичной автоматизации процесса оценки степени выраженности психологических особенностей пользователей. Целью исследования является повышение точности многоклассовой классификации постов пользователей посредством разработки и внедрения новых моделей второго уровня иерархического классификатора. Теоретическая значимость исследования заключается в построении новых более точных моделей классификации, которые лягут в основу моделей оценки выраженности личностных особенностей пользователей. Практическая значимость заключается в улучшении автоматизированной системы классификации постов, которая дополнит существующий прототип программы для анализа защищенности пользователей от социоинженерных атак. Новизна результата заключается в создании нового способа решения актуальной задачи автоматизированной классификации постов, позволяющего достигать большей точности классификации по отношению к существующим ранее способам. Лучший результат классификации показала модель на основе архитектуры LSTM (F1-micro 0.766, F1-macro 0.734, Accuracy 0.793).


Sign in / Sign up

Export Citation Format

Share Document