scholarly journals Temporally Grounding Language Queries in Videos by Contextual Boundary-Aware Prediction

2020 ◽  
Vol 34 (07) ◽  
pp. 12168-12175 ◽  
Author(s):  
Jingwen Wang ◽  
Lin Ma ◽  
Wenhao Jiang

The task of temporally grounding language queries in videos is to temporally localize the best matched video segment corresponding to a given language (sentence). It requires certain models to simultaneously perform visual and linguistic understandings. Previous work predominantly ignores the precision of segment localization. Sliding window based methods use predefined search window sizes, which suffer from redundant computation, while existing anchor-based approaches fail to yield precise localization. We address this issue by proposing an end-to-end boundary-aware model, which uses a lightweight branch to predict semantic boundaries corresponding to the given linguistic information. To better detect semantic boundaries, we propose to aggregate contextual information by explicitly modeling the relationship between the current element and its neighbors. The most confident segments are subsequently selected based on both anchor and boundary predictions at the testing stage. The proposed model, dubbed Contextual Boundary-aware Prediction (CBP), outperforms its competitors with a clear margin on three public datasets.

Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1589
Author(s):  
Yongkeun Hwang ◽  
Yanghoon Kim ◽  
Kyomin Jung

Neural machine translation (NMT) is one of the text generation tasks which has achieved significant improvement with the rise of deep neural networks. However, language-specific problems such as handling the translation of honorifics received little attention. In this paper, we propose a context-aware NMT to promote translation improvements of Korean honorifics. By exploiting the information such as the relationship between speakers from the surrounding sentences, our proposed model effectively manages the use of honorific expressions. Specifically, we utilize a novel encoder architecture that can represent the contextual information of the given input sentences. Furthermore, a context-aware post-editing (CAPE) technique is adopted to refine a set of inconsistent sentence-level honorific translations. To demonstrate the efficacy of the proposed method, honorific-labeled test data is required. Thus, we also design a heuristic that labels Korean sentences to distinguish between honorific and non-honorific styles. Experimental results show that our proposed method outperforms sentence-level NMT baselines both in overall translation quality and honorific translations.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Gaihua Wang ◽  
Qianyu Zhai

AbstractContextual information is a key factor affecting semantic segmentation. Recently, many methods have tried to use the self-attention mechanism to capture more contextual information. However, these methods with self-attention mechanism need a huge computation. In order to solve this problem, a novel self-attention network, called FFANet, is designed to efficiently capture contextual information, which reduces the amount of calculation through strip pooling and linear layers. It proposes the feature fusion (FF) module to calculate the affinity matrix. The affinity matrix can capture the relationship between pixels. Then we multiply the affinity matrix with the feature map, which can selectively increase the weight of the region of interest. Extensive experiments on the public datasets (PASCAL VOC2012, CityScapes) and remote sensing dataset (DLRSD) have been conducted and achieved Mean Iou score 74.5%, 70.3%, and 63.9% respectively. Compared with the current typical algorithms, the proposed method has achieved excellent performance.


2021 ◽  
Vol 2021 ◽  
pp. 1-20
Author(s):  
Gulfam Shahzadi ◽  
Fariha Zafar ◽  
Maha Abdullah Alghamdi

Fermatean fuzzy set (FFS) is a more efficient, flexible, and generalized model to deal with uncertainty, as compared to intuitionistic and Pythagorean fuzzy models. This research article presents a novel multiple-attribute decision-making (MADM) technique based on FFS. Aggregation operators (AOs), for example, Dombi, Einstein, and Hamacher, are frequently being used in the MADM process and are considered useful tools for evaluating the given alternatives. Among these, one of the most effective is the Hamacher operator. The salient feature of this operator is that it reduces the impact of negative information and provides more accurate results. Motivated by the primary characteristics of the Hamacher operator, we apply Hamacher interactive aggregation operators based on FFSs to determine the best alternative. Using Hamacher’s norm operations, we introduce some new geometric operators, namely, Fermatean fuzzy Hamacher interactive weighted geometric (FFHIWG) operator, Fermatean fuzzy Hamacher interactive ordered weighted geometric (FFHIOWG) operator, and Fermatean fuzzy Hamacher interactive hybrid weighted geometric (FFHIHWG) operator. Some important results and properties of the proposed AOs are discussed, and to achieve the optimal alternative, the proposed MADM technique is carried out in a real-life application of the medical field. An algorithm of the proposed technique is also developed. The significance of the proposed method is that Fermatean fuzzy Hamacher interactive geometric (FFHIG) operators deal with the relationship among belongingness degree (BD) and nonbelongingness degree (NBD) of the objects, which perform a crucial role in decision-making (DM). At last, to show the exactness and validity of the proposed work, a comparative analysis of the proposed model and the existing models is presented.


2010 ◽  
Vol 15 (2) ◽  
pp. 121-131 ◽  
Author(s):  
Remus Ilies ◽  
Timothy A. Judge ◽  
David T. Wagner

This paper focuses on explaining how individuals set goals on multiple performance episodes, in the context of performance feedback comparing their performance on each episode with their respective goal. The proposed model was tested through a longitudinal study of 493 university students’ actual goals and performance on business school exams. Results of a structural equation model supported the proposed conceptual model in which self-efficacy and emotional reactions to feedback mediate the relationship between feedback and subsequent goals. In addition, as expected, participants’ standing on a dispositional measure of behavioral inhibition influenced the strength of their emotional reactions to negative feedback.


Symmetry ◽  
2021 ◽  
Vol 13 (5) ◽  
pp. 753
Author(s):  
Ivan Chajda ◽  
Helmut Länger

In order to be able to use methods of universal algebra for investigating posets, we assigned to every pseudocomplemented poset, to every relatively pseudocomplemented poset and to every sectionally pseudocomplemented poset, a certain algebra (based on a commutative directoid or on a λ-lattice) which satisfies certain identities and implications. We show that the assigned algebras fully characterize the given corresponding posets. A certain kind of symmetry can be seen in the relationship between the classes of mentioned posets and the classes of directoids and λ-lattices representing these relational structures. As we show in the paper, this relationship is fully symmetric. Our results show that the assigned algebras satisfy strong congruence properties which can be transferred back to the posets. We also mention applications of such posets in certain non-classical logics.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 708
Author(s):  
Wenbo Liu ◽  
Fei Yan ◽  
Jiyong Zhang ◽  
Tao Deng

The quality of detected lane lines has a great influence on the driving decisions of unmanned vehicles. However, during the process of unmanned vehicle driving, the changes in the driving scene cause much trouble for lane detection algorithms. The unclear and occluded lane lines cannot be clearly detected by most existing lane detection models in many complex driving scenes, such as crowded scene, poor light condition, etc. In view of this, we propose a robust lane detection model using vertical spatial features and contextual driving information in complex driving scenes. The more effective use of contextual information and vertical spatial features enables the proposed model more robust detect unclear and occluded lane lines by two designed blocks: feature merging block and information exchange block. The feature merging block can provide increased contextual information to pass to the subsequent network, which enables the network to learn more feature details to help detect unclear lane lines. The information exchange block is a novel block that combines the advantages of spatial convolution and dilated convolution to enhance the process of information transfer between pixels. The addition of spatial information allows the network to better detect occluded lane lines. Experimental results show that our proposed model can detect lane lines more robustly and precisely than state-of-the-art models in a variety of complex driving scenarios.


Author(s):  
Santosh Kumar Mishra ◽  
Rijul Dhir ◽  
Sriparna Saha ◽  
Pushpak Bhattacharyya

Image captioning is the process of generating a textual description of an image that aims to describe the salient parts of the given image. It is an important problem, as it involves computer vision and natural language processing, where computer vision is used for understanding images, and natural language processing is used for language modeling. A lot of works have been done for image captioning for the English language. In this article, we have developed a model for image captioning in the Hindi language. Hindi is the official language of India, and it is the fourth most spoken language in the world, spoken in India and South Asia. To the best of our knowledge, this is the first attempt to generate image captions in the Hindi language. A dataset is manually created by translating well known MSCOCO dataset from English to Hindi. Finally, different types of attention-based architectures are developed for image captioning in the Hindi language. These attention mechanisms are new for the Hindi language, as those have never been used for the Hindi language. The obtained results of the proposed model are compared with several baselines in terms of BLEU scores, and the results show that our model performs better than others. Manual evaluation of the obtained captions in terms of adequacy and fluency also reveals the effectiveness of our proposed approach. Availability of resources : The codes of the article are available at https://github.com/santosh1821cs03/Image_Captioning_Hindi_Language ; The dataset will be made available: http://www.iitp.ac.in/∼ai-nlp-ml/resources.html .


2021 ◽  
pp. 1-17
Author(s):  
J. Shobana ◽  
M. Murali

Text Sentiment analysis is the process of predicting whether a segment of text has opinionated or objective content and analyzing the polarity of the text’s sentiment. Understanding the needs and behavior of the target customer plays a vital role in the success of the business so the sentiment analysis process would help the marketer to improve the quality of the product as well as a shopper to buy the correct product. Due to its automatic learning capability, deep learning is the current research interest in Natural language processing. Skip-gram architecture is used in the proposed model for better extraction of the semantic relationships as well as contextual information of words. However, the main contribution of this work is Adaptive Particle Swarm Optimization (APSO) algorithm based LSTM for sentiment analysis. LSTM is used in the proposed model for understanding complex patterns in textual data. To improve the performance of the LSTM, weight parameters are enhanced by presenting the Adaptive PSO algorithm. Opposition based learning (OBL) method combined with PSO algorithm becomes the Adaptive Particle Swarm Optimization (APSO) classifier which assists LSTM in selecting optimal weight for the environment in less number of iterations. So APSO - LSTM ‘s ability in adjusting the attributes such as optimal weights and learning rates combined with the good hyper parameter choices leads to improved accuracy and reduces losses. Extensive experiments were conducted on four datasets proved that our proposed APSO-LSTM model secured higher accuracy over the classical methods such as traditional LSTM, ANN, and SVM. According to simulation results, the proposed model is outperforming other existing models.


Author(s):  
Huimin Lu ◽  
Rui Yang ◽  
Zhenrong Deng ◽  
Yonglin Zhang ◽  
Guangwei Gao ◽  
...  

Chinese image description generation tasks usually have some challenges, such as single-feature extraction, lack of global information, and lack of detailed description of the image content. To address these limitations, we propose a fuzzy attention-based DenseNet-BiLSTM Chinese image captioning method in this article. In the proposed method, we first improve the densely connected network to extract features of the image at different scales and to enhance the model’s ability to capture the weak features. At the same time, a bidirectional LSTM is used as the decoder to enhance the use of context information. The introduction of an improved fuzzy attention mechanism effectively improves the problem of correspondence between image features and contextual information. We conduct experiments on the AI Challenger dataset to evaluate the performance of the model. The results show that compared with other models, our proposed model achieves higher scores in objective quantitative evaluation indicators, including BLEU , BLEU , METEOR, ROUGEl, and CIDEr. The generated description sentence can accurately express the image content.


Information ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 79 ◽  
Author(s):  
Xiaoyu Han ◽  
Yue Zhang ◽  
Wenkai Zhang ◽  
Tinglei Huang

Relation extraction is a vital task in natural language processing. It aims to identify the relationship between two specified entities in a sentence. Besides information contained in the sentence, additional information about the entities is verified to be helpful in relation extraction. Additional information such as entity type getting by NER (Named Entity Recognition) and description provided by knowledge base both have their limitations. Nevertheless, there exists another way to provide additional information which can overcome these limitations in Chinese relation extraction. As Chinese characters usually have explicit meanings and can carry more information than English letters. We suggest that characters that constitute the entities can provide additional information which is helpful for the relation extraction task, especially in large scale datasets. This assumption has never been verified before. The main obstacle is the lack of large-scale Chinese relation datasets. In this paper, first, we generate a large scale Chinese relation extraction dataset based on a Chinese encyclopedia. Second, we propose an attention-based model using the characters that compose the entities. The result on the generated dataset shows that these characters can provide useful information for the Chinese relation extraction task. By using this information, the attention mechanism we used can recognize the crucial part of the sentence that can express the relation. The proposed model outperforms other baseline models on our Chinese relation extraction dataset.


Sign in / Sign up

Export Citation Format

Share Document