relevance measure
Recently Published Documents


TOTAL DOCUMENTS

46
(FIVE YEARS 10)

H-INDEX

7
(FIVE YEARS 1)

This paper addresses the problem of automatic recognition of out-of-topic documents from a small set of similar documents that are expected to be on some common topic. The objective is to remove documents of noise from a set. A topic model based classification framework is proposed for the task of discovering out-of-topic documents. This paper introduces a new concept of annotated {\it search engine suggests}, where this paper takes whichever search queries were used to search for a page as representations of content in that page. This paper adopted word embedding to create distributed representation of words and documents, and perform similarity comparison on search engine suggests. It is shown that search engine suggests can be highly accurate semantic representations of textual content and demonstrate that our document analysis algorithm using such representation for relevance measure gives satisfactory performance in terms of in-topic content filtering compared to the baseline technique of topic probability ranking.


Author(s):  
Chen Zhao ◽  
Takehito Utsuro ◽  
Yasuhide Kawada

This paper addresses the problem of automatic recognition of out-of-topic documents from a small set of similar documents that are expected to be on some common topic. The objective is to remove documents of noise from a set. A topic model based classification framework is proposed for the task of discovering out-of-topic documents. This paper introduces a new concept of annotated {\it search engine suggests}, where this paper takes whichever search queries were used to search for a page as representations of content in that page. This paper adopted word embedding to create distributed representation of words and documents, and perform similarity comparison on search engine suggests. It is shown that search engine suggests can be highly accurate semantic representations of textual content and demonstrate that our document analysis algorithm using such representation for relevance measure gives satisfactory performance in terms of in-topic content filtering compared to the baseline technique of topic probability ranking.


Author(s):  
Jorge L. Villacís ◽  
Jesús de la Fuente ◽  
Concepción Naval

A renewed interest in the study of character and virtue has recently emerged in the fields of Education and Psychology. The latest research has confirmed the association between virtuous consistent behaviours and academic positive outcomes. However, the motivational dimension of character (the intentions underlying the patterns of observed behaviours) has received little attention. This research aims to extend the knowledge on this topic by examining the predictive relationships between the behavioural and motivational dimensions of character, with reference to academic engagement, career self-doubt and performance of Spanish university students. A total of 183 undergraduates aged 18–30 (142 of whom were women) from the north of Spain completed specific parts of self-report questionnaires, including the Values in Action VIA-72, a Spanish translated and validated version of the Moral Self-Relevance Measure MSR, and the Utrecht Work Engagement Student Scale UWES-S9. The collected data were analysed using Structural Equation Modelling. The behavioural dimension of character (character strength factors of caring, self-control and inquisitiveness) showed positive associations with academic engagement and performance. The motivational dimension of character (phronesis motivation), was negatively related to career self-doubt. For the first time, the present study has provided support for the contribution of both dimensions of character to undergraduate academic outcomes.


PLoS ONE ◽  
2021 ◽  
Vol 16 (6) ◽  
pp. e0252991
Author(s):  
Werner A. Stahel

The p-value has been debated exorbitantly in the last decades, experiencing fierce critique, but also finding some advocates. The fundamental issue with its misleading interpretation stems from its common use for testing the unrealistic null hypothesis of an effect that is precisely zero. A meaningful question asks instead whether the effect is relevant. It is then unavoidable that a threshold for relevance is chosen. Considerations that can lead to agreeable conventions for this choice are presented for several commonly used statistical situations. Based on the threshold, a simple quantitative measure of relevance emerges naturally. Statistical inference for the effect should be based on the confidence interval for the relevance measure. A classification of results that goes beyond a simple distinction like “significant / non-significant” is proposed. On the other hand, if desired, a single number called the “secured relevance” may summarize the result, like the p-value does it, but with a scientifically meaningful interpretation.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Hongqing Fang ◽  
Pei Tang ◽  
Hao Si

In this paper, maximal relevance measure and minimal redundancy maximal relevance (mRMR) algorithm (under D-R and D/R criteria) have been applied to select features and to compose different features subsets based on observed motion sensor events for human activity recognition in smart home environments. And then, the selected features subsets have been evaluated and the activity recognition accuracy rates have been compared with two probabilistic algorithms: naïve Bayes (NB) classifier and hidden Markov model (HMM). The experimental results show that not all features are beneficial to human activity recognition and different features subsets yield different human activity recognition accuracy rates. Furthermore, even the same features subset has different effect on human activity recognition accuracy rate for different activity classifiers. It is significant for researchers performing human activity recognition to consider both relevance between features and activities and redundancy among features. Generally, both maximal relevance measure and mRMR algorithm are feasible for feature selection and positive to activity recognition.


2020 ◽  
Vol 10 (2) ◽  
Author(s):  
Essam F. Alnatsheh

This paper describes methodology and performance of an experimental research on filtering of web search results. Filtering was performed on the basis of predicted relevance of search results derived from users’ implicit feedback. The feedback was obtained from users’ web browsers and consisted of a set of browsing behavioral metrics, including reading time, clicks on links, mouse pointer and wheel movement patterns, bookmarking, sharing, copying, and whether the search was continued after the page was closed. A multi-layer neural network used to infer from the behaviors how much the user was interested in each filtered document. Neural network, therefore, performed deep learning without human supervision. Predicted relevance measure was compared to the explicit feedback. Obtained results of 89% correct relevance rating prediction suggest that selected set of metrics was successful in terms of correctly predict how relevant the web page was for the user involved in the study. More research is recommended for further advances of information filtering methods. 


The linguistic and statistical information extraction is an important aspect of text processing. The extraction of Multi Word Expression (MWEs) plays a key role in text processing as these are used to find correct meaning of a text phrase. MWEs are the lexical phrases consisting of two or more words conveying some different meaning together other than its constituent words. The linguistics in MWEs extraction is mainly related to the text information including the Part of Speech (POS) tags, grammar rules, related literature, and so on. It is important to extract the correct MWEs for a particular language as there exists variety and veracity in languages. The selection of MWEs are based on the statistical analysis of the MWEs extraction process. In the proposed work, the MWEs extraction is done for English dataset. Along with the existing statistical measures, i.e. Pointwise Mutual Information (PMI), Dice Coefficient (DC) and Modified Dice Coefficient (MDC), the additional measures, Lexical Fixedness (LF), Syntactic Fixedness (SF) and Relevance Measure (RM) are also been evaluated. The results are compared with the other existing approaches applied for English MWEs. The results shows that the proposed measures LF, SF and RM are more significant than existing measures to find the best statistics for the MWEs extraction process. The process model is generic in nature and not adhered to a particular language. It can also be applied for other languages by selecting POS tags for that particular language.


2019 ◽  
Author(s):  
Carlos Sevilla-Salcedo ◽  
Vanessa Gómez-Verdejo ◽  
Jussi Tohka ◽  

AbstractA fundamental problem of supervised learning algorithms for brain imaging applications is that the number of features far exceeds the number of subjects. In this paper, we propose a combined feature selection and extraction approach for multiclass problems. This method starts with a bagging procedure which calculates the sign consistency of the multivariate analysis (MVA) projection matrix feature-wise to determine the relevance of each feature. This relevance measure provides a parsimonious matrix, which is combined with a hypothesis test to automatically determine the number of selected features. Then, a novel MVA regularized with the sign and magnitude consistency of the features is used to generate a reduced set of summary components providing a compact data description.We evaluated the proposed method with two multiclass brain imaging problems: 1) the classification of the elderly subjects in four classes (cognitively normal, stable mild cognitive impairment (MCI), MCI converting to AD in 3 years, and Alzheimer’s disease) based on structural brain imaging data from the ADNI cohort; 2) the classification of children in 3 classes (typically developing, and 2 types of Attention Deficit/Hyperactivity Disorder (ADHD)) based on functional connectivity. Experimental results confirmed that each brain image (defined by 29.852 features in the ADNI database and 61.425 in the ADHD) could be represented with only 30 – 45% of the original features. Furthermore, this information could be redefined into two or three summary components, providing not only a gain of interpretability but also classification rate improvements when compared to state-of-art reference methods.


2019 ◽  
Vol 19 (2) ◽  
pp. 146-158 ◽  
Author(s):  
S. R. Mani Sekhar ◽  
G. M. Siddesh ◽  
Sunilkumar S. Manvi ◽  
K. G. Srinivasa

Abstract In the fast growing of digital technologies, crawlers and search engines face unpredictable challenges. Focused web-crawlers are essential for mining the boundless data available on the internet. Web-Crawlers face indeterminate latency problem due to differences in their response time. The proposed work attempts to optimize the designing and implementation of Focused Web-Crawlers using Master-Slave architecture for Bioinformatics web sources. Focused Crawlers ideally should crawl only relevant pages, but the relevance of the page can only be estimated after crawling the genomics pages. A solution for predicting the page relevance, which is based on Natural Language Processing, is proposed in the paper. The frequency of the keywords on the top ranked sentences of the page determines the relevance of the pages within genomics sources. The proposed solution uses a TextRank algorithm to rank the sentences, as well as ensuring the correct classification of Bioinformatics web page. Finally, the model is validated by being compared with a breadth first search web-crawler. The comparison shows significant reduction in run time for the same harvest rate.


Sign in / Sign up

Export Citation Format

Share Document