concept similarity
Recently Published Documents


TOTAL DOCUMENTS

118
(FIVE YEARS 7)

H-INDEX

11
(FIVE YEARS 0)

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Jirapond Muangprathub ◽  
Siriwan Kajornkasirat ◽  
Apirat Wanichsombat

This paper proposes an algorithm for document plagiarism detection using the provided incremental knowledge construction with formal concept analysis (FCA). The incremental knowledge construction is presented to support document matching between the source document in storage and the suspect document. Thus, a new concept similarity measure is also proposed for retrieving formal concepts in the knowledge construction. The presented concept similarity employs appearance frequencies in the obtained knowledge construction. Our approach can be applied to retrieve relevant information because the obtained structure uses FCA in concept form that is definable by a conjunction of properties. This measure is mathematically proven to be a formal similarity metric. The performance of the proposed similarity measure is demonstrated in document plagiarism detection. Moreover, this paper provides an algorithm to build the information structure for document plagiarism detection. Thai text test collections are used for performance evaluation of the implemented web application.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 146027-146038
Author(s):  
Shaohua Tao ◽  
Runhe Qiu ◽  
Yuan Ping ◽  
Woping Xu ◽  
Hui Ma

Author(s):  
Hongtao Huang ◽  
Cunliang Liang ◽  
Haizhi Ye

Probability information content-based FCA concepts similarity computation method relies on the frequency of concepts in corpus, it takes only the occurrence probability as information content metric to compute FCA concept similarity, which leads to lower accuracy. This article introduces a semantic information content-based method for FCA concept similarity evaluation, in addition to the occurrence probability, it takes the superordinate and subordinate semantic relationship of concepts to measure information content, which makes the generic and specific degree of concepts more accurate. Then the semantic information content similarity can be calculated with the help of an ISA hierarchy which is derived from the domain ontology. The difference between this method and probability information content is that the evaluation of semantic information content is independent of corpus. Furthermore, semantic information content can be used for FCA concept similarity evaluation, and the weighted bipartite graph is also utilized to help improve the efficiency of the similarity evaluation. The experimental results show that this semantic information content based FCA concept similarity computation method improves the accuracy of probabilistic information content based method effectively without loss of time performance.


Sign in / Sign up

Export Citation Format

Share Document