semantic similarity measurement Latest Research Papers

Semantic mining is always a challenge for big biomedical text data. Ontology has been widely proved and used to extract semantic information. However, the process of ontology-based semantic similarity calculation is so complex that it cannot measure the similarity for big text data. To solve this problem, we propose a parallelized semantic similarity measurement method based on Hadoop MapReduce for big text data. At first, we preprocess and extract the semantic features from documents. Then, we calculate the document semantic similarity based on ontology network structure under MapReduce framework. Finally, based on the generated semantic document similarity, document clusters are generated via clustering algorithms. To validate the effectiveness, we use two kinds of open datasets. The experimental results show that the traditional methods can hardly work for more than ten thousand biomedical documents. The proposed method keeps efficient and accurate for big dataset and is of high parallelism and scalability.

Download Full-text

Capability Language Processing (CLP): Classification and Ranking of Manufacturing Suppliers Based on Unstructured Capability Data

10.1115/detc2021-71308 ◽

2021 ◽

Author(s):

Kimia Zandbiglari ◽

Farhad Ameri ◽

Mohammad Javadi

Keyword(s):

Language Processing ◽

Organizational Capabilities ◽

Automated Classification ◽

Text Analytics ◽

Natural Language Text ◽

Knowledge Organization System ◽

Semantic Similarity Measurement ◽

Overlapping Classes ◽

Ranked List ◽

Language Text

Abstract The unstructured data available on the websites of manufacturing suppliers can provide useful insights into the technological and organizational capabilities of manufacturers. However, since the data is often represented in an unstructured form using natural language text, it is difficult to efficiently search and analyze the capability data and learn from it. The objective of this work is to propose a set of text analytics techniques to enable automated classification and ranking of suppliers based on their capability narratives. The supervised classification and semantic similarity measurement methods used in this research are supported by a formal thesaurus that uses SKOS (Simple Knowledge Organization System) for its syntax and semantics. Normalized Google Distance (NGD) was used as a metric for measuring the relatedness of terms. The proposed framework was validated experimentally using a hypothetical search scenario. The results indicate that the generated ranked list shows a high correlation with human judgment specially if the query concept vector and supplier concept vector belong to the same class. However, the correlation decreases when multiple overlapping classes of suppliers are mixed together. The findings of this research can be used to improve the precision and reliability of Capability Language Processing (CLP) tools and methods.

Download Full-text

A novel model for semantic similarity measurement based on wordnet and word embedding

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202337 ◽

2021 ◽

pp. 1-12

Author(s):

Fuqiang Zhao ◽

Zhengyu Zhu ◽

Ping Han

Keyword(s):

Vector Space ◽

Semantic Similarity ◽

Semantic Information ◽

Word Embedding ◽

Similarity Measurement ◽

New Model ◽

Part Of Speech ◽

Benchmark Datasets ◽

Semantic Similarity Measurement ◽

Novel Model

To measure semantic similarity between words, a novel model DFRVec that encodes multiple semantic information of a word in WordNet into a vector space is presented in this paper. Firstly, three different sub-models are proposed: 1) DefVec: encoding the definitions of a word in WordNet; 2) FormVec: encoding the part-of-speech (POS) of a word in WordNet; 3) RelVec: encoding the relations of a word in WordNet. Then by combining the three sub-models with an existing word embedding, the new model for generating the vector of a word is proposed. Finally, based on DFRVec and the path information in WordNet, a new method DFRVec+Path to measure semantic similarity between words is presented. The experiments on ten benchmark datasets show that DFRVec+Path can outperform many existing methods on semantic similarity measurement.

Download Full-text

A Hybrid Semantic Similarity Measurement for Geospatial Entities

Microprocessors and Microsystems ◽

10.1016/j.micpro.2020.103526 ◽

2021 ◽

Vol 80 ◽

pp. 103526

Author(s):

Liangang Wang ◽

Feng Zhang ◽

Zhenhong Du ◽

Yongpei Chen ◽

Chuanrong Zhang ◽

...

Keyword(s):

Semantic Similarity ◽

Similarity Measurement ◽

Semantic Similarity Measurement

Download Full-text

A Novel Neurofuzzy Approach for Semantic Similarity Measurement

10.1007/978-3-030-86534-4_18 ◽

2021 ◽

pp. 192-203

Author(s):

Jorge Martinez-Gil ◽

Riad Mokadem ◽

Josef Küng ◽

Abdelkader Hameurlain

Keyword(s):

Semantic Similarity ◽

Similarity Measurement ◽

Semantic Similarity Measurement

Download Full-text

Cross-Modal Learning Based on Semantic Correlation and Multi-Task Learning for Text-Video Retrieval

Electronics ◽

10.3390/electronics9122125 ◽

2020 ◽

Vol 9 (12) ◽

pp. 2125

Author(s):

Xiaoyu Wu ◽

Tiantian Wang ◽

Shengjin Wang

Keyword(s):

Video Retrieval ◽

Feature Space ◽

Retrieval Algorithm ◽

Learning Networks ◽

Semantic Association ◽

Semantic Consistency ◽

Semantic Correlation ◽

Task Learning ◽

Feature Subspace ◽

Semantic Similarity Measurement

Text-video retrieval tasks face a great challenge in the semantic gap between cross modal information. Some existing methods transform the text or video into the same subspace to measure their similarity. However, this kind of method does not consider adding a semantic consistency constraint when associating the two modalities of semantic encoding, and the associated result is poor. In this paper, we propose a multi-modal retrieval algorithm based on semantic association and multi-task learning. Firstly, the multi-level features of video or text are extracted based on multiple deep learning networks, so that the information of the two modalities can be fully encoded. Then, in the public feature space where the two modalities information are mapped together, we propose a semantic similarity measurement and semantic consistency classification based on text-video features for a multi-task learning framework. With the semantic consistency classification task, the learning of semantic association task is restrained. So multi-task learning guides the better feature mapping of two modalities and optimizes the construction of unified feature subspace. Finally, the experimental results of our proposed algorithm on the Microsoft Video Description dataset (MSVD) and MSR-Video to Text (MSR-VTT) are better than the existing research, which prove that our algorithm can improve the performance of cross-modal retrieval.

Download Full-text

Semantic Similarity for English and Arabic Texts: A Review

Journal of Information & Knowledge Management ◽

10.1142/s0219649220500331 ◽

2020 ◽

Vol 19 (04) ◽

pp. 2050033

Author(s):

Marwah Alian ◽

Arafat Awajan

Keyword(s):

Semantic Similarity ◽

Language Processing ◽

Pearson Correlation ◽

Plagiarism Detection ◽

Embedding Technique ◽

Semantic Similarity Measurement ◽

Feature Based ◽

Arabic And English ◽

Descriptive Feature ◽

Degree Of Similarity

Semantic similarity is the task of measuring relations between sentences or words to determine the degree of similarity or resemblance. Several applications of natural language processing require semantic similarity measurement to achieve good results; these applications include plagiarism detection, text entailment, text summarisation, paraphrasing identification, and information extraction. Many researchers have proposed new methods to measure the semantic similarity of Arabic and English texts. In this research, these methods are reviewed and compared. Results show that the precision of the corpus-based approach exceeds 0.70. The precision of the descriptive feature-based technique is between 0.670 and 0.86, with a Pearson correlation coefficient of over 0.70. Meanwhile, the word embedding technique has a correlation of 0.67, and its accuracy is in the range 0.76–0.80. The best results are achieved by the feature-based approach.

Download Full-text

A novel method based on symbolic regression for interpretable semantic similarity measurement

Expert Systems with Applications ◽

10.1016/j.eswa.2020.113663 ◽

2020 ◽

Vol 160 ◽

pp. 113663

Author(s):

Jorge Martinez-Gil ◽

Jose M. Chaves-Gonzalez

Keyword(s):

Semantic Similarity ◽

Symbolic Regression ◽

Similarity Measurement ◽

Semantic Similarity Measurement ◽

Novel Method

Download Full-text

Structural and Semantic Similarity Measurement of UML Use Case Diagram

Lontar Komputer Jurnal Ilmiah Teknologi Informasi ◽

10.24843/lkjiti.2020.v11.i02.p03 ◽

2020 ◽

Vol 11 (2) ◽

pp. 88

Author(s):

Mohammad Nazir Arifin ◽

Daniel Siahaan

Keyword(s):

Semantic Similarity ◽

Early Stage ◽

Structural Similarity ◽

Similarity Measurement ◽

Use Case ◽

Software Artifacts ◽

Measurement Results ◽

Semantic Similarity Measurement ◽

Coefficient Measurement ◽

Use Case Diagram

Reusing software has several benefits ranging from reducing cost and risk, accelerating development, and its primary purposes are improving software quality. In the early stage of software development, reusing existing software artifacts may increase the benefit of reusing software because it uses mature artifacts from previous artifacts. One of software artifacts is diagram, and in order to assist the reusing diagram is to find the level of similarity of diagrams. This paper proposes a method for measuring the similarity of the use case diagram using structural and semantic aspects. For structural similarity measurement, Graph Edit Distance is used by transforming each factor and use case into a graph, while for semantic similarity measurement, WordNet, WuPalmer,and Levenshtein were used. The experimentation was conducted on ten datasets from variousprojects. The results of the method were compared with the results of assessments from experts.The measurement of agreement between experts and method was done by using Gwet’s AC1 andPearson correlation coefficient. Measurement results with Gwet’s AC1 diagram similarity are 0,60,which were categorized as “moderate" agreement and the result of measurement with Pearsonis 0.506 which means there is a significant correlation between experts and methods. The resultshowed that the proposed method can be used to find the similarity of the diagram, so finding andreuse of the diagram as a software component can be optimized.

Download Full-text

SEMANTIC SIMILARITY MEASUREMENT FOR MALAY WORDS USING WORDNET BAHASA AND WIKIPEDIA BAHASA MELAYU: ISSUES AND PROPOSED SOLUTIONS

International Journal of Computer Systems & Software Engineering ◽

10.15282/ijsecs.6.1.2020.4.0067 ◽

2020 ◽

Vol 6 (1) ◽

pp. 25-40

Author(s):

Tuan Norhafizah Tuan Zakaria ◽

◽

Mohd Juzaiddin Ab Aziz ◽

Mohd Rosmadi Mokhtar ◽

Saadiyah Darus ◽

...

Keyword(s):

Semantic Similarity ◽

Similarity Measurement ◽

Semantic Similarity Measurement

Download Full-text

semantic similarity measurement
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

An Efficient Parallelized Ontology Network-Based Semantic Similarity Measure for Big Biomedical Document Clustering

Capability Language Processing (CLP): Classification and Ranking of Manufacturing Suppliers Based on Unstructured Capability Data

A novel model for semantic similarity measurement based on wordnet and word embedding

A Hybrid Semantic Similarity Measurement for Geospatial Entities

A Novel Neurofuzzy Approach for Semantic Similarity Measurement

Cross-Modal Learning Based on Semantic Correlation and Multi-Task Learning for Text-Video Retrieval

Semantic Similarity for English and Arabic Texts: A Review

A novel method based on symbolic regression for interpretable semantic similarity measurement

Structural and Semantic Similarity Measurement of UML Use Case Diagram

SEMANTIC SIMILARITY MEASUREMENT FOR MALAY WORDS USING WORDNET BAHASA AND WIKIPEDIA BAHASA MELAYU: ISSUES AND PROPOSED SOLUTIONS

Export Citation Format

semantic similarity measurementRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

An Efficient Parallelized Ontology Network-Based Semantic Similarity Measure for Big Biomedical Document Clustering

Capability Language Processing (CLP): Classification and Ranking of Manufacturing Suppliers Based on Unstructured Capability Data

A novel model for semantic similarity measurement based on wordnet and word embedding

A Hybrid Semantic Similarity Measurement for Geospatial Entities

A Novel Neurofuzzy Approach for Semantic Similarity Measurement

Cross-Modal Learning Based on Semantic Correlation and Multi-Task Learning for Text-Video Retrieval

Semantic Similarity for English and Arabic Texts: A Review

A novel method based on symbolic regression for interpretable semantic similarity measurement

Structural and Semantic Similarity Measurement of UML Use Case Diagram

SEMANTIC SIMILARITY MEASUREMENT FOR MALAY WORDS USING WORDNET BAHASA AND WIKIPEDIA BAHASA MELAYU: ISSUES AND PROPOSED SOLUTIONS

semantic similarity measurement
Recently Published Documents