Web Information Integration and Schema Matching

In this article, we propose a succinct data structure of zero-suppressed binary decision diagrams (ZDDs). A ZDD represents sets of combinations efficiently and we can perform various set operations on the ZDD without explicitly extracting combinations. Thanks to these features, ZDDs have been applied to web information retrieval, information integration, and data mining. However, to support rich manipulation of sets of combinations and update ZDDs in the future, ZDDs need too much space, which means that there is still room to be compressed. The paper introduces a new succinct data structure, called DenseZDD, for further compressing a ZDD when we do not need to conduct set operations on the ZDD but want to examine whether a given set is included in the family represented by the ZDD, and count the number of elements in the family. We also propose a hybrid method, which combines DenseZDDs with ordinary ZDDs. By numerical experiments, we show that the sizes of our data structures are three times smaller than those of ordinary ZDDs, and membership operations and random sampling on DenseZDDs are about ten times and three times faster than those on ordinary ZDDs for some datasets, respectively.

Download Full-text

Web Information Integration Based on Compressed XML

Databases in Networked Information Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-540-39845-5_11 ◽

2003 ◽

pp. 122-137

Author(s):

Hongzhi Wang ◽

Jianzhong Li ◽

Zhenying He ◽

Jizhou Luo

Keyword(s):

Information Integration ◽

Web Information

Download Full-text

RDF-Based Web Information Integration System: A Travel System Use Case

2018 14th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS) ◽

10.1109/sitis.2018.00082 ◽

2018 ◽

Author(s):

Yenukunme Pelagie Elyse Houngue ◽

Kouessi Arafat Romaric Sagbo ◽

Kokou Yetongnon

Keyword(s):

Information Integration ◽

Use Case ◽

Integration System ◽

System A ◽

Web Information ◽

System Use

Download Full-text

AVSML: An XML-Based Markup Language for Web Information Integration in 3D Virtual Space

Intelligent Virtual Agents - Lecture Notes in Computer Science ◽

10.1007/978-3-540-74997-4_51 ◽

2007 ◽

pp. 385-386

Author(s):

Yasuhiko Kitamura ◽

Yatsuho Shibata ◽

Keisuke Tokuda ◽

Kazuki Kobayashi ◽

Noriko Nagata

Keyword(s):

Information Integration ◽

Virtual Space ◽

Markup Language ◽

Web Information

Download Full-text

A Hybrid Model Schema Matching Using Constraint-Based and Instance-Based

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i3.9790 ◽

2016 ◽

Vol 6 (3) ◽

pp. 1048 ◽

Cited By ~ 1

Author(s):

Edhy Sutanta ◽

Retantyo Wardoyo ◽

Khabib Mustofa ◽

Edi Winarko

Keyword(s):

Information Integration ◽

Model Development ◽

Analysis Data ◽

Development Stage ◽

Data Type ◽

Schema Matching ◽

P Value ◽

General Description ◽

Type Conversion ◽

Matching Mechanism

Schema matching is an important process in the Enterprise Information Integration (EII) which is at the level of the back end to solve the problems due to the schematic heterogeneity. This paper is a summary of preliminary result work of the model development stage as part of research on the development of models and prototype of hybrid schema matching that combines two methods, namely constraint-based and instance-based. The discussion includes a general description of the proposed models and the development of models, start from requirement analysis, data type conversion, matching mechanism, database support, constraints and instance extraction, matching and compute the similarity, preliminary result, user verification, verified result, dataset for testing, as well as the performance measurement. Based on result experiment on 36 datasets of heterogeneous RDBMS, it obtained the highest P value is 100.00% while the lowest is 71.43%; The highest R value is 100.00% while the lowest is 75.00%; and F-Measure highest value is 100.00% while the lowest is 81.48%. Unsuccessful matching on the model still happens, including use of an id attribute with data type as autoincrement; using codes that are defined in the same way but different meanings; and if encountered in common instance with the same definition but different meaning.

Download Full-text