Web Information Integration and Schema Matching

2009 ◽  
pp. 3479-3479
2013 ◽  
Vol 33 (9) ◽  
pp. 2493-2496
Author(s):  
Xueqiong LIU ◽  
Gang WU ◽  
Houping DENG

Algorithms ◽  
2018 ◽  
Vol 11 (8) ◽  
pp. 128 ◽  
Author(s):  
Shuhei Denzumi ◽  
Jun Kawahara ◽  
Koji Tsuda ◽  
Hiroki Arimura ◽  
Shin-ichi Minato ◽  
...  

In this article, we propose a succinct data structure of zero-suppressed binary decision diagrams (ZDDs). A ZDD represents sets of combinations efficiently and we can perform various set operations on the ZDD without explicitly extracting combinations. Thanks to these features, ZDDs have been applied to web information retrieval, information integration, and data mining. However, to support rich manipulation of sets of combinations and update ZDDs in the future, ZDDs need too much space, which means that there is still room to be compressed. The paper introduces a new succinct data structure, called DenseZDD, for further compressing a ZDD when we do not need to conduct set operations on the ZDD but want to examine whether a given set is included in the family represented by the ZDD, and count the number of elements in the family. We also propose a hybrid method, which combines DenseZDDs with ordinary ZDDs. By numerical experiments, we show that the sizes of our data structures are three times smaller than those of ordinary ZDDs, and membership operations and random sampling on DenseZDDs are about ten times and three times faster than those on ordinary ZDDs for some datasets, respectively.


Author(s):  
Yasuhiko Kitamura ◽  
Yatsuho Shibata ◽  
Keisuke Tokuda ◽  
Kazuki Kobayashi ◽  
Noriko Nagata

Author(s):  
Edhy Sutanta ◽  
Retantyo Wardoyo ◽  
Khabib Mustofa ◽  
Edi Winarko

Schema matching is an important process in the Enterprise Information Integration (EII) which is at the level of the back end to solve the problems due to the schematic heterogeneity. This paper is a summary of preliminary result work of the model development stage as part of research on the development of models and prototype of hybrid schema matching that combines two methods, namely constraint-based and instance-based. The discussion includes a general description of the proposed models and the development of models, start from requirement analysis, data type conversion, matching mechanism, database support, constraints and instance extraction, matching and compute the similarity, preliminary result, user verification, verified result, dataset for testing, as well as the performance measurement. Based on result experiment on 36 datasets of heterogeneous RDBMS, it obtained the highest P value is 100.00% while the lowest is 71.43%; The highest R value is 100.00% while the lowest is 75.00%; and F-Measure highest value is 100.00% while the lowest is 81.48%. Unsuccessful matching on the model still happens, including use of an id attribute with data type as autoincrement; using codes that are defined in the same way but different meanings; and if encountered in common instance with the same definition but different meaning.


Sign in / Sign up

Export Citation Format

Share Document