scholarly journals Ext-LOUDS: A Space Efficient Extended LOUDS Index for Superset Query

2020 ◽  
Vol 10 (23) ◽  
pp. 8530
Author(s):  
Lianyin Jia ◽  
Yuna Zhang ◽  
Jiaman Ding ◽  
Jinguo You ◽  
Yinong Chen ◽  
...  

Superset query is widely used in object-oriented databases, data mining, and many other fields. Trie is an efficient index for superset query, whereas most existing trie index aim at improving query performance while ignoring storage overheads. To solve this problem, in this paper, we propose an efficient extended Level-Ordered Unary Degree Sequence (LOUDS) index: Ext-LOUDS. Ext-LOUDS expresses a trie by 1 integer vector and 3 bit vectors directly map each NodeID to its corresponding position, thus accelerating some key operations needed for superset query. Based on Ext-LOUDS, an efficient superset query algorithm, ELOUDS-Super, is designed. Experimental results on both real and synthetic datasets show that Ext-LOUDS can decrease 50%–60% space overheads compared with trie while maintaining a relative good query performance.

Author(s):  
Rafal Angryk ◽  
Roy Ladner ◽  
Frederick E. Petry

In this chapter, we consider the application of generalization-based data mining to fuzzy similarity-based object-oriented databases (OODBs). Attribute generalization algorithms have been most commonly applied to relational databases, and we extend these approaches. A key aspect of generalization data mining is the use of a concept hierarchy. The objects of the database are generalized by replacing specific attribute values by the next higher-level term in the hierarchy. This will then eventually result in generalizations that represent a summarization of the information in the database. We focus on the generalization of similarity-based simple fuzzy attributes for an OODB using approaches to the fuzzy concept hierarchy developed from the given similarity relation of the database. Then consideration is given to applying this approach to complex structure-valued data in the fuzzy OODB.


1998 ◽  
Vol 25 (1-2) ◽  
pp. 55-97 ◽  
Author(s):  
Jiawei Han ◽  
Shojiro Nishio ◽  
Hiroyuki Kawano ◽  
Wei Wang

2008 ◽  
pp. 2121-2140
Author(s):  
Rafal Angryk ◽  
Roy Ladner ◽  
Frederick E. Petry

In this chapter, we consider the application of generalization-based data mining to fuzzy similarity-based object-oriented databases (OODBs). Attribute generalization algorithms have been most commonly applied to relational databases, and we extend these approaches. A key aspect of generalization data mining is the use of a concept hierarchy. The objects of the database are generalized by replacing specific attribute values by the next higher-level term in the hierarchy. This will then eventually result in generalizations that represent a summarization of the information in the database. We focus on the generalization of similarity-based simple fuzzy attributes for an OODB using approaches to the fuzzy concept hierarchy developed from the given similarity relation of the database. Then consideration is given to applying this approach to complex structure-valued data in the fuzzy OODB.


Data ◽  
2020 ◽  
Vol 6 (1) ◽  
pp. 1
Author(s):  
Ahmed Elmogy ◽  
Hamada Rizk ◽  
Amany M. Sarhan

In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics.


2011 ◽  
Vol 403-408 ◽  
pp. 1834-1838
Author(s):  
Jing Zhao ◽  
Chong Zhao Han ◽  
Bin Wei ◽  
De Qiang Han

Discretization of continuous attributes have played an important role in machine learning and data mining. They can not only improve the performance of the classifier, but also reduce the space of the storage. Univariate Marginal Distribution Algorithm is a modified Evolutionary Algorithms, which has some advantages over classical Evolutionary Algorithms such as the fast convergence speed and few parameters need to be tuned. In this paper, we proposed a bottom-up, global, dynamic, and supervised discretization method on the basis of Univariate Marginal Distribution Algorithm.The experimental results showed that the proposed method could effectively improve the accuracy of classifier.


1996 ◽  
Vol 11 (2) ◽  
pp. 191-192 ◽  
Author(s):  
Stefan Conrad

For the first time, post-conference workshops were organised for the International Conference on Deductive and Object-Oriented Databases (DOOD). There were two workshops focusing on knowledge discovery and temporal reasoning. This report is dedicated to one dealing with temporal reasoning.


Sign in / Sign up

Export Citation Format

Share Document