concept classes
Recently Published Documents


TOTAL DOCUMENTS

35
(FIVE YEARS 5)

H-INDEX

10
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Hyeoneui Kim ◽  
Jinsun Jung ◽  
Jisung Choi

BACKGROUND Dietary habits offer crucial information on one's health and form a considerable part of the Patient-Generated Health Data (PGHD). Dietary data are collected through various channels and formats; thus, interoperability is a significant challenge to reusing the data. The vast scope of dietary concepts and colloquial style of expression add difficulty to the standardization task. Common Data Elements (CDE) with metadata annotation and ontological structuring of dietary concepts address the interoperability issues of dietary data to some extent. However, challenges remaining in making culture-specific dietary habits and questionnaire-based dietary assessment data interoperable require additional efforts. OBJECTIVE The main goal of this study was to address the interoperability challenge in dietary concepts by combining ontological curation of dietary concepts and metadata annotation of questionnaire-based dietary data. Specifically, this study aimed to develop a Dietary Lifestyle Ontology (DILON) and demonstrated the improved interoperability of questionnaire-based dietary data by annotating its main semantics with DILON. METHODS By analyzing 1158 dietary assessment data elements (367 in Korean and 791 in English), 515 dietary concepts were extracted and used to construct DILON. To demonstrate the utility of DILON in improving the interoperability of multi-cultural questionnaire-based dietary data, ten Competency Questions (CQs) were developed that identified data elements that share the same dietary topics and measurement qualities. As the test cases, 68 dietary habit data elements from Korean and English questionnaires were instantiated and annotated with the dietary concepts in DILON. The competency questions were translated into Semantic Query-enhanced Web Rule Language (SQWRL), and the query results were reviewed for accuracy. RESULTS DILON was built with 260 concept classes and 486 instances and successfully validated with ontology validation tools. A small overlap (72 concepts) in the concepts extracted from the questionnaires in two languages indicates the need to pay closer attention to representing culture-specific dietary concepts. The SQWRL queries reflecting the 10 CQs yielded the correct results. CONCLUSIONS Ensuring the interoperability of dietary lifestyle data is a demanding task due to its vast scope and variations in expression. This study demonstrated that, when combined with common data elements and semantic metadata annotation, ontology can effectively mediate the interoperability of dietary data generated in different cultural contexts and expressed in various styles.


2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Matthias C. Caro ◽  
Ishaun Datta

AbstractWe characterize the expressive power of quantum circuits with the pseudo-dimension, a measure of complexity for probabilistic concept classes. We prove pseudo-dimension bounds on the output probability distributions of quantum circuits; the upper bounds are polynomial in circuit depth and number of gates. Using these bounds, we exhibit a class of circuit output states out of which at least one has exponential gate complexity of state preparation, and moreover demonstrate that quantum circuits of known polynomial size and depth are PAC-learnable.


Computers ◽  
2020 ◽  
Vol 9 (4) ◽  
pp. 79
Author(s):  
Graham Spinks ◽  
Marie-Francine Moens

This paper proposes a novel technique for representing templates and instances of concept classes. A template representation refers to the generic representation that captures the characteristics of an entire class. The proposed technique uses end-to-end deep learning to learn structured and composable representations from input images and discrete labels. The obtained representations are based on distance estimates between the distributions given by the class label and those given by contextual information, which are modeled as environments. We prove that the representations have a clear structure allowing decomposing the representation into factors that represent classes and environments. We evaluate our novel technique on classification and retrieval tasks involving different modalities (visual and language data). In various experiments, we show how the representations can be compressed and how different hyperparameters impact performance.


2020 ◽  
Vol 17 (163) ◽  
pp. 20190612
Author(s):  
Ludwig Lausser ◽  
Robin Szekely ◽  
Attila Klimmek ◽  
Florian Schmid ◽  
Hans A. Kestler

Analysing molecular profiles requires the selection of classification models that can cope with the high dimensionality and variability of these data. Also, improper reference point choice and scaling pose additional challenges. Often model selection is somewhat guided by ad hoc simulations rather than by sophisticated considerations on the properties of a categorization model. Here, we derive and report four linked linear concept classes/models with distinct invariance properties for high-dimensional molecular classification. We can further show that these concept classes also form a half-order of complexity classes in terms of Vapnik–Chervonenkis dimensions, which also implies increased generalization abilities. We implemented support vector machines with these properties. Surprisingly, we were able to attain comparable or even superior generalization abilities to the standard linear one on the 27 investigated RNA-Seq and microarray datasets. Our results indicate that a priori chosen invariant models can replace ad hoc robustness analysis by interpretable and theoretically guaranteed properties in molecular categorization.


Author(s):  
Yu. I. Shokin ◽  
A. V. Yurchenko

Introduction: Storage and usage of research data become more sophisticated as their quantity and diversity grow. Research data have a number of features which do not allow you to copy the approaches and tools used in commercial or governmental data-processing facilities. Providing researchers with specialized tools for working with data is an urgent task in research management.Purpose: Identifying and describing the basic principles for working with research data, the processes and stages of this work, the mechanisms for implementing the principles and solving the problems of organizing the storage and usage of research data.Results: We review and discuss the principles on which the storage and usage of research data can be based, including the FAIR Data Principles. The main goal of organizing the work with research data and the central focus of its principles is the effective use and reuse of this data. We present a hierarchy of mechanisms which can be applied when working with research data for solving scientific and organizational problems. The main processes and lifecycle stages of scientific data and research processes based on them are listed in the article. A number of well-known models of such lifecycles are considered. It is proposed, instead of trying to build a universal model, to use or create models based on the presented list of stages for specific cases or classes of data-driven research.Practical relevance: The hierarchy of concept classes developed in the work for the field “Organizing the storage and usage of scientific data” will be used as an ontology core, and for the development of regulatory documents, recommendations and information systems supporting data-driven research.


2009 ◽  
Vol 03 (04) ◽  
pp. 421-444 ◽  
Author(s):  
LIN LIN ◽  
MEI-LING SHYU

Two important approaches in multimedia information retrieval are classification and the ranking of the retrieved results. The technique of performing classification using Association Rule Mining (ARM) has been utilized to detect the high-level features from the video, taking advantages of its high efficiency and accuracy. Motivated by the fact that the users are only interested in the top-ranked relevant results, ranking strategies have been adopted to sort the retrieved results. In this paper, an effective and efficient video high-level semantic retrieval framework that utilizes associations and correlations to retrieve and rank the high-level features is developed. The n-feature-value pair rules are generated using a combined measure based on (1) the existence of the (n - 1)-feature-value pairs, where n is larger than 1, (2) the correlation between different n-feature-value pairs and the concept classes through Multiple Correspondence Analysis (MCA), and (3) the similarity representing the harmonic mean of the inter-similarity and intra-similarity. The final association classification rules are selected by using the calculated similarity values. Then our proposed ranking process uses the scores that integrate the correlation and similarity values to rank the retrieved results. To show the robustness of the proposed framework, experiments with 15 high-level features (concepts) and benchmark data sets from TRECVID and comparisons with 6 other well-known classifiers are presented. Our proposed framework achieves promising performance and outperforms all the other classifiers. Moreover, the final ranked retrieved results are evaluated by the mean average precision measure, which is commonly used for performance evaluation in the TRECVID community.


Sign in / Sign up

Export Citation Format

Share Document