data harmonisation
Recently Published Documents


TOTAL DOCUMENTS

30
(FIVE YEARS 15)

H-INDEX

4
(FIVE YEARS 1)

2022 ◽  
Vol 22 (1) ◽  
Author(s):  
Deborah Bamber ◽  
Helen E. Collins ◽  
Charlotte Powell ◽  
Gonçalo Campos Gonçalves ◽  
Samantha Johnson ◽  
...  

Abstract Background The small sample sizes available within many very preterm (VPT) longitudinal birth cohort studies mean that it is often necessary to combine and harmonise data from individual studies to increase statistical power, especially for studying rare outcomes. Curating and mapping data is a vital first step in the process of data harmonisation. To facilitate data mapping and harmonisation across VPT birth cohort studies, we developed a custom classification system as part of the Research on European Children and Adults born Preterm (RECAP Preterm) project in order to increase the scope and generalisability of research and the evaluation of outcomes across the lifespan for individuals born VPT. Methods The multidisciplinary consortium of expert clinicians and researchers who made up the RECAP Preterm project participated in a four-phase consultation process via email questionnaire to develop a topic-specific classification system. Descriptive analyses were calculated after each questionnaire round to provide pre- and post- ratings to assess levels of agreement with the classification system as it developed. Amendments and refinements were made to the classification system after each round. Results Expert input from 23 clinicians and researchers from the RECAP Preterm project aided development of the classification system’s topic content, refining it from 10 modules, 48 themes and 197 domains to 14 modules, 93 themes and 345 domains. Supplementary classifications for target, source, mode and instrument were also developed to capture additional variable-level information. Over 22,000 individual data variables relating to VPT birth outcomes have been mapped to the classification system to date to facilitate data harmonisation. This will continue to increase as retrospective data items are mapped and harmonised variables are created. Conclusions This bespoke preterm birth classification system is a fundamental component of the RECAP Preterm project’s web-based interactive platform. It is freely available for use worldwide by those interested in research into the long term impact of VPT birth. It can also be used to inform the development of future cohort studies.


2021 ◽  
Author(s):  
Bhavani Shankara Bagepally ◽  
Usa Chaikledkaew ◽  
Nathorn Chaiyakunapruk ◽  
John Attia ◽  
Ammarin Thakkinstian

Abstract Background In the context of ever-growing health expenditure and limited resources, economic evaluations aid in making evidence-informed policy decisions. Cost utility analyses (CUA) are often used in this context, but limitations include pairwise contrasts, missing contrasts, and different sources or quality of data. Results Synthesis of CUA data from multiple studies is therefore desirable to assist policy makers, but there are many challenging methodological issues including: inconsistent reporting of results using different economic parameters, and multiple sources of heterogeneity including: setting, time horizon, perspective, modelling approaches and assumptions, currency, willingness to pay (WTP) threshold, level of country income, and input parameters. In this paper, we provide a step by step description of the methods for data harmonisation and synthesis of aggregated data from CUA studies, as well as a framework for handling heterogeneity; we demonstrate these methods using the example of agents for type 2 diabetes. Conclusion These meta-analytic methods for the synthesis of economic evidence synthesis should be useful for policy makers.


2021 ◽  
Vol 50 (Supplement_1) ◽  
Author(s):  
Kathryn Eastwood ◽  
Alison Johnson ◽  
Angela Jones ◽  
Peter Cameron ◽  
Helena Teede

Abstract Background Australia has eight state-based ambulance services and New Zealand (NZ) has two. Significant variation between their datasets compromises cross-border research opportunities and translation of research to improve patient care. Ambulance data harmonisation has occurred in the United States and United Kingdom however, to-date no data harmonisation has occurred in Australia. This study aims to compare ambulance service variables in Australia and NZ to identify opportunities and barriers for harmonisation. Method Available 2019 variables were mapped to each other and several international standardized terminology systems to identify variations and similarities in variable names and definitions, and harmonisation opportunities. Results Four Australian ambulance services used one electronic patient care record (ePCR) system, three used other ePCR systems, one used paper-based records and both NZ services used one ePCR system. Only the NZ services had mapped their variables to two international standardised terminology systems. Barriers to harmonisation included the variables collected, the variable definitions and the variable naming convention. The core variables available for mapping varied and numbered from 27-69. Differences included similar variable names having different definitions, variables that should have different definitions having the same, and naming convention for similar/same variables differing between services. Conclusions Ambulance service data harmonisation in Australia and NZ is possible and presents significant opportunities for improvement in patient outcomes and performance audit. It would facilitate quality, large-scale, high-impact collaborative national and international research. Key Message There is an opportunity for Australian and NZ ambulance services to harmonise their data to conduct large scale international research.


2021 ◽  
Author(s):  
Tiffany K Bell ◽  
Kate J Godfrey ◽  
Ashley L Ware ◽  
Keith Owen Yeates ◽  
Ashley DK Harris

Magnetic resonance spectroscopy (MRS) is a non-invasive neuroimaging technique used to measure brain chemistry in vivo and has been used to study the healthy brain as well as neuropathology in numerous neurological disorders. The number of multi-site studies using MRS are increasing; however, non-biological variability introduced during data collection across multiple sites, such as differences in scanner vendors and site-specific acquisition implementations for MRS, can obscure detection of biological effects of interest. ComBat is a data harmonisation technique that can remove non-biological sources of variance in multisite studies. It has been validated for use with structural and functional MRI metrics but not for MRS metabolites. This study investigated the validity of using ComBat to harmonize MRS metabolites for vendor and site differences. Analyses were performed using data acquired across 20 sites and included edited MRS for GABA+ (N=218) and macromolecule-suppressed GABA data (N=209), as well as standard PRESS data to quantify NAA, creatine, choline, and glutamate (N=195). ComBat harmonisation successfully mitigated vendor and site differences for all metabolites of interest. Moreover, significant associations were detected between sex and choline levels and between age and glutamate and GABA+ levels that were not detectable prior to harmonisation, confirming the importance of removing site and vendor effects in multi-site data. In conclusion, ComBat harmonisation can be successfully applied to MRS data in multi-site MRS studies.


2021 ◽  
Vol 127 ◽  
pp. 360-370
Author(s):  
Akhtar Zeb ◽  
Juha-Pekka Soininen ◽  
Nesli Sozer

Trials ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Sophie Relph ◽  
◽  
Maria Elstad ◽  
Bolaji Coker ◽  
Matias C. Vieira ◽  
...  

Abstract Background The use of electronic patient records for assessing outcomes in clinical trials is a methodological strategy intended to drive faster and more cost-efficient acquisition of results. The aim of this manuscript was to outline the data collection and management considerations of a maternity and perinatal clinical trial using data from electronic patient records, exemplifying the DESiGN Trial as a case study. Methods The DESiGN Trial is a cluster randomised control trial assessing the effect of a complex intervention versus standard care for identifying small for gestational age foetuses. Data on maternal/perinatal characteristics and outcomes including infants admitted to neonatal care, parameters from foetal ultrasound and details of hospital activity for health-economic evaluation were collected at two time points from four types of electronic patient records held in 22 different electronic record systems at the 13 research clusters. Data were pseudonymised on site using a bespoke Microsoft Excel macro and securely transferred to the central data store. Data quality checks were undertaken. Rules for data harmonisation of the raw data were developed and a data dictionary produced, along with rules and assumptions for data linkage of the datasets. The dictionary included descriptions of the rationale and assumptions for data harmonisation and quality checks. Results Data were collected on 182,052 babies from 178,350 pregnancies in 165,397 unique women. Data availability and completeness varied across research sites; each of eight variables which were key to calculation of the primary outcome were completely missing in median 3 (range 1–4) clusters at the time of the first data download. This improved by the second data download following clarification of instructions to the research sites (each of the eight key variables were completely missing in median 1 (range 0–1) cluster at the second time point). Common data management challenges were harmonising a single variable from multiple sources and categorising free-text data, solutions were developed for this trial. Conclusions Conduct of clinical trials which use electronic patient records for the assessment of outcomes can be time and cost-effective but still requires appropriate time and resources to maximise data quality. A difficulty for pregnancy and perinatal research in the UK is the wide variety of different systems used to collect patient data across maternity units. In this manuscript, we describe how we managed this and provide a detailed data dictionary covering the harmonisation of variable names and values that will be helpful for other researchers working with these data. Trial registration Primary registry and trial identifying number: ISRCTN 67698474. Registered on 02/11/16.


2021 ◽  
Author(s):  
Sophie Relph ◽  
Maria Elstad ◽  
Bolaji Coker ◽  
Matias C Vieira ◽  
Natalie Moitt ◽  
...  

Abstract Background The use of electronic patient records for assessing outcomes in clinical trials is a methodological strategy intended to drive faster and more cost-efficient acquisition of results. The aim of this manuscript was to outline the data collection and management considerations of a maternity and perinatal clinical trial using data from electronic patient records, exemplifying the DESiGN Trial as a case study. Methods The DESiGN Trial is a cluster randomised controlled trial assessing the effect of a complex intervention versus standard care for identifying small for gestational age fetuses. Data on maternal/perinatal characteristics and outcomes including infants admitted to neonatal care, parameters from fetal ultrasound and details of hospital activity for health-economic evaluation were collected at two time points from four types of electronic patient records held in 22 different electronic record systems at the 13 research clusters. Data were pseudonymised on site using a bespoke Microsoft Excel macro and securely transferred to the central data store. Data quality checks were undertaken. Rules for data harmonisation of the raw data were developed and a data dictionary produced, along with rules and assumptions for data linkage of the datasets. The dictionary included descriptions of the rationale and assumptions for data harmonisation and quality checks. Results Data were collected on 182,052 babies from 178,350 pregnancies in 165,397 unique women. Data availability and completeness varied across research sites (each of eight variables which were key to calculation of the primary outcome were completely missing in median 3 (range 1-4) clusters at the time of the first data download. This improved by the second data download following clarification of instructions to the research sites (each of the eight key variables were completely missing in median 1 (range 0-1) cluster at the second time point). Common data management challenges were harmonising a single variable from multiple sources and categorising free-text data, solutions were developed for this trial. Conclusions Conduct of clinical trials which use electronic patient records for the assessment of outcomes can be time and cost-effective but still requires appropriate time and resources to maximise data quality. A difficulty for pregnancy and perinatal research in the UK is the wide variety of different systems used to collect patient data across maternity units in the UK. In this manuscript we describe how we managed this and provide a detailed data dictionary which covers the harmonisation of variable names and variables that will be helpful for other researchers working with these data. Trial Registration: Primary registry and trial identifying number: ISRCTN 67698474. Registered 02/11/16. https://doi.org/10.1186/ISRCTN67698474


2020 ◽  
pp. jech-2020-214259
Author(s):  
Tina W Wey ◽  
Dany Doiron ◽  
Rita Wissa ◽  
Guillaume Fabre ◽  
Irina Motoc ◽  
...  

BackgroundThe MINDMAP project implemented a multinational data infrastructure to investigate the direct and interactive effects of urban environments and individual determinants of mental well-being and cognitive function in ageing populations. Using a rigorous process involving multiple teams of experts, longitudinal data from six cohort studies were harmonised to serve MINDMAP objectives. This article documents the retrospective data harmonisation process achieved based on the Maelstrom Research approach and provides a descriptive analysis of the harmonised data generated.MethodsA list of core variables (the DataSchema) to be generated across cohorts was first defined, and the potential for cohort-specific data sets to generate the DataSchema variables was assessed. Where relevant, algorithms were developed to process cohort-specific data into DataSchema format, and information to be provided to data users was documented. Procedures and harmonisation decisions were thoroughly documented.ResultsThe MINDMAP DataSchema (v2.0, April 2020) comprised a total of 2841 variables (993 on individual determinants and outcomes, 1848 on environmental exposures) distributed across up to seven data collection events. The harmonised data set included 220 621 participants from six cohorts (10 subpopulations). Harmonisation potential, participant distributions and missing values varied across data sets and variable domains.ConclusionThe MINDMAP project implemented a collaborative and transparent process to generate a rich integrated data set for research in ageing, mental well-being and the urban environment. The harmonised data set supports a range of research activities and will continue to be updated to serve ongoing and future MINDMAP research needs.


2020 ◽  
Author(s):  
Jill Evans ◽  
Paul Biggs ◽  
Mark Elliott

Abstract Background: Osteoarthritis is a heterogeneous condition characterised by a wide variety of factors and represents a worldwide healthcare challenge. There are multiple clinical and research specialisms involved in the diagnosis, prognosis and treatment of osteoarthritis, and there may be opportunities to share or pool data which are currently not being utilised. However, there are challenges to doing so which require carefully structured solutions and partnership working.Methods: Interviews were conducted with nine experts from various fields within osteoarthritis research. A semi-structured approach was used, and thematic analysis applied to the results.Results: Generally, osteoarthritis researchers were supportive of data sharing, provided it is done responsibly and without impacting data integrity. Benefits identified included increasing typically low-powered data, the potential for machine learning opportunities, and the potential for improved patient outcomes. However, a number of challenges were identified, in particular related to; data security, data harmonisation, storage costs, ethical considerations and governance.Conclusions: There is clear support for increased data sharing and partnership working in osteoarthritis research. Further investigation will be required to navigate the complex issues identified; however, it is clear that collaborative opportunities should be better facilitated and there may be innovative ways to do this. It is also clear that nomenclature within different disciplines could be better streamlined, to improve existing opportunities to harmonise data.


Sign in / Sign up

Export Citation Format

Share Document