fuzzy matching
Recently Published Documents


TOTAL DOCUMENTS

119
(FIVE YEARS 24)

H-INDEX

12
(FIVE YEARS 2)

Author(s):  
Ryan J. Urbanowicz ◽  
John H. Holmes ◽  
Dina Appleby ◽  
Vanamala Narasimhan ◽  
Stephen Durborow ◽  
...  

Abstract Objective Data harmonization is essential to integrate individual participant data from multiple sites, time periods, and trials for meta-analysis. The process of mapping terms and phrases to an ontology is complicated by typographic errors, abbreviations, truncation, and plurality. We sought to harmonize medical history (MH) and adverse events (AE) term records across 21 randomized clinical trials in pulmonary arterial hypertension and chronic thromboembolic pulmonary hypertension. Methods We developed and applied a semi-automated harmonization pipeline for use with domain-expert annotators to resolve ambiguous term mappings using exact and fuzzy matching. We summarized MH and AE term mapping success, including map quality measures, and imputation of a generalizing term hierarchy as defined by the applied Medical Dictionary for Regulatory Activities (MedDRA) ontology standard. Results Over 99.6% of both MH (N = 37,105) and AE (N = 58,170) records were successfully mapped to MedDRA low-level terms. Automated exact matching accounted for 74.9% of MH and 85.5% of AE mappings. Term recommendations from fuzzy matching in the pipeline facilitated annotator mapping of the remaining 24.9% of MH and 13.8% of AE records. Imputation of the generalized MedDRA term hierarchy was unambiguous in 85.2% of high-level terms, 99.4% of high-level group terms, and 99.5% of system organ class in MH, and 75% of high-level terms, 98.3% of high-level group terms, and 98.4% of system organ class in AE. Conclusion This pipeline dramatically reduced the burden of manual annotation for MH and AE term harmonization and could be adapted to other data integration efforts.


2021 ◽  
Author(s):  
Yullia Franko ◽  
Natalia Porplytsya ◽  
Mykhailo Ozhha ◽  
Olha Potapchuk ◽  
Yuriy Franko
Keyword(s):  

Author(s):  
Carole Faviez ◽  
Pierre Foulquié ◽  
Xiaoyi Chen ◽  
Adel Mebarki ◽  
Sophie Quennelle ◽  
...  

The exhaustive automatic detection of symptoms in social media posts is made difficult by the presence of colloquial expressions, misspellings and inflected forms of words. The detection of self-reported symptoms is of major importance for emergent diseases like the Covid-19. In this study, we aimed to (1) develop an algorithm based on fuzzy matching to detect symptoms in tweets, (2) establish a comprehensive list of Covid-19-related symptoms and (3) evaluate the fuzzy matching for Covid-19-related symptom detection in French tweets. The Covid-19-related symptom list was built based on the aggregation of different data sources. French Covid-19-related tweets were automatically extracted using a dedicated data broker during the first wave of the pandemic in France. The fuzzy matching parameters were finetuned using all symptoms from MedDRA and then evaluated on a subset of 5000 Covid-19-related tweets in French for the detection of symptoms from our Covid-19-related list. The fuzzy matching improved the detection by the addition of 42% more correct matches with an 81% precision.


Optik ◽  
2021 ◽  
pp. 166991
Author(s):  
Chengyu Zhu ◽  
Yuxin Li ◽  
Hang Yuan ◽  
Yulei Wang ◽  
Lingxi Liang ◽  
...  

2021 ◽  
pp. 39-62
Author(s):  
Yuanhao Wang ◽  
Qiong Huang ◽  
Hongbo Li ◽  
Meiyan Xiao ◽  
Jianye Huang ◽  
...  

2021 ◽  
pp. 909-920
Author(s):  
Qinwen Zuo ◽  
Fred Wu ◽  
Fei Yan ◽  
Shaofei Lu ◽  
Colmenares-diaz Eduardo ◽  
...  

Author(s):  
Chaoqiong Fan ◽  
Bin Li ◽  
Yi Wu ◽  
Jun Zhang ◽  
Zheng Yang ◽  
...  

2021 ◽  
pp. 572-583
Author(s):  
Jitao Zhang ◽  
Shihong Chen ◽  
Haoling Zhang ◽  
Yue Shen ◽  
Zhi Ping
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document