scholarly journals GENETEX—a GENomics Report TEXt mining R package and Shiny application designed to capture real-world clinico-genomic data

JAMIA Open ◽  
2021 ◽  
Vol 4 (3) ◽  
Author(s):  
David M Miller ◽  
Sophia Z Shalhout

Abstract Objectives Clinico-genomic data (CGD) acquired through routine clinical practice has the potential to improve our understanding of clinical oncology. However, these data often reside in heterogeneous and semistructured data, resulting in prolonged time-to-analyses. Materials and Methods We created GENETEX: an R package and Shiny application for text mining genomic reports from electronic health record (EHR) and direct import into Research Electronic Data Capture (REDCap). Results GENETEX facilitates the abstraction of CGD from EHR and streamlines the capture of structured data into REDCap. Its functions include natural language processing of key genomic information, transformation of semistructured data into structured data, and importation into REDCap. When evaluated with manual abstraction, GENETEX had >99% agreement and captured CGD in approximately one-fifth the time. Conclusions GENETEX is freely available under the Massachusetts Institute of Technology license and can be obtained from GitHub (https://github.com/TheMillerLab/genetex). GENETEX is executed in R and deployed as a Shiny application for non-R users. It produces high-fidelity abstraction of CGD in a fraction of the time.

10.2196/20773 ◽  
2020 ◽  
Vol 22 (8) ◽  
pp. e20773 ◽  
Author(s):  
Antoine Neuraz ◽  
Ivan Lerner ◽  
William Digan ◽  
Nicolas Paris ◽  
Rosy Tsopra ◽  
...  

Background A novel disease poses special challenges for informatics solutions. Biomedical informatics relies for the most part on structured data, which require a preexisting data or knowledge model; however, novel diseases do not have preexisting knowledge models. In an emergent epidemic, language processing can enable rapid conversion of unstructured text to a novel knowledge model. However, although this idea has often been suggested, no opportunity has arisen to actually test it in real time. The current coronavirus disease (COVID-19) pandemic presents such an opportunity. Objective The aim of this study was to evaluate the added value of information from clinical text in response to emergent diseases using natural language processing (NLP). Methods We explored the effects of long-term treatment by calcium channel blockers on the outcomes of COVID-19 infection in patients with high blood pressure during in-patient hospital stays using two sources of information: data available strictly from structured electronic health records (EHRs) and data available through structured EHRs and text mining. Results In this multicenter study involving 39 hospitals, text mining increased the statistical power sufficiently to change a negative result for an adjusted hazard ratio to a positive one. Compared to the baseline structured data, the number of patients available for inclusion in the study increased by 2.95 times, the amount of available information on medications increased by 7.2 times, and the amount of additional phenotypic information increased by 11.9 times. Conclusions In our study, use of calcium channel blockers was associated with decreased in-hospital mortality in patients with COVID-19 infection. This finding was obtained by quickly adapting an NLP pipeline to the domain of the novel disease; the adapted pipeline still performed sufficiently to extract useful information. When that information was used to supplement existing structured data, the sample size could be increased sufficiently to see treatment effects that were not previously statistically detectable.


2020 ◽  
Author(s):  
Antoine Neuraz ◽  
Ivan Lerner ◽  
William Digan ◽  
Nicolas Paris ◽  
Rosy Tsopra ◽  
...  

BACKGROUND A novel disease poses special challenges for informatics solutions. Biomedical informatics relies for the most part on structured data, which require a preexisting data or knowledge model; however, novel diseases do not have preexisting knowledge models. In an emergent epidemic, language processing can enable rapid conversion of unstructured text to a novel knowledge model. However, although this idea has often been suggested, no opportunity has arisen to actually test it in real time. The current coronavirus disease (COVID-19) pandemic presents such an opportunity. OBJECTIVE The aim of this study was to evaluate the added value of information from clinical text in response to emergent diseases using natural language processing (NLP). METHODS We explored the effects of long-term treatment by calcium channel blockers on the outcomes of COVID-19 infection in patients with high blood pressure during in-patient hospital stays using two sources of information: data available strictly from structured electronic health records (EHRs) and data available through structured EHRs and text mining. RESULTS In this multicenter study involving 39 hospitals, text mining increased the statistical power sufficiently to change a negative result for an adjusted hazard ratio to a positive one. Compared to the baseline structured data, the number of patients available for inclusion in the study increased by 2.95 times, the amount of available information on medications increased by 7.2 times, and the amount of additional phenotypic information increased by 11.9 times. CONCLUSIONS In our study, use of calcium channel blockers was associated with decreased in-hospital mortality in patients with COVID-19 infection. This finding was obtained by quickly adapting an NLP pipeline to the domain of the novel disease; the adapted pipeline still performed sufficiently to extract useful information. When that information was used to supplement existing structured data, the sample size could be increased sufficiently to see treatment effects that were not previously statistically detectable.


Crisis ◽  
2013 ◽  
Vol 34 (6) ◽  
pp. 434-437 ◽  
Author(s):  
Donald W. MacKenzie

Background: Suicide clusters at Cornell University and the Massachusetts Institute of Technology (MIT) prompted popular and expert speculation of suicide contagion. However, some clustering is to be expected in any random process. Aim: This work tested whether suicide clusters at these two universities differed significantly from those expected under a homogeneous Poisson process, in which suicides occur randomly and independently of one another. Method: Suicide dates were collected for MIT and Cornell for 1990–2012. The Anderson-Darling statistic was used to test the goodness-of-fit of the intervals between suicides to distribution expected under the Poisson process. Results: Suicides at MIT were consistent with the homogeneous Poisson process, while those at Cornell showed clustering inconsistent with such a process (p = .05). Conclusions: The Anderson-Darling test provides a statistically powerful means to identify suicide clustering in small samples. Practitioners can use this method to test for clustering in relevant communities. The difference in clustering behavior between the two institutions suggests that more institutions should be studied to determine the prevalence of suicide clustering in universities and its causes.


Author(s):  
Ashraf M. Salama

With an acceptance rate that does not exceed 25% of the total papers and articles submitted to the journal, IJAR – International Journal of Architectural Research is moving forward to position itself among the leading journals in architecture and urban studies worldwide. As this is the case since the beginning of volume 5, issue 1, March 2011, one must note that the journal has been covered by several data and index bases since its inception including Avery Index to Architectural Periodicals, EBSCO-Current Abstracts-Art and Architecture, INTUTE, Directory of Open Access Journals, Pro-Quest, Scopus-Elsevier and many university library databases across the globe. This is coupled with IJAR being an integral part of the archives and a featured collection of ArchNet and the Aga Khan Documentation Centre at MIT: Massachusetts Institute of Technology, Cambridge, MA.In 2014, IJAR was included in Quartile 2 / Q2 list of Journals both in ‘Architecture’ and ‘Urban Studies.’ As of May 2015, IJAR is ranked 23 out of 83 journals in ‘Architecture’ and 59 out of 119 in ‘Urban Studies.’ Rankings are based on the SJR (SCImago Journal Ranking); an Elsevier- SCOPUS indicator that measures the scientific influence of the average article in a journal. SJR is a measure of scientific influence of scholarly journals that accounts for both the number of citations received by a journal and the importance or prestige of the journals where such citations come from. See here for more information (http://www.scimagojr.com/index.php) and (http://www.journalmetrics.com/sjr.php). While the journal is now on top of many of the distinguished journals in Elsevier- SCOPUS database, we will keep aspiring to sustain our position and move forward to Q1 group list and eventually in the top 10 journal list in the field. However, this requires sustained efforts and conscious endeavours that give attention to quality submissions through a rigorous review process. This edition of IJAR: volume 9, issue 2, July 2015 includes debates on a wide spectrum of issues, explorations and investigations in various settings. The issue encompasses sixteen papers addressing cities, settlements, and projects in Europe, South East Asia, and the Middle East. Papers involve international collaborations evidenced by joint contributions and come from scholars in universities, academic institutions, and practices in Belgium; Egypt; Greece; Italy; Jordan; Malaysia; Palestine; Qatar; Saudi Arabia; Serbia; Spain; Turkey; and the United Kingdom. In this editorial I briefly outline the key issues presented in these papers, which include topics relevant to social housing, multigenerational dwelling, practice-based research, sustainable design and biomimetic models, learning environments and learning styles, realism and the post modern condition, development and planning, urban identity, contemporary landscapes, and cultural values and traditions.


Author(s):  
GERARDO REYES GUZMÁN

Rudiger Dornbusch, destacado economista del Massachusetts Institute of Technology (MIT), analiza en esta trascendental obra tópicos como inflación, deuda, tipos de cambio, política externa y mercados emergentes. El marco conceptual descansa en la corriente de la escuela de Chicago, la cual parte del principio de que el mercado es el mecanismo que garantiza la creación del progreso en contraste con el Estado, que en su afán por encontrar soluciones perfectas, fracasa regularmente en sus cometidos.


2021 ◽  
pp. 1-13
Author(s):  
Lamiae Benhayoun ◽  
Daniel Lang

BACKGROUND: The renewed advent of Artificial Intelligence (AI) is inducing profound changes in the classic categories of technology professions and is creating the need for new specific skills. OBJECTIVE: Identify the gaps in terms of skills between academic training on AI in French engineering and Business Schools, and the requirements of the labour market. METHOD: Extraction of AI training contents from the schools’ websites and scraping of a job advertisements’ website. Then, analysis based on a text mining approach with a Python code for Natural Language Processing. RESULTS: Categorization of occupations related to AI. Characterization of three classes of skills for the AI market: Technical, Soft and Interdisciplinary. Skills’ gaps concern some professional certifications and the mastery of specific tools, research abilities, and awareness of ethical and regulatory dimensions of AI. CONCLUSIONS: A deep analysis using algorithms for Natural Language Processing. Results that provide a better understanding of the AI capability components at the individual and the organizational levels. A study that can help shape educational programs to respond to the AI market requirements.


Sign in / Sign up

Export Citation Format

Share Document