scholarly journals A Bayesian zero-inflated negative binomial regression model for the integrative analysis of microbiome data

Biostatistics ◽  
2019 ◽  
Author(s):  
Shuang Jiang ◽  
Guanghua Xiao ◽  
Andrew Y Koh ◽  
Jiwoong Kim ◽  
Qiwei Li ◽  
...  

Summary Microbiome omics approaches can reveal intriguing relationships between the human microbiome and certain disease states. Along with identification of specific bacteria taxa associated with diseases, recent scientific advancements provide mounting evidence that metabolism, genetics, and environmental factors can all modulate these microbial effects. However, the current methods for integrating microbiome data and other covariates are severely lacking. Hence, we present an integrative Bayesian zero-inflated negative binomial regression model that can both distinguish differentially abundant taxa with distinct phenotypes and quantify covariate-taxa effects. Our model demonstrates good performance using simulated data. Furthermore, we successfully integrated microbiome taxonomies and metabolomics in two real microbiome datasets to provide biologically interpretable findings. In all, we proposed a novel integrative Bayesian regression model that features bacterial differential abundance analysis and microbiome-covariate effects quantifications, which makes it suitable for general microbiome studies.

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Ahmed Nabil Shaaban ◽  
Bárbara Peleteiro ◽  
Maria Rosario O. Martins

Abstract Background This study offers a comprehensive approach to precisely analyze the complexly distributed length of stay among HIV admissions in Portugal. Objective To provide an illustration of statistical techniques for analysing count data using longitudinal predictors of length of stay among HIV hospitalizations in Portugal. Method Registered discharges in the Portuguese National Health Service (NHS) facilities Between January 2009 and December 2017, a total of 26,505 classified under Major Diagnostic Category (MDC) created for patients with HIV infection, with HIV/AIDS as a main or secondary cause of admission, were used to predict length of stay among HIV hospitalizations in Portugal. Several strategies were applied to select the best count fit model that includes the Poisson regression model, zero-inflated Poisson, the negative binomial regression model, and zero-inflated negative binomial regression model. A random hospital effects term has been incorporated into the negative binomial model to examine the dependence between observations within the same hospital. A multivariable analysis has been performed to assess the effect of covariates on length of stay. Results The median length of stay in our study was 11 days (interquartile range: 6–22). Statistical comparisons among the count models revealed that the random-effects negative binomial models provided the best fit with observed data. Admissions among males or admissions associated with TB infection, pneumocystis, cytomegalovirus, candidiasis, toxoplasmosis, or mycobacterium disease exhibit a highly significant increase in length of stay. Perfect trends were observed in which a higher number of diagnoses or procedures lead to significantly higher length of stay. The random-effects term included in our model and refers to unexplained factors specific to each hospital revealed obvious differences in quality among the hospitals included in our study. Conclusions This study provides a comprehensive approach to address unique problems associated with the prediction of length of stay among HIV patients in Portugal.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0254479
Author(s):  
Ta-Chien Chan ◽  
Jia-Hong Tang ◽  
Cheng-Yu Hsieh ◽  
Kevin J. Chen ◽  
Tsan-Hua Yu ◽  
...  

Background Sentinel physician surveillance in communities has played an important role in detecting early signs of epidemics. The traditional approach is to let the primary care physician voluntarily and actively report diseases to the health department on a weekly basis. However, this is labor-intensive work, and the spatio-temporal resolution of the surveillance data is not precise at all. In this study, we built up a clinic-based enhanced sentinel surveillance system named “Sentinel plus” which was designed for sentinel clinics and community hospitals to monitor 23 kinds of syndromic groups in Taipei City, Taiwan. The definitions of those syndromic groups were based on ICD-10 diagnoses from physicians. Methods Daily ICD-10 counts of two syndromic groups including ILI and EV-like syndromes in Taipei City were extracted from Sentinel plus. A negative binomial regression model was used to couple with lag structure functions to examine the short-term association between ICD counts and meteorological variables. After fitting the negative binomial regression model, residuals were further rescaled to Pearson residuals. We then monitored these daily standardized Pearson residuals for any aberrations from July 2018 to October 2019. Results The results showed that daily average temperature was significantly negatively associated with numbers of ILI syndromes. The ozone and PM2.5 concentrations were significantly positively associated with ILI syndromes. In addition, daily minimum temperature, and the ozone and PM2.5 concentrations were significantly negatively associated with the EV-like syndromes. The aberrational signals detected from clinics for ILI and EV-like syndromes were earlier than the epidemic period based on outpatient surveillance defined by the Taiwan CDC. Conclusions This system not only provides warning signals to the local health department for managing the risks but also reminds medical practitioners to be vigilant toward susceptible patients. The near real-time surveillance can help decision makers evaluate their policy on a timely basis.


2021 ◽  
Author(s):  
Yesuf Abdela Mustefa ◽  
Addis Belayhun

Abstract Background: Road traffic accident is a major public health as well as economic challenge that rated the eighth leading cause of death. The severity became higher in developing countries. Ethiopian is among the most confronted countries in the world. We utilized the Ethiopian Toll Roads Enterprise data to provide insights and model significant determinants of accidents involving injuries and fatalities. Besides utilizing recent dataset, we applied the most appropriate but forwent statistical model. Moreover, we examined the significance of the effects of drivers’ age and gender that have not been the cases in the literatures.Methods: We made descriptive insights available on the basis of graphs from integrated traffic accident and flow datasets. We tested for the presence of over-dispersion in a total of 1824 observations of accident data recorded from September, 2014 to December, 2019 for inferential analysis. Finally, we modeled the effects of significant variables on the number of injuries using the negative binomial regression model. Results: we found that the number of injuries in accidents were significantly determined by type of vehicles, ownership status of vehicles, accident time weather condition, driver-vehicle relationship, drivers’ level of education, and drivers’ age.Conclusions: Heavy trucks were more likely to cause more number of injuries than medium or small vehicles. Hot and windy weather conditions were associated with higher probability of the number of injuries. The likelihood of the number of injuries were lower when drivers are owner of the vehicle; drivers level of education is above secondary school; and the age of the driver is between 18 and 23 years old. Moreover, due concern needs to be given for traffic road rules.


2021 ◽  
Vol 5 (1) ◽  
pp. 1-13
Author(s):  
Yopi Ariesia Ulfa ◽  
Agus M Soleh ◽  
Bagus Sartono

Based on data from the Directorate General of Disease Prevention and Control of the Ministry of Health of the Republic of Indonesia, in 2017, new leprosy cases that emerged on Java Island were the highest in Indonesia compared to the number of events on other islands. The purpose of this study is to compare Poisson regression to a negative binomial regression model to be applied to the data on the number of new cases of leprosy and to find out what explanatory variables have a significant effect on the number of new cases of leprosy in Java. This study's results indicate that a negative binomial regression model can overcome the Poisson regression model's overdispersion. Variables that significantly affect the number of new cases of leprosy based on the results of negative binomial regression modeling are total population, percentage of children under five years who had immunized with BCG, and percentage of the population with sustainable access to clean water.


Sign in / Sign up

Export Citation Format

Share Document