scholarly journals Interpreting Poisson Regression Models in Dental Caries Studies

2018 ◽  
Vol 52 (4) ◽  
pp. 339-345 ◽  
Author(s):  
Alex Man Him Chau ◽  
Edward Chin Man Lo ◽  
May Chun Mei Wong ◽  
Chun Hung Chu

Oral epidemiology involves studying and investigating the distribution and determinants of dental-related diseases in a specified population group to inform decisions in the management of health problems. In oral epidemiology studies, the hypothesis is typically followed by a cogent study design and data collection. Appropriate statistical analysis is essential to demonstrate the scientific association between the independent factors and the target variable. Analysis also helps to develop and build a statistical model. Poisson regression and its extensions have gained more attention in caries epidemiology than other working models such as logistic regression. This review discusses the fundamental principles and basic knowledge of Poisson regression models. It also introduces the use of a robust variance estimator with a focus on the “robust” interpretation of the model. In addition, extensions of regression models, including the zero-inflated model, hurdle model, and negative binomial model, and their interpretation in caries studies are reviewed. Principles of model fitting, including goodness-of-fit measures, are also discussed. Clinicians and researchers should pay attention to the statistical context of the models used and interpret the models to improve the oral and general health of the communities in which they live.

Author(s):  
Robin Flowerdew

Most statistical analysis is based on the assumption that error is normally distributed, but many data sets are based on discrete data (the number of migrants from one place to another must be a whole number). Recent developments in statistics have often involved generalising methods so that they can be properly applied to non-normal data. For example, Nelder and Wedderburn (1972) developed the theory of generalised linear modelling, where the dependent or response variable can take a variety of different probability distributions linked in one of several possible ways to a linear predictor, based on a combination of independent or explanatory variables. Several common statistical techniques are special cases of the generalised linear models, including the usual form of regression analysis, Ordinary Least Squares regression, and binomial logit modelling. Another important special case is Poisson regression, which has a Poisson-distributed dependent variable, linked logarithmically to a linear combination of independent variables. Poisson regression may be an appropriate method when the dependent variable is constrained to be a non-negative integer, usually a count of the number of events in certain categories. It assumes that each event is independent of the others, though the probability of an event may be linked to available explanatory variables. This chapter illustrates how Poisson regression can be carried out using the Stata package, proceeding to discuss various problems and issues which may arise in the use of the method. The number of migrants from area i to area j must be a non-negative integer and is likely to vary according to zone population, distance and economic variables. The availability of high-quality migration data through the WICID facility permits detailed analysis at levels from the region to the output areas. A vast range of possible explanatory variables can also be derived from the 2001 Census data. Model results are discussed in terms of the significant explanatory variables, the overall goodness of fit and the big residuals. Comparisons are drawn with other analytic techniques such as OLS regression. The relationship to Wilson’s entropy maximising methods is described, and variants on the method are explained. These include negative binomial regression and zero-censored and zero-truncated models.


Author(s):  
Samuel Olorunfemi Adams ◽  
Muhammad Ardo Bamanga ◽  
Samuel Olayemi Olanrewaju ◽  
Haruna Umar Yahaya ◽  
Rafiu Olayinka Akano

COVID-19 is currently threatening countries in the world. Presently in Nigeria, there are about 29,286 confirmed cases, 11,828 discharged and 654 deaths as of 6th July 2020. It is against this background that this study was targeted at modeling daily cases of COVID-19’s deaths in Nigeria using count regression models like; Poisson Regression (PR), Negative Binomial Regression (NBR) and Generalized Poisson Regression (GPR) model. The study aim at fitting an appropriate count Regression model to the confirmed, active and critical cases of COVID-19 in Nigeria after 118 days. The data for the study was extracted from the daily COVID-19 cases update released by the Nigeria Centre for Disease Control (NCDC) online database from February 28th, 2020 – 6th, July 2020. The extracted data were used in the simulation of Poisson, Negative Binomial, and Generalized Poisson Regression model with a program written in STATA version 14 and fitted to the data at a 5% significance level. The best model was selected based on the values of -2logL, AIC, and BIC selection test/criteria. The results obtained from the analysis revealed that the Poisson regression could not capture over-dispersion, so other forms of Poisson Regression models such as the Negative Binomial Regression and Generalized Poisson Regression were used in the estimation. Of the three count Regression models, Generalized Poisson Regression was the best model for fitting daily cumulative confirmed, active and critical COVID-19 cases in Nigeria when overdispersion is present in the predictors because it had the least -2log-Likelihood, AIC, and BIC. It was also discovered that active and critical cases have a positive and significant effect on the number of COVID-19 related deaths in Nigeria.


2021 ◽  
Author(s):  
AGMAS SISAY ABERA ◽  
HUNACHEW KIBRET YOHANNIS

Abstract Background: Under-five mortality rate, often known by its acronym U5MR, indicates the probability of dying between birth and five years of age, expressed per 1,000 live births. Globally, 16,000 children under-five still die every day. Especially in Sub-Saharan Africa every 1 child in 12, dying before his or her fifth birthday. This study aims to identify the determinants of under-five mortality among women in child bearing age group of Tach-Armachiho district using count regression models. Methods: For achieving the objective, a two stage random sampling technique (simple random sampling and systematic random sampling techniques in the first and second stages respectively) was used to select women respondents. The sample survey conducted in Tach-Armachiho district considered a total of 3815 households of women aged 15 to 49 years out of which the information was collected from 446 selected women through interviewer administrated questionnaire. Results: The descriptive statistics result showed that in the district 16.6% of mothers have faced the problem of at least one under-five death. In this study, Poisson regression, negative binomial, zero-inflated Poisson and zero-inflated negative binomial regression models were applied for data analysis. Each of these count models were compared by different statistical tests. So that, zero-inflated poisson regression model was found to be the best fit for the collected data. Results of the zero-inflated Poisson regression model showed that education of husband, source of water, mother occupation, kebele of mother, prenatal care, place of delivery, place of residence, wealth of house hold, average birth interval and average breast feeding were found to be statistically significant determinants of under-five mortality. Conclusions: In this study, it was found that the factors like average birth interval and average breast feeding were found to be statistically significant factors in both groups (not always zero category and always zero category) with under-five child death whereas education of husband, source of water, place of delivery, mother occupation and wealth index of the household have significant effect on under-five mortality under not always zero group. Place of residence, kebele of mother and prenatal care have a significant effect on under-five mortality in Tach-Armachiho district on inflated group.


1996 ◽  
Vol 24 (4) ◽  
pp. 387-442 ◽  
Author(s):  
KENNETH C. LAND ◽  
PATRICIA L. McCALL ◽  
DANIEL S. NAGIN

2017 ◽  
Vol 17 (6) ◽  
pp. 359-380 ◽  
Author(s):  
Alan Huang

Conway–Maxwell–Poisson (CMP) distributions are flexible generalizations of the Poisson distribution for modelling overdispersed or underdispersed counts. The main hindrance to their wider use in practice seems to be the inability to directly model the mean of counts, making them not compatible with nor comparable to competing count regression models, such as the log-linear Poisson, negative-binomial or generalized Poisson regression models. This note illustrates how CMP distributions can be parametrized via the mean, so that simpler and more easily interpretable mean-models can be used, such as a log-linear model. Other link functions are also available, of course. In addition to establishing attractive theoretical and asymptotic properties of the proposed model, its good finite-sample performance is exhibited through various examples and a simulation study based on real datasets. Moreover, the MATLAB routine to fit the model to data is demonstrated to be up to an order of magnitude faster than the current software to fit standard CMP models, and over two orders of magnitude faster than the recently proposed hyper-Poisson model.


2020 ◽  
Vol 62 (3) ◽  
pp. 340-366
Author(s):  
Takeshi Kurosawa ◽  
Francis K.C. Hui ◽  
A.H. Welsh ◽  
Kousuke Shinmura ◽  
Nobuoki Eshima

Author(s):  
Mohammad Mirjani Arjenan ◽  
Mohsen Askarshahi ◽  
Mahmud Vakili

Introduction: Despite the advances in cardiovascular diseases, death caused by these diseases is still considered as the leading cause of mortality. In this study, some of the effective factors on the deaths caused by cardiovascular diseases were investigated Methods: This cross-sectional analytical study investigated the efficacy of Poisson regression models and negative binomial regression models on factors affecting mortality from cardiovascular diseases. The death data were extracted from the death registration system for Yazd province in 2017.Gender, age, education, occupation, location, and city of death were also extracted for each deceased. The two regression models were then fitted to the data Results:  A total of 5,015 deaths were recorded, of which 1,642 were due to cardiovascular diseases. Cardiovascular disease mortality rates were significant using negative binomial regression in terms of the educational variables, place of residence, type of residence, and age. Death rates caused by cardiovascular diseases were not significant for age and occupational, educational, and residential variables. Conclusion: If the time of death is considered as an offset variable, the regression model of two negative sentences is more effective in showing the factors affecting death due to cardiovascular diseases according to AIC and BIC criteria. In the case that the total number of deaths is considered as the offset variable, the Poisson regression model is more efficient.


2018 ◽  
Vol 22 (8) ◽  
pp. 1390-1398 ◽  
Author(s):  
Brian Pittman ◽  
Eugenia Buta ◽  
Suchitra Krishnan-Sarin ◽  
Stephanie S O’Malley ◽  
Thomas Liss ◽  
...  

Abstract Introduction This article describes different methods for analyzing counts and illustrates their use on cigarette and marijuana smoking data. Methods The Poisson, zero-inflated Poisson (ZIP), hurdle Poisson (HUP), negative binomial (NB), zero-inflated negative binomial (ZINB), and hurdle negative binomial (HUNB) regression models are considered. The different approaches are evaluated in terms of the ability to take into account zero-inflation (extra zeroes) and overdispersion (variance larger than expected) in count outcomes, with emphasis placed on model fit, interpretation, and choosing an appropriate model given the nature of the data. The illustrative data example focuses on cigarette and marijuana smoking reports from a study on smoking habits among youth e-cigarette users with gender, age, and e-cigarette use included as predictors. Results Of the 69 subjects available for analysis, 36% and 64% reported smoking no cigarettes and no marijuana, respectively, suggesting both outcomes might be zero-inflated. Both outcomes were also overdispersed with large positive skew. The ZINB and HUNB models fit the cigarette counts best. According to goodness-of-fit statistics, the NB, HUNB, and ZINB models fit the marijuana data well, but the ZINB provided better interpretation. Conclusion In the absence of zero-inflation, the NB model fits smoking data well, which is typically overdispersed. In the presence of zero-inflation, the ZINB or HUNB model is recommended to account for additional heterogeneity. In addition to model fit and interpretability, choosing between a zero-inflated or hurdle model should ultimately depend on the assumptions regarding the zeros, study design, and the research question being asked. Implications Count outcomes are frequent in tobacco research and often have many zeros and exhibit large variance and skew. Analyzing such data based on methods requiring a normally distributed outcome are inappropriate and will likely produce spurious results. This study compares and contrasts appropriate methods for analyzing count data, specifically those with an over-abundance of zeros, and illustrates their use on cigarette and marijuana smoking data. Recommendations are provided.


Sign in / Sign up

Export Citation Format

Share Document