scholarly journals A Bayesian Hurdle Quantile Regression Model for Citation Analysis with Mass Points at Lower Values

2021 ◽  
pp. 1-29
Author(s):  
Marzieh Shahmandi ◽  
Paul Wilson ◽  
Mike Thelwall

Abstract Quantile regression presents a complete picture of the effects on the location, scale, and shape of the dependent variable at all points, not just the mean. We focus on two challenges for citation count analysis by quantile regression: discontinuity and substantial mass points at lower counts. A Bayesian hurdle quantile regression model for count data with a substantial mass point at zero was proposed by King and Song (2019). It uses quantile regression for modeling the nonzero data and logistic regression for modeling the probability of zeros versus nonzeros. We show that substantial mass points for low citation counts will nearly certainly also affect parameter estimation in the quantile regression part of the model, similar to a mass point at zero. We update the King and Song model by shifting the hurdle point past the main mass points. This model delivers more accurate quantile regression for moderately to highly cited articles, especially at quantiles corresponding to values just beyond the mass points, and enables estimates of the extent to which factors influence the chances that an article will be low cited. To illustrate the potential of this method, it is applied to simulated citation counts and data from Scopus. Peer Review https://publons.com/publon/10.1162/qss_a_00147

Mathematics ◽  
2021 ◽  
Vol 9 (21) ◽  
pp. 2768
Author(s):  
Luis Sánchez ◽  
Víctor Leiva ◽  
Helton Saulo ◽  
Carolina Marchant ◽  
José M. Sarabia

Standard regression models focus on the mean response based on covariates. Quantile regression describes the quantile for a response conditioned to values of covariates. The relevance of quantile regression is even greater when the response follows an asymmetrical distribution. This relevance is because the mean is not a good centrality measure to resume asymmetrically distributed data. In such a scenario, the median is a better measure of the central tendency. Quantile regression, which includes median modeling, is a better alternative to describe asymmetrically distributed data. The Weibull distribution is asymmetrical, has positive support, and has been extensively studied. In this work, we propose a new approach to quantile regression based on the Weibull distribution parameterized by its quantiles. We estimate the model parameters using the maximum likelihood method, discuss their asymptotic properties, and develop hypothesis tests. Two types of residuals are presented to evaluate the model fitting to data. We conduct Monte Carlo simulations to assess the performance of the maximum likelihood estimators and residuals. Local influence techniques are also derived to analyze the impact of perturbations on the estimated parameters, allowing us to detect potentially influential observations. We apply the obtained results to a real-world data set to show how helpful this type of quantile regression model is.


2017 ◽  
Vol 2 (1) ◽  
pp. 89-104 ◽  
Author(s):  
Guoqiang Liang ◽  
Haiyan Hou ◽  
Zhigang Hu ◽  
Fu Huang ◽  
Yajie Wang ◽  
...  

Abstract Purpose Research fronts build on recent work, but using times cited as a traditional indicator to detect research fronts will inevitably result in a certain time lag. This study attempts to explore the effects of usage count as a new indicator to detect research fronts in shortening the time lag of classic indicators in research fronts detection. Design/methodology/approach An exploratory study was conducted where the new indicator “usage count” was compared to the traditional citation count, “times cited,” in detecting research fronts of the regenerative medicine domain. An initial topic search of the term “regenerative medicine” returned 10,553 records published between 2000 and 2015 in the Web of Science (WoS). We first ranked these records with usage count and times cited, respectively, and selected the top 2,000 records for each. We then performed a co-citation analysis in order to obtain the citing papers of the co-citation clusters as the research fronts. Finally, we compared the average publication year of the citing papers as well as the mean cited year of the co-citation clusters. Findings The citing articles detected by usage count tend to be published more recently compared with times cited within the same research front. Moreover, research fronts detected by usage count tend to be within the last two years, which presents a higher immediacy and real-time feature compared to times cited. There is approximately a three-year time span among the mean cited years (known as “intellectual base”) of all clusters generated by usage count and this figure is about four years in the network of times cited. In comparison to times cited, usage count is a dynamic and instant indicator. Research limitations We are trying to find the cutting-edge research fronts, but those generated based on co-citations may refer to the hot research fronts. The usage count of older highly cited papers was not taken into consideration, because the usage count indicator released by WoS only reflects usage logs after February 2013. Practical implications The article provides a new perspective on using usage count as a new indicator to detect research fronts. Originality/value Usage count can greatly shorten the time lag in research fronts detection, which would be a promising complementary indicator in detection of the latest research fronts.


2018 ◽  
Vol 22 (Suppl. 1) ◽  
pp. 97-107 ◽  
Author(s):  
Bahadır Yuzbasi ◽  
Yasin Asar ◽  
Samil Sik ◽  
Ahmet Demiralp

An important issue is that the respiratory mortality may be a result of air pollution which can be measured by the following variables: temperature, relative humidity, carbon monoxide, sulfur dioxide, nitrogen dioxide, hydrocarbons, ozone, and particulates. The usual way is to fit a model using the ordinary least squares regression, which has some assumptions, also known as Gauss-Markov assumptions, on the error term showing white noise process of the regression model. However, in many applications, especially for this example, these assumptions are not satisfied. Therefore, in this study, a quantile regression approach is used to model the respiratory mortality using the mentioned explanatory variables. Moreover, improved estimation techniques such as preliminary testing and shrinkage strategies are also obtained when the errors are autoregressive. A Monte Carlo simulation experiment, including the quantile penalty estimators such as lasso, ridge, and elastic net, is designed to evaluate the performances of the proposed techniques. Finally, the theoretical risks of the listed estimators are given.


2015 ◽  
Vol 32 (3) ◽  
pp. 686-713 ◽  
Author(s):  
Walter Oberhofer ◽  
Harry Haupt

This paper studies the asymptotic properties of the nonlinear quantile regression model under general assumptions on the error process, which is allowed to be heterogeneous and mixing. We derive the consistency and asymptotic normality of regression quantiles under mild assumptions. First-order asymptotic theory is completed by a discussion of consistent covariance estimation.


Sign in / Sign up

Export Citation Format

Share Document