Double-smoothing in Kernel Hazard Rate Estimation

2008 ◽  
Vol 47 (02) ◽  
pp. 167-173 ◽  
Author(s):  
A. Pfahlberg ◽  
O. Gefeller ◽  
R. Weißbach

Summary Objectives: In oncological studies, the hazard rate can be used to differentiate subgroups of the study population according to their patterns of survival risk over time. Nonparametric curve estimation has been suggested as an exploratory means of revealing such patterns. The decision about the type of smoothing parameter is critical for performance in practice. In this paper, we study data-adaptive smoothing. Methods: A decade ago, the nearest-neighbor bandwidth was introduced for censored data in survival analysis. It is specified by one parameter, namely the number of nearest neighbors. Bandwidth selection in this setting has rarely been investigated, although the heuristical advantages over the frequently-studied fixed bandwidth are quite obvious. The asymptotical relationship between the fixed and the nearest-neighbor bandwidth can be used to generate novel approaches. Results: We develop a new selection algorithm termed double-smoothing for the nearest-neighbor bandwidth in hazard rate estimation. Our approach uses a finite sample approximation of the asymptotical relationship between the fixed and nearest-neighbor bandwidth. By so doing, we identify the nearest-neighbor bandwidth as an additional smoothing step and achieve further data-adaption after fixed bandwidth smoothing. We illustrate the application of the new algorithm in a clinical study and compare the outcome to the traditional fixed bandwidth result, thus demonstrating the practical performance of the technique. Conclusion: The double-smoothing approach enlarges the methodological repertoire for selecting smoothing parameters in nonparametric hazard rate estimation. The slight increase in computational effort is rewarded with a substantial amount of estimation stability, thus demonstrating the benefit of the technique for biostatistical applications.

Proceedings ◽  
2018 ◽  
Vol 2 (18) ◽  
pp. 1164
Author(s):  
Inés Barbeito ◽  
Ricardo Cao

Bootstrap methods are used for bandwidth selection in: (1) nonparametric kernel density estimation with dependent data (smoothed stationary bootstrap and smoothed moving blocks bootstrap), and (2) nonparametric kernel hazard rate estimation (smoothed bootstrap). In these contexts, four new bandwidth parameter selectors are proposed based on closed bootstrap expressions of the MISE of the kernel density estimator (case 1) and two approximations of the kernel hazard rate estimation (case 2). These expressions turn out to be very useful since Monte Carlo approximation is no longer needed. Finally, these smoothing parameter selectors are empirically compared with the already existing ones via a simulation study.


2021 ◽  
Vol 15 (6) ◽  
pp. 1812-1819
Author(s):  
Azita Yazdani ◽  
Ramin Ravangard ◽  
Roxana Sharifian

The new coronavirus has been spreading since the beginning of 2020 and many efforts have been made to develop vaccines to help patients recover. It is now clear that the world needs a rapid solution to curb the spread of COVID-19 worldwide with non-clinical approaches such as data mining, enhanced intelligence, and other artificial intelligence techniques. These approaches can be effective in reducing the burden on the health care system to provide the best possible way to diagnose and predict the COVID-19 epidemic. In this study, data mining models for early detection of Covid-19 in patients were developed using the epidemiological dataset of patients and individuals suspected of having Covid-19 in Iran. C4.5, support vector machine, Naive Bayes, logistic regression, Random Forest, and k-nearest neighbor algorithm were used directly on the dataset using Rapid miner to develop the models. By receiving clinical signs, this model diagnosis the risk of contracting the COVID-19 virus. Examination of the models in this study has shown that the support vector machine with 93.41% accuracy is more efficient in the diagnosis of patients with COVID-19 pandemic, which is the best model among other developed models. Keywords: COVID-19, Data mining, Machine Learning, Artificial Intelligence, Classification


2020 ◽  
Vol 52 (8) ◽  
pp. 890-903
Author(s):  
Han Ye ◽  
Lawrence D. Brown ◽  
Haipeng Shen

Sign in / Sign up

Export Citation Format

Share Document