AEC Classifier: A Tree-Based Classifier with Error Control for Medical Disease Diagnosis and Other Applications

2021 ◽  
Vol 15 (02) ◽  
pp. 241-262
Author(s):  
Wasif Bokhari ◽  
Ajay Bansal

In medical disease diagnosis, the cost of a false negative could greatly outweigh the cost of a false positive. This is because the former could cost a life, whereas the latter may only cause medical costs and stress to the patient. The unique nature of this problem highlights the need of asymmetric error control for binary classification applications. In this domain, traditional machine learning classifiers may not be ideal as they do not provide a way to control the number of false negatives below a certain threshold. This paper proposes a novel tree-based binary classification algorithm that can control the number of false negatives with a mathematical guarantee, based on Neyman–Pearson (NP) Lemma. This classifier is evaluated on the data obtained from different heart studies and it predicts the risk of cardiac disease, not only with comparable accuracy and AUC-ROC score but also with full control over the number of false negatives. The methodology used to construct this classifier can be expanded to many more use cases, not only in medical disease diagnosis but also beyond as shown from analysis on different diverse datasets.

Author(s):  
Martin Pokorný

In the area of economical classification tasks, the accuracy maximization is often used to evaluate classifier performance. Accuracy maximization (or error rate minimization) suffers from the assumption of equal false positive and false negative error costs. Furthermore, accuracy is not able to express true classifier performance under skewed class distribution. Due to these limitations, the use of accuracy on real tasks is questionable. In a real binary classification task, the difference between the costs of false positive and false negative error is usually critical. To overcome this issue, the Receiver Ope­rating Characteristic (ROC) method in relation to decision-analytic principles can be used. One essential advantage of this method is the possibility of classifier performance visualization by means of a ROC graph. This paper presents concrete examples of binary classification, where the inadequacy of accuracy as the evaluation metric is shown, and on the same examples the ROC method is applied. From the set of possible classification models, the probabilistic classifier with continuous output is under consideration. Mainly two questions are solved. Firstly, the selection of the best classifier from a set of possible classifiers. For example, accuracy metric rates two classifiers almost equiva­lently (87.7 % and 89.3 %), whereas decision analysis (via costs minimization) or ROC analysis reveal differe­nt performance according to target conditions of unequal error costs of false positives and false negatives. Secondly, the setting of an optimal decision threshold at classifier’s output. For example, accuracy maximization finds the optimal threshold at classifier’s output in value of 0.597, but the optimal threshold respecting higher costs of false negatives is discovered by costs minimization or ROC analysis in a value substantially lower (0.477).


Methodology ◽  
2019 ◽  
Vol 15 (3) ◽  
pp. 97-105
Author(s):  
Rodrigo Ferrer ◽  
Antonio Pardo

Abstract. In a recent paper, Ferrer and Pardo (2014) tested several distribution-based methods designed to assess when test scores obtained before and after an intervention reflect a statistically reliable change. However, we still do not know how these methods perform from the point of view of false negatives. For this purpose, we have simulated change scenarios (different effect sizes in a pre-post-test design) with distributions of different shapes and with different sample sizes. For each simulated scenario, we generated 1,000 samples. In each sample, we recorded the false-negative rate of the five distribution-based methods with the best performance from the point of view of the false positives. Our results have revealed unacceptable rates of false negatives even with effects of very large size, starting from 31.8% in an optimistic scenario (effect size of 2.0 and a normal distribution) to 99.9% in the worst scenario (effect size of 0.2 and a highly skewed distribution). Therefore, our results suggest that the widely used distribution-based methods must be applied with caution in a clinical context, because they need huge effect sizes to detect a true change. However, we made some considerations regarding the effect size and the cut-off points commonly used which allow us to be more precise in our estimates.


2020 ◽  
Vol 14 ◽  
Author(s):  
Lahari Tipirneni ◽  
Rizwan Patan

Abstract:: Millions of deaths all over the world are caused by breast cancer every year. It has become the most common type of cancer in women. Early detection will help in better prognosis and increases the chance of survival. Automating the classification using Computer-Aided Diagnosis (CAD) systems can make the diagnosis less prone to errors. Multi class classification and Binary classification of breast cancer is a challenging problem. Convolutional neural network architectures extract specific feature descriptors from images, which cannot represent different types of breast cancer. This leads to false positives in classification, which is undesirable in disease diagnosis. The current paper presents an ensemble Convolutional neural network for multi class classification and Binary classification of breast cancer. The feature descriptors from each network are combined to produce the final classification. In this paper, histopathological images are taken from publicly available BreakHis dataset and classified between 8 classes. The proposed ensemble model can perform better when compared to the methods proposed in the literature. The results showed that the proposed model could be a viable approach for breast cancer classification.


2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Mandella King ◽  
Alexander E. George ◽  
Pau Cisteró ◽  
Christine K. Tarr-Attia ◽  
Beatriz Arregui ◽  
...  

Abstract Background Malaria diagnosis in many malaria-endemic countries relies mainly on the use of rapid diagnostic tests (RDTs). The majority of commercial RDTs used in Africa detect the Plasmodium falciparum histidine-rich protein 2 (PfHRP2). pfhrp2/3 gene deletions can therefore lead to false-negative RDT results. This study aimed to evaluate the frequency of PCR-confirmed, false-negative P. falciparum RDT results in Monrovia, Liberia. Methods PfHRP2-based RDT (Paracheck Pf®) and microscopy results from 1038 individuals with fever or history of fever (n = 951) and pregnant women at first antenatal care (ANC) visit (n = 87) enrolled in the Saint Joseph’s Catholic Hospital (Monrovia) from March to July 2019 were used to assess the frequency of false-negative RDT results. True–false negatives were confirmed by detecting the presence of P. falciparum DNA by quantitative PCR in samples from individuals with discrepant RDT and microscopy results. Samples that were positive by 18S rRNA qPCR but negative by PfHRP2-RDT were subjected to multiplex qPCR assay for detection of pfhrp2 and pfhrp3. Results One-hundred and eighty-six (19.6%) and 200 (21.0%) of the 951 febrile participants had a P. falciparum-positive result by RDT and microscopy, respectively. Positivity rate increased with age and the reporting of joint pain, chills and shivers, vomiting and weakness, and decreased with the presence of coughs and nausea. The positivity rate at first ANC visit was 5.7% (n = 5) and 8% (n = 7) by RDT and microscopy, respectively. Out of 207 Plasmodium infections detected by microscopy, 22 (11%) were negative by RDT. qPCR confirmed absence of P. falciparum DNA in the 16 RDT-negative but microscopy-positive samples which were available for molecular testing. Among the 14 samples that were positive by qPCR but negative by RDT and microscopy, 3 only amplified pfldh, and among these 3 all were positive for pfhrp2 and pfhrp3. Conclusion There is no qPCR-confirmed evidence of false-negative RDT results due to pfhrp2/pfhrp3 deletions in this study conducted in Monrovia (Liberia). This indicates that these deletions are not expected to affect the performance of PfHRP2-based RDTs for the diagnosis of malaria in Liberia. Nevertheless, active surveillance for the emergence of PfHRP2 deletions is required.


2021 ◽  
Vol 7 (2) ◽  
pp. 16
Author(s):  
Pedro Furtado

Image structures are segmented automatically using deep learning (DL) for analysis and processing. The three most popular base loss functions are cross entropy (crossE), intersect-over-the-union (IoU), and dice. Which should be used, is it useful to consider simple variations, such as modifying formula coefficients? How do characteristics of different image structures influence scores? Taking three different medical image segmentation problems (segmentation of organs in magnetic resonance images (MRI), liver in computer tomography images (CT) and diabetic retinopathy lesions in eye fundus images (EFI)), we quantify loss functions and variations, as well as segmentation scores of different targets. We first describe the limitations of metrics, since loss is a metric, then we describe and test alternatives. Experimentally, we observed that DeeplabV3 outperforms UNet and fully convolutional network (FCN) in all datasets. Dice scored 1 to 6 percentage points (pp) higher than cross entropy over all datasets, IoU improved 0 to 3 pp. Varying formula coefficients improved scores, but the best choices depend on the dataset: compared to crossE, different false positive vs. false negative weights improved MRI by 12 pp, and assigning zero weight to background improved EFI by 6 pp. Multiclass segmentation scored higher than n-uniclass segmentation in MRI by 8 pp. EFI lesions score low compared to more constant structures (e.g., optic disk or even organs), but loss modifications improve those scores significantly 6 to 9 pp. Our conclusions are that dice is best, it is worth assigning 0 weight to class background and to test different weights on false positives and false negatives.


2021 ◽  
pp. emermed-2020-209607
Author(s):  
Stephanie P Jones ◽  
Janet E Bray ◽  
Josephine ME Gibson ◽  
Graham McClelland ◽  
Colette Miller ◽  
...  

BackgroundAround 25% of patients who had a stroke do not present with typical ‘face, arm, speech’ symptoms at onset, and are challenging for emergency medical services (EMS) to identify. The aim of this systematic review was to identify the characteristics of acute stroke presentations associated with inaccurate EMS identification (false negatives).MethodWe performed a systematic search of MEDLINE, EMBASE, CINAHL and PubMed from 1995 to August 2020 using key terms: stroke, EMS, paramedics, identification and assessment. Studies included: patients who had a stroke or patient records; ≥18 years; any stroke type; prehospital assessment undertaken by health professionals including paramedics or technicians; data reported on prehospital diagnostic accuracy and/or presenting symptoms. Data were extracted and study quality assessed by two researchers using the Quality Assessment of Diagnostic Accuracy Studies V.2 tool.ResultsOf 845 studies initially identified, 21 observational studies met the inclusion criteria. Of the 6934 stroke and Transient Ischaemic Attack patients included, there were 1774 (26%) false negative patients (range from 4 (2%) to 247 (52%)). Commonly documented symptoms in false negative cases were speech problems (n=107; 13%–28%), nausea/vomiting (n=94; 8%–38%), dizziness (n=86; 23%–27%), changes in mental status (n=51; 8%–25%) and visual disturbance/impairment (n=43; 13%–28%).ConclusionSpeech problems and posterior circulation symptoms were the most commonly documented symptoms among stroke presentations that were not correctly identified by EMS (false negatives). However, the addition of further symptoms to stroke screening tools requires valuation of subsequent sensitivity and specificity, training needs and possible overuse of high priority resources.


2021 ◽  
Vol 10 (7) ◽  
pp. 205846012110306
Author(s):  
Mine B Lange ◽  
Lars J Petersen ◽  
Michael B Nielsen ◽  
Helle D Zacho

Background The presence of malignant cells in bone biopsies is considered gold standard to verify occurrence of cancer, whereas a negative bone biopsy can represent a false negative, with a risk of increasing patient morbidity and mortality and creating misleading conclusions in cancer research. However, a paucity of literature documents the validity of negative bone biopsy as an exclusion criterion for the presence of skeletal malignancies. Purpose To investigate the validity of a negative bone biopsy in bone lesions suspicious of malignancy. Material and Method A retrospective cohort of 215 consecutive targeted non-malignant skeletal biopsies from 207 patients (43% women, 57% men, median age 64, and range 94) representing suspicious focal bone lesions, collected from January 1, 2011, to July 31, 2013, was followed over a 2-year period to examine any additional biopsy, imaging, and clinical follow-up information to categorize the original biopsy as truly benign, malignant, or equivocal. Standard deviations and 95% confidence intervals were calculated. Results 210 of 215 biopsies (98%; 95% CI 0.94–0.99) showed to be truly benign 2 years after initial biopsy. Two biopsies were false negatives (1%; 95% CI 0.001–0.03), and three were equivocal (lack of imaging description). Conclusion Our study documents negative bone biopsy as a valid criterion for the absence of bone metastasis. Since only 28% had a confirmed diagnosis of prior cancer and not all patients received adequately sensitive imaging, our results might not be applicable to all cancer patients with suspicious bone lesions.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Rupam Bhattacharyya ◽  
Ritoban Kundu ◽  
Ritwik Bhaduri ◽  
Debashree Ray ◽  
Lauren J. Beesley ◽  
...  

AbstractSusceptible-Exposed-Infected-Removed (SEIR)-type epidemiologic models, modeling unascertained infections latently, can predict unreported cases and deaths assuming perfect testing. We apply a method we developed to account for the high false negative rates of diagnostic RT-PCR tests for detecting an active SARS-CoV-2 infection in a classic SEIR model. The number of unascertained cases and false negatives being unobservable in a real study, population-based serosurveys can help validate model projections. Applying our method to training data from Delhi, India, during March 15–June 30, 2020, we estimate the underreporting factor for cases at 34–53 (deaths: 8–13) on July 10, 2020, largely consistent with the findings of the first round of serosurveys for Delhi (done during June 27–July 10, 2020) with an estimated 22.86% IgG antibody prevalence, yielding estimated underreporting factors of 30–42 for cases. Together, these imply approximately 96–98% cases in Delhi remained unreported (July 10, 2020). Updated calculations using training data during March 15-December 31, 2020 yield estimated underreporting factor for cases at 13–22 (deaths: 3–7) on January 23, 2021, which are again consistent with the latest (fifth) round of serosurveys for Delhi (done during January 15–23, 2021) with an estimated 56.13% IgG antibody prevalence, yielding an estimated range for the underreporting factor for cases at 17–21. Together, these updated estimates imply approximately 92–96% cases in Delhi remained unreported (January 23, 2021). Such model-based estimates, updated with latest data, provide a viable alternative to repeated resource-intensive serosurveys for tracking unreported cases and deaths and gauging the true extent of the pandemic.


2019 ◽  
Vol 152 (Supplement_1) ◽  
pp. S35-S36
Author(s):  
Hadrian Mendoza ◽  
Christopher Tormey ◽  
Alexa Siddon

Abstract In the evaluation of bone marrow (BM) and peripheral blood (PB) for hematologic malignancy, positive immunoglobulin heavy chain (IG) or T-cell receptor (TCR) gene rearrangement results may be detected despite unrevealing results from morphologic, flow cytometric, immunohistochemical (IHC), and/or cytogenetic studies. The significance of positive rearrangement studies in the context of otherwise normal ancillary findings is unknown, and as such, we hypothesized that gene rearrangement studies may be predictive of an emerging B- or T-cell clone in the absence of other abnormal laboratory tests. Data from all patients who underwent IG or TCR gene rearrangement testing at the authors’ affiliated VA hospital between January 1, 2013, and July 6, 2018, were extracted from the electronic medical record. Date of testing; specimen source; and morphologic, flow cytometric, IHC, and cytogenetic characterization of the tissue source were recorded from pathology reports. Gene rearrangement results were categorized as true positive, false positive, false negative, or true negative. Lastly, patient records were reviewed for subsequent diagnosis of hematologic malignancy in patients with positive gene rearrangement results with negative ancillary testing. A total of 136 patients, who had 203 gene rearrangement studies (50 PB and 153 BM), were analyzed. In TCR studies, there were 2 false positives and 1 false negative in 47 PB assays, as well as 7 false positives and 1 false negative in 54 BM assays. Regarding IG studies, 3 false positives and 12 false negatives in 99 BM studies were identified. Sensitivity and specificity, respectively, were calculated for PB TCR studies (94% and 93%), BM IG studies (71% and 95%), and BM TCR studies (92% and 83%). Analysis of PB IG gene rearrangement studies was not performed due to the small number of tests (3; all true negative). None of the 12 patients with false-positive IG/TCR gene rearrangement studies later developed a lymphoproliferative disorder, although 2 patients were later diagnosed with acute myeloid leukemia. Of the 14 false negatives, 10 (71%) were related to a diagnosis of plasma cell neoplasms. Results from the present study suggest that positive IG/TCR gene rearrangement studies are not predictive of lymphoproliferative disorders in the context of otherwise negative BM or PB findings. As such, when faced with equivocal pathology reports, clinicians can be practically advised that isolated positive IG/TCR gene rearrangement results may not indicate the need for closer surveillance.


Sign in / Sign up

Export Citation Format

Share Document