scholarly journals Rationale Discovery and Explainable AI

2021 ◽  
Author(s):  
Cor Steging ◽  
Silja Renooij ◽  
Bart Verheij

The justification of an algorithm’s outcomes is important in many domains, and in particular in the law. However, previous research has shown that machine learning systems can make the right decisions for the wrong reasons: despite high accuracies, not all of the conditions that define the domain of the training data are learned. In this study, we investigate what the system does learn, using state-of-the-art explainable AI techniques. With the use of SHAP and LIME, we are able to show which features impact the decision making process and how the impact changes with different distributions of the training data. However, our results also show that even high accuracy and good relevant feature detection are no guarantee for a sound rationale. Hence these state-of-the-art explainable AI techniques cannot be used to fully expose unsound rationales, further advocating the need for a separate method for rationale evaluation.

Social media has paved a new way for communication and interacting with others. The use of social media differs according to the socio-cultural, demographic and psychological aspects of individuals. People chat, share ideas and visual material, and feel that they satisfy their needs of belonging along with the groups they have joined. Social networks is not only a area of freedom where persons express themselves openly or furtively, but also an area where several ways of violence emerge or even a means used for some aspects of violence.. The present research throws light on a few of the regular and trendy methods of abuse and risks faced by the users of social media. Develop a system to identify abusing audio file by an individual on a people/ group based on common language, race, sexual preferences, religion, or nationality. We examine a new model from machine learning, namely deep machine learning by probing design configurations of deep Convolutional Neural Networks (CNN) and the impact of different hyper-parameter settings in identifying the negative aspects in social media. Deep CNN automatically generate powerful features by hierarchical learning strategies from massive amounts of training data with a minimum of human interaction or expert process knowledge. An application of the proposed method demonstrates excellent results with low false alarm rates for Twitter data


Author(s):  
Alex J. DeGrave ◽  
Joseph D. Janizek ◽  
Su-In Lee

AbstractArtificial intelligence (AI) researchers and radiologists have recently reported AI systems that accurately detect COVID-19 in chest radiographs. However, the robustness of these systems remains unclear. Using state-of-the-art techniques in explainable AI, we demonstrate that recent deep learning systems to detect COVID-19 from chest radiographs rely on confounding factors rather than medical pathology, creating an alarming situation in which the systems appear accurate, but fail when tested in new hospitals. We observe that the approach to obtain training data for these AI systems introduces a nearly ideal scenario for AI to learn these spurious “shortcuts.” Because this approach to data collection has also been used to obtain training data for detection of COVID-19 in computed tomography scans and for medical imaging tasks related to other diseases, our study reveals a far-reaching problem in medical imaging AI. In addition, we show that evaluation of a model on external data is insufficient to ensure AI systems rely on medically relevant pathology, since the undesired “shortcuts” learned by AI systems may not impair performance in new hospitals. These findings demonstrate that explainable AI should be seen as a prerequisite to clinical deployment of ML healthcare models.


2021 ◽  
Vol 2 (3) ◽  
pp. 1-26
Author(s):  
Xiaowei Jia ◽  
Jared Willard ◽  
Anuj Karpatne ◽  
Jordan S. Read ◽  
Jacob A. Zwart ◽  
...  

Physics-based models are often used to study engineering and environmental systems. The ability to model these systems is the key to achieving our future environmental sustainability and improving the quality of human life. This article focuses on simulating lake water temperature, which is critical for understanding the impact of changing climate on aquatic ecosystems and assisting in aquatic resource management decisions. General Lake Model (GLM) is a state-of-the-art physics-based model used for addressing such problems. However, like other physics-based models used for studying scientific and engineering systems, it has several well-known limitations due to simplified representations of the physical processes being modeled or challenges in selecting appropriate parameters. While state-of-the-art machine learning models can sometimes outperform physics-based models given ample amount of training data, they can produce results that are physically inconsistent. This article proposes a physics-guided recurrent neural network model (PGRNN) that combines RNNs and physics-based models to leverage their complementary strengths and improves the modeling of physical processes. Specifically, we show that a PGRNN can improve prediction accuracy over that of physics-based models (by over 20% even with very little training data), while generating outputs consistent with physical laws. An important aspect of our PGRNN approach lies in its ability to incorporate the knowledge encoded in physics-based models. This allows training the PGRNN model using very few true observed data while also ensuring high prediction accuracy. Although we present and evaluate this methodology in the context of modeling the dynamics of temperature in lakes, it is applicable more widely to a range of scientific and engineering disciplines where physics-based (also known as mechanistic) models are used.


Author(s):  
Jonas Austerjost ◽  
Robert Söldner ◽  
Christoffer Edlund ◽  
Johan Trygg ◽  
David Pollard ◽  
...  

Machine vision is a powerful technology that has become increasingly popular and accurate during the last decade due to rapid advances in the field of machine learning. The majority of machine vision applications are currently found in consumer electronics, automotive applications, and quality control, yet the potential for bioprocessing applications is tremendous. For instance, detecting and controlling foam emergence is important for all upstream bioprocesses, but the lack of robust foam sensing often leads to batch failures from foam-outs or overaddition of antifoam agents. Here, we report a new low-cost, flexible, and reliable foam sensor concept for bioreactor applications. The concept applies convolutional neural networks (CNNs), a state-of-the-art machine learning system for image processing. The implemented method shows high accuracy for both binary foam detection (foam/no foam) and fine-grained classification of foam levels.


2017 ◽  
Vol 3 ◽  
pp. e137 ◽  
Author(s):  
Mona Alshahrani ◽  
Othman Soufan ◽  
Arturo Magana-Mora ◽  
Vladimir B. Bajic

Background Artificial neural networks (ANNs) are a robust class of machine learning models and are a frequent choice for solving classification problems. However, determining the structure of the ANNs is not trivial as a large number of weights (connection links) may lead to overfitting the training data. Although several ANN pruning algorithms have been proposed for the simplification of ANNs, these algorithms are not able to efficiently cope with intricate ANN structures required for complex classification problems. Methods We developed DANNP, a web-based tool, that implements parallelized versions of several ANN pruning algorithms. The DANNP tool uses a modified version of the Fast Compressed Neural Network software implemented in C++ to considerably enhance the running time of the ANN pruning algorithms we implemented. In addition to the performance evaluation of the pruned ANNs, we systematically compared the set of features that remained in the pruned ANN with those obtained by different state-of-the-art feature selection (FS) methods. Results Although the ANN pruning algorithms are not entirely parallelizable, DANNP was able to speed up the ANN pruning up to eight times on a 32-core machine, compared to the serial implementations. To assess the impact of the ANN pruning by DANNP tool, we used 16 datasets from different domains. In eight out of the 16 datasets, DANNP significantly reduced the number of weights by 70%–99%, while maintaining a competitive or better model performance compared to the unpruned ANN. Finally, we used a naïve Bayes classifier derived with the features selected as a byproduct of the ANN pruning and demonstrated that its accuracy is comparable to those obtained by the classifiers trained with the features selected by several state-of-the-art FS methods. The FS ranking methodology proposed in this study allows the users to identify the most discriminant features of the problem at hand. To the best of our knowledge, DANNP (publicly available at www.cbrc.kaust.edu.sa/dannp) is the only available and on-line accessible tool that provides multiple parallelized ANN pruning options. Datasets and DANNP code can be obtained at www.cbrc.kaust.edu.sa/dannp/data.php and https://doi.org/10.5281/zenodo.1001086.


2021 ◽  
Vol 17 (2) ◽  
pp. 1-20
Author(s):  
Zheng Wang ◽  
Qiao Wang ◽  
Tingzhang Zhao ◽  
Chaokun Wang ◽  
Xiaojun Ye

Feature selection, an effective technique for dimensionality reduction, plays an important role in many machine learning systems. Supervised knowledge can significantly improve the performance. However, faced with the rapid growth of newly emerging concepts, existing supervised methods might easily suffer from the scarcity and validity of labeled data for training. In this paper, the authors study the problem of zero-shot feature selection (i.e., building a feature selection model that generalizes well to “unseen” concepts with limited training data of “seen” concepts). Specifically, they adopt class-semantic descriptions (i.e., attributes) as supervision for feature selection, so as to utilize the supervised knowledge transferred from the seen concepts. For more reliable discriminative features, they further propose the center-characteristic loss which encourages the selected features to capture the central characteristics of seen concepts. Extensive experiments conducted on various real-world datasets demonstrate the effectiveness of the method.


Author(s):  
Sri Hartini ◽  
Zuherman Rustam ◽  
Glori Stephani Saragih ◽  
María Jesús Segovia Vargas

<span id="docs-internal-guid-4935b5ce-7fff-d9fa-75c7-0c6a5aa1f9a6"><span>Banks have a crucial role in the financial system. When many banks suffer from the crisis, it can lead to financial instability. According to the impact of the crises, the banking crisis can be divided into two categories, namely systemic and non-systemic crisis. When systemic crises happen, it may cause even stable banks bankrupt. Hence, this paper proposed a random forest for estimating the probability of banking crises as prevention action. Random forest is well-known as a robust technique both in classification and regression, which is far from the intervention of outliers and overfitting. The experiments were then constructed using the financial crisis database, containing a sample of 79 countries in the period 1981-1999 (annual data). This dataset has 521 samples consisting of 164 crisis samples and 357 non-crisis cases. From the experiments, it was concluded that utilizing 90 percent of training data would deliver 0.98 accuracy, 0.92 sensitivity, 1.00 precision, and 0.96 F1-Score as the highest score than other percentages of training data. These results are also better than state-of-the-art methods used in the same dataset. Therefore, the proposed method is shown promising results to predict the probability of banking crises.</span></span>


2021 ◽  
Vol 25 (5) ◽  
pp. 1073-1098
Author(s):  
Nor Hamizah Miswan ◽  
Chee Seng Chan ◽  
Chong Guan Ng

Hospital readmission is a major cost for healthcare systems worldwide. If patients with a higher potential of readmission could be identified at the start, existing resources could be used more efficiently, and appropriate plans could be implemented to reduce the risk of readmission. Therefore, it is important to predict the right target patients. Medical data is usually noisy, incomplete, and inconsistent. Hence, before developing a prediction model, it is crucial to efficiently set up the predictive model so that improved predictive performance is achieved. The current study aims to analyse the impact of different preprocessing methods on the performance of different machine learning classifiers. The preprocessing applied by previous hospital readmission studies were compared, and the most common approaches highlighted such as missing value imputation, feature selection, data balancing, and feature scaling. The hyperparameters were selected using Bayesian optimisation. The different preprocessing pipelines were assessed using various performance metrics and computational costs. The results indicated that the preprocessing approaches helped improve the model’s prediction of hospital readmission.


Author(s):  
Jessica Taylor ◽  
Eliezer Yudkowsky ◽  
Patrick LaVictoire ◽  
Andrew Critch

This chapter surveys eight research areas organized around one question: As learning systems become increasingly intelligent and autonomous, what design principles can best ensure that their behavior is aligned with the interests of the operators? The chapter focuses on two major technical obstacles to AI alignment: the challenge of specifying the right kind of objective functions and the challenge of designing AI systems that avoid unintended consequences and undesirable behavior even in cases where the objective function does not line up perfectly with the intentions of the designers. The questions surveyed include the following: How can we train reinforcement learners to take actions that are more amenable to meaningful assessment by intelligent overseers? What kinds of objective functions incentivize a system to “not have an overly large impact” or “not have many side effects”? The chapter discusses these questions, related work, and potential directions for future research, with the goal of highlighting relevant research topics in machine learning that appear tractable today.


2020 ◽  
Vol 10 (18) ◽  
pp. 6619
Author(s):  
Po-Jiun Wen ◽  
Chihpin Huang

The noise prediction using machine learning is a special study that has recently received increased attention. This is particularly true in workplaces with noise pollution, which increases noise exposure for general laborers. This study attempts to analyze the noise equivalent level (Leq) at the National Synchrotron Radiation Research Center (NSRRC) facility and establish a machine learning model for noise prediction. This study utilized the gradient boosting model (GBM) as the learning model in which past noise measurement records and many other features are integrated as the proposed model makes a prediction. This study analyzed the time duration and frequency of the collected Leq and also investigated the impact of training data selection. The results presented in this paper indicate that the proposed prediction model works well in almost noise sensors and frequencies. Moreover, the model performed especially well in sensor 8 (125 Hz), which was determined to be a serious noise zone in the past noise measurements. The results also show that the root-mean-square-error (RMSE) of the predicted harmful noise was less than 1 dBA and the coefficient of determination (R2) value was greater than 0.7. That is, the working field showed a favorable noise prediction performance using the proposed method. This positive result shows the ability of the proposed approach in noise prediction, thus providing a notification to the laborer to prevent long-term exposure. In addition, the proposed model accurately predicts noise future pollution, which is essential for laborers in high-noise environments. This would keep employees healthy in avoiding noise harmful positions to prevent people from working in that environment.


Sign in / Sign up

Export Citation Format

Share Document