scholarly journals Model-free RL or action sequences?

Author(s):  
Adam Morris ◽  
Fiery Andrews Cushman

The alignment of habits with model-free reinforcement learning (MF RL) is a success story for computational models of decision making, and MF RL has been applied to explain phasic dopamine responses, working memory gating, drug addiction, moral intuitions, and more. Yet, the role of MF RL has recently been challenged by an alternate model---model-based selection of chained action sequences---that produces similar behavioral and neural patterns. Here, we present two experiments that dissociate MF RL from this prominent alternative, and present unconfounded empirical support for the role of MF RL in human decision making. Our results also demonstrate that people are simultaneously using model-based selection of action sequences, thus demonstrating two distinct mechanisms of habitual control in a common experimental paradigm. These findings clarify the nature of habits and help solidify MF RL's central position in models of human behavior.

Author(s):  
Fajar Syahputra ◽  
Mesran Mesran ◽  
Ikhwan Lubis ◽  
Agus Perdana Windarto

The teacher is a major milestone in the world of education, the ability and achievement of students cannot be separated from the role of a teacher in teaching and guiding students. Based on the Law of the Republic of Indonesia No. 14 of 2005 concerning Teachers and Lecturers, in Article 1 explained that teachers are professional educators with the main task of educating, teaching, guiding, directing, training, evaluating, and evaluating students in early childhood education through formal education, basic education and education medium. Whereas in Article 4 of the Act, it is explained that the position of teachers as professionals serves to enhance the dignity and role of teachers as learning agents to function to improve the quality of national education.Decision making is an election process, among various alternatives that aim to meet one or several targets. The decision-making system has 4 phases, namely intelligence, design, choice and implementation. These phases are the basis for decision making, which ends with a recommendation.The Preferences Selection Index (PSI) method is a rarely used decision support system method. This method is a method developed by stevanie and Bhatt (2010) to solve the Multi Criteria Decision Making (MCDM). With the right consideration, this method can be one of the tools to determine policies in decision-making systems, especially the selection of outstanding teachers. Determination of policies taken as a basis for decision making, must use criteria that can be defined clearly and objectively.Keywords: Decision Support System, PSI, Selection of Achieving Teachers


2021 ◽  
Vol 22 (1) ◽  
pp. 14-44
Author(s):  
Marija Zlatnar Moe ◽  
Tamara Mikolič Južnič ◽  
Tanja Žigon

AbstractThe article explores the interaction among three key figures in the process of publication of a literary translation into a language of low diffusion: the translator, the editor and the language reviser (the latter specific to the Slovene situation). The aim of the research is to identify who has the strongest position of power in the decision-making process of the production of a literary translation, especially when conflict arises. Information was gathered from the three groups with questionnaires, interviews and an analysis of public statements. The questions focused on the selection of the translator and language reviser, the translation process, the revision process and conflict resolution. A cross-comparison of the results indicates that despite the automatic central position of the editors, they tend to yield their decision-making power to translators, while language revisers have a more subservient, consulting role.


2019 ◽  
Author(s):  
Allison Letkiewicz ◽  
Amy L. Cochran ◽  
Josh M. Cisler

Trauma and trauma-related disorders are characterized by altered learning styles. Two learning processes that have been delineated using computational modeling are model-free and model-based reinforcement learning (RL), characterized by trial and error and goal-driven, rule-based learning, respectively. Prior research suggests that model-free RL is disrupted among individuals with a history of assaultive trauma and may contribute to altered fear responding. Currently, it is unclear whether model-based RL, which involves building abstract and nuanced representations of stimulus-outcome relationships to prospectively predict action-related outcomes, is also impaired among individuals who have experienced trauma. The present study sought to test the hypothesis of impaired model-based RL among adolescent females exposed to assaultive trauma. Participants (n=60) completed a three-arm bandit RL task during fMRI acquisition. Two computational models compared the degree to which each participant’s task behavior fit the use of a model-free versus model-based RL strategy. Overall, a greater portion of participants’ behavior was better captured by the model-based than model-free RL model. Although assaultive trauma did not predict learning strategy use, greater sexual abuse severity predicted less use of model-based compared to model-free RL. Additionally, severe sexual abuse predicted less left frontoparietal network encoding of model-based RL updates, which was not accounted for by PTSD. Given the significant impact that sexual trauma has on mental health and other aspects of functioning, it is plausible that altered model-based RL is an important route through which clinical impairment emerges.


Author(s):  
Ashish Khaira ◽  
Ravi K. Dwivedi

Nondestructive testing (NDT) is a vital tool in maintenance. Each NDT technique has some benefits and hindrances; therefore, the selection is crucial. Generally, the selection of a technique relies on operating personnel experience, and very few research papers shows uses of the decision-making (DM) approach. It was highlighted by various researchers that if a proper DM approach is used, it will save time and increase fault detection reliability. By keeping this fact in mind, this chapter is an attempt to provide a detailed review of research work from the year 2000-2018 that covered the role of DM techniques while making combinations of NDT for effective condition monitoring. It observed from the literature that very few researchers effectively utilized the power of DM tool. The researcher can use the outcome of this work as a beacon and improve it further.


2014 ◽  
Vol 369 (1655) ◽  
pp. 20130474 ◽  
Author(s):  
Etienne Koechlin

The prefrontal cortex subserves executive control and decision-making, that is, the coordination and selection of thoughts and actions in the service of adaptive behaviour. We present here a computational theory describing the evolution of the prefrontal cortex from rodents to humans as gradually adding new inferential Bayesian capabilities for dealing with a computationally intractable decision problem: exploring and learning new behavioural strategies versus exploiting and adjusting previously learned ones through reinforcement learning (RL). We provide a principled account identifying three inferential steps optimizing this arbitration through the emergence of (i) factual reactive inferences in paralimbic prefrontal regions in rodents; (ii) factual proactive inferences in lateral prefrontal regions in primates and (iii) counterfactual reactive and proactive inferences in human frontopolar regions. The theory clarifies the integration of model-free and model-based RL through the notion of strategy creation. The theory also shows that counterfactual inferences in humans yield to the notion of hypothesis testing, a critical reasoning ability for approximating optimal adaptive processes and presumably endowing humans with a qualitative evolutionary advantage in adaptive behaviour.


BMC Cancer ◽  
2018 ◽  
Vol 18 (1) ◽  
Author(s):  
S. Mokhles ◽  
J. J. M. E. Nuyttens ◽  
M. de Mol ◽  
J. G. J. V. Aerts ◽  
A. P. W. M. Maat ◽  
...  

2003 ◽  
Vol 89 (S1) ◽  
pp. S87-S99 ◽  
Author(s):  
Silvia Valtueña ◽  
Kevin Cashman ◽  
Simon P. Robins ◽  
Aedin Cassidy ◽  
Alwine Kardinaal ◽  
...  

Research on the bone effects of natural phyto-oestrogens after menopause is at a relatively early stage. Published studies are few, difficult to compare and often inconclusive, due in part to design weaknesses. Currently, many questions remain to be answered including to what extent a safe daily intake may prevent postmenopausal bone loss. These questions can only be addressed by conducting well-planned, randomised clinical trials that take into consideration present knowledge in the oestrogen, phyto-oestrogen and bone fields. This review is intended to provide hints for critical decision-making about the selection of subjects, type of intervention, suitable outcome measures and variables that need to be controlled.


2013 ◽  
Vol 2013 ◽  
pp. 1-12 ◽  
Author(s):  
Ji-Feng Ding ◽  
Chien-Chang Chou

The role of container logistics centre as home bases for merchandise transportation has become increasingly important. The container carriers need to select a suitable centre location of transshipment port to meet the requirements of container shipping logistics. In the light of this, the main purpose of this paper is to develop a fuzzy multi-criteria decision-making (MCDM) model to evaluate the best selection of transshipment ports for container carriers. At first, some concepts and methods used to develop the proposed model are briefly introduced. The performance values of quantitative and qualitative subcriteria are discussed to evaluate the fuzzy ratings. Then, the ideal and anti-ideal concepts and the modified distance measure method are used in the proposed model. Finally, a step-by-step example is illustrated to study the computational process of the quantitative and qualitative fuzzy MCDM model. The proposed approach has successfully accomplished our goal. In addition, the proposed fuzzy MCDM model can be empirically employed to select the best location of transshipment port for container carriers in the future study.


2015 ◽  
Vol 22 (2) ◽  
pp. 188-198 ◽  
Author(s):  
Patricia Gruner ◽  
Alan Anticevic ◽  
Daeyeol Lee ◽  
Christopher Pittenger

Decision making in a complex world, characterized both by predictable regularities and by frequent departures from the norm, requires dynamic switching between rapid habit-like, automatic processes and slower, more flexible evaluative processes. These strategies, formalized as “model-free” and “model-based” reinforcement learning algorithms, respectively, can lead to divergent behavioral outcomes, requiring a mechanism to arbitrate between them in a context-appropriate manner. Recent data suggest that individuals with obsessive-compulsive disorder (OCD) rely excessively on inflexible habit-like decision making during reinforcement-driven learning. We propose that inflexible reliance on habit in OCD may reflect a functional weakness in the mechanism for context-appropriate dynamic arbitration between model-free and model-based decision making. Support for this hypothesis derives from emerging functional imaging findings. A deficit in arbitration in OCD may help reconcile evidence for excessive reliance on habit in rewarded learning tasks with an older literature suggesting inappropriate recruitment of circuitry associated with model-based decision making in unreinforced procedural learning. The hypothesized deficit and corresponding circuitry may be a particularly fruitful target for interventions, including cognitive remediation.


Sign in / Sign up

Export Citation Format

Share Document