environmental reward
Recently Published Documents


TOTAL DOCUMENTS

24
(FIVE YEARS 10)

H-INDEX

4
(FIVE YEARS 1)

2021 ◽  
Vol 36 (6) ◽  
pp. 1160-1160
Author(s):  
Julianne Wilson ◽  
Amanda R Rabinowitz ◽  
Tessa Hart

Abstract Objective In persons with moderate–severe traumatic brain injury (msTBI), we compared traditional measures of mood with dynamic measures of affect derived from ecological momentary assessment (EMA), for the purpose of validating the EMA indices and exploring their unique contributions to emotional assessment. Method 23 community-dwelling participants with chronic msTBI were enrolled in a treatment trial for anxiety and/ or depression. At baseline, participants completed the Brief Symptom Inventory-18 Depression and Anxiety subscales (BSI-D, BSI-A) and the Environmental Reward Observation Scale (EROS), a measure of everyday pleasure and reward. EMA data, including the Positive and Negative Affect Scale (PANAS), were collected via smartphone 5 times daily for 7–14 days prior to treatment (M = 8.65; SD = 1.87). Spearman correlations tested associations between baseline BSI-D, BSI-A, and EROS scores with both overall means and temporal variability measures for positive and negative affect (PA, NA). Results Mean PA was significantly correlated with BSI-D (rho −0.60, p < 0.05) and EROS (rho 0.72, p < 0.01). Mean NA and affect variability measures were uncorrelated with baseline scores. NA mean and variability were intercorrelated (rho 0.87, p < 0.001), but this was not the case for PA. Conclusion EMA measures of averaged positive affect showed robust relationships with retrospective measures of depression and environmental reward, providing support for the validity of EMA measures of PA, and for use of the EROS in msTBI. While negative findings must be interpreted with caution, the lack of association of affective variability with retrospective measures suggest a unique role for EMA in examining temporal dynamics of affect.


Author(s):  
Lilla Horvath ◽  
Stanley Colcombe ◽  
Michael Milham ◽  
Shruti Ray ◽  
Philipp Schwartenbeck ◽  
...  

AbstractHumans often face sequential decision-making problems, in which information about the environmental reward structure is detached from rewards for a subset of actions. In the current exploratory study, we introduce an information-selective symmetric reversal bandit task to model such situations and obtained choice data on this task from 24 participants. To arbitrate between different decision-making strategies that participants may use on this task, we developed a set of probabilistic agent-based behavioral models, including exploitative and explorative Bayesian agents, as well as heuristic control agents. Upon validating the model and parameter recovery properties of our model set and summarizing the participants’ choice data in a descriptive way, we used a maximum likelihood approach to evaluate the participants’ choice data from the perspective of our model set. In brief, we provide quantitative evidence that participants employ a belief state-based hybrid explorative-exploitative strategy on the information-selective symmetric reversal bandit task, lending further support to the finding that humans are guided by their subjective uncertainty when solving exploration-exploitation dilemmas.


eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Ana C Sias ◽  
Ashleigh K Morse ◽  
Sherry Wang ◽  
Venuz Y Greenfield ◽  
Caitlin M Goodpaster ◽  
...  

Adaptive reward-related decision making often requires accurate and detailed representation of potential available rewards. Environmental reward-predictive stimuli can facilitate these representations, allowing one to infer which specific rewards might be available and choose accordingly. This process relies on encoded relationships between the cues and the sensory-specific details of the reward they predict. Here we interrogated the function of the basolateral amygdala (BLA) and its interaction with the lateral orbitofrontal cortex (lOFC) in the ability to learn such stimulus-outcome associations and use these memories to guide decision making. Using optical recording and inhibition approaches, Pavlovian cue-reward conditioning, and the outcome-selective Pavlovian-to-instrumental transfer (PIT) test in male rats, we found that the BLA is robustly activated at the time of stimulus-outcome learning and that this activity is necessary for sensory-specific stimulus-outcome memories to be encoded, so they can subsequently influence reward choices. Direct input from the lOFC was found to support the BLA in this function. Based on prior work, activity in BLA projections back to the lOFC was known to support the use of stimulus-outcome memories to influence decision making. By multiplexing optogenetic and chemogenetic inhibition we performed a serial circuit disconnection and found that the lOFCàBLA and BLAàlOFC pathways form a functional circuit regulating the encoding (lOFCàBLA) and subsequent use (BLAàlOFC) of the stimulus-dependent, sensory-specific reward memories that are critical for adaptive, appetitive decision making.


Author(s):  
Tasmia Tasrin ◽  
Md Sultan Al Nahian ◽  
Habarakadage Perera ◽  
Brent Harrison

Interactive reinforcement learning (IRL) agents use human feedback or instruction to help them learn in complex environments. Often, this feedback comes in the form of a discrete signal that’s either positive or negative. While informative, this information can be difficult to generalize on its own. In this work, we explore how natural language advice can be used to provide a richer feedback signal to a reinforcement learning agent by extending policy shaping, a well-known IRL technique. Usually policy shaping employs a human feedback policy to help an agent to learn more about how to achieve its goal. In our case, we replace this human feedback policy with policy generated based on natural language advice. We aim to inspect if the generated natural language reasoning provides support to a deep RL agent to decide its actions successfully in any given environment. So, we design our model with three networks: first one is the experience driven, next is the advice generator and third one is the advice driven. While the experience driven RL agent chooses its actions being influenced by the environmental reward, the advice driven neural network with generated feedback by the advice generator for any new state selects its actions to assist the RL agent to better policy shaping.


2021 ◽  
Author(s):  
Ana C Sias ◽  
Ashleigh K Morse ◽  
Sherry Wang ◽  
Venuz Y Greenfield ◽  
Caitlin M Goodpaster ◽  
...  

Adaptive reward-related decision making often requires accurate and detailed representation of potential available rewards. Environmental reward-predictive stimuli can facilitate these representations, allowing one to infer which specific rewards might be available and choose accordingly. This process relies on encoded relationships between the cues and the sensory-specific details of the reward they predict. Here we interrogated the function of the basolateral amygdala (BLA) and its interaction with the lateral orbitofrontal cortex (lOFC) in the ability to learn such stimulus-outcome associations and use these memories to guide decision making. Using optical recording and inhibition approaches, Pavlovian cue-reward conditioning, and an outcome-selective Pavlovian-to-instrumental transfer (PIT) test in male rats, we found that the BLA is robustly activated at the time of stimulus-outcome learning and that this activity is necessary for sensory-specific stimulus-outcome memories to be encoded, so that they can subsequently influence reward choices. Direct input from the lOFC was found to support the BLA in this function. Based on prior work, activity in BLA projections back to the lOFC was known to support the use of stimulus-outcome memories to influence decision making. By multiplexing optogenetic and chemogenetic inhibition to perform a serial circuit disconnection, we found that activity in lOFC→BLA projections regulates the encoding of the same components of the stimulus-outcome memory that are later used to allow cues to guide choice via activity in BLA→lOFC projections. Thus, the lOFC→BLA→lOFC circuit regulates the encoding (lOFC→BLA) and subsequent use (BLA→lOFC) of the stimulus-dependent, sensory-specific reward memories that are critical for adaptive, appetitive decision making.


2020 ◽  
pp. 003329412098193
Author(s):  
Lindsey W. Vilca ◽  
Robert I. Echebaudes-Ilizarbe ◽  
Jannia M. Aquino-Hidalgo ◽  
José Ventura-León ◽  
Renzo Martinez-Munive ◽  
...  

The aim of this study was to assess the factorial structure of the scale, the method's effect associated with its negative items, its temporal invariance, and factorial invariance according to sex. For this purpose, three samples were collected, an initial sample of 200 participants, a second sample of 461 participants and a third sample of 107 participants; making a total of 768 Peruvian university students. Other instruments were applied together with the EROS scale in order to measure satisfaction with life, anxiety, stress and depression. Regarding the results, in the initial sample it was found that the original scale containing positive and negative items does adequately fit the data (RMSEA = .19; CFI = .77; TLI = .71) and also evidence was found supporting the existence of a methodological effect associated with the negative items. It was also found that version B of the scale which only has positive items data fits the data (RMSEA = .13; CFI = .96; TLI = .95). In the second sample it was found that version B still had a good fit to the data in a larger sample (RMSEA = .07; CFI = .98; TLI = .98). In addition, it was found that the scale can be considered invariant according to sex and presents validity based on other constructs. In the third sample it was found that the test-retest reliability of the scale was adequate (.70 [CI95% .593–.788]) and also evidence was found in favor of the temporal invariance of the scale. It is concluded that the scale formed only by positive items presents more robust psychometric properties and constitutes a better alternative to measure the level of reward provided by the environment.


2020 ◽  
Vol 69 ◽  
pp. 1287-1332
Author(s):  
Cam Linke ◽  
Nadia M. Ady ◽  
Martha White ◽  
Thomas Degris ◽  
Adam White

Learning about many things can provide numerous benefits to a reinforcement learning system. For example, learning many auxiliary value functions, in addition to optimizing the environmental reward, appears to improve both exploration and representation learning. The question we tackle in this paper is how to sculpt the stream of experience—how to adapt the learning system’s behavior—to optimize the learning of a collection of value functions. A simple answer is to compute an intrinsic reward based on the statistics of each auxiliary learner, and use reinforcement learning to maximize that intrinsic reward. Unfortunately, implementing this simple idea has proven difficult, and thus has been the focus of decades of study. It remains unclear which of the many possible measures of learning would work well in a parallel learning setting where environmental reward is extremely sparse or absent. In this paper, we investigate and compare different intrinsic reward mechanisms in a new bandit-like parallel-learning testbed. We discuss the interaction between reward and prediction learners and highlight the importance of introspective prediction learners: those that increase their rate of learning when progress is possible, and decrease when it is not. We provide a comprehensive empirical comparison of 14 different rewards, including well-known ideas from reinforcement learning and active learning. Our results highlight a simple but seemingly powerful principle: intrinsic rewards based on the amount of learning can generate useful behavior, if each individual learner is introspective.


2020 ◽  
Vol 11 ◽  
Author(s):  
Matthew D. McPhee ◽  
Matthew T. Keough ◽  
Samantha Rundle ◽  
Laura M. Heath ◽  
Jeffrey D. Wardell ◽  
...  

2020 ◽  
Author(s):  
Lilla Horvath ◽  
Stanley Colcombe ◽  
Michael Milham ◽  
Shruti Ray ◽  
Philipp Schwartenbeck ◽  
...  

AbstractHumans often face sequential decision-making problems, in which information about the environmental reward structure is detached from rewards for a subset of actions. For example, a medicated patient may consider partaking in a clinical trial on the effectiveness of a new drug. Taking part in the trial can provide the patient with information about the personal effectiveness of the new drug and the potential reward of a better treatment. Not taking part in the trial does not provide the patient with this information, but is associated with the reward of a (potentially less) effective treatment. In the current study, we introduce a novel information-selective reversal bandit task to model such situations and obtained choice data on this task from 24 participants. To arbitrate between different decision-making strategies that participants may use on this task, we developed a set of probabilistic agent-based behavioural models, including exploitative and explorative Bayesian agents, as well as heuristic control agents. Upon validating the model and parameter recovery properties of our model set and summarizing the participants’ choice data in a descriptive way, we used a maximum likelihood approach to evaluate the participants’ choice data from the perspective of our model set. In brief, we provide evidence that participants employ a belief state-based hybrid explorative-exploitative strategy on the information-selective reversal bandit task, lending further support to the finding that humans are guided by their subjective uncertainty when solving exploration-exploitation dilemmas.


2020 ◽  
Author(s):  
Matthew McPhee ◽  
Matthew T. Keough ◽  
Samantha Rundle ◽  
Laura M. Heath ◽  
Jeffrey Wardell ◽  
...  

Increases in the incidence of psychological distress and alcohol use during the COVID-19 pandemic have been predicted. Environmental reward and self-medication theories suggest that increased distress and greater social/environmental constraints during COVID-19 could result in increases in depression and drinking to cope with negative affect. The current study had two goals: (1) to clarify the presence and direction of changes in alcohol use and related outcomes after the introduction of COVID-19 social distancing requirements, and; (2) to test hypothesized mediation models to explain individual differences in alcohol use during the early weeks of the COVID-19 pandemic. Participants (n = 1127) were U.S. residents recruited for participation in an online survey. The survey included questions assessing environmental reward, depression, COVID-19-related distress, drinking motives, and alcohol use outcomes (alcohol use; drinking motives; alcohol demand, and solitary drinking). Outcomes were assessed for two timeframes: the 30 days prior to state-mandated social distancing (‘pre-social-distancing’), and the 30 days after the start of state-mandated social distancing (‘post-social-distancing’). Depression severity, coping motives, and frequency of solitary drinking were significantly greater post-social-distancing relative to pre-social-distancing. Conversely, environmental reward and other drinking motives (social, enhancement, and conformity) were significantly lower post-social distancing compared to pre-social-distancing. Time spent drinking and frequency of binge drinking were greater post-social-distancing compared to pre-social-distancing, whereas typical alcohol quantity/frequency were not significantly different between timeframes. Indices of alcohol demand were variable with regard to change. Mediation analyses suggested a significant indirect effects of reduced environmental reward with drinking quantity/frequency via increased depressive symptoms and coping motives, and a significant indirect effect of COVID-related distress with alcohol quantity/frequency via coping motives for drinking. Results provide early evidence regarding the relation of psychological distress with alcohol consumption and coping motives during the early weeks of the COVID-19 pandemic. Moreover, results largely converged with predictions from self-medication and environmental reinforcement theories. Future research will be needed to study prospective associations among these outcomes.


Sign in / Sign up

Export Citation Format

Share Document