Suprathreshold perceptual decisions constrain models of confidence
Perceptual confidence is an important internal signal about the certainty of our decisions and there is a substantial debate on how it is computed. We highlight three confidence metric types from the literature: observers either use 1) the full probability distribution to compute probability correct (Probability metrics), 2) point estimates from the perceptual decision process to estimate uncertainty (Evidence-Strength metrics), or 3) heuristic confidence from stimulus-based cues to uncertainty (Heuristic metrics). These metrics are rarely tested against one another, so we examined models of all three types on a suprathreshold spatial discrimination task. Observers were shown a cloud of dots sampled from a dot generating distribution and judged if the mean of the distribution was left or right of centre. In addition to varying the horizontal position of the mean, there were two sensory uncertainty manipulations: the number of dots sampled and the spread of the generating distribution. After every two perceptual decisions, observers made a confidence forced-choice judgement whether they were more confident in the first or second decision. Model results showed that observers were on average best-fit by a Heuristic model that used dot cloud position, spread, and number of dots as cues. However, almost half of the observers were best-fit by an Evidence-Strength model that uses the distance between the discrimination criterion and a point estimate, scaled according to sensory uncertainty, to compute confidence. This signal-to-noise ratio model outperformed the standard unscaled distance from criterion model favoured by many researchers and suggests that this latter simple model may not be suitable for mixed-difficulty designs. An accidental repetition of some sessions also allowed for the measurement of confidence agreement for identical pairs of stimuli. This N-pass analysis revealed that human observers were more consistent than their best-fitting model would predict, indicating there are still aspects of confidence that are not captured by our model. As such, we propose confidence agreement as a useful technique for computational studies of confidence. Taken together, these findings highlight the idiosyncratic nature of confidence computations for complex decision contexts and the need to consider different potential metrics and transformations in the confidence computation.