Addressing a Crisis of Generalizability with Large-Scale Construct Validation

2021 ◽  
Author(s):  
Jessica Kay Flake ◽  
Mairead Shaw ◽  
Raymond Luong

Yarkoni describes a grim state of psychological science in which the gross misspecification of our models and specificity of our operationalizations produce claims with generality so narrow that no one would be interested in them. We consider this a generalizability of construct validity issue and discuss how construct validation research should precede large-scale replication research. We provide ideas for a path forward by suggesting psychologists take a few steps back. By retooling large-scale replication studies, psychologists can execute the descriptive research needed to assess the generalizability of constructs. We provide examples of reusing large-scale replication data to conduct construct validation research post hoc. We also discuss proof of concept research that is on-going at the Psychological Science Accelerator. Big team psychology makes large-scale construct validity and generalizability research feasible and worthwhile. We assert that no one needs to quit the field, in fact, there is plenty of work to do. The optimistic interpretation is that if psychologists focus less on generating new ideas and more on organizing, synthesizing, measuring, and assessing constructs from existing ideas, we can keep busy for at least 100 years.

2021 ◽  
Author(s):  
Jessica Kay Flake

An increased focus on transparency and replication in science has stimulated reform in research practices and dissemination. As a result, the research culture is changing: the use of preregistration is on the rise, access to data and materials is increasing, and large-scale replication studies are more common. In this paper, I discuss two problems the methodological reform movement is now ready to tackle given the progress thus far and how educational psychology is particularly well suited to contribute. The first problem is that there is a lack of transparency and rigor in measurement development and use. The second problem is caused by the first; replication research is difficult and potentially futile as long as the first problem persists. I describe how to expand transparent practices into measure use and how construct validation can be implemented to bolster the validity of replication studies.


Author(s):  
Jessica Kay Flake ◽  
Eiko I Fried

In this paper, we define questionable measurement practices (QMPs) as decisions researchers make that raise doubts about the validity of the measures, and ultimately the validity of study conclusions. Doubts arise for a host of reasons including a lack of transparency, ignorance, negligence, or misrepresentation of the evidence. We describe the scope of the problem and focus on how transparency is a part of the solution. A lack of measurement transparency makes it impossible to evaluate potential threats to internal, external, statistical conclusion, and construct validity. We demonstrate that psychology is plagued by a measurement schmeasurement attitude: QMPs are common, hide a stunning source of researcher degrees of freedom, pose a serious threat to cumulative psychological science, but are largely ignored. We address these challenges by providing a set of questions that researchers and consumers of scientific research can consider to identify and avoid QMPs. Transparent answers to these measurement questions promote rigorous research, allow for thorough evaluations of a study’s inferences, and are necessary for meaningful replication studies.


2020 ◽  
Vol 3 (4) ◽  
pp. 456-465
Author(s):  
Jessica Kay Flake ◽  
Eiko I. Fried

In this article, we define questionable measurement practices (QMPs) as decisions researchers make that raise doubts about the validity of the measures, and ultimately the validity of study conclusions. Doubts arise for a host of reasons, including a lack of transparency, ignorance, negligence, or misrepresentation of the evidence. We describe the scope of the problem and focus on how transparency is a part of the solution. A lack of measurement transparency makes it impossible to evaluate potential threats to internal, external, statistical-conclusion, and construct validity. We demonstrate that psychology is plagued by a measurement schmeasurement attitude: QMPs are common, hide a stunning source of researcher degrees of freedom, and pose a serious threat to cumulative psychological science, but are largely ignored. We address these challenges by providing a set of questions that researchers and consumers of scientific research can consider to identify and avoid QMPs. Transparent answers to these measurement questions promote rigorous research, allow for thorough evaluations of a study’s inferences, and are necessary for meaningful replication studies.


2019 ◽  
Author(s):  
Ulrich Schimmack

In this commentary on the state of validation research in psychology, I review Cronbach and Fiske’s (1955) seminal article and point out that the term is widely used, but researchers rarely follow their recommendations. Most important, construct validation requires specification of a nomological net, which could be done with a structural equation model and construct validity should be quantified, which could be done by means of factor loadings in an SEM measurement model.


2020 ◽  
Author(s):  
Mairead Shaw ◽  
Leonie Johanna Rosina Cloos ◽  
Raymond Luong ◽  
Sasha Elbaz ◽  
Jessica Kay Flake

Validity of measurement is integral to the interpretability of research endeavours and any subsequent replication attempts. To assess current measurement practices and the construct validity of measures in large-scale replication studies, we conducted a systematic review of measures used in Many Labs 2: Investigating Variation in Replicability Across Samples and Settings (Klein et al., 2018). To evaluate the psychometric properties of the scales used in ManyLabs 2 we conducted factor and reliability analyses on the publicly-available data. We report that measures in Many Labs 2 were often short with little validity evidence reported in the original study, that measures with more validity evidence in the original study had stronger psychometric properties in the replication sample, and that translated versions of scales had lower reliability.We discuss the implications of these findings for interpreting replication results, and make recommendations to improve measurement practices in future replications.


2021 ◽  
Author(s):  
Alexander McDiarmid ◽  
Alexa Mary Tullett ◽  
Cassie Marie Whitt ◽  
Simine Vazire ◽  
Paul E. Smaldino ◽  
...  

Self-correction—a key feature distinguishing science from pseudoscience—requires that scientists update their beliefs in light of new evidence. However, people are often reluctant to change their beliefs. We examined self-correction in action, tracking research psychologists’ beliefs in psychological effects before and after the completion of four large-scale replication projects. We found that psychologists did update their beliefs; they updated as much as they predicted they would, but not as much as our Bayesian model suggests they should if they trust the results. We found no evidence that psychologists became more critical of replications when it would have preserved their pre-existing beliefs. We also found no evidence that personal investment or lack of expertise discouraged belief updating, but people higher on intellectual humility updated their beliefs slightly more. Overall, our results suggest that replication studies can contribute to self-correction within psychology, but psychologists may underweight their evidentiary value.


2018 ◽  
Author(s):  
Nicholas Alvaro Coles ◽  
Leonid Tiokhin ◽  
Anne M. Scheel ◽  
Peder Mortvedt Isager ◽  
Daniel Lakens

In a summary of recent discussions about the role of direct replications in psychological science, Zwaan, Etz, Lucas, and Donnellan (2017; henceforth ZELD) argue that replications should be more mainstream, and discuss six common objections to direct replication studies. We believe that the debate about the importance of replication research is essentially driven by disagreements about the value of replication studies and the best way to allocate limited resources. We suggest that a decision theory framework (Wald, 1950) can provide a tool for researchers to (a) evaluate costs and benefits in order to determine when replication studies are worthwhile, and (b) specify their assumptions in quantifiable terms, facilitating more productive discussions in which the sources of disagreement about the value of replications can be identified.


2018 ◽  
Vol 41 ◽  
Author(s):  
Duane T. Wegener ◽  
Leandre R. Fabrigar

AbstractReplications can make theoretical contributions, but are unlikely to do so if their findings are open to multiple interpretations (especially violations of psychometric invariance). Thus, just as studies demonstrating novel effects are often expected to empirically evaluate competing explanations, replications should be held to similar standards. Unfortunately, this is rarely done, thereby undermining the value of replication research.


2019 ◽  
Author(s):  
Amanda Kvarven ◽  
Eirik Strømland ◽  
Magnus Johannesson

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.


Sign in / Sign up

Export Citation Format

Share Document