scholarly journals Measurement Schmeasurement: Questionable Measurement Practices and How to Avoid Them

Author(s):  
Jessica Kay Flake ◽  
Eiko I Fried

In this paper, we define questionable measurement practices (QMPs) as decisions researchers make that raise doubts about the validity of the measures, and ultimately the validity of study conclusions. Doubts arise for a host of reasons including a lack of transparency, ignorance, negligence, or misrepresentation of the evidence. We describe the scope of the problem and focus on how transparency is a part of the solution. A lack of measurement transparency makes it impossible to evaluate potential threats to internal, external, statistical conclusion, and construct validity. We demonstrate that psychology is plagued by a measurement schmeasurement attitude: QMPs are common, hide a stunning source of researcher degrees of freedom, pose a serious threat to cumulative psychological science, but are largely ignored. We address these challenges by providing a set of questions that researchers and consumers of scientific research can consider to identify and avoid QMPs. Transparent answers to these measurement questions promote rigorous research, allow for thorough evaluations of a study’s inferences, and are necessary for meaningful replication studies.

2020 ◽  
Vol 3 (4) ◽  
pp. 456-465
Author(s):  
Jessica Kay Flake ◽  
Eiko I. Fried

In this article, we define questionable measurement practices (QMPs) as decisions researchers make that raise doubts about the validity of the measures, and ultimately the validity of study conclusions. Doubts arise for a host of reasons, including a lack of transparency, ignorance, negligence, or misrepresentation of the evidence. We describe the scope of the problem and focus on how transparency is a part of the solution. A lack of measurement transparency makes it impossible to evaluate potential threats to internal, external, statistical-conclusion, and construct validity. We demonstrate that psychology is plagued by a measurement schmeasurement attitude: QMPs are common, hide a stunning source of researcher degrees of freedom, and pose a serious threat to cumulative psychological science, but are largely ignored. We address these challenges by providing a set of questions that researchers and consumers of scientific research can consider to identify and avoid QMPs. Transparent answers to these measurement questions promote rigorous research, allow for thorough evaluations of a study’s inferences, and are necessary for meaningful replication studies.


2020 ◽  
Author(s):  
Mairead Shaw ◽  
Leonie Johanna Rosina Cloos ◽  
Raymond Luong ◽  
Sasha Elbaz ◽  
Jessica Kay Flake

Validity of measurement is integral to the interpretability of research endeavours and any subsequent replication attempts. To assess current measurement practices and the construct validity of measures in large-scale replication studies, we conducted a systematic review of measures used in Many Labs 2: Investigating Variation in Replicability Across Samples and Settings (Klein et al., 2018). To evaluate the psychometric properties of the scales used in ManyLabs 2 we conducted factor and reliability analyses on the publicly-available data. We report that measures in Many Labs 2 were often short with little validity evidence reported in the original study, that measures with more validity evidence in the original study had stronger psychometric properties in the replication sample, and that translated versions of scales had lower reliability.We discuss the implications of these findings for interpreting replication results, and make recommendations to improve measurement practices in future replications.


2021 ◽  
Author(s):  
Jessica Kay Flake ◽  
Mairead Shaw ◽  
Raymond Luong

Yarkoni describes a grim state of psychological science in which the gross misspecification of our models and specificity of our operationalizations produce claims with generality so narrow that no one would be interested in them. We consider this a generalizability of construct validity issue and discuss how construct validation research should precede large-scale replication research. We provide ideas for a path forward by suggesting psychologists take a few steps back. By retooling large-scale replication studies, psychologists can execute the descriptive research needed to assess the generalizability of constructs. We provide examples of reusing large-scale replication data to conduct construct validation research post hoc. We also discuss proof of concept research that is on-going at the Psychological Science Accelerator. Big team psychology makes large-scale construct validity and generalizability research feasible and worthwhile. We assert that no one needs to quit the field, in fact, there is plenty of work to do. The optimistic interpretation is that if psychologists focus less on generating new ideas and more on organizing, synthesizing, measuring, and assessing constructs from existing ideas, we can keep busy for at least 100 years.


2021 ◽  
Author(s):  
Dag Sjøberg ◽  
Gunnar Bergersen

Empirical research aims to establish generalizable claims from data. Such claims involve concepts that often must be measured indirectly by using indicators. Construct validity is concerned with whether one can justifiably make claims at the conceptual level that are supported by results at the operational level. We report a quantitative analysis of the awareness of construct validity in the software engineering literature between 2000 and 2019 and a qualitative review of 83 articles about human-centric experiments published in five high-quality journals between 2015 and 2019. Over the two decades, the appearance in the literature of the term construct validity increased sevenfold. Some of the reviewed articles we reviewed employed various ways to ensure that the indicators span the concept in an unbiased manner. We also found articles that reuse formerly validated constructs. However, the articles disagree about how to define construct validity. Several interpret construct validity excessively by including threats to internal, external, or statistical conclusion validity. A few articles also include fundamental challenges of a study, such as cheating and misunderstandings of experiment material. The diversity of topics discussed makes us recommend a minimalist approach to construct validity. We propose seven guidelines to establish a common ground for addressing construct validity in software engineering.


2020 ◽  
Author(s):  
Cooper Hodges ◽  
Hannah Michelle Lindsey ◽  
Paula Johnson ◽  
Bryant M Stone ◽  
James carter

The replication crisis within the social and behavioral sciences has called into question the consistency of research methodology. A lack of attention to minor details in replication studies may limit researchers’ abilities to reproduce the results. One such overlooked detail is the statistical programs used to analyze the data. In the current investigation, we compared the results of several nonparametric analyses and measures of normality conducted on a large sample of data in SPSS, SAS, Stata, and R with results obtained through hand-calculation using the raw computational formulas. Multiple inconsistencies were found in the results produced between statistical packages due to algorithmic variation, computational error, and lack of clarity and/or specificity in the statistical output generated. We also highlight similar inconsistencies in supplementary analyses conducted on subsets of the data, which reflect realistic sample sizes. These inconsistencies were largely due to algorithmic variations used within packages when the analyses are performed on data from small- or medium-sized samples. We discuss how such inconsistencies may influence the conclusions drawn from the results of statistical analyses depending on the statistical software used, and we urge researchers to analyze their data across multiple packages, report details regarding the statistical procedure used for data analysis and consider these details when conducting direct replications studies.


Author(s):  
Rolf A. Zwaan ◽  
Alexander Etz ◽  
Richard E. Lucas ◽  
M. Brent Donnellan

AbstractMany philosophers of science and methodologists have argued that the ability to repeat studies and obtain similar results is an essential component of science. A finding is elevated from single observation to scientific evidence when the procedures that were used to obtain it can be reproduced and the finding itself can be replicated. Recent replication attempts show that some high profile results – most notably in psychology, but in many other disciplines as well – cannot be replicated consistently. These replication attempts have generated a considerable amount of controversy, and the issue of whether direct replications have value has, in particular, proven to be contentious. However, much of this discussion has occurred in published commentaries and social media outlets, resulting in a fragmented discourse. To address the need for an integrative summary, we review various types of replication studies and then discuss the most commonly voiced concerns about direct replication. We provide detailed responses to these concerns and consider different statistical ways to evaluate replications. We conclude there are no theoretical or statistical obstacles to making direct replication a routine aspect of psychological science.


2020 ◽  
Author(s):  
Kristoffer Klevjer ◽  
Per Pippin Aspaas

In this episode, we are exploring a student's perspective on open science – and specifically replication studies. Kristoffer Klevjer recently finished his Master’s degree in psychology at UiT The Arctic University of Norway and has now taken on a PhD. But already as a master student, Klevjer was involved in replication studies. In his experience, replication studies can be benefitial to the student, the supervisor, and the scientific community at large. Furthermore, Klevjer argues that replications can be well suited for students at Bachelor level as well. In the interview, Klevjer refers to several publications and projects, including - The Collaborative Replications and Education Project - Kool, W., McGuire, J. T., Rosen, Z. B., & Botvinick, M. M. (2010). Decision making and the avoidance of cognitive demand. Journal of Experimental Psychology: General, 139(4), 665–682. https://doi.org/10.1037/a0020198 - Psychological Science Accelerator The replication Klevjer did for his Master's degree can be found here First published online March 9, 2020.


2021 ◽  
Author(s):  
Dag Sjøberg ◽  
Gunnar Bergersen

Empirical research aims to establish generalizable claims from data. Such claims involve concepts that often must be measured indirectly by using indicators. Construct validity is concerned with whether one can justifiably make claims at the conceptual level that are supported by results at the operational level. We report a quantitative analysis of the awareness of construct validity in the software engineering literature between 2000 and 2019 and a qualitative review of 83 articles about human-centric experiments published in five high-quality journals between 2015 and 2019. Over the two decades, the appearance in the literature of the term construct validity increased sevenfold. Some of the reviewed articles we reviewed employed various ways to ensure that the indicators span the concept in an unbiased manner. We also found articles that reuse formerly validated constructs. However, the articles disagree about how to define construct validity. Several interpret construct validity excessively by including threats to internal, external, or statistical conclusion validity. A few articles also include fundamental challenges of a study, such as cheating and misunderstandings of experiment material. The diversity of topics discussed makes us recommend a minimalist approach to construct validity. We propose seven guidelines to establish a common ground for addressing construct validity in software engineering.


2018 ◽  
Vol 41 ◽  
Author(s):  
Scott O. Lilienfeld

AbstractZwaan et al. make a compelling case for the necessity of direct replication in psychological science. I build on their arguments by underscoring the necessity of direct implication for two domains of clinical psychological science: the evaluation of psychotherapy outcome and the construct validity of psychological measures.


Sign in / Sign up

Export Citation Format

Share Document