expert ratings
Recently Published Documents


TOTAL DOCUMENTS

164
(FIVE YEARS 49)

H-INDEX

20
(FIVE YEARS 3)

2021 ◽  
pp. 001872672110709
Author(s):  
Yuri S. Scharp ◽  
Arnold B. Bakker ◽  
Kimberley Breevaart ◽  
Kaspar Kruup ◽  
Andero Uusberg

Drawing on the play and work design literatures, we conceptualize and validate an instrument to measure playful work design (PWD) – the proactive cognitive-behavioral orientation that employees engage in to incorporate play into their work activities to promote fun and challenge. Across three studies (N=1006), we developed a reliable scale with a two-dimensional factor structure. In Study 1, we utilized expert-ratings and iterative exploratory factor analyses to develop an instrument that measures (1) designing fun and (2) designing competition. Additionally, Study 1 evidences the divergent and convergent validity of the subscales as well as their distinctiveness. Specifically, PWD was indicative of proactivity as well as play, and designing fun especially correlated with ludic traits (i.e., traits focused on deriving fun; e.g., humor), whereas designing competition particularly correlated with agonistic traits (i.e., traits focused on deriving challenge; e.g., competitiveness). Study 2 cross-validated the two-factor structure, further investigated the nomological net of PWD, and revealed that PWD is distinct from job crafting. Finally, Study 3 examined the predictive and incremental validity of the PWD instrument with self- and colleague-ratings two weeks apart. Taken together, the results suggest that the instrument may advance our understanding of play initiated by employees during work.


Author(s):  
Vladimir Ivanov ◽  
Valery Solovyev

Concrete/abstract words are used in a growing number of psychological and neurophysiological research. For a few languages, large dictionaries have been created manually. This is a very time-consuming and costly process. To generate large high-quality dictionaries of concrete/abstract words automatically one needs extrapolating the expert assessments obtained on smaller samples. The research question that arises is how small such samples should be to do a good enough extrapolation. In this paper, we present a method for automatic ranking concreteness of words and propose an approach to significantly decrease amount of expert assessment. The method has been evaluated on a large test set for English. The quality of the constructed dictionaries is comparable to the expert ones. The correlation between predicted and expert ratings is higher comparing to the state-of-the-art methods.


2021 ◽  
pp. 136216882110585
Author(s):  
Timothy Doe

Language learning activities involving time-pressured repetition of similar content have been shown to facilitate improvements in fluency. However, concerns have been voiced about whether these gains might be offset by reduced levels of grammatical accuracy. This descriptive study tracked the oral proficiency of 32 Japanese university students enrolled in English as a foreign language (EFL) classes over one academic semester during which they regularly completed 3/2/1 fluency development activities. Measures of complexity, accuracy, and fluency (CAF) were analysed to investigate whether any developmental patterns could be identified. The results indicated that over the semester, the students made small, but significant gains in two fluency measures, the mean length of pause and the phonation/time ratio. Despite the relatively small size of the gains, expert ratings of perceived fluency suggested that these fluency improvements were detectable to the human ear. Furthermore, a significant relationship emerged between three of the four CAF measures over the semester. These results suggest that the activities moderately impacted students’ speaking fluency without negatively affecting accuracy or complexity levels; however, further longitudinal research is needed to determine which factors might influence this development, as class performance measures did not account for any of the variation detected.


2021 ◽  
pp. 1-14
Author(s):  
Kristen Edwards ◽  
Aoran Peng ◽  
Scarlett Miller ◽  
Faez Ahmed

Abstract A picture is worth a thousand words, and in design metric estimation, a word may be worth a thousand features. Pictures are awarded this worth because they encode a plethora of information. When evaluating designs, we aim to capture a range of information, including usefulness, uniqueness, and novelty of a design. The subjective nature of these concepts makes their evaluation difficult. Still, many attempts have been made and metrics developed to do so, because design evaluation is integral to the creation of novel solutions. The most common metrics used are the consensual assessment technique (CAT) and the Shah, Vargas-Hernandez, and Smith (SVS) method. While CAT is accurate and often regarded as the “gold standard,” it relies on using expert ratings, making CAT expensive and time-consuming. Comparatively, SVS is less resource-demanding, but often criticized as lacking sensitivity and accuracy. We utilize the complementary strengths of both methods through machine learning. This study investigates the potential of machine learning to predict expert creativity assessments from non-expert survey results. The SVS method results in a text-rich dataset about a design. We utilize these textual design representations and the deep semantic relationships that natural language encodes to predict more desirable design metrics, including CAT metrics. We demonstrate the ability of machine learning models to predict design metrics from the design itself and SVS survey information. We show that incorporating natural language processing improves prediction results across design metrics, and that clear distinctions in the predictability of certain metrics exist.


Author(s):  
Alina Alwast ◽  
Katrin Vorhölter

AbstractTeaching mathematical modeling is a demanding task. Thus, fostering teachers’ competencies in this regard is an essential component of teacher education. Recent conceptualizations of teachers’ competencies include situation-specific skills based on the concept of noticing, which is of particular interest for the spontaneous reactions needed when teaching mathematical modeling. The study described in this paper aims to analyze the development of a video-based instrument for measuring teachers’ noticing competencies within a mathematical modeling context and obtain evidence for the validity of the instrument. Three kinds of validity are examined in three different studies: content validity, elemental validity and construct validity. Indicators for content validity could be found through different expert ratings and implementation with the target group, where participants were able to perceive all relevant aspects. The qualitative analysis of participants’ reasoning, which is consistent with the coded level, indicates elemental validity. Moreover, the results of the confirmatory factor analysis suggest construct validity with one overall factor of noticing competence within a mathematical modeling context. Taken together, these studies imply a satisfactory validity of the video-based instrument.


2021 ◽  
Vol 12 ◽  
Author(s):  
Namkje Koudenburg ◽  
Henk A. L. Kiers ◽  
Yoshihisa Kashima

Opinion polarization is increasingly becoming an issue in today’s society, producing both unrest at the societal level, and conflict within small scale communications between people of opposite opinion. Often, opinion polarization is conceptualized as the direct opposite of agreement and consequently operationalized as an index of dispersion. However, in doing so, researchers fail to account for the bimodality that is characteristic of a polarized opinion distribution. A valid measurement of opinion polarization would enable us to predict when, and on what issues conflict may arise. The current study is aimed at developing and validating a new index of opinion polarization. The weights of this index were derived from utilizing the knowledge of 58 international experts on polarization through an expert survey. The resulting Opinion Polarization Index predicted expert polarization scores in opinion distributions better than common measures of polarization, such as the standard deviation, Van der Eijk’s polarization measure and Esteban and Ray’s polarization index. We reflect on the use of expert ratings for the development of measurements in this case, and more in general.


2021 ◽  
pp. 002242942110446
Author(s):  
Erkki Huovinen ◽  
Aaro Keipi

Studies in musical improvisation show that musicians and even children are able to communicate intended emotions to listeners at will. To understand emotional expressivity in music as an art form, communicative success needs to be related to improvisers’ thought processes and listeners’ aesthetic judgments. In the present study, we used retrospective verbal protocols to address college music students’ strategies in improvisations based on emotion terms. We also subjected their improvisations to expert ratings in terms of heard emotional content and aesthetic value. A qualitative analysis showed that improvisers used both generative strategies (expressible in intramusical terms) and imaginative, extramusical strategies when approaching the improvisation tasks. The clarity of emotional communication was found to be high overall, and linear mixed-effects models showed that it was supported by generative approaches. However, perceived aesthetic value was unrelated to such emotional clarity. Instead, aesthetic value was associated with emotional complexity, here defined as the heard presence of “nonintended” emotions. The results point toward a view according to which the expressive content of improvisation gets specified and personalized during the very act of improvisation itself. Arguably, musical expressivity in improvisation should not be equated with the error-free communication of previously intended emotional categories.


Author(s):  
Anne-Marie Steingräber ◽  
Nick Tübben ◽  
Niels Brinkmann ◽  
Felix Finkeldey ◽  
Slava Migutin ◽  
...  

AbstractThe service of specialized and special forces of the Federal Armed Forces and police is characterized by complex situations. Such personnel often face numerous difficulties and extreme danger and experience periods of high stress when fulfilling their tasks. In the context of social and technological changes, it is necessary to explore the individual components of stress management in further detail, i.e., stress prevention, stress control, and stress coping mechanisms, and furthermore to consider these elements in the fields of training and service. For this purpose, a stress management model was created based on participant observations, expert ratings, and problem-centered interviews with specialized members of military police and special police forces. The results of the validation can be interpreted as suggesting that effective stress management requires a diverse range of techniques and methods, including the use of digital means such as e-learning, digital reality, and eye tracking, in order to be able to address new demands appropriately.


2021 ◽  
Author(s):  
Kristen M. Edwards ◽  
Aoran Peng ◽  
Scarlett R. Miller ◽  
Faez Ahmed

Abstract A picture is worth a thousand words, and in design metric estimation, a word may be worth a thousand features. Pictures are awarded this worth because of their ability to encode a plethora of information. When evaluating designs, we aim to capture a range of information as well, information including usefulness, uniqueness, and novelty of a design. The subjective nature of these concepts makes their evaluation difficult. Despite this, many attempts have been made and metrics developed to do so, because design evaluation is integral to innovation and the creation of novel solutions. The most common metrics used are the consensual assessment technique (CAT) and the Shah, Vargas-Hernandez, and Smith (SVS) method. While CAT is accurate and often regarded as the “gold standard,” it heavily relies on using expert ratings as a basis for judgement, making CAT expensive and time consuming. Comparatively, SVS is less resource-demanding, but it is often criticized as lacking sensitivity and accuracy. We aim to take advantage of the distinct strengths of both methods through machine learning. More specifically, this study seeks to investigate the possibility of using machine learning to facilitate automated creativity assessment. The SVS method results in a text-rich dataset about a design. In this paper we utilize these textual design representations and the deep semantic relationships that words and sentences encode, to predict more desirable design metrics, including CAT metrics. We demonstrate the ability of machine learning models to predict design metrics from the design itself and SVS Survey information. We demonstrate that incorporating natural language processing (NLP) improves prediction results across all of our design metrics, and that clear distinctions in the predictability of certain metrics exist. Our code and additional information about our work are available at http://decode.mit.edu/projects/nlp-design-eval/.


2021 ◽  
Author(s):  
Benjamin T Kaveladze ◽  
Akash R Wasil ◽  
John B Bunyi ◽  
Veronica Ramirez ◽  
Stephen M Schueller

BACKGROUND User experience and engagement are critical to mental health apps’ abilities to support users. However, limited work has examined the relationship between user experience, engagement, and app popularity. Given that apps vary immensely in their popularity, understanding why some mental health apps are more appealing or engaging to users can inform efforts to develop better apps. OBJECTIVE We aimed to examine relationships between user experience, engagement, and popularity. To do so, we examined links between subjective measures of user experience and objective measures of app popularity and engagement. METHODS We conducted a pre-registered secondary data analysis in a sample of 56 mental health apps. To measure user experience, we used expert ratings on the Mobile App Rating Scale (MARS) and consumer ratings from the Apple app store and Google Play app store. To measure engagement, we acquired estimates of measures of monthly active users (MAU) and user retention. To measure app popularity, we used download count, total app revenue, and MAU again. RESULTS MARS total score was significantly and positively correlated with app-level revenue (T=0.30, P=.002), MAU (T= 0.39, P<.001), and downloads (T=0.41, P<.001). However, neither the MARS total score nor any of its subscales (Engagement, Functionality, Aesthetics, nor Information) were significantly correlated with user retention 1, 7, or 30 days after downloading. Also, MARS total score was not significantly correlated with app store rating. CONCLUSIONS Popular mental health apps receive better ratings of user experience than less popular ones. However, user experience (as operationalized by the MARS) does not predict sustained engagement with mental health apps. Collaboration between industry and academic teams may better advance a science of engagement and help to make mental health apps more effective and appealing.


Sign in / Sign up

Export Citation Format

Share Document