Criterion validity, test-retest reliability and sensitivity to change of the St George urinary incontinence score

2004 ◽  
Vol 93 (3) ◽  
pp. 331-335 ◽  
Author(s):  
A.L. Blackwell ◽  
W. Yoong ◽  
K.H. Moore
BMJ Open ◽  
2018 ◽  
Vol 8 (10) ◽  
pp. e021734 ◽  
Author(s):  
Alison Griffiths ◽  
Rachel Toovey ◽  
Prue E Morgan ◽  
Alicia J Spittle

ObjectiveGross motor assessment tools have a critical role in identifying, diagnosing and evaluating motor difficulties in childhood. The objective of this review was to systematically evaluate the psychometric properties and clinical utility of gross motor assessment tools for children aged 2–12 years.MethodA systematic search of MEDLINE, Embase, CINAHL and AMED was performed between May and July 2017. Methodological quality was assessed with the COnsensus-based Standards for the selection of health status Measurement INstruments checklist and an outcome measures rating form was used to evaluate reliability, validity and clinical utility of assessment tools.ResultsSeven assessment tools from 37 studies/manuals met the inclusion criteria: Bayley Scale of Infant and Toddler Development-III (Bayley-III), Bruininks-Oseretsky Test of Motor Proficiency-2 (BOT-2), Movement Assessment Battery for Children-2 (MABC-2), McCarron Assessment of Neuromuscular Development (MAND), Neurological Sensory Motor Developmental Assessment (NSMDA), Peabody Developmental Motor Scales-2 (PDMS-2) and Test of Gross Motor Development-2 (TGMD-2). Methodological quality varied from poor to excellent. Validity and internal consistency varied from fair to excellent (α=0.5–0.99). The Bayley-III, NSMDA and MABC-2 have evidence of predictive validity. Test–retest reliability is excellent in the BOT-2 (intraclass correlation coefficient (ICC)=0.80–0.99), PDMS-2 (ICC=0.97), MABC-2 (ICC=0.83–0.96) and TGMD-2 (ICC=0.81–0.92). TGMD-2 has the highest inter-rater (ICC=0.88–0.93) and intrarater reliability (ICC=0.92–0.99).ConclusionsThe majority of gross motor assessments for children have good-excellent validity. Test–retest reliability is highest in the BOT-2, MABC-2, PDMS-2 and TGMD-2. The Bayley-III has the best predictive validity at 2 years of age for later motor outcome. None of the assessment tools demonstrate good evaluative validity. Further research on evaluative gross motor assessment tools are urgently needed.


2019 ◽  
Author(s):  
Stephanie A Maganja ◽  
David C Clarke ◽  
Scott A Lear ◽  
Dawn C Mackey

BACKGROUND To assess whether commercial-grade activity monitors are appropriate for measuring step counts in older adults, it is essential to evaluate their measurement properties in this population. OBJECTIVE This study aimed to evaluate test-retest reliability and criterion validity of step counting in older adults with self-reported intact and limited mobility from 6 commercial-grade activity monitors: Fitbit Charge, Fitbit One, Garmin vívofit 2, Jawbone UP2, Misfit Shine, and New-Lifestyles NL-1000. METHODS For test-retest reliability, participants completed two 100-step overground walks at a usual pace while wearing all monitors. We tested the effects of the activity monitor and mobility status on the absolute difference in step count error (%) and computed the standard error of measurement (SEM) between repeat trials. To assess criterion validity, participants completed two 400-meter overground walks at a usual pace while wearing all monitors. The first walk was continuous; the second walk incorporated interruptions to mimic the conditions of daily walking. Criterion step counts were from the researcher tally count. We estimated the effects of the activity monitor, mobility status, and walk interruptions on step count error (%). We also generated Bland-Altman plots and conducted equivalence tests. RESULTS A total of 36 individuals participated (n=20 intact mobility and n=16 limited mobility; 19/36, 53% female) with a mean age of 71.4 (SD 4.7) years and BMI of 29.4 (SD 5.9) kg/m<sup>2</sup>. Considering test-retest reliability, there was an effect of the activity monitor (<i>P</i>&lt;.001). The Fitbit One (1.0%, 95% CI 0.6% to 1.3%), the New-Lifestyles NL-1000 (2.6%, 95% CI 1.3% to 3.9%), and the Garmin vívofit 2 (6.0%, 95 CI 3.2% to 8.8%) had the smallest mean absolute differences in step count errors. The SEM values ranged from 1.0% (Fitbit One) to 23.5% (Jawbone UP2). Regarding criterion validity, all monitors undercounted the steps. Step count error was affected by the activity monitor (<i>P</i>&lt;.001) and walk interruptions (<i>P</i>=.02). Three monitors had small mean step count errors: Misfit Shine (−1.3%, 95% CI −19.5% to 16.8%), Fitbit One (−2.1%, 95% CI −6.1% to 2.0%), and New-Lifestyles NL-1000 (−4.3%, 95 CI −18.9% to 10.3%). Mean step count error was larger during interrupted walking than continuous walking (−5.5% vs −3.6%; <i>P</i>=.02). Bland-Altman plots illustrated nonsystematic bias and small limits of agreement for Fitbit One and Jawbone UP2. Mean step count error lay within an equivalence bound of ±5% for Fitbit One (<i>P</i>&lt;.001) and Misfit Shine (<i>P</i>=.001). CONCLUSIONS Test-retest reliability and criterion validity of step counting varied across 6 consumer-grade activity monitors worn by older adults with self-reported intact and limited mobility. Walk interruptions increased the step count error for all monitors, whereas mobility status did not affect the step count error. The hip-worn Fitbit One was the only monitor with high test-retest reliability and criterion validity.


Author(s):  
Amel Tayech ◽  
Mohamed A. Mejri ◽  
Helmi Chaabene ◽  
Mehdi Chaouachi ◽  
David G. Behm ◽  
...  

2019 ◽  
Vol 46 (1) ◽  
pp. 67-75 ◽  
Author(s):  
Samuel F. Whitley ◽  
Yojanna Cuenca-Carlino

Many schools attempt to identify and service students at risk for poor mental health outcomes within a multi-tiered system of support (MTSS). Universal screening within a MTSS requires technically adequate tools. The Social, Academic, and Emotional Behavior Risk Screener (SAEBRS) has been put forth as a technically adequate screener. Researchers have examined the factor structure, diagnostic accuracy, criterion validity, and internal consistency of SAEBRS data. However, previous research has not examined its temporal stability or replicated the criterion validity results with a racially/ethnically diverse urban elementary school sample. This study examined the test–retest reliability, convergent validity, and predictive validity of teacher-completed SAEBRS ratings with racially/ethnically diverse group students enrolled in first through fifth grade in an urban elementary school. Reliability analyses resulted in significant test–retest reliability coefficients across four weeks for all SAEBRS scales. Furthermore, nonsignificant paired samples t tests were observed with the exception of the third-grade Emotional subscale. Validity analyses yielded significant concurrent and predictive Pearson correlation coefficients between SAEBRS ratings, oral reading fluency, and office discipline referrals. Limitations and implications of the results are discussed.


2018 ◽  
Vol 21 (12) ◽  
pp. 1268-1273 ◽  
Author(s):  
Sean F. Mungovan ◽  
Paula J. Peralta ◽  
Gregory C. Gass ◽  
Aaron T. Scanlan

Sign in / Sign up

Export Citation Format

Share Document