Class 3 Classical Methods of Scale Construction October 13, 2005

Class 3 Classical Methods of Scale Construction October 13, 2005 Anita L. Stewart Institute for Health & Aging University of California, San Francisco

Readings and Homework • Homework as stated in syllabus is for the following week • Readings are relevant to the current week

Overview of class • Types of measurement scales • Rationale for multi-item measures • Scale construction methods • Error concepts

Types of Measurement Scales • Categorical (nominal) • Classification • Numbers are labels for categories • Continuous (along a continuum) • Ordinal • Interval • Ratio

Classification vs. Continuous Scores • CES-D continuous score • 20 items summed using Likert scaling methods • Range of sum is 0-60, used as continuous score in correlational studies • CES-D classification score: • Those scoring 16 or higher are “classified” as having likely depression • Referred for further screening

Categorical (Nominal) Scales/Measures • Primary language1 Spanish 2 English 3 Other • Can you walk without help? 1 Yes 2 No Numbers have no inherent meaning

Change in health: 1 Better 2 No change 3 Worse Income: 1 < $10,000 2 $10,000 - <$20,000 3 $20,000 - <$30,000 4 >$30,000 Ordinal Scales: Numbers Reflect Increasing Level Numbers have no inherent meaning other than “more” or “less.”

Another Example of Ordinal Scale How much pain did you have this past week? 1 None 2 Very mild 3 Mild 4 Moderate 5 Severe 6 Very severe

Feature of Ordinal Scales • Distances between numbers are unknown and probably vary • some closer together in meaning than others • When ordinal responses are determining extent of agreement (agree, disagree) • referred to as a Likert scale • Likert scale has since come to have other meanings in health measurement

Interval Scales • Numbers have equal intervals • A unit change is constant across the scale • Example - temperature • can add and subtract scores • a 2 unit change is the same at lower temperatures as higher temperatures

Ratio Scale • Has a meaningful zero point • Change scores have specific meaning • and can multiply • e.g., one score can be 2 or 3 times another • Examples • Weight in pounds • Income in dollars • Number of visits

Types of Measurement Scales and Their Properties

Single- and Multi-Item Measures • Advantages of single items • Response choices are interpretable • Disadvantages • Numbers are not easily interpretable • Limited variability • Easy to get skewed distributions • Reliability is usually low • Difficult to assess a complex concept with one item

Interpretability of “Numbers” in Single Item Ordinal Scale

Interpretability of “Numbers” in Single Item Ordinal Scale Is “very severe”twice as painfulas “mild”?

Estimated Distance Between Levels in Ordinal Scale (N=2,928) (0-100 scale)

Distance Between Levels in an Ordinal Scale (N=2,928) 9 10 17 20 16

Distance Between Levels: “In general, how would you rate your health?”

Distance Between Levels: “In general, how would you rate your health?” 20 26 18 11

Multi-Item Measures or Scales • Multi-item measures are created by combining two or more items into an overall measure or scale score

Advantages of Multi-item measures • More scale values (enhances sensitivity) • Improves score distribution (more normal) • Reduces number of variables needed to measure one concept • Improves reliability (reduces random error) • Can estimate a score if some items are missing • Enriches the concept being measured (more valid)

Types of Scale Construction • Summated ratings scales • Likert scaling • Utility weighting or preference-based measures (econometric scales) • Guttman scaling • Thurstone scales • Many others

How much of the time .... tired? 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time How much of the time …. full of energy? 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time Example of a 2-item Summated Ratings Scale

How much of the time .... tired? 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time How much of the time …. full of energy? 1=5 All of the time 2=4 Most of the time 3=3 Some of the time 4=2 A little of the time 5=1 None of the time Step 1: Reverse One Item So They Are All in the Same Direction Reverse “energy” item so high score = more energy

How much of the time .... tired? 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time How much of the time …. full of energy? 5 All of the time 4 Most of the time 3 Some of the time 2 A little of the time 1 None of the time Step 2: Sum the Two Items Highest = 10 (tired none of the time, full of energy all of the time)Lowest = 2 (tired all of the time, full of energy none of the time)

How much of the time .... tired? 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time How much of the time …. full of energy? 5 All of the time 4 Most of the time 3 Some of the time 2 A little of the time 1 None of the time Step 2: Average the Two Items Highest = 5.0 (tired none of the time, full of energy all of the time) Lowest = 1.0 (tired all of the time, full of energy none of the time)

Summed or Averaged: Increase Number of Levels from 5 to 9

Summated Scales: Scaling Analyses • To create a summated scale, one needs to first test whether a set of items that appear to measure the same concept can be combined • Need to test hypothesis that the items do indeed belong together to form a single concept • Five criteria need to be met to combine items into a summated scale

Five Criteria to Meet to Qualify as a Summated Scale • Item convergence • Item discrimination • No unhypothesized dimensions • Items contribute similar proportion of information to score • Items have equal variances

First Criterion: Item Convergence • Each item correlates substantially with the total score of all items • with the item taken out or “corrected for overlap” • Typical criterion is >= .30 • for well-developed scales, often set at>= .40

Example: Analyzing Convergent Validity for Adaptive Coping Scale Item-scale correlations Adaptive coping (alpha = .70) 5 Get emotional support from others .49 11 See it in a different light .62 18 Accept the reality of it .25 20 Find comfort in religion .58 13 Get comfort from someone .45 21 Learn to live with it .21 23 Pray or meditate .39 Moody-Ayers SY et al. Prevalence and correlates of perceivedsocietal racism in older African American adults with type 2 diabetes mellitus. J Amer Geriatr Soc, 2005, in press.

Example: Analyzing Convergent Validity for Adaptive Coping Scale Item-scale correlations Adaptive coping (alpha = .70) 5 Get emotional support from others .49 11 See it in a different light .62 18 Accept the reality of it .25 <.30 20 Find comfort in religion .58 13 Get comfort from someone .45 21 Learn to live with it .21 <.30 23 Pray or meditate .39

Example: Analyzing Convergent Validity for Adaptive Coping Scale • Item-scale correlations • Adaptive coping (alpha = .76) • 5 Get emotional support from others .45 • 11 See it in a different light .59 • 20 Find comfort in religion .73 • 13 Get comfort from someone .45 • Pray or meditate .51 • Acceptance (alpha = .67) • Learn to live with it .50 • 18 Accept the reality of it .50

SAS/SPSS Make Item Convergence Analysis Easy • Reliability programs provide this • Item-scale correlations corrected for overlap • Internal consistency reliability (coefficient alpha) • Reliability with each item removed • To see effect of removing a bad item

Second Criterion: Item Discrimination • Each item correlates significantly higher with the construct it is hypothesized to measure than with other constructs • Item discrimination • Statistical significance is determined by standard error of the correlation • Determined by sample size

Multitrait Scaling - An Approach to Constructing Multi-item Scales • Confirms whether hypothesized item groupings can be summed into a scale score • Examines extent to which all five criteria are met • Examines resulting scales

Example: Two Subscales Being Developed • Depression and Anxiety subscales of MOS Psychological Distress measure

Example of Multitrait Scaling Matrix: Hypothesized Scales ANXIETYDEPRESSION ANXIETY Nervous person .80 .65 Tense, high strung .83 .70 Anxious, worried .78 .78 Restless, fidgety .76 .68 DEPRESSION Low spirits .75 .89 Downhearted .74 .88 Depressed .76 .90 Moody .77 .82

Example of Multitrait Scaling Matrix: Item Convergence ANXIETYDEPRESSION ANXIETY Nervous person .80* .65 Tense, high strung .83* .70 Anxious, worried .78* .78 Restless, fidgety .76* .68 DEPRESSION Low spirits .75 .89* Downhearted .74 .88* Depressed .76 .90* Moody .77 .82*

Example of Multitrait Scaling Matrix: Item Discrimination ANXIETYDEPRESSION ANXIETY Nervous person .80* .65 Tense, high strung .83* .70 Anxious, worried .78* .78 Restless, fidgety .76* .68 DEPRESSION Low spirits .75 .89* Downhearted .74 .88* Depressed .76 .90* Moody .77 .82*

Preference Based or Utility Measures • Utilities are numeric measurements that reflect the desirability people associate with a health state or condition • Value of that health state • Preference for that health state (rather than another)

Methods for Assigning Values? • Four steps: • Identify the population of judges who will assign “preferences” • Sample and describe health states to be assigned utilities • Select a preference measurement method • Collect preference judgments, analyze the data, and assign weights to the health states

Preference Based or Utility Measures (cont.) • Advantages • Combine complex health states into a single number • Score reflects the value or preference for the overall health state • Need two absolute reference points • 0 represents death • 1 represents perfect health • Methods for obtaining value weights • Time tradeoff, standard gamble, rating scales

Readings on Utility Measurement • A huge literature • Some readings available on request

Overview • Types of measurement scales • Rationale for multi-item measures • Scale construction methods • Error concepts

Concepts of Error • How to depict error • Distinction between random error and systematic error

= + Components of an Individual’s Observed Item Score (NOTE: Simplistic view) Observed true item score score error

Class 3 Classical Methods of Scale Construction October 13, 2005

Class 3 Classical Methods of Scale Construction October 13, 2005

Presentation Transcript

Principles Methods of Classical Archaeology

Construction Methods

Major Scale Construction

Arapahoe Anatomy Class 13 October 2009

Productions of construction 2005

Scale/Index Construction

13 OCTOBER 2005

Construction Methods

Class 13, October 21

Teacher: Mary Wang Date: October 31 Class: Class 13, Senior 3

Sociology 601 Class 13: October 13, 2009

Construction Methods

Construction Methods

October 13-14, 2005 Martinique

Construction Methods

October 3, 2005

Construction Methods

Construction Methods

Construction Methods