|
Quasi-Experimental
Designs and Program Evaluation
Applied Research
•
Goal: To improve the conditions in which
people live and work.
•
Natural settings: Messy, “real world”
— hard to establish experimental control.
•
Quasi-experiments: Experimental procedures
that approximate the conditions of highly controlled laboratory experiments.
•
Program Evaluation: Applied research used
to learn whether real-world treatments work.
Characteristics of True Experiments
• In true experiments, researchers manipulate an independent variable
with treatment and comparison condition(s), and exercise a high degree of control (especially through random assignment to
conditions).
• A true experiment is one that leads to an unambiguous outcome regarding
what caused a result on the dependent variable.
Obstacles to Conducting True
Experiments in Natural Settings
•
Researchers may experience difficulty obtaining
permission to conduct true experiments in natural settings and gaining access to participants.
•
People frequently view random assignment
to treatment as unfair, because some people who may need treatment don’t receive it.
•
However, random assignment is the best
way and fairest way to determine if a new treatment really is effective.
•
A waiting-list control group may
be used so that people randomly assigned to the control group receive treatment after the study is completed.
Advantage of True Experiments: Threats
to Internal Validity are Controlled
•
Threats to internal validity are confounds
that serve as plausible alternative explanations for a research finding.
•
In order to make a causal inference, researchers
rule out these alternative explanations.
•
There are eight general classes of confounds
referred to as “threats to internal validity”: history, maturation, testing, instrumentation, regression, selection,
subject attrition, and additive effects with selection.
Threats to Internal Validity
• History
When an event occurs
at the same time as treatment and changes participants’ behavior, this event becomes an alternative explanation for
the changes in participants’ behavior (rather than treatment); thus, participants’ “history” includes
events other than treatment.
Threats to Internal Validity
(continued)
• Maturation
Participants naturally
change over time; these maturational changes, not treatment, may explain any changes in participants during the experiment.
Threats to Internal Validity
(continued)
• Testing
Taking a test generally
affects subsequent testing; thus, participants’ performance on a measure at the end of the study may differ from an
initial testing, not because of treatment but because they are familiar with the measure.
Threats to Internal Validity
(continued)
• Instrumentation
Instruments used to
measure participants’ performance may change over time (e.g., observers may become bored or tired); thus, changes in
participants’ performance may not be due to treatment but to changes in the instruments used to measure performance.
Threats to Internal Validity
(continued)
• Regression
Participants sometimes
perform very well or very poorly on a measure because of chance factors (e.g., luck). These chance factors are not likely
to be present in a second testing, so their scores will not be so extreme — the scores “regress to the mean.”
These regression effects, not the effect of treatment, may account for changes in participants’ performance over time.
Regression (continued)
Test score = true score
+ error (chance factors, etc.)
One definition of an
unreliable test or measure is that it measures with a lot of error.
If people score very
high or low on the test, it’s possible that chance factors produced the extreme score.
On a second testing,
those chance factors are less likely to be present (otherwise they wouldn’t be chance).
Threats to Internal Validity
(continued)
• Subject Attrition
When participants are
lost from the study (attrition), the group equivalence formed at the start of the study may be destroyed; thus, differences
between treatment and control groups at the end of the study may be due to differences in those who remained in each group
rather than to the effects of treatment.
Threats to Internal Validity
(continued)
• Selection
When differences exist
between individuals in treatment and control groups at the start of the study, these differences become alternative explanations
for any differences observed at the end of the study (rather than treatment).
Threats to Internal Validity
(continued)
• Additive Effects with Selection
When one group of participants
responds differently to an external event (history), matures differently, or is measured more sensitively by a test (instrumentation),
these threats (rather than treatment) may account for any group differences at the end of a study.
Threats to Internal Validity
(continued)
• Important Points to Remember:
– When there is no comparison group in the study, the following threats
to internal validity must be considered:
history, maturation, testing, instrumentation,
regression, subject mortality, selection
– When a comparison group is added, the following threats to internal
validity must be considered:
selection, additive effects with selection
Threats to Internal Validity
(continued)
•
Threats to internal validity that true
experiments may not eliminate:
– Contamination,
– Experimenter expectancy effects, and
– Novelty effects (including Hawthorne effect)
•
Threats to external validity occur
when treatment effects may not be generalized beyond the particular people, setting, treatment, and outcome of the experiment.
– The best way to assess the external validity of findings is to replicate
the experiment.
Threats to Internal Validity
(continued)
• Contamination: This occurs when there is communication about the experiment between groups of participants.
– Three possible outcomes of contamination:
• resentment: some participants’ performance may worsen because
they resent being in a less desirable condition;
• rivalry: participants in a less desirable
condition may boost their performance so they don’t look bad; and
• diffusion of treatments: control participants
learn about a treatment and apply it to themselves.
Threats to Internal Validity
(continued)
• Expectancy Effects: This occurs when an experimenter unintentionally influences the results of an experiment.
– Experimenters can make systematic errors in their interpretation of
participants’ performance based on their expectations.
– Experimenters can make errors in recording data based on their expectations
for participants’ performance.
Threats to Internal Validity
(continued)
• Novelty Effects: This refers to changes in people’s behavior simply because an innovation (e.g., a treatment)
produces excitement, energy, and enthusiasm
– A special case of novelty effects is the Hawthorne effect: performance changes when
people know “significant others” (e.g., researchers, company bosses) are interested in them or care about their
living or work conditions.
• Because
of contamination, expectancy effects, and novelty effects, researchers may have difficulty concluding whether a treatment
was effective.
Quasi-Experiments
• Quasi- (“resembling”) experiments provide an important
alternative when true experiments are not possible.
• Quasi-experiments lack the degree of control found in true experiments.
• Researchers must seek additional evidence to eliminate threats to
internal validity in a quasi-experiment.
The One-Group Pretest-Posttest
Design
•
This is a “bad experiment”
and is sometimes referred to as a “pre-experimental design.”
– An intact group is selected for a treatment (e.g., a classroom of
children, a group of employees).
– A pretest measure is used to record participants’ performance
before treatment (O1— or “Observation 1”)
– The treatment (X) is implemented.
– A posttest measure is used to record performance following the treatment
(O2).
O1 X O2
The One-Group Pretest-Posttest
Design (continued)
•
The one-group pretest-posttest design is
a bad experiment because none of the threats to internal validity are controlled.
• Any
change between pretest (O1) and posttest (O2) scores may be due to the treatment (X) or due to:
– History (some other event that coincided with treatment),
– Testing (the effects of repeated testing),
– Maturation (natural changes in participants over time),
– or due to the other threats to internal validity
Three Quasi-Experimental Designs
• Nonequivalent Control Group Design:
– a group similar to the treatment group serves as a comparison
group, and
– researchers obtain pretest and posttest measures for individuals in
both groups.
– random assignment to groups is not used
– pretest scores are used to determine whether the groups are equivalent
Example: Research Methods and Reasoning Ability
•
Compare students in research methods courses
and students in developmental psychology
•
DV: 7-item test of methodological and statistical
reasoning ability
Nonequivalent Control Group
(continued)
• Suppose group differences are observed at a posttest.
• Rule out threats to internal validity:
– By adding a comparison group, researchers can rule out threats due to history, maturation, testing, instrumentation, and
regression.
– We assume that these threats happen the same to both groups,
therefore, these threats can’t be used to explain posttest differences.
Nonequivalent Control Group
(continued)
• What threats are not ruled out?
– Selection:
• Because individuals are not randomly assigned to conditions, the two groups are not
likely to be equivalent before the intervention (hence, “nonequivalent control”).
• These preexisting differences may account for group differences in the outcome at the
end of the experiment.
Nonequivalent Control Group
(continued)
Additive Effects with
Selection: The two groups
• may have different experiences (selection- history effect), or
• may mature at different rates (selection- maturation effect), or
• be measured more or less sensitively by the instruments (selection-instrumentation
effect), or…
Nonequivalent Control Group
(continued)
continued
– Additive Effects with Selection: The two groups:
• may drop out of the study at different rates (differential subject attrition), or
• may differ in terms of regression to the mean (differential regression).
Simple Interrupted Time-Series
Design
• Observe a dependent variable for some time before and after a treatment
is introduced (often, archival data are used).
O1 O2
O3 O4
X O5 O6 O7 O8
•
Look for clear discontinuity in
the time-series graph for evidence of treatment effectiveness.
Example: Study Habits
•
Intervention: An instructional course to
change students’ study habits, implemented during the summer following the sophomore year (after semester 4).
•
DV: semester GPA
Simple Interrupted Time-Series
Design (continued)
• Suppose a discontinuity is observed when treatment (X) is introduced.
• Rule out threats to internal validity:
– history threats are the most troublesome in this design,
– instrumentation threats also are likely in some studies.
Simple Interrupted Time-Series
Design (continued)
• What threats are more easily ruled out?
– Maturation: We assume maturational changes are gradual, not abrupt
discontinuities.
– Testing: If testing influences responses, these effects are likely
to show up in the initial observations (i.e., before the intervention). Also, testing effects are less likely with archival
data.
– Regression: If scores regress to the mean, they will do so in the
initial observations.
Time Series with Nonequivalent
Control Group Design
• Add a comparison group to the simple interrupted time series design:
O1
O2 O3
O4 X O5
O6 O7 O8
------------------------------------------------------------------
O1 O2
O3 O4
O5 O6
O7 O8
Example: Study Habits
•
Suppose a nonequivalent control group is
added — these students don’t participate in the study habits course.
•
Who should be in the comparison group?
•
What threats are you able to rule out?
Program Evaluation
•
Goal: To provide feedback to administrators of human service organizations in order to help
them decide:
– what services they will provide
– to whom
– how to provide them most effectively and efficiently.
•
This is a big growth area — particularly
in the field of mental health (managed health care).
•
Program evaluators in social services assess
– Needs
– Process
– Outcomes
– Efficiency
Four Questions of Program Evaluation
•
Needs: Is an agency or organization meeting the needs of the people it serves? (survey designs)
•
Process: How is a program being implemented (is it going as planned)? (observational designs)
•
Outcome: Has a program been effective in meeting its stated goals? (experimental, quasi-experimental
designs; archival data)
•
Efficiency: Is a program cost-efficient relative to alternative programs? (experimental, quasi-experimental
designs; archival data)
Basic Research and Applied Research
• Program evaluation is the most extreme case of applied research — the goal of
program evaluation is practical, not theoretical.
• The relationship between basic research and applied research is reciprocal:
– Basic research provides scientifically based principles about behavior and mental processes.
– These principles are applied in the complex, real world.
– New complexities are recognized (e.g., the scientific principles may not always apply in real-world settings)
and new hypotheses must be tested in the lab using basic research.
|