Zum Hauptinhalt springen

SUPPORTING VALID DECISION MAKING: USES AND MISUSES OF ASSESSMENT DATA WITHIN THE CONTEXT OF RTI

BALL, Carrie R ; CHRIST, Theodore J ; et al.
In: Addressing Response to Intervention Implementation: Questions from fhe Field, Jg. 49 (2012), Heft 3, S. 231-244
Online academicJournal - print; 14; 2 p

Supporting valid decision making: Uses and misuses of assessment data within the context of RTI. 

Within an RtI problem‐solving context, assessment and decision making generally center around the tasks of problem identification, problem analysis, progress monitoring, and program evaluation. We use this framework to discuss the current state of the literature regarding curriculum based measurement, its technical properties, and its utility for making instructional decisions. Cursory examination of emerging alternatives (e.g., computer adaptive tests) is included, where appropriate. We then offer recommendations for local decision making, with an emphasis on high‐quality assessment, responsible decision making, bridging the research–practice gap, and capitalizing on the expertise of school psychologists. © 2012 Wiley Periodicals, Inc.

Response to intervention (RtI) may be generally described as a model of service delivery comprising multiple levels of increasingly intensive intervention to prevent and remediate learning difficulties, with movement between intervention levels guided by increasingly frequent assessment of students' level of performance and progress over time (Gresham, [32]). Within the RtI framework, one of the most critical and complex elements is that of data‐based decision making, which relies on measurement of the level (i.e., performance at a static point in time) and slope (i.e., amount of progress across time) of student performance (Fuchs, [30]). To support the RtI model, measurement generally occurs for four distinct purposes: (a) problem identification (e.g., universal screening or benchmark assessment of all students to identify students in need of more intensive intervention); (b) problem analysis or identification of specific skill areas in need of more intensive intervention; (c) biweekly, weekly, or bimonthly progress monitoring to document the trajectory of progress for students who are receiving more intensive tiers of intervention; and (d) program evaluation or summative assessment to evaluate instructional outcomes at a systems level (National Association of School Psychologists [NASP], 2009).

Several alternative methods have been suggested as effective for universal screening purposes, including curriculum‐based measurement (CBM; e.g., Vander Meer, Lentz, & Stollar, [54]), computer adaptive testing (CAT; e.g., Ball, O'Connor, & Holden, [6]), and single‐skill assessment for younger populations (e.g., Coyne & Harn, [17]). CBM is currently the most popular method of universal screening, in large part due to cost‐effectiveness and ease of administration. CBM consists of a brief (less than 5‐minute) assessment of either a global indicator of functioning in a particular domain (e.g., oral reading fluency) or a sampling of skills targeted for mastery at a particular grade level (e.g., math calculation fluency). CBM is often referred to as a general outcome measure (GOM) because it measures a broad range of general skills associated with overall competence in a specific skill area (Fuchs & Deno, [31]).

CBM is also the most commonly used measure for progress monitoring; however, some researchers have raised concern about the use of GOM as the primary—and often singular—source of data to monitor and evaluate individual students' response to intervention. Others (e.g., Olinghouse, Lambert, & Compton, [45]; Shapiro, [47]), however, have suggested that the assumptions, purposes, and technical characteristics of CBM may be inappropriate for certain types of progress monitoring, such as for interventions focused on isolated skill sets (e.g., decoding, sight word reading). They have proposed and helped to conceptualize the development and use of specific subskill mastery measurement (SSMM), which emphasizes targeted assessment of particular skills to evaluate intervention effects for individual students.

Over the past 10 years, systems of student data collection and progress monitoring (e.g., Dynamic Indicators of Basic Early Literacy Skills (DIBELS), AIMSWeb, System to Enhance Educational Performance (STEEP)) have become increasingly available to schools and districts to support RtI implementation efforts (dibels.uoregon.edu; aimsweb.com; www.isteep.com). Many of those systems rely on CBM or CBM‐like tools to estimate the student's level of performance and slope of progress. Because of their relatively low cost, ease of administration, and ability to address schoolwide needs for both screening and progress monitoring assessment, these assessment systems are currently popular choices to support RtI implementation.

Many schools and districts, therefore, implement universal screening to identify students who are "at risk" based on the data from these assessment systems. Many schools have also successfully identified and adopted a series of increasingly intensive interventions, based on information available from state and national associations and publicly available resources, such as the What Works Clearinghouse (www.whatworks.ed.gov) or the National Center on Response to Intervention (www.rti4success.org). The point at which schools typically struggle with data‐based decision making relates to decisions at the individual student level. For example, common issues include (a) the number of data points needed to make a decision regarding response, (b) the amount of time necessary to evaluate whether an intervention is successful, (c) whether progress‐monitoring data are sufficient in lieu of more traditional standardized assessments for making special education placement decisions, and (d) the most appropriate action once an intervention is deemed successful or unsuccessful (e.g., continue, discontinue, intensify, or change interventions).

PURPOSE

The purpose of this article is to provide practitioners with a useful framework to understand the current state of the literature regarding CBM, its technical properties, and its utility for making a variety of decisions within a problem‐solving context; cursory examination of emerging alternatives (e.g., computer adaptive tests) is included where appropriate. First, the problem‐solving model is briefly summarized to provide context. We then evaluate the technical adequacy of currently available measures to inform relevant questions at each stage, with an emphasis on the current shortcomings of available measures for problem analysis and progress monitoring for individual students. The results of the review provide a foundation for recommendations to school psychologists, with an emphasis on high‐quality assessment, responsible decision making, and bridging the research–practice gap. Consistent with the available evidence and emphasis on the primary grades, the focus is on reading.

THE PROBLEM‐SOLVING MODEL

RtI bears a strong resemblance to other problem‐solving models that have been proposed over the years (Tilly, [52]). For example, Deno ([24]) proposed a five‐step "IDEAL" problem‐solving model: (a) Identify the problem, (b) Define the problem, (c) Examine alternatives, (d) Apply the chosen alterative, and (e) Look at the effects. Within this type of model, problem solving is driven by data collection and evaluation at each step. A similar schoolwide problem‐solving model (Tilly, [52]) uses only four steps: (a) define the problem, (b) develop a plan, (c) implement the plan, and (d) evaluate. Other models for problem solving (e.g., behavioral consultation, conjoint behavioral consultation) were precursors or derivatives of those models. Throughout the remainder of the article, rather than espousing a specific problem‐solving model, we focus on the four major purposes of assessment within an RtI process: (a) problem identification, (b) problem analysis, (c) progress‐monitoring, and (d) program evaluation (NASP, 2009). We find this framework useful for evaluating the technical properties and utility of available measures because these four purposes of assessment are generally consistent across problem‐solving models.

Additionally, the adoption of the above model allows us to focus on the issue of decision validity—the validity of the decisions that are made based on assessment results (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 1999; Kane 1992; 2005; Messick, [42]). Kane ([39]; 2005) provides a thorough discussion of the types of validity to be documented within psychological and educational assessment, addressing not only traditional conceptualizations of validation (e.g., construct, content, predictive, and concurrent validity), but also advocating a more critical and formative evaluation process that evaluates the validity of educational and psychological decision making. For example, results of a standardized achievement assessment may be used to make an inference that a particular student is performing significantly below a grade level standard; this can be evaluated in terms of whether the inference is valid. If the same assessment is then used to make a determination regarding retention or promotion to the next grade level, the inference results in a particular decision, which is distinct from the inference itself; the validity of the decision, then, also warrants evaluation (Kane, [40]). Contemporary conceptions of validity provide a foundation and framework to conclude that decision validity is the most critical type of validity within the context of RtIbecause educational decisions are the ultimate application of the data obtained from our screening and progress‐monitoring endeavors. Our goal, then, is to frame a discussion of the types of decisions to be made within an RtI model and to focus our attention on the issue of decision validity in particular. Table 1 is provided to summarize key information related to each decision point.

1 Summary of Evidence in Support of CBM‐R Utility in Educational Decision Making

Problem‐SolvingAssessment
TaskExamplesDecision ExamplesSummary of the Evidence
Problem IdentificationUniversal screeningTier placement• Technically adequate for identifying and quantifying discrepancies
Problem AnalysisIsolate skill deficits Hypothesis testingDetermine placement in specific interventions

• CBM‐R not useful for identifying specific skill deficits or designing individualized interventions

• More research is needed to develop technically adequate measures

Progress MonitoringDetermine "response" to instruction (e.g., frequent repeated measurement)Movement between tiers (special education eligibility in some districts)

• Not adequate for reliably measuring change over brief intervals

• Not sensitive to specific skill development

Program EvaluationEvaluate effectiveness of curriculum (e.g., pre–post assessment)Maintain, revise, or replace interventions or curricula• Technically adequate for demonstrating global instruction effectiveness over long time interval

Problem Identification

Within an RtI model, universal screening is the most common method of problem identification, as it is used to make a determination about a student's level of performance at a particular point in time. There has been a fairly large amount of research focused on the technical adequacy of CBM‐reading (CBM‐R) for assessing level of performance over extended periods, as is characteristic of screening (Wayman, Wallace, Wiley, Ticha & Espin, 2007; also see Deno, [23]; Fuchs, [30]). High test–retest reliability and alternate‐forms reliability have been published for commonly used measures (e.g., Wayman et al., [55]; also see Ardoin & Christ, [9]; Kaminski & Good, [38]), and CBM‐R has been shown to have moderate to strong concurrent and predictive validity for other measures of broad reading, reading fluency, and reading comprehension (Ardoin et al., [5]; McGlinchey & Hixon, [41]; Shinn, Good, Knutson, Tilly, & Collins, [48]; Silberglitt & Hintze, [49]). Research has also evaluated the need to administer three oral reading passages or whether one passage would be sufficient for adequately assessing the current level. In general, strong reliability has been shown with regard to estimating the present level of performance at a single point in time (e.g., Ardoin & Christ, [9]; Ardoin et al., [5]), with Ardoin & Christ ([9]) reporting relatively small improvements in reliability (i.e., from a median reliability coefficient of.92 to.97) by using the median score from three passages. Ardoin et al. ([5]) also reported greater concurrent validity for measures of oral reading fluency compared with group‐administered maze passages.

More attention has been given recently to the use of CAT (e.g., Northwest Evaluation Association [NWEA] Measures of Academic Progress) as a universal screening tool, with some research indicating that CAT explains as much or more variability than CBM measures on statewide achievement tests (e.g., Ball et al., [6]). CAT generally takes longer to administer and is more costly than CBM, although it can be group administered and scored to provide some information about student performance, in particular, subareas of functioning (e.g., word meanings, understanding text), rather than a single general indicator (e.g., words correct per minute). Available technical information suggests moderate to high concurrent and predictive validity with other achievement measures, as well as strong test–retest and internal consistency reliability (NWEA, 2009). Moreover, recent work by Christ and colleagues established a 6‐ to 15‐min CAT‐based reading screener with excellent decision accuracy as compared with other assessments that take 1 hour or more (see National Center for Response to Intervention Tools Chart at http://www.rti4success.org/progressMonitoringTools).

Typically, universal screening data are used to draw inferences regarding students' current performance, falling either above or below a predetermined cut score. If scores fall above the cut point, students are deemed to be making acceptable progress within the general curriculum; therefore, no problem is identified and no instructional changes are made. For students whose scores fall below the cut point, potential problems are identified; that is, they are deemed to be performing below expectations in response to general classroom instruction. At this point, a decision may be made to place the student in a more intensive intervention or to engage in more detailed assessment for the purpose of intervention planning. In any case, the decisions made at the problem‐identification stage of RtI are considered relatively "low" or "moderate" stakes (NASP, 2009) because the long‐term consequences of an error will likely be minimal. Students who are deemed to perform adequately will likely be identified and moved into intervention at the next screening cycle, and students who are incorrectly identified as needing intervention will soon demonstrate no further need of intensive services. Under these conditions and based on the available research, both CBM and CAT are likely technically adequate to support problem identification (Thorndike & Thorndike‐Christ, [51]).

Problem Analysis

There is clear evidence that neither screeners nor high‐stakes assessments provide sufficient information to guide intervention development for individual students. Valencia & Buly ([53]) identified multiple profiles of strengths and weaknesses among students who failed to achieve reading proficiency. They concluded that there are distinct needs and intervention targets that vary among those students who fail to achieve in reading. Unfortunately, at the time this article was written, there were sparse resources to conceptualize and carry out problem analysis. Christ ([9]) stated that, "Problem analysis is a dynamic process of assessment and evaluation. It is the collection, summary, and use of information to systematically test, reject, or verify relevant hypotheses to establish problem solutions" (p. 159). Further, it is noted that solutions are more than categorical decisions; rather, problem solutions are "one or more changes to the instruction, curriculum, or environment that function(s) to reduce or eliminate a problem" (Christ, [9], p. 159). Although multiple theoretical and methodological foundations might guide hypothesis testing, the most useful are those that promote low‐inference, skills‐based approaches that directly connect skill deficits and instructional procedures (Christ, [9]).

Specifically, the process should systematically identify salient skills that might emerge as targets for intervention while ruling out skills that are already established. For example, in reading, the process might rule out deficits in vision and hearing along with deficits in exposure to oral or printed language. It might then progress to rule out deficits in skills such as letter identification, letter‐sound knowledge, decoding, and sight word reading. This process is largely consistent with SSMM (described earlier) in terms of its emphasis on identifying specific skill deficits. Although potentially useful for analyzing the subskills underlying academic difficulties and for developing individualized interventions, SSMM has largely undocumented technical characteristics to inform its validity for this purpose.

Several other alternatives have also been proposed for problem analysis and have promise as tools for planning or modifying various aspects of intervention (e.g., instructional level, areas of skill strength and weakness, likelihood of response). A thorough discussion of these is beyond the scope of this article, but we endeavor to provide a brief overview of the types of assessments proposed for the purpose of problem analysis. It is clear, for example, that problem analysis might be embedded within an intervention program (see Fountas & Pinnell, [28], for a popular example) for the purpose of aiding educators' planning for individual or group instruction. Formal stand‐alone assessments, such as the Developmental Reading Assessment, Second Edition(Beaver & Carter, [7]), are also available to place students within instructional levels and identify broadly defined instructional targets. Other instrumentation and procedures for problem analysis, such as informal reading inventories and miscue analysis, are independent of intervention programs and can provide information useful for assessing instructional level or identifying broad areas of instructional need. Prescriptive procedures, such as brief experimental analysis (Daly, Martens, Hamler, Dool, & Eckert, [20]; Eckert, Ardoin, Daly, & Martens, [27]), have also been proposed to briefly expose students to a range of interventions and determine which is likely to be most effective. Across procedures, however, the technical adequacy of the assessments themselves and the validity of decisions based on the results are poorly documented; it is therefore difficult to determine the quality and degree of validity, reliability, or decision accuracy based on these assessments. A substantial amount of additional research is needed in this area to develop and document technically adequate problem analysis tools.

Progress Monitoring

Intervention evaluation occurs during intervention implementation so the instructional program can be modified and titrated to ensure sufficient effects. There are, however, few assessments that are designed for ongoing and frequent (i.e., weekly) assessment. Substantial research over the past 3 decades has contributed to establishing CBM—and especially CBM‐R—as a method uniquely suited to improving student achievement (Stecker, Fuchs, & Fuchs, [50]), especially as it is used to monitor progress and evaluate instructional effects over brief periods (Deno, [21], [22], 2003).

Recent work, however, provides substantial impetus to reevaluate the underlying assumptions and evidence‐base of CBM. For example, work by Christ and colleagues provides compelling evidence that progress monitoring outcomes are highly instable when interventions and progress monitoring occur over brief periods of less than 2 months. Moreover, a recent review of the research and professional literature failed to identify an evidence base to support many of the recommended practices for interpretation and use of CBM progress monitoring data (Ardoin, Christ, Morena, Cormier & Klingbeil, 2010).

An implicit assumption of frequent assessment is that CBM is sufficiently sensitive to evaluate the instructional effects necessary for progress monitoring and inductive hypothesis testing. It is important to emphasize, however, that CBM is sensitive to a range of other influences in addition to instructional effects. Indeed, although other standardized assessment procedures are also subject to extraneous influences, CBM may be particularly susceptible due to its unique characteristics. For example, rate‐based measures are often more sensitive to variations in performance compared with frequency or accuracy measures. Documented sources of variability include the potential influence of examiner characteristics (Derr‐Minneci & Shapiro, [26]; Derr & Shapiro, [25]), setting (Derr‐Minneci & Shapiro, [26]), and delivery of directions (Colon & Kranzler, [16]). There is also extensive evidence that CBM is especially sensitive to variations in instrumentation, particularly variations in difficulty or content across passages (Ardoin & Christ, [3]; Christ & Ardoin, [10]; Francis et al., [29]; Hintze & Christ, [36]; Jenkins, Zumeta, Dupree, & Johnson, [37]; Poncy, Skinner, & Axtell, [46]).

In the case of CBM‐R, the results of research demonstrate that average student performances might fluctuate by as much as 40 words read correctly per minute (WRC/min) when administered alternate grade‐level passages; the expected deviations from a student's mean (or median) level performance might approximate ±10 to 15 WRC/min (Christ & Ardoin, [10]; Francis et al., [29]; Jenkins et al., [37]; Poncy, et al., 2005). A substantial proportion this variance in student performance across alternate CBM‐R administrations is related to the inconsistencies across alternate forms (Christ & Ardoin, [10]; Hintze & Christ, [36]; Poncy, et al., 2005), along with other variations of assessment occasion.

The impacts of those inconsistencies were evaluated in a number of studies (Ardoin & Christ, [3]; Christ, [8]; Christ, Monaghen, Zopluoglu, & Van Norman, [11]; Christ, Zopluoglu, & Long, in press; Christ, Zopluoglu, Monaghen, Pike‐Balow, & Van Norman, [13]; Christ, Zopluoglu, Pike‐Balow, & Monaghen, [14]), with consistent conclusions: CBM‐R data do not provide useful estimates of growth over brief periods of intervention and progress monitoring. It seems that months—rather than weeks—of intervention and progress monitoring are necessary to achieve sufficient decision accuracy. The results of Christ, Zopluoglu, Monaghen, et al. (2011) indicate that the limitations are not likely to be specific to CBM or CBM‐R. That is, improvements in GOM are not likely to exceed the error associated with repeated measurement until longer durations of interventions allow for the effects to manifest. The nature and characteristics of GOM requires time for the construct to develop and be detected with measurement.

Progress monitoring is an essential element of problem solving and RtI; however, the current conceptions of progress monitoring are insufficient. That is, the notion that GOM is sufficient as the primary or exclusive method for evaluating intervention effects appears flawed in light of the outcomes of recent studies. Instead, it is necessary to use a combination of SSMM and GOM to evaluate specific short‐term intervention effects with SSMM and generalized long‐term intervention effects with GOM. This is consistent with measurement of both specific objects and broad goals. Although this recommendation seems to make sense, there is very little research to document the potential benefits and, as noted earlier, little work has been done with regard to documenting the technical characteristics of measures that could be used for SSMM; ongoing research and development are necessary to advance practice in this area.

Program Evaluation

Prior to intervention, a problem is identified and defined as an unacceptable discrepancy between the expected level of achievement and the observed level of achievement. It is typically defined objectively and precisely with a quantitative value from an assessment, which is often achieved with benchmarking or norming. Defining a problem with a test score is especially useful if the assessment is sensitive to change for the purpose of program evaluation. As discussed, program evaluation occurs formatively with progress monitoring data, which may be insufficient to evaluate RtI over the short term. Program evaluation also occurs summatively, using pre–post or interim assessment data. These types of decisions are typically focused on the effectiveness of a program, intervention, or curriculum overall.

There are well‐documented and frequently cited challenges associated with the assessment of change (Cronbach & Furby, [18]). That is, the unreliability—or invalidity—of an assessment is magnified when pre–post data are used to estimate growth. Moreover, there are multiple other facets of measurement (e.g., setting, form equivalence, administration) that often impede repeated assessment (Cronbach et al., [19]). Those challenges are magnified when data are collected and interpreted to guide decisions about individual student cases due to idiosyncratic variations. Nevertheless, program evaluation, with both formative and summative procedures, is fundamental to problem solving and RtI.

In addition to issues associated with the measurement of change, intervention integrity represents a critical challenge to the decision validity of program evaluation assessment. Existing research suggests that interventions are often not carried out as intended with regard to content, quantity, and process (e.g., Hagermoser Sanetti & Kratochwill, [34], [35]), and these issues are likely to be pervasive across intervention tiers. Examples of deviations from the intervention plan may include alterations such as longer or shorter intervention sessions than originally planned, implementation that excludes key components or adds additional unplanned components, or changes in delivery format (e.g., small group or individual intervention). Departure from the intervention plan does not necessarily result in a weakening of the intervention effects; nevertheless, decisions about the effectiveness of a modified intervention will be flawed without detailed information about the types of modifications made. Emerging research further suggests that, in the vast majority of districts, there is no mechanism in place to routinely assess the integrity with which an intervention has been implemented (Cochrane & Laux, [15]). Therefore, even assuming adequate tools for monitoring change across time, the RtI decision‐making process is likely to be flawed with respect to intervention effectiveness if we attempt to draw conclusions without essential information about the extent to which the intervention plan was actually followed.

The available evidence for CBM‐R does suggest that interim assessment (Ardoin & Christ, [9]) and pre–post assessment (Christ, Monaghen, et al., 2011) are useful to evaluate program effects after extended periods (e.g., 3 months) of intervention; therefore, the method remains a potentially useful approach, provided sufficient time is permitted for the effects of intervention or instruction to be reflected in CBM‐reading (CBM‐R). Other methods, such as traditional broadband paper‐and‐pencil assessments or CAT, are likely to be less sensitive to intervention effects over similar periods; however, additional research continues to emerge related to CAT assessment, which may hold promise for program evaluation over long periods of Tier‐1 instruction (e.g., evaluating grade‐level curriculum). Additional work is also needed in the field to establish and implement systems for monitoring intervention integrity at all tiers.

Review of the present literature establishes a challenge for those entrenched in the development and implementation of RtI. That is, because the skills captured by GOM are generally and broadly defined, it appears adequate for identifying problems, quantifying discrepancy, and evaluating the overall effects of instruction across extended periods. GOM is unable, however, to provide information that would be useful for designing interventions, and there appear to be few GOMs with the technical adequacy to detect valid and reliable change over the short term.

LOCAL DECISIONS

Notwithstanding the substantial promise and influence of problem solving and RtI, there is much left to learn. Although there are common features associated with many problem‐solving and RtI prototypes, the specifics are far from standardized with regard to instruction, intervention programming, and assessment tools and interpretive procedures. Indeed, RtI implementation requires both careful consideration of the research literature and careful attention to data that emerge from classrooms, teams, schools, districts, and states. RtI is not a one‐size‐fits‐all approach. Instead, it is characterized by the use of extant and emerging data to select and optimize practices that achieve desired outcomes for students, especially those populations that have been historically underserved and at risk. In most districts, the effects of RtI on student outcomes will emerge over time, and the evidence to guide and document effectiveness derives from both the research literature and discoveries at the local level. Indeed, RtI holds substantial promise because it emphasizes evidence‐based practices along with the collection and use of the right kind of assessment data. The preceding sections provided a critical review of the extant and emerging evidence base related to assessment systems. In the next section, we attempt to provide guidance to local districts for implementing assessment systems that are grounded in research, with consideration of what might be learned locally.

ASSESSMENT AUDIT

Poor alignment between assessments and the intended decisions is common. As a general rule, define the decision first and then select the most relevant assessment tools, schedule, and interpretive procedures. Moreover, do not collect data unless it is clear how that data will be used in response to a specific prioritized decision. It is necessary to understand which decisions are most important at the class, grade, school, and district level. Because reading is widely accepted as the most critical skill to long‐term academic success, RTI implementation often begins in that area.

It is also common to have multiple assessments in use to address one type of decision (e.g., problem identification) without a clear plan on how those data are used. Moreover, many districts have no assessments in use to address other types of decisions (e.g., problem analysis). An assessment audit should describe (a) the assessment used, (b) the decision(s) for which each assessment is most appropriate, (c) the technical adequacy of the assessment to inform the intended decision, (d) redundancies, and (d) gaps in the assessment system. Once the existing assessment system is described, attention may be devoted to eliminating unnecessary redundancies and filling gaps as needed to align with instructional and decision‐making priorities. The following paragraphs introduce other relevant factors to consider when designing and implementing local assessment systems to support problem solving and RtI.

Unit of Analysis

As districts evaluate redundancies and gaps in their existing assessment frameworks, a critical consideration is whether sufficient data are available to preserve multiple units of analysis. A major improvement inherent in the contemporary conceptions of problem solving and RtI is that problems are identified, analyzed, targeted for intervention, and resolved at the individual, group, and systems levels (Christ, [9]). This multilevel conceptualization allows resource allocationto maximize service delivery at the systems and group level, while minimizing the need for intensive individualized services. For example, systemwide problems, which impact a substantial portion of the population (e.g., 40% of students below reading benchmark), should be identified and remediated at the grade, school, or district level. Other problems may be identified for particular groups within the population (e.g., English language learners may demonstrate less benefit from traditional instruction); standard protocol interventions for these groups might remediate deficits and prevent long‐term failure. Finally, some individual students exhibit problems that require individualized programming.

To support this multitiered approach, districts must ensure collection of sufficient data to allow examination of patterns and trends across all students and key subgroups within the population. Data commonly collected for screening and program evaluation are very useful for this purpose and may be accompanied by more frequent or targeted data collection of individual students or groups of students. The intended unit of analysis must be considered, however, to ensure that data can be interpreted and used to guide efficient resource allocation at the systems, group, or individual level. If all problems are viewed as individual student problems, then the supplemental and intensive supports at Tiers 2 and 3 will bloat and strain resources until the system fails. Likewise, conceptualizing all problems as occurring at the system level will result in unwieldy and excessive data collection systems that reduce instructional time without enhancing outcomes.

Purposeful Selection of Assessments

A key goal of completing an assessment audit is to establish a system of assessments that are purposefully selected to support problem solving and RtI decisions. For some decisions (e.g., problem identification, program evaluation), districts may choose one or more of several well‐established or promising assessment tools. For example, there appears at this time to be an emerging trend toward the combined use of CBM‐R and CAT for universal screening. CBM‐R can be used to assess reading rate and fluency, whereas CAT is a highly efficient method for broadband and standards‐based assessment. In combination, the measures appear to provide educators with some information about particular reading skills (e.g., vocabulary), as well as providing a global indicator of overall functioning (i.e., fluency).

For other decisions (e.g., problem analysis, progress monitoring), more flexibility, creativity, and experimentation will likely be required to establish assessment systems that adequately meet the needs of local stakeholders. Based on existing research, the potential of CBM‐R for short‐term progress monitoring appears more limited than often promised; however, it may have utility when used in combination with other measures that are suited for assessing intervention effects. In these cases, with few alternatives for technically sound assessment, districts may choose to maintain the use of CBM‐R. It is also advisable, however, for districts to select or develop additional measures that are better suited for monitoring short‐term progress. For example, word lists, pseudoword lists, informal reading inventories or reading passages aligned with intervention content may also be introduced. Although such measures lack research to document their technical adequacy, they have not been shown technically inadequateand may hold practical utility for local decision making. The balance and combination of measures to be used in the areas of problem analysis and progress monitoring will likely become a point of dynamic and ongoing local discussion.

Assessment and Eligibility Determination

The adoption and initial implementation of problem solving and RtI should focus on prevention and early intervention. Eligibility decisions should be secondary during the early phases and years of adoption. There are many legal, psychometric, and ethical issues yet to be addressed. At the time this article was written, there was vigorous debate within the school psychology community about the need for additional standardized assessment, particularly intelligence testing, to make special education determinations within an RtI framework (e.g., Gresham, Restori, & Cook, [33]). The debate is far from resolved. Although it is clear that historical models (e.g., ipsative analysis, aptitude‐achievement discrepancies) lack theoretical and psychometric support, it is not clear how eligibility determination might emerge within problem solving and RtI.

In any event, progress monitoring data using CBM‐R has not been well supported as a reliable indicator of short‐term progress or response to intervention and should not be used in isolation to make high‐stakes eligibility decisions. Moreover, the lack of information regarding intervention integrity in most districts presents an additional challenge to valid eligibility decision making, as a lack of progress may be attributable to poorly implemented interventions about which no data are available (Hagermoser Sanetti & Kratochwill, [34]). The best recommendations are to use multisetting, multisource, multimethod and multi‐measure approaches to evaluation. Data from parents, teachers, health records, classroom assessments, statewide achievement testing, and other types of information—especially data that are collected as part of early intervention and prevention—will provide the most complete database to guide high‐quality decisions. At the time of this review, it was not clear how the data from aptitude assessments might inform eligibility determination or program planning for students within specific learning disabilities.

ROLE OF THE SCHOOL PSYCHOLOGIST

Given the need for continued best practice and research indicating a large gap between extant research and current common practice, we conclude with a discussion of the potential role of school psychologists in promoting empirically supported practice in the arena of data‐based decision making.

Remain Informed

The most critical responsibility for school psychologists to support best practice is to remain informed and connected to emerging literature related to RtI assessment and intervention. The literature base continues to develop rapidly, and much of the research that questions the technical adequacy of CBM has been published in the past 2 to 3 years. Even relatively new practitioners, then, may easily find themselves out of date with regard to what is "known" about CBM and data‐based decision‐making. For practitioners in schools or districts in the process of implementing or refining RtI, it would be advisable to participate in listservs, professional organizations, conferences, or other activities that allow access to the professional literature at some level. Maintaining up‐to‐date knowledge is a first step toward promoting good practice. Many concepts and issues presented within this article are necessarily painted with a broad brush; Table 2 provides a list of suggested resources for more detailed information.

2 Suggested References for Additional Information About Key Concepts

Problem Identification
Ardoin, S. P., & Christ, T. J. (2008). Evaluating curriculum‐based measurement slope estimates using data from triannual universal screenings. School Psychology Review, 37, 109–125.
Ikeda, M. J., Neesen, E., & Witt, J. C. (2008). Best practices in universal screening. In A. Thomas & J. Grimes (Eds.), Best practices in school psychology V (pp. 103–114). Bethesda, MD: National Association of School Psychologists.
Problem Analysis
Christ, T. J. (2008). Best practices in problem analysis. In A. Thomas & J. Grimes (Eds.), Best practices in school psychology V (pp. 159–176). Bethesda, MD: National Association of School Psychologists.
Eckert, T. L., Ardoin, S. P., Daly, E. J., III, & Martens, B. K. (2002). Improving oral reading fluency: A brief experimental analysis of combining an antecedent intervention with consequences. Journal of Applied Behavior Analysis, 35(3), 271–281.
Progress Monitoring
Ardoin, S. P., & Christ, T. J. (2009). Curriculum based measurement of oral reading: Standard errors associated with progress monitoring outcomes from DIBELS, AIMSweb, and experimental passage set. School Psychology Review, 38(2), 266–283.
Christ, T. J. (2006). Short term estimates of growth using curriculum‐based measurement of oral reading fluency: Estimates of standard error of the slope to construct confidence intervals. School Psychology Review, 35(1), 128–133.
Olinghouse, N. G., Lambert, W., & Compton, D. L. (2006). Monitoring children with reading disabilities' response to phonics intervention: Are there differences between intervention aligned and general skill progress monitoring assessments? Exceptional Children, 73, 90–106.
Program Evaluation and Treatment Integrity
Hagermoser Sanetti, L. M., & Kratochwill, T. R. (Eds.). (2009). Developing a science of treatment integrity [Special series]. School Psychology Review, 38(4).
Special Education Eligibility
Gresham, F. M., Restori, A. F., & Cook, C. R. (2008). To test or not to test: Issues pertaining to response to intervention and cognitive testing. Communiqué, 37, 5–7.
Lichtenstein, R. (2008). Best practices in identification of learning disabilities. In A. Thomas & J. Grimes (Eds.), Best practices in school psychology V (pp. 295–318). Bethesda, MD: National Association of School Psychologists.

Remain Active in RtI Discussions

In addition to maintaining an understanding of current issues pertaining to data‐based decision‐making, school psychologists can make an effort to assume an active role in building‐ or district‐level discussions related to RtI decision making. In many districts, RtI implementation is in its infancy, with barriers, challenges, and problems still emerging. Additionally, school psychologists, with their training in assessment, measurement, validity, and reliability, are uniquely positioned to provide system‐level consultation in the area of data‐based decision making. In many schools and districts, the school psychologist may be the only practitioner who understands these concepts sufficiently to guide and support a system of assessment that will adequately support RtI implementation. Therefore, school psychologists are encouraged to advocate for a role in the implementation process.

CONCLUSION

This article summarized some of the current knowledge regarding the technical adequacies and inadequacies of available measures commonly used to support RtI implementation. Clearly, more work is needed to develop and validate measures that will be useful for purposes of problem analysis and progress monitoring. In the meantime, there is much school psychologists can do to promote good practice, including advocating for purposeful selection and application of various types of assessment and utilization of multiple measures where no single measure has been shown adequate for the type of decision being considered. We urge school psychologists to remain abreast of the developing professional literature and to remain active in discussions that will promote empirically supported practice.

REFERENCSE 1 American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC : American Educational and Psychological Research Association. 2 Ardoin, S. P., & Christ, T. J. (2008). Evaluating curriculum‐based measurement slope estimates using data from triannual universal screenings. School Psychology Review, 37, 109 – 125. 3 Ardoin, S. P., & Christ, T. J. (2009). Curriculum based measurement of oral reading: Standard errors associated with progress monitoring outcomes from DIBELS, AIMSweb, and an experimental passage set. School Psychology Review, 38 (2), 266 – 283. 4 Ardoin, S. P., Christ, T. J., Morena, L. S., Cormier, D. C., & Klingbeil, D. A. (2010). Exploring the evidence behind curriculum based measurement of oral reading (CBM‐R) decision rules. Manuscript submitted for publication. 5 Ardoin, S. P., Witt, J. C., Suldo, S. M., Connell, J. E., Koenig, J. L., Resetar, J. L., et al. (2004). Examining the incremental benefits of administering a maze and three versus one curriculum‐based measurement reading probes when conducting universal screening. School Psychology Review, 33, 218 – 233. 6 Ball, C., O'Connor, E., & Holden, J (2011, February). Toward efficiency in universal screening: A predictive validity study. Paper presented at the annual conference of the National Association of School Psychologists, San Francisco, CA. 7 Beaver, J. M., & Carter, M. A. (2006). The developmental reading assessment–2nd edition (DRA2). Upper Saddle River, NJ : Pearson. 8 Christ, T. J. (2006). Short term estimates of growth using curriculum‐based measurement of oral reading fluency: Estimates of standard error of the slope to construct confidence intervals. School Psychology Review, 35 (1), 128 – 133. 9 Christ, T. J. (2008). Best practices in problem analysis. In A. Thomas & J. Grimes (Eds.), Best practices in school psychology V (pp. 159 – 176). Bethesda, MD : National Association of School Psychologists. Christ, T. J., & Ardoin, S. P. (2009). Curriculum‐based measurement of oral reading: Passage equivalence and probe‐set development. Journal of School Psychology, 47, 55 – 75. Christ, T. J., Monaghen, B., Zopluoglu, C., & Van Norman, E. R. (2011). Curriculum‐based measurement of oral reading: Evaluation of pre‐post estimates of weekly growth. Manuscript submitted for publication. Christ, T. J., Zopluoglu, C., & Long, J. (in press). Curriculum based measurement of oral reading (CBM‐R): Evaluation of progress monitoring outcomes and trend line estimates. Exceptional Children. Christ, T. J., Zopluoglu, C., Monaghen, B., Pike‐Balow, A., & Van Norman, E. R. (2011). Curriculum‐based measurement of oral reading (CBM‐R): Evaluation and evidenced‐based guidelines for progress monitoring. Manuscript submitted for publication. Christ, T. J., Zopluoglu, C., Pike‐Balow, P., & Monaghen, B. (2011). Curriculum‐based measurement of oral reading (CBM‐R): Diagnostic accuracy of slope estimate derived from weekly progress monitoring data. Manuscript submitted for publication. Cochrane, W. S., & Laux, J. M. (2008). A survey investigating school psychologists' measurement of treatment integrity in school‐based interventions and their beliefs about its importance. Pyschology in the Schools, 45, 499 – 507. Colon, E. P., & Kranzler, J. H. (2006). Effect of instructions on curriculum‐based measurement of reading. Journal of Psychoeducational Assessment, 24 (4), 318 – 328. Coyne, M. D., & Harn, B. A. (2006). Promoting beginning reading success through meaningful assessment of early literacy skills. Psychology in the Schools, 43 (1), 33 – 43. Cronbach, L. J., & Furby, L. (1970). How we should measure''chang'': Or should we ? Psychological Bulletin, 74 (1), 68. Cronbach, L. J., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measures. New York : John Wiley. Daly, E. J., Martens, B. K., Hamler, K. R., Dool, E. J., & Eckert, T. L. (1999). A brief experimental analysis for identifying instructional components needed to improve oral reading fluency. Journal of Applied Behavior Analysis, 32 (1), 83 – 94. Deno, S. L. (1986a). Curriculum‐based measurement: The emerging alternative. Exceptional Children, 52 (3), 219 – 232. Deno, S. L. (1986b). Formative evaluation of individual student programs: A new role for school psychologists. School Psychology Review, 15 (3), 358 – 374. Deno, S. L. (2003). Developments in curriculum‐based measurement. Journal of Special Education, 37 (3), 184 – 192. Deno, S. L. (2005). Problem‐solving assessment. In R. Brown‐Chidsey (Ed), Assessment for intervention: A problem‐solving approach (pp. 10 – 40). New York : Guilford. Derr, T. F., & Shapiro, E. S. (1989). A behavioral evaluation of curriculum‐based assessment of reading. Journal of Psychoeducational Assessment, 7 (2), 148. Derr‐Minneci, T. F., & Shapiro, E. S. (1992). Validating curriculum‐based measurement in reading from a behavioral perspective. School Psychology Quarterly, 7, 2 – 16. Eckert, T. L., Ardoin, S. P., Daly, E. J. III, & Martens, B. K. (2002). Improving oral reading fluency: A brief experimental analysis of combining an antecedent intervention with consequences. Journal of Applied Behavior Analysis, 35 (3), 271 – 281. Fountas, I. C., & Pinnell, G. S. (2007). Fountas and Pinnell benchmark assessment system 1: Grades K‐2, levels A‐N. Portsmouth, NH : Heinemann. Francis, D. J., Santi, K. L., Barr, C., Fletcher, J. M., Varisco, A., & Foorman, B. R. (2008). Form effects on the estimation of students' oral reading fluency using DIBELS. Journal of School Psychology, 46 (3), 315 – 342. Fuchs, L. S. (2004). The past, present, and future of curriculum‐based measurement research. School Psychology Review, 33, 188 – 192. Fuchs, L. S., & Deno, S. L. (1991). Paradigmatic distinctions between instructionally relevant measurement models. Exceptional Children, 57, 488 – 500. Gresham, F. M. (2007). Evolution of the response‐to‐intervention concept: Empirical foundations. In S. R. Jimerson, M. K. Burns, & A. M. VanDerHeyden (Eds.), Handbook of response to intervention: The science and practice of assessment and intervention (pp. 10 – 24). New York : Springer. Gresham, F. M., Restori, A. F., & Cook, C. R. (2008). To test or not to test: Issues pertaining to response to intervention and cognitive testing. Communiqué, 37, 5 – 7. Hagermoser Sanetti, L. M., & Kratochwill, T. R. (2009a). Toward developing a science of treatment integrity: Introduction to the special series. School Psychology Review, 38, 445 – 459. Hagermoser Sanetti, L. M., & Kratochwill, T. R. (2009b). Treatment integrity assessment in the schools: An evaluation of the treatment integrity planning protocol. School Psychology Quarterly, 24, 24 – 35. Hintze, J. M., & Christ, T. J. (2004). An examination of variability as a function of passage variance in CBM progress monitoring. School Psychology Review, 33 (2), 204 – 217. Jenkins, J. R., Zumeta, R., Dupree, O., & Johnson, K. (2005). Measuring gains in reading ability with passage reading fluency. Learning Disabilities Research & Practice, 20 (4), 245 – 253. Kaminski, R. A., & Good, R. H. (1996). Toward a technology for assessing basic early literacy skills. School Psychology Review, 25, 215 – 227. Kane, M. T. (1992) An argument‐based approach to validity. Psychological Bulletin, 112, 527 – 535. Kane, M. T. (2005). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17 – 64) Lanham, MD : Rowman & Littlefield. McGlinchey, M. T., & Hixon, M. D. (2004). Using curriculum‐based measurement to predict performance on state assessments in reading. School Psychology Review, 33, 193 – 203. Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American Psychologist, 50 (9), 741 – 749. National Association of School Psychologists. (2009). School psychologists' involvement in assessment (Position statement). Bethesda, MD : Author. Northwest Evaluation Association. (2009). Technical manual for Measures of Academic Progress and Measures of Academic Progress for Primary Grades. Lake Oswego, OR : Author. Olinghouse, N. G., Lambert, W., & Compton, D. L. (2006). Monitoring children with reading disabilities' response to phonics intervention: Are there differences between intervention aligned and general skill progress monitoring assessments ? Exceptional Children, 73, 90 – 106. Poncy, B. C., Skinner, C. H., & Axtell, P. K. (2005). An investigation of the reliability and standard error of measurement of words read correctly per minute using curriculum‐based measurement. Journal of Psychoeducational Assessment, 23 (4), 326 – 338. Shapiro, E. S. (2010). Academic Skills Problems (4th ed.). New York : Guilford. Shinn, M. R., Good, R. H., Knutson, N, Tilly, D. W., & Collins, V. L. (1992). Curriculum‐based measurement of oral reading fluency: A confirmatory analysis of its relation to reading. School Psychology Review, 21 (3), 459 – 479. Silberglitt, B., & Hintze, J. M. (2005). Formative assessment using CBM‐R cut scores to track progress toward success on state mandated achievement tests: A comparison of methods. Journal of Psychoeducational Assessment, 23, 304 – 325. Stecker, P. M., Fuchs, L. S., & Fuchs, D. (2005). Using curriculum‐based measurement to improve student achievement: Review of research. Psychology in the Schools, 42 (8), 795 – 819. Thorndike, R. M., & Thorndike‐Christ, T. (2010). Measurement and evaluation in psychology and education (8th ed.). Boston : Pearson. Tilly, D. (2003, July). Foundations for a problem solving, school‐wide model. Paper presented at the Rhode Island Technical Assistance Project Summer Institute, Providence, RI. Retrieved May 26, 2011, from http://www.ritap.org/rti/resources/presentations.php. Valencia, S. W., & Buly, M. R. (2004). Behind test scores: What struggling readers really need. The Reading Teacher, 57 (6), 520 – 531. Vander Meer, C. D., Lentz, F. E., & Stollar, S. (2005). The relationship between oral reading fluency and Ohio Proficiency Testing in reading (Technical report). Eugene, OR : University of Oregon. Wayman, M. M., Wallace, T., Wiley, H. I., Ticha, R., & Espin, C. A. (2007). Literature synthesis on curriculum‐based measurement in reading. The Journal of Special Education, 41 (2), 85 – 120.

By Carrie R. Ball and Theodore J. Christ

Reported by Author; Author

Titel:
SUPPORTING VALID DECISION MAKING: USES AND MISUSES OF ASSESSMENT DATA WITHIN THE CONTEXT OF RTI
Autor/in / Beteiligte Person: BALL, Carrie R ; CHRIST, Theodore J ; JONES, Ruth E
Link:
Zeitschrift: Addressing Response to Intervention Implementation: Questions from fhe Field, Jg. 49 (2012), Heft 3, S. 231-244
Veröffentlichung: Hoboken, NJ: Wiley, 2012
Medientyp: academicJournal
Umfang: print; 14; 2 p
ISSN: 0033-3085 (print)
Schlagwort:
  • Amérique du Nord
  • Amérique
  • Etats-Unis
  • Homme
  • Human
  • Hombre
  • Santé publique
  • Public health
  • Salud pública
  • Echec scolaire
  • School failure
  • Fracaso escolar
  • Enfant
  • Child
  • Niño
  • Implémentation
  • Implementation
  • Implementación
  • Milieu scolaire
  • School environment
  • Medio escolar
  • Méthode mesure
  • Measurement method
  • Método medida
  • Programme enseignement
  • Educational program
  • Programa enseñanza
  • Programme sanitaire
  • Sanitary program
  • Programa sanitario
  • Prévention
  • Prevention
  • Prevención
  • Psychologue scolaire
  • School psychologist
  • Psicólogo escolar
  • Rôle professionnel
  • Occupational role
  • Rol profesional
  • Santé mentale
  • Mental health
  • Salud mental
  • Trouble de l'apprentissage
  • Learning disability
  • Trastorno aprendizaje
  • Sciences biologiques et medicales
  • Biological and medical sciences
  • Sciences biologiques fondamentales et appliquees. Psychologie
  • Fundamental and applied biological sciences. Psychology
  • Psychologie. Psychophysiologie
  • Psychology. Psychophysiology
  • Psychologie de l'éducation
  • Educational psychology
  • Elève et étudiant. Réussite et échec scolaire
  • Pupil and student. Academic achievement and failure
  • Sciences medicales
  • Medical sciences
  • Psychopathologie. Psychiatrie
  • Psychopathology. Psychiatry
  • Psychiatrie sociale. Ethnopsychiatrie
  • Social psychiatry. Ethnopsychiatry
  • Prévention. Politique sanitaire. Planification
  • Prevention. Health policy. Planification
  • Psychologie. Psychanalyse. Psychiatrie
  • Psychology. Psychoanalysis. Psychiatry
  • PSYCHOPATHOLOGIE. PSYCHIATRIE
  • Cognition
  • Psychology, psychopathology, psychiatry
  • Psychologie, psychopathologie, psychiatrie
  • Subject Geographic: Amérique du Nord Amérique Etats-Unis
Sonstiges:
  • Nachgewiesen in: FRANCIS Archive
  • Sprachen: English
  • Original Material: INIST-CNRS
  • Document Type: Article
  • File Description: text
  • Language: English
  • Author Affiliations: Indiana State University, United States ; University of Minnesota, United States ; Department of Special Education, Ball State University, Teachers College, Room 720, Muncie, IN 47306, United States
  • Rights: Copyright 2015 INIST-CNRS ; CC BY 4.0 ; Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

oder
oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

oder
oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.

xs 0 - 576
sm 576 - 768
md 768 - 992
lg 992 - 1200
xl 1200 - 1366
xxl 1366 -