Handbook of psychology volume 7 educational psychology
Implementing a Randomized Classroom Trials Study
Download 9.82 Mb. Pdf ko'rish
|
- Bu sahifa navigatsiya:
- Commitment of Federal Funds to Randomized Classroom Trials Research
- Additional Comments
- References 577
- Closing Trials Arguments
Implementing a Randomized Classroom Trials Study Is there a need for either smaller or larger scale randomized intervention studies? Have any instructional interventions
Enhancing the Credibility of Intervention Research 575 advanced to the point where they are ready to be evaluated in well-controlled classroom trials? Or, as was alluded to ear- lier, are such implementation-and-evaluation efforts the sole property of medical research’s clinical trials? Yes, yes, and no, respectively, and the time is ripe to demonstrate it. A similar research sequence could be followed in moving beyond classroom description, laboratory research, and one- unit-per-intervention studies to help settle the whole-language versus phonemic-awareness training wars in reading instruc- tion (e.g., Pressley & Allington, 1999), to prescribe the most ef- fective classroom-based reading-comprehension techniques (e.g., Pressley et al., 1992), to investigate issues related to opti- mal instructional media and technologies (e.g., Salomon & Almog, 1998), and the like—the list goes on and on. That is, there is no shortage of randomized classroom-intervention re- search leads to be explored, in virtually all content domains that promote cognitive or behavioral interventions. (Beyond the classroom, school and institutional trials experiments can help to bolster claims about intervention efforts at those levels.) In addition to a perusal of the usual scholarly syntheses of re- search, all one needs do is to take a look at something such as What Works (U.S. Department of Education, 1986) for re- search-based candidates with the potential to have a dramatic positive impact on instructional outcomes, classroom behav- ior, and general cognitive development. Randomized class- room trials research can provide the necessary credible and creditable evidence for that potential. Commitment of Federal Funds to Randomized Classroom Trials Research The notions we have been advancing are quite compatible with Stanovich’s (1998, pp. 54 –55, 133–135) discussions of the importance of research progressing from early to later stages, producing, respectively, weaker and stronger forms of causal evidence (see also Table 22.3). The notions are also in synchrony with the final evaluative phase of Slavin’s (1997) recommended design competitions, in which an agency iden- tifies educational problems and research bidders submit their plans to solve them. With respect to that evaluative phase (which roughly corresponds to our randomized classroom trials stage), Slavin (1997) wrote, Ideally, schools for the third-party evaluations would be chosen at random from among schools that volunteered to use the pro- gram being evaluated. For example, schools in a given district might be asked to volunteer to implement a new middle school model. This offer might be made in 5 to 10 districts around the country: some urban, some suburban, some rural, some with language-minority students, some large schools, some small ones, and so on. Fifty schools might be identified. Twenty-five might be randomly assigned to use the program and 25 to serve as controls (and to implement their current programs for a few more years). Control schools would receive extra resources, partly to balance those given to the experimental schools and partly to maintain a level of motivation to serve as control groups. (p. 26) The random assignment of volunteering schools to the program and control conditions, along with the allocation of additional resources to the control schools, exhibits a concern for the research’s internal validity (see, e.g., Levin & Levin, 1993). Additionally, the random sampling of schools exhibits a concern for the research’s external validity and also permits an investigation of program effectiveness as a function of specific school characteristics. Multiple high-quality ran- domized school or classroom trials studies of this kind would do much to improve both public and professional perceptions of the low-quality standards that accompany educational re- search today (e.g., McGuire, 1999; Sabelli & Kelly, 1998; Sroufe, 1997). Incorporating and extending the knowledge base provided by smaller scale Stage 2 empirical studies (e.g., Hedges & Stock, 1983), the decade-long Tennessee Project STAR randomized classroom experiment investigat- ing the effects of class size on student achievement (e.g., Nye et al., 1999) is a prominent example of scientifically credible research that has already begun to influence educational policy nationwide (“Research finds advantages,” 1999). The same can be said of the Success for All randomized schools experiments investigating the effects of systemic reform on student academic outcomes in schools serving traditionally low-achieving student populations (e.g., Slavin, Madden, Dolan, & Wasik, 1996). Of less recent vintage, an illustration of a scientifically credible intervention with educational cred- itability is Harvard Project Physics, a randomized schools ex- periment based on a national random sample, in which an innovative high school physics curriculum was carefully im- plemented and evaluated (e.g., Walberg & Welch, 1972). Are federal funding agencies willing to support random- ized classroom trials ventures? Such ventures appear to be exactly what at least some agencies want, if not demand: At one end of the continuum, research is defined by researcher questions that push the boundaries of knowledge. At the other end of the continuum, research is defined by large-scale and contex- tual experiments, defined by implementation questions that frame robust applications. . . . What is needed now, and what NSF is ac- tively exploring, is to move ahead simultaneously at both ex- tremes of the continuum. Basic learning about the process of learning itself—innovative R&D in tackling increasingly complex content and in the tools of science and mathematics 576 Educational / Psychological Intervention Research education—informs and must be informed by applied, robust, large-scale testbed implementation research. (Sabelli & Kelly, 1998, p. 46) Thus, in contrast to detractors’ periodic assertions that the medical research model does not map well onto the educa- tional research landscape, we assert that randomized class- room trials studies have much to recommend themselves. Additional Comments We conclude this section with five comments. First, we do not mean to imply that randomized classroom trials studies are appropriate for all areas of intervention research inquiry, for they most certainly are not (see, e.g., Eisner, 1999). Systematic observation, rich description, and relationship documentation, with no randomized classroom component, may well suffice for characterizing many classroom pro- cesses and behaviors of both practical and theoretical conse- quence. For the prescription of instructional interventions (e.g., alternative teaching methods, learning strategies, cur- ricular materials) and other school- or other system–based in- novations, however, randomized classroom trials studies could go a long way toward responding to former Assistant Secretary of Education McGuire’s (1999) call for rigorous educational research that “readily inform[s] our understand- ing of a number of enduring problems of practice” (p. 1). Second, randomized classroom trials studies can be carried out on both smaller and larger scales, depending on one’s intended purposes and resources. The critical issues here are (a) realistic classroom-based interventions that are (b) admin- istered to multiple randomized classrooms. Scientifically credible classroom-based intervention research does not in- variably require an inordinate number of classrooms per inter- vention condition, such as the 50 schools alluded to by Slavin (1997) in the final stage of his aforementioned design compe- tition scenario. Initially, an intervention’s potential might be evaluated with, say, three or four classrooms randomly as- signed to each intervention condition. Even with that number of multiple classrooms (and especially when combined with classroom stratification, statistical control, and the specifica- tion of relevant within-classroom characteristics), classroom- based statistical analyses can be sensibly and sensitively applied to detect intervention main effects and interactions of reasonable magnitudes (e.g., Barcikowski, 1991; Bryk & Raudenbush, 1992; Levin, 1992; Levin & O’Donnell, 1999a; Levin & Serlin, 1993). This statement may come as surprise to those who are used to conducting research based on individu- als as the units of treatment administration and analysis. With classrooms as the units, the ability to detect intervention effects is a function of several factors, including the number of classrooms per intervention condition, the number of students per classroom, and the degree of within-classroom homo- geneity (both apart from and produced by the intervention; see, e.g., Barcikowski, 1981). Each of these factors serves to affect the statistical power of classroom-based analyses. After an intervention’s potential has been documented through small, controlled, classroom-based experiments (and replica- tions) of the kind alluded to here, more ambitious, larger scale, randomized trials studies based on randomly selected class- rooms or schools, consistent with Slavin’s (1997) design competition notions, would then be in order. Third, if we are to understand the strengths, weaknesses, and potential roles of various modes of empirical inquiry (e.g., observational studies, surveys, controlled laboratory experiments, design experiments), we need an overall model to represent the relationships among them. For Figure 22.1 to be such a model, one must believe that it is possible to have a generalized instructional intervention that can work in a vari- ety of contexts. Testing the comparative efficacy of such an intervention would be the subject of a Stage 3 randomized classroom trials investigation. A substantive example that readily comes to mind is tutoring, an instructional strategy that has been shown to be effective in a variety of student populations and situations and across time (see, e.g., Cohen, Kulik, & Kulik, 1982; O’Donnell, 1998). For those who be- lieve that interventions can only be population and situation specific, a unifying view of the reciprocal contributions of various research methodologies is difficult to promote. Fourth, along with acknowledging that the classroom is typically a nest of “blooming, buzzing confusion” (Brown, 1992, p. 141), it should also be acknowledged that in the absence of Figure 22.1’s Stage 3 research, the confusion will be in a researcher’s interpreting which classroom procedures or features produced which instructional outcomes (if, indeed, any were produced at all). In that regard, we reiterate that randomized classroom trials research is equally applica- ble and appropriate for evaluating the effects of single- component, multiple-component, and systemic intervention efforts alike. With the randomized classroom trials stage, at least a researcher will be able to attribute outcomes to the intervention (however tightly or loosely defined) rather than to other unintended or unwanted characteristics (e.g., teacher, classroom, or student effects). Finally, and also in reference to Brown’s (1992, p. 141) “blooming, buzzing confusion” comments directed at class- room-based research, we note that not all research on teach- ing and learning is, or needs to be, concerned with issues of teaching and learning in classrooms. Consider, for example, the question of whether musical knowledge and spatial References 577 ability foster the development of students’ mathematical skills. Answering that question does not require any class- room-based intervention or investigation. In fact, addressing the question in classroom contexts, and certainly in the man- ner in which the research has been conducted to date (e.g., Graziano et al., 1999; Rauscher et al., 1997), may serve to ob- fuscate the issue more than resolve it. Alternatively, one need not travel very far afield to investigate the potential of indi- vidually based interventions for ameliorating children’s psychological and conduct disorders. Controlled large-scale assessments of the comparative effectiveness of various drug or behavioral therapies could be credibly managed within the randomized classroom (or community) trials stage of the Figure 22.1 model (see, e.g., COMMIT Research Group, 1995; Goode, 1999; Peterson, Mann, Kealey, & Marek, 2000; Wampold et al., 1997). Adapting Scriven’s (1997, p. 21) aspirin question here, is the individual administration of therapeutic interventions applicable only for treating med- ical, and not educational, problems? Closing Trials Arguments So, educational intervention research, whither goest thou? By the year 2025, will educational researchers still regard such methodologies as the ESP investigation, the demonstration study, and the design experiment as credible evidence pro- ducers and regard the information derived from them as “sat- isficing” (Simon, 1955)? Or are there enough among us who will fight for credible evidence-producing methodologies, contesting incredible claims in venues in which recommen- dations based on intervention “research” are being served up for either public or professional consumption? A similar kind of soul searching related to research pur- poses, tools, and standards of evidence has been taking place in other social sciences academic disciplines as well (e.g., Azar, 1999; Thu, 1999; Weisz & Hawley, 2001). Grinder (1989) described a literal fallout observed in the field of edu- cational psychology as a result of researchers’ perceived dif- ferences in purposes: In the 1970s and 1980s many researchers chose to withdraw from educational psychology and head in other disciplinary directions. In the last decade or so we have seen that sort of retreat in at least two kindred professional organizations to the AERA. Perceiving the American Psychological Association as becoming more and more concerned with clinical and applied issues, researchers aligned with the scientific side of psychology helped to form the American Psychological Society (APS). Similarly, Inter- national Reading Association researchers and others who wished to focus on the scientific study of reading rather than on reading practitioners’ problems founded a professional organization to represent that focus, the Society for the Scientific Study of Reading. Will history repeat itself, once again, in educational research? Our message is a simple one: When it comes to recom- mending or prescribing educational, clinical, and social inter- ventions based on research, standards of evidence credibility must occupy a position of preeminence. The core of the investigative framework we propose here is not new. Many educational researchers and methodologists concerned with the credibility of research-derived evidence and prescriptions have offered similar suggestions for years, if not decades: Harken back to Bereiter’s (1965) trenchant analysis of the situation. Why, then, do we believe it important, if not imper- ative, for us to restate the case for scientifically credible intervention research at this time? Perhaps it is best summa- rized in a personal communication received from the educa- tional researcher Herbert Walberg (May 11, 1999): “Live long enough and, like wide ties, you come back in style—this in a day when anecdotalism is the AERA research method of choice.” A frightening state of affairs currently exists within the general domain of educational research and within its individual subdomains. It is time to convince the public, the press, and policy makers alike of the importance of credible evidence derived from CAREfully conducted research, delin- eating the characteristics critical to both its production and recognition. In this chapter we have taken a step toward that end by first attempting to convince educational/psychologi- cal intervention researchers of the same.
Abelson, R. P. (1995). Statistics as principled argument. Mahwah, NJ: Erlbaum. Angell, M., & Kassirer, J. P. (1998). Alternative medicine: The risks of untested and unregulated remedies. New England Journal of
Azar, B. (1999). Consortium of editors pushes shift in child research method. APA Monitor, 30(2), 20–21. Barcikowski, R. S. (1981). Statistical power with group mean as the unit of analysis. Journal of Educational Statistics, 6, 267–285. Bartlett, F. C. (1932). Remembering: A study in experimental and social psychology. Cambridge, England: Cambridge University Press.
Bereiter, C. (1965). Issues and dilemmas in developing training pro- grams for educational researchers. In E. Guba & S. Elam (Eds.), The training and nurture of educational researchers (pp. 95–110). Bloomington, IN: Phi Delta Kappa. Bereiter, C. (1994). Implications of postmodernism for science, or, science as progressive discourse. Educational Psychologist, 29, 3–12.
578 Educational / Psychological Intervention Research Boruch, R. F. (1975). On common contentions about randomized field experiments. In R. F. Boruch & H. W. Riecken (Eds.),
CO: Westview Press. Boyer, E. L. (1990). Scholarship reconsidered: Priorities of the pro-
Braun, T. M., & Feng, Z. (2001). Optimal permutation tests for the analysis of group randomized trials. Journal of the American
Brown, A. L. (1992). Design experiments: Theoretical and method- ological challenges in creating complex interventions in class- room settings. Journal of the Learning Sciences, 2, 141–178. Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear mod-
Byrne, B., & Fielding-Barnsley, R. (1991). Evaluation of a program to teach phonemic awareness to young children. Journal of Edu-
Calfee, R. (1992). Refining educational psychology: The case of the missing links. Educational Psychologist, 27, 163–175. Campbell, D. T., & Boruch, R. F. (1975). Making the case for ran- domized assignment to treatments by considering the alternatives: Six ways in which quasi-experimental evaluations in compen- satory education tend to underestimate effects. In C. A. Bennett & A. A. Lumsdaine (Eds.), Evaluation and experiment: Some critical issues in assessing social programs (pp. 195–296). New York: Academic Press. Campbell, D. T., & Stanley, J. C. (1966). Experimental and quasi-
Chambless, D. L., & Ollendick, T. H. (2001). Empirically sup- ported psychological interventions: Controversies and evidence.
Cliff, N. (1983). Some cautions concerning the application of causal modeling methods. Multivariate Behavioral Research, 18, 115– 126.
Cobb, P., & Bowers, J. (1999). Cognitive and situated learning per- spectives in theory and practice. Educational Researcher, 28(2), 4 –15. Cocco, N., & Sharpe, L. (1993). An auditory variant of eye- movement desensitization in a case of childhood posttraumatic stress disorder. Journal of Behaviour Therapy and Experimental Psychiatry, 24, 373–377. Cohen, P. A., Kulik, J. A., & Kulik, C. C. (1982). Educational outcomes from tutoring: A meta-analysis of findings. American
Cole, N. S. (1997). “The vision thing”: Educational research and AERA in the 21st century: Pt. 2. Competing visions for enhanc- ing the impact of educational research. Educational Researcher, 26(4), 13, 16–17. Collins, A. (1992). Toward a design science of education. In E. Scanlon & T. O’Shea (Eds.), New directions in educational tech-
COMMIT Research Group. (1995). Community intervention trial for smoking cessation (COMMIT). American Journal of Public
Copeland, W. D. (1991). Microcomputers and teaching actions in the context of historical inquiry. Journal of Educational Com-
Derry, S., Levin, J. R., Osana, H. P., Jones, M. S., & Peterson, M. (2000). Fostering students’ statistical and scientific thinking: Lessons learned from an innovative college course. American Educational Research Journal, 37, 747–773. Donner, A., & Klar, N. (2000). Design and analysis of cluster ran- domization trials in health research. New York: Oxford Univer- sity Press. Donmoyer, R. (1993). Yes, but is it research? Educational Researcher,
Donmoyer, R. (1996). Educational research in an era of paradigm proliferation: What’s a journal editor to do? Educational
Doyle, W., & Carter, K. (1996). Educational psychology and the education of teachers: A reaction. Educational Psychologist, 31, 23–28.
Dressman, M. (1999). On the use and misuse of research evidence: Decoding two states’ reading initiatives. Reading Research Quarterly, 34, 258–285. Duffy, G. R., Roehler, L. R., Sivan, E., Rackliffe, G., Book, C., Meloth, M. S., Vavrus, L. G., Wesselman, R., Putnam, J., & Bassiri, D. (1987). Effects of explaining the reasoning associated with using reading strategies. Reading Research Quarterly, 22, 347–368.
Eisner, E. (1999). Rejoinder: A response to Tom Knapp. Educa- tional Researcher, 28(1), 19–20. Elashoff, J. D. (1969). Analysis of covariance: A delicate instru- ment. American Educational Research Journal, 6, 381– 401. Goode, E. (1999, March 19). New and old depression drugs are found equal. New York Times, pp. A1, A16. Graziano, A. B., Peterson, M., & Shaw, G. L. (1999). Enhanced learning of proportional math through music training and spatial- temporal training. Neurological Research, 21, 139–152. Greenwald, R. (1999). Eye movement desensitization and repro-
Northvale, NJ: Jason Aronson. Grinder, R. E. (1989). Educational psychology: The master science. In M. C. Wittrock & F. Farley (Eds.). The future of educational psychology: The challenges and opportunities (pp. 3–18). Hillsdale, NJ: Erlbaum. Gross, P. R., Levitt, N., & Lewis, M. W. (1997). The flight from sci-
Halpern, D. F. (1996). Thought & knowledge: An introduction to critical thinking (3rd ed.). Mahwah, NJ: Erlbaum. Hedges, L. V., & Stock, W. (1983). The effects of class size: An examination of rival hypotheses. American Educational
|
ma'muriyatiga murojaat qiling