The argument for inquiry learning as an educational tool is being heard increasingly, especially as the technology and materials to support this kind of educational experience have expanded and become widely available.
1. The Development of Cognitive Skills to Support Inquiry Learning Author(s): Deanna Kuhn, John Black, Alla Keselman, Danielle Kaplan Source: Cognition and Instruction, Vol. 18, No. 4 (2000), pp. 495-523 Published by: Taylor & Francis, Ltd. Stable URL: http://www.jstor.org/stable/3233891 . Accessed: 16/09/2011 15:15 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. Taylor & Francis, Ltd. is collaborating with JSTOR to digitize, preserve and extend access to Cognition and Instruction.
2. COGNITION ANDINSTRUCTION,18(4),495-523 ? 2000,Lawrence Copyright Erlbaum Inc. Associates, The Developmentof CognitiveSkillsTo SupportInquiryLearning DeannaKuhn,JohnBlack, Alla Keselman,Danielle Kaplan TeachersCollege ColumbiaUniversity Establishing thevalueof inquirylearningasaneducational method,it is argued,rests on thorough,detailedknowledgeof the cognitiveskillsit is intendedto promote. Mentalmodels,as representations of therealitybeinginvestigated in inquirylearn- ing,standtoinfluencestrategies appliedtothetask.Intheresearch described here,the is hypothesis investigated thatstudents atthemiddle schoollevel,and sometimes well beyond,mayhaveanincorrect mentalmodelof multivariable causality(oneinwhich effectsof individualfeatureson anoutcomeareneitherconsistentnoradditive)that impedesthecausalanalysisinvolvedinmostformsof inquirylearning.Anextended intervention with6thto 8thgraders wastargeted topromote(a)atthemetalevel,a cor- rectmentalmodelbasedonadditiveeffectsof individual features(indicated byiden- tificationof effects of individualfeaturesas the task objective);(b) also at the metalevel,metastrategic understanding of theneedto controltheinfluencesof other features;and(c)attheperformance level,consistent useof thecontrolled comparison strategy.Both metalevel advancements were observed,inaddition totransfertoanew taskattheperformance level,amongmany(thoughnotall)students. Findingssupport theclaimthata developmental hierarchy of skillsandunderstanding underlies,and shouldbe identifiedas anobjectiveof, inquirylearning. The argumentfor inquirylearningas an educationaltool is being heard increas- ingly, especially as the technologyandmaterialsto supportthiskind of educational experience have expandedand become widely available.Amidst the widespread enthusiasm,the strongestcriticismto be heardis thatsuch methodsareinefficient. Too little substantiveknowledge is gained to justify the sizable expenditureof classroomtime that such activities typically consume.But outweighingthis criti- cism in a majorityof educators'eyes arethe potentialbenefits of the opportunities forreprints Requests should besenttoDeanna Box119,Teachers Kuhn, College, Univer- Columbia sity,NewYork,NY 10027.E-mail: [email protected]
3. 496 KUHN, BLACK, KESELMAN,KAPLAN affordedstudentsto engage in genuineinquiry.Highly favoredin a recentNational Research Council report (Bransford,Brown, & Cocking, 1999) is a method in which students designstudies,collectinformation, analyzedataandconstruct evidence.... Theythen debatetheconclusionthattheyderivefromtheirevidence.Ineffectthestudents build andargueabouttheories.... Questionposing,theorizing, andargumentation formthe structure of thestudents'scientificactivity.... Theprocessas a wholeprovide[s]a richer,morescientificallygrounded experience focusontext- thantheconventional booksorlaboratory demonstrations.(pp.171-172) In formulatingquestions,accessing and interpretingevidence, and coordinatingit with theories,studentsarebelievedto developthe intellectualskills thatwill enable them to constructnew knowledge (Chan,Burtis, & Bereiter, 1997). In addition, they ideally arealso acquiringa set of intellectualvalues-values thatdeem activi- ties of this sort to be worthwhilein generaland personallyuseful. In the words of Resnick andNelson-LeGall (1997), studentswho value intellectualinquiry believetheyhavetheright(andtheobligation) to understand thingsandmakethings work... believethatproblemscanbe analyzed,thatsolutionsoftencomefromsuch analysisandthattheyarecapableof thatanalysis... havea toolkitof problem-analy- sis toolsandgoodintuitionsaboutwhento usethem... knowhowto askquestions, seekhelpandget enoughinformation to solveproblems... havehabitsof mindthat leadthemto activelyuse thetoolkitof analysisskills.(pp.149-150) In short,studentscome to understandthatthey are able to acquireknowledgethey desire, in virtuallyany contentdomain,in ways thatthey can initiate,manage,and execute on theirown, andthatsuchknowledgeis empowering.This outcomeis be- lieved to justify the time devoted to developmentof these skills and dispositions within the context of what is typically a circumscribedtopic of investigation. Is inquiry-basededucationcapableof deliveringon these promises?We argue here thatthe argumentssupportingits meritsrest on a criticalassumption.The as- sumptionis thatstudentspossess the cognitive skills thatenablethemto engage in these activities in a way thatis profitablewith respectto the objectives identified previously. If studentslack the necessary skills, inquirylearningcould in fact be counterproductive,leading studentsto frustrationand to the conclusion that the world, in fact, is not analyzableandworthtryingto understand-a conclusionthat runs exactly opposite to the intellectualvalues that Resnick and Nelson-LeGall (1997) arguedinquirylearningshouldpromote. At this point, it is necessaryto become specific as to whatwe arereferringto as inquiry learning because a wide range of educationalpractices have been de- scribedunderthis heading.Here,we define inquirylearningas an educationalac- tivity in which students individually or collectively investigate a set of
4. INQUIRYLEARNING 497 phenomena-virtual or real-and drawconclusions aboutit. Studentsdirecttheir own investigatoryactivity,but they may be promptedto formulatequestions,plan their activity, and draw andjustify conclusions about what they have learnedde Jong and van Joolingen (1998). Inquiryactivitiestargetedto young childrenmay have simple goals thatdo not extendbeyond description,classification,or measurementof familiarphenomena. More typically, however, inquiryactivitiesare designed for olderchildrenor ado- lescents and have, as their goal, the identificationof causes and effects. The con- text is typically a multivariableone, such thatthe goal becomes one of identifying which variableor variablesare responsiblefor an outcome or how a change in the level of one variablecauses a change in one or more othervariablesin the system. Equallyimportantis the identificationof noncausalvariables,so thatthese can be eliminatedas sources of influence in understandinghow the system functions. Are studentsof the elementaryand middle school grades(in which inquiryac- tivities are most commonly introduced)capableof inferringsuch relationsbased on investigations of a multivariablesystem? There exists little educationalre- searchon studentsengagedin inquirylearningthatwould answerthis questiondi- rectly. Evidence that is available, on the other hand, from the literatureon scientific reasoning suggests significant strategicweaknesses that have implica- tions for inquiryactivity (Klahr,2000; Klahr,Fay, & Dunbar,1993;Kuhn,Amsel, & O'Loughlin, 1988; Kuhn, Garcia-Mila,Zohar, & Andersen, 1995; Kuhn, Schauble, & Garcia-Mila, 1992; Schauble, 1990, 1996). Strategies, moreover, even thoughthey have been the focus of attentionin scientific reasoningresearch, may not be all, or even the most criticalelement,thatis missing. In this article,we raise the possibility that studentsat the middle school level, and sometimes well beyond, have an incorrectmentalmodel thatunderliesstrategicweaknesses, and thatimpedes the multivariableanalysis requiredin the most common forms of in- quirylearning.Like manymentalmodels, this model maybe resistantto revision. MENTALMODELS UNDERLYINGINQUIRYLEARNING Numerouslines of cognitive andcognitively orientededucationalresearchempha- size mental models as vehicles that studentsemploy in coming to understandthe workings of a system (Gentner& Stevens, 1983; Vosniadou & Brewer, 1992). Such models facilitate(or sometimes interferewith) understandingof how a sys- tem operates.We use the mentalmodel terminologyhere, however, in a more ge- neric sense. It is students'mentalmodel of causalityitself, we claim, that may be deficient, ratherthana mentalmodel of the workingsof any particularcausal sys- tem. This incorrectmentalmodel can be contrastedto a normativeanalysisof vari- ance (ANOVA) model of causalityin amultivariablesystem-a model in which in- dividualvariableseach manifesttheirindividualeffects on one or more dependent
5. 498 KUHN, BLACK, KESELMAN,KAPLAN variables.Sucheffectsarenormallyadditive,althoughone effectmay in some casesinfluence(interactwith)theeffectof anothervariable. If we expectstudentsto understand the operationof a multivariablesystem, they must at leastunderstand theconceptof additiveeffects-effects thatoperate individually ona dependent variablebutthatarecumulative (additive)intheirout- comes.A studentwhopossessesthismentalmodelof additiveeffectscanunder- standmuch abouta systemand in manycases even predictoutcomesfairly accurately withoutthemoresophisticated conceptof interactioneffectsas partof this model.The deficientmentalmodelwe describehere,in contrast,is one in whichneitheradditivenorinteractive effectsareunderstood in a normative way. Thespecificsituationwe refertohereinconsidering thesementalmodelsis one in whichanoutcomevariablethatcanassumemultiplelevelsonatleastanordinal scale(i.e.,orderedfromlesstomoreof somequantity) ispotentiallyaffectedbyaset of independent each variables, of whichcanassumetwodifferentlevels.Forexam- ple, in theworkdescribedhere,thevariablesof soiltype(sandvs. clay),elevation (highvs. low),andwaterpollution(highvs. low)areamongfivepotentialfeatures canassumefivedifferentlevels,fromlowflooding(1 ft)tohigh(5ft).Toinvestigate thesystem,a studenthastheopportunity tochoosedesiredlevelsforeachofthefea- turesand,oncethisis done,toobservetheresultingoutcome.Thetaskpresented to thestudentis to findoutwhichfeaturesmakea differenceandwhichdonotmakea differencein determining thelevelof theoutcomevariable. Studentsbeginningto investigatesucha systemoftenfocusexclusivelyon out- comes-achieving those deemeddesirableand avoidingundesirableoutcomes (Kuhnet al., 1995;Kuhnet al., 1992;Schauble,1990;Schauble,Klopfer,& Raghavan,1991).To makeprogressbeyondan outcomefocus,it is necessaryto shiftone'sattentionto whatwe cancallananalysisfocus-specifically,analysis in termsof theeffectsof individualfeatures.Withouttheunderstanding thatindi- vidualfeatureswill contribute theirrespectiveeffectsto theoutcomes,thesystem cannotbe analyzedandunderstood. Considernow the mentalmodel that might characterizethe thinkingof sixth-grader Matt(anactualcasefromthedatabaseof theresearchdescribedhere, although student'snameis changed).Welabelthefivevariablefeaturesof the the systemby numberandtherespectivelevelsof eachfeatureby thelettersa orb. Matt makes the following claims. Based on observationof the instance la2a3a4a5ain conjunction witha positiveoutcome(01), Mattconcludesthatall of thesecontributed to thegoodoutcome(thesiteis minimallyflooded).Inother words,the sandysoil, the lackof pollution,the highelevation,andso forth,"all makea difference,becauseit cameoutgood."Next,Mattexaminesthe instance lb2b3b4a5a(i.e., the levels of threeof the featuresarechangedfromwhatthey werein the firstinstanceandremainthesamefortheothertwo features)andob- servesa pooroutcome(highflooding).Thistime,Mattsays,"Noneof themmade
6. INQUIRYLEARNING 499 Soil: Sand Soil: Clay WaterTemperature:Hot WaterTemperature:Hot FIGURE 1 The co-occurrencementalmodel. Bothfeaturesareimplicatedas causalinthe out- come on the left and not implicatedin the outcome on the right. a difference-it came outbad."We can inferfromthese statementsthatMattis not using the expressionmakea differencein the normativeway dictatedby the analy- sis model. Instead,making a difference appearsto mean "helping to produce a good outcome." Such a model of multivariablecausalityaccommodatesthe seemingparadoxof a variablemakinga differenceon some occasions (when the outcome is good) and not makinga differenceon others (when the outcome is poor)-a state of affairs that we in fact have found to be common among, and not at all paradoxical,for manychildrenof this age. In earlierwork, for example,childrenof Matt's age who observedthatsportsballs with a certaintype of surfaceproducea good serve half of the time and a poor serve half of the time, whereasballs with a differentsurface type producethe same results, often failed to make the normativeinference that type of surfacewas noncausalwith respectto this outcomevariable.Instead,they concludedthatthe surfacetype "sometimesmakes a difference"in the qualityof the serve (Kuhnet al., 1988). Formalizingthis mentalmodel, it can be describedas stipulatingthe co-occur- rence of a particularvariablelevel andan outcomeas a sufficientconditionfor im- plicating thatvariableas having played a role in the outcome (or, in the case of a negative outcome,excludingthe variableas havingplayed a role). We referto this mentalmodel as a co-occurrencemodel. It is importantto note thatthe variablelevel, not the variableitself, is implicated as causalin the co-occurrencemodel. In the depictionin Figure 1, for example,it is the featurelevels sandy soil and hot water(ratherthansoil type or watertempera- ture, as features)that are implicatedas causal in interpretingthe successful out- come on the left-handside of the figure.In interpretingthe unsuccessfuloutcome
7. 500 KUHN, BLACK, KESELMAN,KAPLAN on the right,the same watertemperatureis, this time,judged not to make a differ- ence. Reflecting anotherformof inconsistency,ratherthansoil type makinga dif- ference, sand does (but clay does not) make a difference. Causalattributions,then, fluctuateas functionsof the particularconstellationof feature levels that are present in a particularinstance. Each constellation is a uniqueevent (even thoughits componentsmay be incompletelyidentified).Rather thanrepresentinga genuineinteractivemodel, however, the co-occurrencemodel reflects failureto conceptualizeeven the main effects (of featuresas variables)on which statisticalinteractioneffects are founded. It shouldbe noted finally thatin additionto being inconsistent,effects of indi- vidual features are not additive in the co-occurrencemodel. Because co-occur- rence of a particularfeature level and an outcome is a sufficient condition for attributingcausality,any co-occurringfeaturelevel may be implicatedin what is regardedas a successful outcome.Implicationof more co-occurringfeaturelevels might be expected to produce an even more successful outcome-yet even one co-occurringfeatureis sufficientto explain even the most successful outcome. MENTALMODELS OF CAUSALITYAND INVESTIGATIVESTRATEGIES Mentalmodels, as noted previously,may be resistantto change, and it is not clear whatthe most effective way mightbe to effect a transitionfroma co-occurrenceto a genuine analysis model of multivariablecausality.In previousresearch(Kuhnet al., 1995;Kuhnet al., 1992),we focusedon the investigatorystrategiesstudentsuse andthe resultingvalidityof theirinferences.To make a valid inference,it is neces- saryto make a controlledcomparisonbetween two instancesthatdiffer only with respectto a single featurethatis the focus of analysis.In researchon scientificrea- soning, the lion's shareof attentionhas gone to this controlledcomparison,or "all otherthings equal"investigationstrategy,as the hallmarkof skilled scientific rea- soning (DeLoache,Miller, & Pierroutsakos,1998;Klahr,2000; Kuhnet al., 1988; Zimmerman,2000). The investigatorneeds to recognize that to conduct a sound test ofthe effect of one variable,all othervariablesmustbe held constant,so thatthe effects of these othervariablesdo not influencethe outcome. In ourresearch(Kuhnet al., 1995;Kuhnet al., 1992), we have foundthatuse of a controlled comparisonstrategyand the valid inferences that result from it in- crease in frequencyover a periodof monthsamongpreadolescentswhen they are given the opportunity to engage in self-directed investigatory activity of a multivariablesystem. Some students,however,even aftermany weeks of investi- gation, remainstubbornlyfixed at a level of confoundedinvestigationsand falla- cious inferences.The mentalmodel ideas proposedhere suggest a possible reason for their lack of progress.
8. INQUIRYLEARNING 501 The analysis model of additiveeffects of individualvariablesis a logical pre- requisiteto the controlledcomparisoninvestigativestrategy.This is so becausethe purposeof the latteris identificationof the effect of a single variable.If one's men- tal model is not one of individual additive effects, neither attributeof the con- trolled comparison strategy is compelling. The "comparison"attributeis not compelling, given that it entails comparingthe outcomes associatedwith two (or more) levels of a variablefor the purposeof assessing the effect of thatvariableon outcome. Furthermore,the "controlled"attributeis even less compellingbecause it is the individualeffects of othervariablesthatneed to be controlled.As we sug- gested previously, then, an incorrectmental model may underlie the strategic weaknessesthathave been observedandimpedethe multivariableanalysiscentral to inquirylearning. As a procedure,the controlledcomparisonstrategyis straightforwardto teach ("Keepeverythingelse the same andjust changeone thing").By comparison,it is not easy to change mentalmodels, and this would seem particularlyso of the sort of generic model (of multivariablecausality)that we discuss here. A numberof studiesover the years have undertakenteachingthe use of the controlledcompari- son procedurein brief trainingsessions (Case, 1974; Chen & Klahr, 1999) with some degree of success, but such interventionsareunlikelyto effect changein un- derlyingmentalmodels of causality. In our research(Kuhn& Angelev, 1976; Kuhn,& Ho, 1980; Kuhn,Ho, & Ad- ams, 1979;Kuhn& Phelps, 1982;Kuhnetal., 1988;Kuhnetal., 1995;Kuhnetal., 1992), we have focused on longer term interventions(typically 8-10 weekly ses- sions), with an objectiveof promotingnotjust changein the strategiesstudentsuse to acquire new knowledge about a causal system (referredto later as knowing strategies), but enhancementof theirmetastrategicunderstandingof why these are the strategiesthatmust be used and why otherswill not suffice. Executionof the controlledcomparisonstrategy,as just noted, is relatively easy to teach, but it is metastrategicunderstandingthatdetermineswhetherthe strategywill be selected when the studentis engaged in self-directedactivity (Kuhn,in press-c). The argumentwe make here is thatthis metastrategicunderstandingrequiresa correctmental model of how a multivariablecausal system (again, in the generic sense of anycausalsystem)operates.A strategythathasthepurposeof assessingthe effect of an individualfeaturewill notbe understoodandvaluedunlessone's mental model of the operationof a multivariablesystem is based on the additiveeffects of individualfeatures.Oncethisanalysismentalmodelof individualadditiveeffects is attained,the learneris in a positionto proceedto a morecomplex analysismodel in which these individualeffects areinteractivein theirinfluenceon outcomes.In the absence of this analysismentalmodel in which individualvariablesasserttheirre- spective effects on an outcome in an additivemanner,the controlledcomparison strategyfor assessing these effects can be taught,but its logic will not be compel- ling-there will not be a deep level of understandingas to why it must be used.
9. 502 KUHN, BLACK, KESELMAN,KAPLAN METALEVEL FUNCTIONING Onewaytoformalizethisdeeplevelofunderstanding asaconstruct is topostulatea metalevelof operationthatis distinctfromtheperformance level (Figure2). The knowingstrategiesdepictedinFigure2 arethosewe regardascentralto inquiryac- tivity.Themetalevelis thelevelatwhichparticular knowingstrategies areselected foruseandtheirapplication monitored andtheresultsinterpreted (left-hand sideof Figure2).Understanding why tousea strategy, occurs then, atthemetalevel. More- over, it is thismetalevel understanding thatshould not govern only the useof a strat- egy but its to generalization a new context in which it is applicable(Crowley& Metalevelunderstanding, we canhypothesize,developsin parallelwithstrate- in a gic competence mutuallyfacilitativerelation.Exerciseof strategiesattheper- formancelevel feeds backandenhancesthe metalevelunderstanding thatwill guide subsequentstrategyselectionand,hence,performance. In otherwords, metalevelunderstanding bothinformsandis informedby strategicperformance (Figure2; see also Sophian,1997). Strategiesexistonlyin relationto goalsorobjectives.Therefore, metalevelun- derstandingof task objectives (metataskunderstanding)is as critical as metastrategic of thestrategiesthatareavailableto applyto thetask understanding (Kuhn& Pearsall,1998;Siegler& Crowley,1994).Bothmustbe presentandco- Competence I Disposition to apply to apply .' TRATEGILES .. tre INQUIRY he....... something to findout? ANALYSIS ............... Can Meta-level analysisbe Knowing: Meta-level worthwhile? Declarative Knowing: Procedural INFERENCE ................ Areunexam- Whatis knowing? inedbeliefs Whatdoknowing worthhaving? Facts strategies ..Op accomplish? ARGUMENT............ Is there Opinions a point Claims When,where, to arguing? \ whyto usethem? Theory- Evidence FIGURE 2 Phases of inquiryactivity, with hypothesizedbidirectionalrelationsbetween the metalevel and the performancelevel. Note. From"HowDo PeopleKnow?"by D. Kuhn,in press,Psychological Science.Copyright 1999 by Blackwell. Reprintedwith permission.
10. INQUIRYLEARNING 503 ordinatedto guideperformancesuccessfully.The mentalmodel of additiveeffects of individualvariables,we have claimed,is essentialfor the controlledcomparison investigative strategy.We can now elaboratethat specifically it is necessary to metataskunderstandingof the task objective of identifying effects of individual variables. Without this understanding,the appropriatecontrolled comparison strategywill not be consistentlyselected. In the researchpresentedin this article, we examine the extent to which the mentalmodel transition(from an incorrectto correctmodel of multivariablecau- sality) thatis discussedhere is facilitatedby metalevelexercise thatoccursin addi- tion to and in conjunctionwith performance-levelexercise of strategies.In past work, we have undertakento promotethe developmentof metalevel understand- ing by externalizingit in collaborativediscussion among peers, a method that works under certainconditions (Kuhn, in press-c). Anothermethod is to engage studentsmore directlyin metalevel exercise by asking them to evaluatedifferent potentialstrategiesthatcould be appliedto a problem.The contemplationof alter- native strategiesshould promotenot only attentionto task objectivesbut also the essential task of coordinatingtask objectiveswith availablestrategies.This direct approach,we have found, also meets with some success (Pearsall, 1999). It is this latterapproachthat is used in the work presentedhere, but we do so with a particularfocus on the questionof whetherit will promotethe transitionto the more correctadditivemental model of causality.As partof the metastrategic evaluation exercise, studentsare presentedthe situationof two individualswho disagree as to the effect of a particularfeaturewith one individual,for example, claiming that soil type makes a differenceand the otherclaiming that it does not. The studentsmust then considerand evaluatethe strategiesthat could be used to resolve the conflict. Note thatthe conflict is explicitly identifiedas one aboutthe effect of a particular,individualfeature.To whatextent,we asked,would extended experience with the evaluationof such conflicts promote (a) at the metalevel, a mentalmodel based on the effects of individualfeatures,reflectedin metataskun- derstandingthat the object of the activity is identificationof effects of individual features;(b) metastrategicunderstandingof the need to controlthe influences of other features(the controlledcomparisonstrategy);(c) at the performancelevel, successful use of the controlledcomparisonstrategy;(d) resultingvalid inferences regardingthe status of causal and noncausalfeaturesin the system; and (e) supe- rior acquisition of knowledge about the system, reflected in correctconclusions aboutits causalstructure.Ourpast researchindicatedthatperformance-levelexer- cise of investigative activity (with no feedbackbeyond that providedby the stu- dent's own activity)over a periodof weeks is sufficientto induce some change on at least some of these dimensions among a majorityof students.We, therefore, compare two conditions: one in which students engage only in this perfor- mance-level exercise and anotherin which studentsalso engage in the metalevel exercise, describedmore fully subsequently.
11. 504 KAPLAN KUHN,BLACK,KESELMAN, METHOD Participantswere 42 middle school (6th, 7th, and 8th grade)studentsattendingan urbanpublic school. They came from two comparableintact science classes of mixed-grade(6th-8th) level. Eachclass participatedover the same several-month periodas partof theirscience curriculum.One class was arbitrarilychosento serve as an experimentalgroup,andthe otherclass servedas a controlgroup.The former groupconsistedof 10 boys and 11 girls, andthe latterhad 12 boys and9 girls. Stu- dentswere of diverse ethnicity,with the majoritybeing AfricanAmericanor His- Task Environment The main task, which studentsengagedrepeatedlyboth individuallyand in dyads duringthe course of the study, is a multimediaresearchprogram,createdwith the MacromediaDirectorauthoringtool. The programsupportsself-directedinvesti- gationof a multivariableenvironmentconsistingof a set of instancesavailablefor investigation,with instancesdefinedby five variablefeaturesandanoutcome-the degree of flooding of a buildingsite. Studentsare placed in the role of buildersworkingfor TC ConstructionCom- pany, which builds cabins along the shore of a series of small lakes. The area is susceptibleto flooding, and the cabins are, therefore,built on supportsthat raise themabove the ground.It is the student'staskto identifythe optimumheightofthe supportsfor variousbuildings.It is explainedin the introductoryonline presenta- tion thatthe supportsshouldbe neitherhigherthannecessaryto avoid unnecessary buildingexpense nor lower thannecessaryto avoid flooding andresultingdamage to the building. Studentsare given a bank accountat the beginningof theirwork, with money subtractedfor incorrectpredictions(of how much flooding will occur at that site and, therefore,how high the supportsneed to be built) and a bonus re- ceived for correctpredictions. The only way for studentsto generatecorrectpredictionsis to investigateef- fects of the five variablefeatureson amountof flooding and drawappropriatein- ferences. Following an introductorysession in which the programis introduced and students'initial beliefs assessed regardingthe five variablefeaturesthatmay influence flooding, the studentembarkson a series of investigatorysessions. The programincludes the following sequence of activities:statementof investigatory intent(studentsindicatewhich featuresthey intendto find out about),selection of featurelevels in instancesto be examined,predictionof outcomes,the opportunity to make inferencesandjustify them, and the option of makingnotes in an online notebook.Duringthe second andsubsequentsequences,the featurelevels andout-
12. INQUIRYLEARNING 505 TABLE1 Causal Structureof FloodProblem Waterpollution(highorlow) No effect Watertemperature (hotorcold) Coldraisesthefloodlevel 1 ft Soildepth(deeporshallow) Shallowraisesthefloodlevel2 ft Soiltype(clayorsand) Sandreducesthefloodlevel 1 ft fordeepsoil only Elevation(highorlow) No effect comeof the immediately precedinginstanceremainvisibleto facilitatecompari- sons. The sequenceis repeatedfive timesduringeach session.At the end of a session,studentsareaskedto drawconclusionsaboutthecausalandnoncausalef- fectsoperatingin thesystem.Students'activitywithintheprogramis trackedand recordedintowordprocessingfilesby theprogram. Thecausalstructure of the taskenvironment is shownin Table1. Two of the five featuresarenoncausal(i.e.,haveno effecton outcome).Theotherthreefea- turesarecausal,withaninteractive effectbetweentwo features. A secondtaskwas employedas a transfertask,to assess the generalityof changesin students'strategiesandunderstanding asa functionof theirworkonthe maintask.Thetransfertaskwas identicalto themaintaskin structure andcom- puterinterface.Thecontentinvolvedtheeffectsof variousfeaturesonjob appli- cants'potentialeffectivenessas a teacher'saidein a classroom. Pretest assessment. Studentsfrombothclassesparticipated in individu- allyadministered pretests.Followingintroduction of theprogramandassessment of initialbeliefs,theinitialinvestigatory sessiontookplace.Duringthatsession,the studentrepeatedtheinvestigatory cycle(selectionsof featurelevels,predictionof outcome,inference,andjustification) fivetimes.Students workedone-on-onewith a researcher that during session, so that or anyquestions misunderstandings could be addressed.An identicalpretestassessmentwas administered for the transfer Performance-level exercise. A 2-week school vacation intervened be- tweencompletionof pretestassessments andcommencement of themainphaseof thestudy.Duringthatphase,participantsworkedinchangingdyadsina seriesof 9 to 10sessionsthattookplaceovera periodof roughly6 weeks,withanaverageof two sessionsperweek(anda rangeof 1-3, dueto absencesandschedulingcon- Assignmenttodyadswasrandomexceptfortheconstraint straints). of avoiding,as faras possible,pairingof thesametwo studentsformorethanonesession.At the beginningof the pairsessions,studentswereinstructedto workcollaboratively ratherthaninturnsto discusstheirviewsasto howtoproceedorwhatto conclude,
13. 506 KAPLAN KUHN,BLACK,KESELMAN, TABLE2 Sample MetalevelExercise Thisis TerryandJamie'swork: T sy sil a Shallows Polltioe: at r Hig . '!l a ligh pthd: si hallow Terrysayssoil typedoesmakea difference. Whatcan settle the argumentbetweenthem? Do the records they looked at say anythingabout whethersoil type does or does not make a difference?(circle one) Yes No Whatdo the records suggest? Soiltypemakesa difference Soiltypedoesnotmakea difference Whatwas differentabout this record and the last record they looked at? Weretheydifferenton soil type? Same Different Weretheydifferenton waterpollution? Same Different Weretheydifferenton watertemperature? Same Different Weretheydifferenton soil depth? Same Different Weretheydifferenton elevation? Same Different Did the two records have differentamountsofflooding? (circle one or more) Becauseof soil type Becauseof waterpollution Becauseof watertemperature Becauseof soil depth Becauseof elevation Do the records they looked at say anythingabout whethersoil type does or does not makea difference?(circle one) Yes No Whatdo the records suggest? Soiltypemakesa difference Soiltypedoesnotmakea difference (continued)
14. INQUIRYLEARNING 507 TABLE2 (Continued) Whatgrade wouldyou give Terryand Jamie on their work?(circle one) A B C D F Whydo they deserve this grade? Andtheywantedto findoutFORSUREif soiltypemakesa difference. Whatrecord should they look at next, to be sure? (Circle your choices.) If the second record comes out differentfrom thefirst, what will the reason be? - and not to proceed until some agreementwas reached.At each session, the pair workedcollaborativelyon the flood task,with an adultavailablefor consultationif problemsarose, but the adultotherwisedid not intervene. Metalevel exercise. In addition,studentsin the experimentalconditionen- gaged in a series of paper-and-pencil exercisesrelatedto the flood task,which they workedon in pairswithintheclassroom,twice eachweek forthedurationofthe period thattheywereworkingon thefloodprogram.Pairingvariedacrossoccasions,andstu- dentswereinstructedto worktogetherandagreeon an answerbeforewritingit down. Studentscompletedone exercisepersession.A sampleexerciseis shownin Table2. In thatexample,the comparisonis confounded(therecordshowndiffersfromthe previ- ousrecordwithrespectto two features)andtheoutcomevaries.Inothercases,thecom- parisonwas controlledandthe outcomeseithervariedor remainedconstant.
15. 508 KUHN, BLACK, KESELMAN,KAPLAN Posttest assessment. Theposttestassessment wasconducted individually andduplicatedthe pretestassessment.Posttestassessmentstookplace duringthe 2 weeks following completionof the interventionperiod. Delayedposttestassessment of metalevelunderstanding.Approxi- mately 1 week following the completionof posttest assessments,a pa- per-and-pencil measurewas administered duringclass time by the classroom teacherineachof theclasses.Theresearchers werenotpresentduringthisadmin- istration.Onestudentin theexperimental conditionand5 studentsin thecontrol conditionwereabsenton the administration dayanddidnotreceivethisassess- Thismeasurewas designedto assessmetataskunderstanding of the taskgoal (identifyingeffects of individual and features) metastrategic understanding of the criticalstrategy(controlledcomparison) thatallowed thisgoalto be met. To serve as themostrigoroustestof understanding, thismeasurewasbased on the content of thetransfer(teacheraide)taskratherthanonthecontentof themaintask(used in theintervention activities).Thestudentwasaskedwhichof tworecordswould be thebetteroneto lookatnext:Pat'schoice(whichrepresented a controlled com- parisonrelativeto theinitialrecordavailable)orLee'schoice(whichrepresented a confoundedcomparison withrespectto two features).Thestudentwas askedto justifywhy thiswas "abetterplanfor findingout."In addition,the studentwas RESULTS Predictionerror. A quantitative measureof performance is thedegreeof er- rorinpredicting outcomes.Averageprediction errordecreasedfrom1.23errorsat thepretestto0.96errorsattheposttest(withoneunitof errorequalingamismatch of 1 ft.betweenthepredictedlevelof floodingandtheactuallevel).Thisdeclinewas significant,F(1,40)=4.54,p= .039,anddidnotdifferbyexperimental condition. Meanpredictionerroron thetransfertasksimilarlydecreasedfrom1.05atthe pretestto 0.74attheposttest.Thisdifferencewasalsosignificant,F(1, 40) = 7.87, = p .008, and did not differby experimentalcondition.Thus, students in both groups learned something aboutthecausalsystemthatwas observable in theirper- Valid inference. A more qualitativepictureof performanceis providedby strategy
16. INQUIRYLEARNING 509 of controlledcomparisonis not straightforward to assess because studentsdid not always makethe appropriatecomparisons,even when they hadselected for exami- nationdatathatwould allow themto make an informativecomparison.Therefore, we were conservativein assessmentof use of the controlledcomparisonstrategy, judging it presentonly when studentsdrewajustified inference,thatis, drewa cor- rect conclusionbased on comparisonof two instancesthatthey had generatedand that they referredto in justifying the conclusion. The numberof inferencesjustified by an appioptiatecontrolledcomparisonof two instances (henceforthcalled valid inferences)was examinedrelativeto num- ber of possible inferences.This proportionof valid inferenceswas calculatedfor each studentfor the main and transfertasks at pre- and posttest assessments.As seen in Table 3, patternsare similarfor the two tasks. Studentsin both conditions show a low level of valid inferenceat the pretest,and both groupsshow improve- ment from pretest to posttest, with the experimentalgroup showing somewhat greaterimprovementthanthe controlgroup.The proportionssummarizedin Table 3 were subjectedto arcsinetransformationand analyzedby a repeatedmeasures ANOVA with time oftesting a within-subjectsfactorandexperimentalconditiona between-subjectfactor.For the main task,time of testingwas significant,F(l, 41) TABLE3 Proportionof ValidInferences Group Pretest Posttest Main task Experimentalgroupa M .06 .45 SD .11 .42 Controlgroupa M .12 .33 SD .19 .42 Total groupb M .09 .39 SD .15 .42 Transfertask Experimentalgroupa M .00 .43 SD .00 .51 Controlgroupa M .10 .29 SD .30 .46 Total groupb M .05 .36 SD .26 .48 aN= 21; bN= 42.
17. 510 KUHN, BLACK, KESELMAN,KAPLAN TABLE4 MeanNumberof Inferencesper InstanceExamined Group Pretest Posttest Main task Experimentalgroupa M 3.77 3.33 SD 1.32 1.66 Controlgroupa M 3.98 4.02 SD 1.11 1.32 Total groupb M 3.88 3.67 SD 1.21 1.51 Transfertask Experimentalgroupa M 3.86 2.64 SD 1.57 1.92 Controlgroupa M 3.64 3.50 SD 1.57 1.86 Total groupb M 3.75 3.07 SD 1.55 1.92 aN= 21; bN= 42. = 20.58, p < .001, but neitherconditionnor the interactionof time and condition reachedsignificance.Forthe transfertask,time of testingwas significant,F(1, 41) = 19.2l,p < .001, andthe Time x Conditioninteractionwas marginallysignificant, F(1, 41)= 2.84,p = .10. A decline in the number of inferences made also reflects improvedperfor- mance.A studentwho declines to makean inference(choosingthe "haven'tfound out" option) recognizes that the evidence he or she has generateddoes not allow for a definitive conclusion. The averagenumberof inferencesmade per session was overallslightlybelow four(of a possible five). As seen in Table4, this number declinednoticeablyonly amongthe experimentalgroupandmore so on the trans- fer task than the main task. A repeatedmeasuresANOVA yielded no significant effects for the main task. On the transfertask, however, the interactioneffects of both time, F(l, 41) = 7.01,p =.01, andTime x Condition,F(l, 41) = 4.37,p =.04, were significant. Some additionalinsight is gained by qualitativeexaminationof patternsof changefrompretestto posttest.These aresummarizedin Table5, which shows the distributionof studentsshowing no valid inference,a mixtureof valid and invalid inference,and all valid inferenceat the two times for the maintask.As seen in Ta-
18. INQUIRYLEARNING 511 ble 5, the majorityof studentsshow no valid inferenceat the pretest,andjust less thanhalf do not improvein this respect.Improvement,however, is more frequent in the experimentalgroup.Mixtureof valid andinvalidinferenceis a commonpat- tern at both times, consistent with previous research (Chen & Klahr, 1999; Crowley& Siegler, 1999;Kuhnet al., 1995). Results for the transfertask aresimi- lar, with slightly lower frequenciesof valid inferenceusage at the posttest (9 stu- dents in the experimentalgroup and 6 in the control group showing some or all valid inferences). Understanding inferred from performance. An indirect measure of stu- dents' understandingof the task objective is providedby their responses to the queryregardingwhich featuresthey intendedto find out about,posed at the begin- ning of each investigativesequence.Did studentsunderstandthe need to focus their investigativeeffortson a single featureat a time?If so, thisunderstandingshouldbe reflectedin answersto this question.A declinein the numberof featuresforwhich a studentexpressed an intent(to investigate)in examininga single instanceof evi- dence should reflect increasedunderstandingof the need to focus on single fea- tures.Therefore,we comparedmeannumberof intents(to investigatea feature)per instanceat pretestand posttest assessments. These means are shown in Table 6 for the two conditionsand times of testing. As seen there, despite differencesattributableto chance at pretest,numberof in- tents declines over time, with the most sizable decline in the experimentalgroup on the main task.An ANOVA yielded significanteffects for the maintask forboth time, F(1, 41) = 60.94, p < .001, and the Time x Conditioninteraction,F(1, 41) = 6.75, p = .013. For the transfertask, only the effect for time was significant,F(1, 41) = 5.92, p = .02. Also relevantare the numberof studentsfor whom the mean numberof intentsdeclined to less than2, indicatingthat at least some of the time this studenthad the intent of investigatinga single feature.At the posttest, these TABLE 5 Pre-andPosttestDistributions of Participants byPatternsof ValidInferences (MainTask) Group No ValidInferences Some ValidInferences All ValidInferences Pretest 16 5 0 Posttest 8 7 6 Pretest 14 7 0 Postest 12 4 5
19. 512 KUHN, BLACK, KESELMAN,KAPLAN TABLE6 MeanNumberof Intentsper InstanceExamined Group Pretest Posttest Main task Experimentalgroupa M 3.74 2.00 SD 0.87 0.72 Controlgroupa M 3.05 2.24 SD 0.89 1.08 Total groupb M 3.40 2.12 SD 0.94 0.91 Transfertask Experimentalgroupa M 3.10 2.45 SD 1.5 1.41 Controlgroupa M 2.57 2.02 SD 1.12 1.16 Total groupb M 2.83 2.24 SD 1.33 1.29 aN= 21; bN= 42. frequencies were 15 (71% of participants)for the experimentalgroup and 12 (57%) for the controlgroupon the main task. This differencewas not maintained in the transfertask,however.Frequencieswere 11 (52%)and 14 (67%)for experi- mental and controlgroups,respectively. Relationof understandingto strategies. Qualitativeanalysisof patterns of performanceindicatedthatfocus on a single featureat a time as an investigatory intentat the posttestwas associatedwith betterstrategiesat the performancelevel. Of 10 participants(6 experimentaland 4 control), who showed consistent sin- gle-featureinvestigatoryintentat the posttest, all displayedvalid inferenceat the posttest.Of the 6 experimentaland 12 controlparticipantswho displayedno valid inference, conversely, none displayed single-featureinvestigatoryintent. These studentseitherintendedto investigatemultiplefeaturesat once, shiftedtheirintent fromone featureto anotherbeforethe necessaryevidence hadbeen generatedwith respectto the firstfeature,orexpressedno investigatoryintent("didn'tknow"what they were going to find out).
20. INQUIRYLEARNING 513 Direct assessment of understanding. The metalevelassessmentmeasure was designedto providea directmeasureof whatparticipantsunderstoodat the fi- nal assessmentwith respectto (a) the objective of the task and (b) why controlled comparisonwas the best strategyfor achievingthat objective.Both of these were assessed in a contentdomainotherthanthe one in which studentshadhadexercise. Studentswho scored at the highest level (Level 3) chose Pat's plan (which al- lows unconfoundedcomparison)as the betterone and were able to answerboth questions about Pat's plan correctly-why it is better than Lee's plan (metastrategicunderstanding)and what Pat is intendingto find out (metataskun- derstanding).Typical of the correctanswersto the firstquestionwere "becausehe only changedone thing"or "becauseeverythingis the same except age," although a few studentsshowed very clear metastrategicunderstandingreflected in an an- swer such as "If you change only one and it makes a difference then you know whatmadethe change."Typicalof the correctanswersto the second questionwere "if age makes a difference"or "if an older or youngerteacheraide is better." Studentscategorizedas Level 2 chose Pat's plan as the betterone but answered only one of the questionscorrectly,responding"I don't know"to the otheror giv- ing a vague answer(e.g., "She'll find out if her plan is betterthanLee's"). Studentscategorizedas Level 1 chose Pat's plan as the betterone but offeredno relevantjustification(e.g., "Pat'splan is betterbecause being a parentmeans she knows how to take care of her students"). Table 7 shows the number of students in each group categorized at each level. All studentsin the experimentalgroup, it is seen, recognized Pat's plan as better,and all but 2 studentsin the controlgroup did so. The numberof students who were able to justify the superiorityof Pat's plan in meeting the task objec- tives, however, is significantly higher among students in the experimental group-55% versus 38%, X2(1,N= 36) = 7.60, p < .01. These results suggest that (a) overall, students' implicit understanding(reflected in the correctchoice of Pat's plan) outstripstheir explicit understanding(reflected in their justifica- tions of the choice); and (b) the experimentalcondition facilitates the develop- ment of metalevel understanding. Results also indicatethatmetastrategicunderstandingmay remainincomplete even among studentswho show considerableunderstandingby correctlyanswer- ing the two questions described.In responseto the question"Whatwill Lee find TABLE7 Number of Studentsat EachLevelof Performance AssessmentMeasure on the Metalevel Group Level 3 Level 2 Level 1 Level 0 Experimentalgroup 11 1 8 0 Controlgroup 6 3 5 2
21. 514 KUHN, BLACK, KESELMAN,KAPLAN out with Lee's plan?"only 4 studentsin the experimentalgroupand 5 studentsin the controlgroupansweredcorrectly,typicallyby identifyingthe limitationof the noncontrolledcomparisonstrategy(e.g., "Shewon't find out anythingbecauseshe won't know what causedthe change").The less commonanswer"He'll find out if anythingmatters"was also counted as correct.Others,when asked about Lee's plan, not only did not acknowledge its inferiority(e.g., "She will find out the same")but also indicatedpotentiallyproductiveoutcomesof the plan (e.g., "She'll find out if the totallyoppositepersonwill make a difference").The latterresponse, we would claim, invokes the faulty co-occurrencemental model of analysis via featurelevels, ratherthan features. Two measuresof the posttestknowledgethatstudentsexhibitedaboutthe system following theirinvestigationswere examined.Onewas the totalnumberof features they implicatedas causalin interpretingoutcomes.Theotherwas the correctnessof theirconclusionsas to which ofthe featureswere causalandwhichwerenoncausal. At the pretestfor the main task, studentsimplicateda mean of 2.69 featuresas having causalstatus(comparedto the correctnumberof 3). Following theirinves- tigationswith the flood program,the meannumberof featuresimplicateddeclined to 2.22, a significantdecrease,F(1, 39) = 4.68,p = .037. (This decreasedid not dif- fer significantlyacross conditions.)In this respect,then, studentsbecameless cor- rect following investigation. However,this conclusionmustbe temperedby the knowledgethatstudentsdis- played as to which featureswere causalandwhich were noncausal.These findings are examinedonly for the maintask. (Students'knowledgewould not be expected to increaseappreciablyin the transfertask,given theirlimitedexposureto it.) With respect to both noncausal features(waterpollution and elevation), there was in- creasefrompretestto posttestin the numberof correctconclusions,indicatingim- proved knowledge aboutthe causal system. Many students,however, maintained their incorrectbeliefs that these featureshad causal status. For water pollution, numberof studentsexhibitingcorrectconclusionsincreasedfrom 10 to 26 (of a to- tal groupof42). Forelevation,the numberincreasedfrom 12 to 18. Withrespectto the causal featurewater temperature,correctconclusions regardingthe direction and natureof its causal statusincreasedfrom 8 at the pretestto 18 at the posttest. (The remaindermost commonlyjudged the featurenoncausal,althougha few stu- dentsjudged it causal but in the incorrectdirection,or chose an "it depends"op- tion.) Similarly,correctconclusionsregardingthe soil type featureincreasedfrom 9 at the pretestto 19 at the posttest,with most of the remainingstudentsjudgingthe featurenoncausal,but 1 studentcorrectlystipulatedan interactioneffect with soil depth. Soil type was initially (and correctly)judged causal by the largestnumber
22. INQUIRYLEARNING 515 of students-23. This numberincreasedto 33 at the posttest, with a few students nonetheless retainingincorrectbeliefs. Thus, students' interactionwith the pro- gramover time enabledboth groupsto increasetheirknowledgeof the causal sys- tem. The retentionof incorrectbeliefs, despitethe substantialamountof evidence each participantgenerated,however,was commonand did not differ significantly across conditions. DISCUSSION Increasingly,"authentic"scientific activity is being promotedas a model of good science education (Bransfordet al., 1999; Cavalli-Sforza,Weiner, & Lesgold, 1994; Eisenhart,Finkel, & Marion, 1996; McGinn & Roth, 1999; Palincsar& Magnusson,in press). Such activity is contrastedto the allegedly more superficial observation,description,and laboratoryexercises with well-knownoutcomesthat long have been the stapleof even the best science education.Studentsmustengage in the genuineinquiry,it is argued-involving the formulationof questions,design of investigations, and coordination of theory and evidence with respect to multivariablesystems-that is characteristicof real science. The datapresentedhere suggest thatthe skills requiredto engage effectively in typical formsof inquirylearningcannotbe assumedto be in place by early adoles- cence. If students are to investigate, analyze, and accurately represent a multivariable system, they must be able to conceptualize multiple variables additively coacting on an outcome. Ourresults indicatethat many young adoles- cents find a model of multivariablecausality challenging. Correspondingly,the strategiesthey exhibit for accessing, examining,and interpretingevidence perti- nent to such a model arefarfromoptimal.We turnlaterto curriculumimplications thatwe believe follow from these findingsand considerfirstwhat the resultssug- gest regardingthe natureof these cognitive competenciesand how they develop. What Develops? Theperformanceskills (notablythe controlledcomparisonstrategy)thathave been the focus of attentionin researchon scientificreasoningarguablyareonly one piece of a complex structureof relatedskills thatundergoesdevelopment.This structure needs to be definedboth horizontally(with respectto the componentsit includes) and vertically (with respect to first its emergentand ultimately its consolidated forms).An attemptto depictthe horizontalstructureappearsin Figure2, presented earlier.Key componentsof this model are (a) the full cycle of inquiryactivity,be- ginningwith the criticalskill of identifyingthe questionsto be askedandculminat- ing in the advancementof claims in argumentivediscourse;(b) themetalevelofun-
23. 516 KAPLAN KUHN,BLACK,KESELMAN, derstanding (of both strategies, depicted on the left side of Figure 2, and knowledge,depictedon the rightside) thatbothdirectsandis influencedby perfor- mance, as discussed earlier;and (c) values associatedwith inquiryactivity, high- lightedby ResnickandNelson-LeGall(1997) anddiscussedearlier.Relatedto val- ues and also representedon the rightside of Figure2 is metalevel epistemological understandingof the natureof one's own and other's knowledge and knowing (Kuhn,Cheney, & Weinstock, in press). The broadimplicationto be drawnfrom Figure2 is thatthereis moreto effectiveknowingthantheperformanceskills them- selves (Kuhn,in press-b). Verticalspecificationrefersto the fact thata complex structureof this sortdoes not emergefully formedbut,morelikely, undergoesa gradualevolution.Research with young elementaryschool children(Lehrer& Schauble,in press) has made it clear that even very basic forms of organizingand representingdata (such as the frequenciesof a set of possible outcomes)pose challengesto young children,and the relevantunderstandingsand skills must be painstakinglyconstructed.In this sense, the findinghighlightedin this work-that slightly older childrenhave diffi- culty in representingrelationsbetweenmultipleantecedentvariablesandmultiple outcomes-should not be surprising.At the otherend of the verticalcontinuum,it is relevantto note thatin earlierresearch(Kuhnet al., 1995), adultcommunitycol- lege studentswho were readily able to use the controlledcomparisonstrategyto identify effects of individual featuresnonetheless often had trouble explaining outcomes that were the additiveproductof two individualeffects and fluctuated from one featureto the otherin accountingfor the outcome, seeing it as theirtask to explain which single featurehad producedthe outcome. Recognizing their si- multaneousadditiveinfluencewas a conceptualhurdlethatrivaledin difficultythe conceptualhurdleposed by interactioneffects. Unrepresentedin the inquiryactiv- ity in which studentsengaged in this work is the furtherconceptualchallengethat is posed when outcomes are not deterministic(as they were in our activity) but ratherare a probabilisticdistributionaroundsome centraltendency. Studentsof any age will not be successful in understandinginteractiveinfluenceson probabil- istic outcomes until they have masteredthe more elementarymodel on which we focus here, involving multiple effects additivelyacting on an outcome. Mental Models Mentalmodels of any sortremainessentiallyunobservabletheoreticalconstructs. Performanceindicatorsof varioustypes serve as evidence thata particularmental model is in operation,but no empiricaldatacan indicatewith certaintythe opera- tion of a particularmentalmodel. In inquiryactivities,mentalmodels arethe indi- vidual's representationof the (virtualor actual)reality that is being investigated. Forthis reason,they arelikely to influencethe strategiesthatarebroughtto bearon
24. INQUIRYLEARNING 517 the task. Nonetheless, we cannotsay with certaintythat it was revision in mental models thatbroughtaboutthe changesobservedover time in this work. Such revi- sion couldbe an effect ratherthana cause.Nor wouldwe wantto claim thatthe kind of interventionundertakenin this workrepresentsthe only soundapproachto facil- itatingdevelopmentof the cognitive competencieswe have identifiedas involved in inquirylearning.However,this interventionwas targetedatthe metalevelofcog- nitiondepictedin Figure2, andwe do wantto claim thatthis level ofunderstanding about inquiry, in contrast to the "understandinghow" emphasized in perfor- mance-focusedinterventions,plays anessentialrole in effectingchange.Metalevel understandingcan come aboutas a productofthe exercise ofperformanceskills, as well as by directtargeting,but it cannotbe bypassed. The importanceof this metalevel of understandingabout inquiry is also un- derscoredby the fact that in most of the knowledge seeking that studentsmay engage in outside of a formalschool setting, they are unlikely to have the oppor- tunity to devise and execute controlledexperiments.Much more often, they will be in a position of interpretingevidence derived from partiallycontrolledor nat- ural experiment data (Kuhn & Brannock, 1977). It is all the more important, then, that their interpretationsnot be compromisedby an inadequatemental rep- resentation of the multivariablecausality that such data are likely to reflect. Equally critical is metalevel understandingof the strengthsand weaknesses of inference strategiesthat may be effective, effective but inefficient, or ineffective and fallacious. Again, what to do (when controlledexperimentationis possible) is only one piece of a larger knowledge structurethat includes what not to do and why, as well as what to conclude when controlledexperimentis not feasi- ble-to know when we do not know, when we have a way to find out, and when we will never know (Kuhn, in press-b). Patterns and Mechanisms of Change The resultspresentedhere confirmearlierresearch(Kuhnet al., 1995;Kuhnet al., 1992; Schauble, 1990, 1996) indicatingthatexercise can be a sufficientcondition to induce strategicchange,both in increasingthe frequencyof effective strategies anddecreasingthe frequencyof ineffective ones. This workextendsthese findings to metalevelunderstandingof task and strategiesandthe mentalmodels of causal- ity associated with them. In addition to performance,metalevel understanding (measuredboth directlyandindirectly,the lattervia investigatoryintent)increases with exercise. This change at duallevels supportsthe kind of continuousfeedback model depicted in Figure 2. An additionalfinding of this work is that exercise directlyat the metalevel (in the experimentalcondition)furtherenhanceschange.These benefits (indicatedby significant effects of condition)were seen either specifically at the metalevel (in
25. 518 KUHN, BLACK, KESELMAN,KAPLAN both direct and indirectmeasures)or in the transferto a new task at the perfor- mance level. (Conditiondifferences,recall, did not reach significance at the per- formancelevel for the maintask,thoughthey were in the expecteddirection.)The social componentof the exercise at bothperformancelevel andmetalevel (in both cases, studentsworkedin pairs),it shouldbe noted, in itself providesa weak form of metalevel exercise for students in both conditions. If partnersare suitably matched,studentsshow higherlevels of performancewhen workingwith a partner thanthey do when workingalone on the same task (Kuhn,in press-c). The exter- nalizationof metalevel decisions in social dialoguepresumablysupportsthis nor- mally covert level of processing. We did not make this comparison (between social and solitaryconditions)in this study,however,because we wished to iden- tify the effect of directmetalevel exercise. More specific than this generalmodel of dual-level change are the particular metalevelunderstandingsandperformance-levelstrategiesthatwere the object of the presentresearch.Althoughunderstandingof task objectives is criticalto per- formanceof most cognitive tasks (Kuhn& Pearsall, 1998; Schaubleet al., 1991; Siegler & Crowley, 1994), in this case we have arguedspecifically thatmetalevel understandingof the task objectiveof identifyingthe effects of individualfeatures (a) requiresa correctmentalmodel of multivariablecausalityand(b) is a prerequi- site for consistent choice of the controlled comparisonstrategy.Logically, the value andpowerof the controlledcomparisonstrategycannotbe appreciatedin the absenceof this mentalmodel. Empirically,ourdatasupportthis claim. Progressin understandingthe task objectiveas one of the identifyingeffects of each of the in- dividual features (which we took as an index of an accuratemental model of multivariablecausality)showed significanteffects of both time and experimental conditionand, in analyses of individualpatterns,was associatedwith good strat- egy usage. An implicationfor researchon scientificreasoningis thatinvestigatory intent is at least as importantas the controlledcomparisonstrategyas a topic of In examining individualstudents'patternsof performance,we found mixture (of levels) and gradualchange to be the rule ratherthanthe exception, consistent with the findings of microgenetic research (Kuhn, 1995; Siegler & Crowley, 1991). Because studentsworkedwith a changingset of partnerswho produceda collaborativeperformance,these resultsdo not allow microgeneticanalysis of in- dividualchangepatterns.Also, it is not obvious exactly what the parallelof strat- egy mixture might be in the domain of mental models. Studentsmay display a confused or incoherentmodel in the course of transitionfrom a less correctto a morecorrectmodel or, as they do in the case of strategies,they may rely on one ap- proach(model) at one time anda differentone at another.Ourdatado not allow us to choose definitively between these two alternatives,but they do suggest that a shift in mentalmodels, like strategyshifts, is not abruptand total, but more likely takes place slowly and in gradualsteps.
26. LEARNING 519 INQUIRY The Process-Content Debate The inquiry activity that students in this study engaged in was deliberatelyde- signed as "content-lean,"in the sense that we were not undertakingto teach stu- dents any significantbody of scientific knowledge. Instead,our approachwas to examine in as simple a context as possible the strategies, metastrategicunder- standing, and attendantgeneric mental models requiredfor productive inquiry regardingrelations among variables. If faulty strategiesand mental models are observed in this context, it is likely that they will be present as well in a more complex, content-richenvironment(though they will be harderto identify and examine in that context). A contrastingpoint of view is thata more content-richcontext would have fa- cilitatedthe reasoningobservedin this study. In otherdomainsof inferentialrea- soning-for example,Wason's (1983) four-cardproblem-performance has been shown to improvedramaticallywhen the problemis situatedin a familiarcontext. There is an importantdifference,however, between that reasoningparadigmand the one investigatedhere. In the former,the objective in providinga familiarcon- text is to facilitatereasoners'recognitionand,hence, applicationof a formof infer- ence they alreadyknow well (e.g., permissionand obligation). This situation, in contrast,is a bit more complex because we are looking to do more than invoke a well-establishedreasoningscheme. The broad-levelpro- cess skill in question,the coordinationof theorywith new evidence, can proceed in several differentways. If new evidence is entirelycompatiblewith an existing theory, the evidence may readilybe integratedinto it and become partof its rep- resentation.However, this does not guaranteethatthis new evidence will be rep- resented independently of the theory and brought to bear on it, which we identify as a hallmarkof matureor skilled scientific thinking(Kuhn, in press-a; Kuhn & Pearsall,2000). Instead,evidence may be integratedas an "illustration" of what is already accepted as true, or it may simply be assimilated without The more interestingcase, because it allows a clearerassessmentof scientific thinkingas a process, is one in which evidence conflicts with theoryand,hence, is not readilyassimilable,forcingthe individualto ignore,dismiss, or distortit or, al- ternatively,to representit accuratelyand evaluateits bearingon the theory.In the case in which the theoreticalrepresentationis richly elaboratedand highly famil- iar,it is not clearthatscientificthinking(again,as a process skill, in contrastto sci- entific understanding or knowledge) will be enhanced. Available evidence comparingscientific reasoning strategies across more and less familiar content suggests thatcontextuallyrich, highly elaborated,and highly familiarcontent,es- pecially to the extentthatit invokes entrenchedbeliefs, is motivatingas a topic for contemplationbut can resist the impingementof new evidence and, hence, work againstproficientscientific thinking(Kuhnet al., 1995).
27. 520 KAPLAN KUHN,BLACK,KESELMAN, forScience Education Animplication thatshouldnotbedrawnfromthisresearchis thatinquiryactivityis inappropriate in theelementary ormiddleschoolsciencecurriculum becausestu- dentsdonothavetherequisiteskillsto engagein it productively. Themessagewe hopeourworkwill conveyis a differentone,whichis thatsupporting thedesignof inquiry curriculum forthesecritical yearsin science educationshould be identifica- tionof a sequenceof well-delineated cognitivecompetencies thatbecome theob- jectiveof thiscurriculum.In theabsence of an explicitsequence of thisnature,in- quirylearning risks becoming a vacuous practice-one embraced without clear evidenceof thecognitiveprocessesoroutcomesthatit is likelyto foster. Webelievethisstudymakesa contribution in thisrespect,butsuchaneffortis far fromcomplete.The skills andunderstanding we have highlightedhere lie somewhere inthemiddleof anextendeddevelopmental Thekindsof el- hierarchy. ementary skillsinposingquestionsandrepresenting datathatLehrerandSchauble (inpress)havestudiedformtheinitiallevelsof thishierarchy andareits essential foundation. At itsupperlevelsaretheskillsandunderstanding neededto construct data-based modelsof causalsystemsthatincludemultiplelayersof causalityand multiplevariables(andvariablelevels)thatinteractively influenceprobabilistic outcomes.Theseareskillsintegralto the scientificinquirythatoccursin profes- sional science. The interventionaspectof this researchsimilarlyleaves much still to be learned.Froman educational perspective,themajorquestionis notexactlywhy theintervention was effectivebutwhy it was notmoreeffective.At best,we can speculateas to whatkindsof interventions mighthavebeenmoreeffectiveforthe sizableminorityof studentswhoshowedlittleornoevidentbenefitfromtheexpe- riencewe provided.Ourworkdoespointto (a) investigatory intent,(b) a mental modelof multivariable causality,and(c) metalevelunderstanding as promising targetsof futureinterventionefforts.However,moreanddifferentkindsof efforts certainlyseemwarranted, especiallyinviewof theenormous currentinterestinin- quiryas a teachingmethod. A finalcommenthasto dowiththeconnectionbetweenscientificthinkingand scienceeducation.A view emphasized in thiswork,andreflectedin Figure2, is thatscientificthinkingencompasses a gooddealmorethanthecontrolledcompar- isonstrategythathasbeenthe focusof attentionin mostdevelopmental research on scientificthinking.A relatedviewhasbeenexpressedin recentwritingon sci- ence educationthatemphasizesthe importance of formulating productiveques- tions,representing observations in insight-generating ways, andadvancingand debatingclaims in a frameworkof scientificargument(Lehrer,Carpenter, Schauble,&Putz,2000).Mastering thecoordination of questions,datarepresenta- tions,andargument, Lehreret al. claimed,"putsstudentsontheroadto becoming authorsof scientificknowledge"(p. 97).
28. INQUIRYLEARNING 521 Despiteitscomprehensiveness in encompassing allphasesof scientificactivity (frominquirythroughargument), the kindof inquiryactivityfeaturedin thisstudy is by itselffarfroma modelof whata comprehensive sciencecurriculum should be. Still,we do see suchanactivityasvaluableasonestrandinterwoven intoa rich middleschoolsciencecurriculum. Itsvalueas aneducational tool,we believe,lies in its focusingattentionon the formsof questionaskingandansweringthatare centralto scientificthinking.By directingstudents'attentionto thethinkingthey do in addressingscientificquestions,we not only implicitlyconveyvaluesand standards of science("Howdoyouknow?"),butwe alsodevelopmetalevelaware- nessand,ultimately,regulation andof infer- of questions,of datarepresentations, ences thatdo-and especiallythatdo not-follow fromwhatis observed.Of course,we wantstudentsto acquirerich anddeepunderstanding of the world aroundthemas a goalof theirscienceeducation, butawarenessandunderstanding of their own and other'sthinkingaboutscientificquestionsseem important enoughto warranta prominent placein thiscurriculum. REFERENCES Bransford,J., Brown,A., &Cocking, R. (Eds.).(1999). Howpeople learn:Brain, mind,experience,and school (Reportof the National ResearchCouncil). WashingtonDC: NationalAcademy Press. Case, R. (1974). Structuresand strictures:Some functional limitations on the course of cognitive growth. CognitivePsychology, 6, 544-573. Cavalli-Sforza,V., Weiner,A., & Lesgold, A. (1994). Softwaresupportfor studentsengagingin scien- tific activity and scientific controversy.Science Education, 78, 577-599. Chan,C., Burtis,J., & Bereiter,C. (1997). Knowledge-buildingas a mediatorof conflict in conceptual change. Cognitionand Instruction,15, 1-40. Chen, Z., & Klahr,D. (1999). All otherthings being equal:Acquisitionand transferof the control of variablesstrategy.ChildDevelopment,70, 1098-1120. Crowley,K., & Siegler, R (1999). Explanationandgeneralizationin young children'sstrategylearning. ChildDevelopment, 70, 304-316. de Jong, T., & van Joolingen,W. R. (1998). Scientificdiscoverylearningwith computersimulationsof conceptualdomains.Review of EducationalResearch, 68(2), 179-201. DeLoache,J., Miller,K., & Pierroutsakos,S. (1998). Reasoningandproblemsolving. In W. Damon(Se- ries Ed.) & D. Kuhn& R Siegler(Vol. Eds.),Handbookofchildpsychology: Vol2. Cognition,lan- guage. andperception (5th ed., pp. 801-850). New York:Wiley. Eisenhart,M., Finkel, E., & Marion,S. (1996). Creatingthe conditionsfor scientific literacy:A re-ex- amination.AmericanEducationalResearchJournal, 33, 261-295. Gentner,D., & Stevens,A. (Eds.).(1983).Mentalmodels.Hillsdale,NJ:LawrenceErlbaumAssociates,Inc. Klahr, D. (2000). Exploring science: The cognition and developmentof discoveryprocesses. Cam- bridge, MA: MIT Press. Klahr,D., Fay, A. L., & Dunbar,K. (1993). Heuristicsfor scientific experimentation:A developmental study. CognitivePsychology, 25, 111-146. Kuhn,D. (1995). Microgeneticstudyofchange: Whathasit toldus?Psychological Science, 6, 133-139. Kuhn,D. (in press-a).Whatis scientific thinkingandhow does it develop?In U. Goswami(Ed.),Hand- book of childhoodcognitive development.Oxford,England-Blackwell.