Soma Roy, Karen McGaughey (Cal Poly), Soma Roy, Karen McGaughey (Cal Poly), - Alex Herrington (Cal Poly undergrad)
John Holcomb (Cleveland State), George Cobb (Mt. Holyoke), Nathan Tintle, Jill VanderStoep, Todd Swanson (Hope College) This project has been supported by the National Science Foundation, DUE/CCLI #0633349
Motivation/Goals Motivation/Goals Examples - Binomial process, randomized experiment- binary, randomized experiment - quantitative response
- Series of lab assignments
- Discussion points
Student feedback, Evaluation results Design principles & implementation Observations, Open questions
Cobb (2007) – 12 reasons to teach permutation tests… Cobb (2007) – 12 reasons to teach permutation tests… - Model is “simple and easily grasped”
- Matches production process, links data production and inference
- Role for tactile and computer simulations
- Easily extendible to other designs (e.g., blocking)
- Fisherian logic
- --”The Introductory Statistics Course:
- A Ptolemaic Curriculum” (TISE)
Develop an introductory curriculum that focuses on randomization-based approach to inference Develop an introductory curriculum that focuses on randomization-based approach to inference - vs. using simulation to teach traditional inference
- From beginning of course, permeate all topics
Improve understanding of inference and statistical process in general - More modern (computer intensive) and flexible approach to inferential analysis
Case-study focus Case-study focus Pre-lab - Background, Review questions submitted in advance
50-minute (computer) lab period Online instructions - Directed questions following statistical process
- Embedded applets or statistical software
Application/Extension Lab report with partner
Videos Videos Research question Pre-lab Descriptive analysis Introduction of null hypothesis, p-value terminology Plausible values Conclusions
Can this be done on day one? Can this be done on day one? - Yes if can motivate the simulation
- Loaded dice
- Before reveal the data?
Many students can reason inferentially Many students can reason inferentially - “If a choice is made at complete random, then having 13 infants would be highly unlikely”
- “Based on the coin flipping experiment, the results stated that at/over 12 was extremely rare. Therefore, at least 12 infants …
- “Would be around 12-16 because it seems highly unlikely that given a 50-50 option 12-16 would choose the helper toy”
But maybe not as well “distributionally” But maybe not as well “distributionally” - Is it unusual? = “barely over half”
- vs. unusual compared to distribution
Examine language carefully - “Unlikely that choice is random”
- “Prove”
- “Simulate”, “Repeated this study”
- “At random” = 50/50, “model”
“Random” = anything is possible
Can this be done on day one? Can this be done on day one? - Yes if can motivate the simulation
- Loaded dice
- Before reveal the data?
- Enough understanding of “chance model”?
- Use of class data instead? (“observed” vs. research study)
- Yes, if return to and build on the ideas throughout the course
Tactile simulation Tactile simulation - One coin 16 times vs. 16 coins
Population vs process 3Ss: statistic, simulate, strength of evidence Fill in the blank wording Timing of final report - Follow-up in-class discussion
Is Yawning Contagious? Is Yawning Contagious? - Modelling entire process: data collection, descriptive statistics, inferential analysis, conclusions
- Parallelisms to first example
- Could random assignment alone produce a difference in the group proportions at least this extreme?
- Card shuffling, recreate two-way table
- Extend to own data
Horizontal axis Horizontal axis Shade p-value Make up a research question
Starting with a significant result but when ready to discuss insignificant? Starting with a significant result but when ready to discuss insignificant? How critical is authentic data? Choice of statistic (count vs. difference in proportion) Role of traditional symbols and notation? Visualization of bar graphs from trial to trial Implementation of predict and test
Are there lingering effects to sleep deprivation? Are there lingering effects to sleep deprivation? Possible follow-up/extensions: what if -4.33?, medians, plausible values
Role of tactile simulation Role of tactile simulation Scaffolding of lab report - Introductory sentences, labeling of graphs
- Write conclusion to journal
When should “normal-based” methods be introduced - Alternative approximation to simulation
- Position, method for confidence intervals
Choice of technology - Advantages/Disadvantages
- Applets, Minitab, R, Fathom
Following the lab comparing two groups on a quantitative variable (65 responses) Following the lab comparing two groups on a quantitative variable (65 responses) - Discuss the purpose of the simulation process
- What information does the simulation process reveal to help you answer the research question?
Essentially correct: 35.4% demonstrated understanding of the big picture (looking at repeated shuffles to assess whether the observed results happened by chance) Partially: 38.5% (one of null or comparison) Incorrect: 26.1% (“better understand the data”)
Did students address the null hypothesis? Did students address the null hypothesis? - 33.9% E/ 38.5% P/ 27.7% I
Did students reference the random assignment? - 36.9% E/ 36.9% P/ 26.2% I
Did students focus on comparing the observed result? - 64.6% E/ 13.8% P/ 21.5% I
Did students explain how they would link the pieces together and draw their conclusion?
Example 3 simulation
Helper/Hinderer (Winter 2011) – Did the lab help you understand the overall process of a statistical investigation? Helper/Hinderer (Winter 2011) – Did the lab help you understand the overall process of a statistical investigation?
Did subsequent labs increase understanding? Did subsequent labs increase understanding?
Lab 4: Random babies Lab 5: Reese’s Pieces (demo) - Normal approximation, CLT for binary
- Transition to formal test of significance (6 steps)
Lab 6: Sleepless nights (finite population) - t approximation, CLT for quantitative, conf interval
Lab 7: Simulation of matched-pairs Lab 8: Simulation of regression sampling Chi-square, ANOVA
Google docs survey during last week of course Google docs survey during last week of course Two instructors
Instructor A Instructor A - Is Yawning Contagious?
- Heart Rates (matched pairs)
Instructor B - Friend or Foe
- Is Yawning Contagious?
- Reese’s Pieces
Most helpful: Most helpful: Least Helpful (Instructor B): - Random babies
- Melting away (intro two-sample t, paired)
In a recent Gallup survey of 500 randomly selected US adult Republicans, 390 said they believe their congressional representative should vote to repeal the Healthcare Law. Suppose we wish to determine if significantly more than three-quarters (75%) of US adult Republicans favor repeal. In a recent Gallup survey of 500 randomly selected US adult Republicans, 390 said they believe their congressional representative should vote to repeal the Healthcare Law. Suppose we wish to determine if significantly more than three-quarters (75%) of US adult Republicans favor repeal. The coin tossing simulation applet was used to generate the following two dotplots (A) and (B). Which, if either, of the two plots (A) and (B) was created using the correct procedure? Explain how you know.
35% picked B (usually citing null .75500) - But some look at shape, or later p-value
29% picked A (observed result) 23% neither (wanted .5500 = 250) 13% other responses: 0, .75, 50, can’t tell, anything possible, label is wrong
Heights of females are known to follow a normal distribution with a mean of 64 inches and a standard deviation of 3 inches. Consider the behavior of sample means. Each of the graphs below depicts the behavior of the sample mean heights of females. Heights of females are known to follow a normal distribution with a mean of 64 inches and a standard deviation of 3 inches. Consider the behavior of sample means. Each of the graphs below depicts the behavior of the sample mean heights of females. - a. One graph shows the distribution of sample means for many, many samples of size 10. The other graph shows the distribution of sample means for many, many samples of size 50. Which graph goes with which sample size?
85% matched n=10 and n = 50
Suppose we wish to test the following hypotheses about the population of Cal Poly undergraduate women: Suppose we wish to test the following hypotheses about the population of Cal Poly undergraduate women: For which graph (A or B) would you expect the p-value to be smaller? Explain using the appropriate statistical reasoning.
77% picked B - Mixture of appealing to smaller SD/outliers, larger sample size means smaller p-value, and thinking in terms of test statistic
- A few choices not internally consistent
CAOS questions (final exam) CAOS questions (final exam) - Statistically significant results correspond to small p-values
- Traditional (National/Hope/CP): 69/86/41%
- Randomization (Hope/CP): 95%/95%
- Recognize valid p-value interpretation
- Traditional (National/Hope/CP): 57/41/74%
- Randomization (Hope/CP): 60/72%
- p-value as probability of Ho - Invalid
- Traditional (National/Hope/CP): 59/69/68%
- Randomization (Hope/CP): 80%/89%
CAOS questions (final exam) CAOS questions (final exam) - p-value as probability of Ha – Invalid
- Traditional (National/Hope/CP): 54/48/72%
- Randomization (Hope/CP): 45/67%
- Recognize a simulation approach to evaluate significance (simulate with no preference vs. repeating the experiment)
- Traditional (National/Hope/CP): 20/20/30%
- Randomization (Hope/CP): 32%/40%
p-value interpretation in regression (final exam) p-value interpretation in regression (final exam)
Video game question (Final exam: NCSU, Hope, Cal Poly, UCLA, Rhodes College) Video game question (Final exam: NCSU, Hope, Cal Poly, UCLA, Rhodes College) - What is the explanation for the process the student followed?
- Which of the following was used as a basis for simulating the data 1000 times?
- What does the histogram tell you about whether $5 incentives are effective in improving performance on the video game?
- Which of the following could be the approximate p-value in this situation?
Simulation process Simulation process - Fall: over 40% chose “This process allows her to determine how many times she needs to replicate the experiment for valid results.”
- About 70% pick “The $5 incentive and verbal encouragement are equally effective at improving performance.” as underlying assumption
- Still evidence some look at center at zero or shape as evidence of no treatment effect
- 1/3 to ½ could estimate p-value from graph
A consumer organization would like a method for measuring the skewness of the data. One possible statistic for measuring skewness is the ratio mean/median…. A consumer organization would like a method for measuring the skewness of the data. One possible statistic for measuring skewness is the ratio mean/median…. - Calculate statistic for sample data…
- Draw conclusion from simulated data …
Tactile simulation Tactile simulation Visual, contextual animation of tactile simulation Intermediate animation capability Level of student construction - Ease of changing inputs
- Connect elements between graphs
Carefully designed, spiraling activities - “Stop!”
- Thought questions
Allow for student exploration
Early in course Early in course Repetition through course, connections Lab assignments - Focus on entire statistical process
- Motivating research question
- Follow-up application
- Thought questions
- Screen captures
- Pre-lab questions
- Minitab demos (Adobe Captivate)
Exam questions
Students quickly get sense of trying to determine whether a result could be “just due to chance” Students quickly get sense of trying to determine whether a result could be “just due to chance” Still struggle with more technical understanding - Under the null hypothesis
- Observed vs. hypothesized value
Students may fail to see connections between scenarios
Begin with class discussion/brain-storming on how to evaluate data before show class results Begin with class discussion/brain-storming on how to evaluate data before show class results - Loaded dice, biased coin tossing
- Thought questions
Student data vs. genuine research article - “the result” vs. “your result”
Choice of first exposure - Significant?
- Random sampling or random assignment
Scaffolding Scaffolding - Observational units, variable
- How would you add one more dot to graph?
- At some point, require students to enter the correct “observed result” (e.g., Captivate)
- At some point, ask students to design the simulation?
- Start with fill in the blank interpretation?
One crank or more? One crank or more? When connect to normal approximations? - How make sure traditional methods don’t overtake once they are introduced?
- How much discuss exact methods?
- Confidence intervals
Very promising but also need to be very careful, and need a strong cycle of repetition closely tied to rest of course… Very promising but also need to be very careful, and need a strong cycle of repetition closely tied to rest of course…
Do'stlaringiz bilan baham: |