The CSAP See-Saw:
Understanding the Fluctuations in CSAP Test Scores
(A Physicist's Perspective)

W. Lowell Morgan
morgan@kinema.com

Colorado Association of School Boards 62nd Annual Convention

December 6, 2002

We all know that CSAP (Colorado Student Assessment Program) test results fluctuate from year to year.  Random fluctuations are a natural consequence of the small numbers of students taking the tests in any given school.  In this presentation I will demonstrate the origins of the fluctuations.  I will provide guidelines to help determine when changes in test outcomes from one year to another are statistically significant and I will provide a number of real examples. 

A comprehensive document on this material, the Power Point presentation, and this handout can be found in electronic form on the
www.kinema.com website. 

Copyright © Kinema Research & Software, L.L.C. (2002)


Page Title

2

Page Title



Example:
  If  30 students in a class of 100 students achieve Proficient or Advanced on a 2002 CSAP test you can expect that in 2003 the number will be between 22 and 41.  If 100 students again took the test the percentage achieving Proficient or Advanced should lie between 22% and 41%.  Since the odds against lying outside that range are 19 to 1, changes greater than about 11% might be considered to be significant.





Caution:  This may be a slippery slope!  That a value lies outside the confidence interval should only be taken as suggestive and worthy of further inquiry.  There may easily be multitudinous other expla nations besides real educational improvement. 

3

Page Title

3 Expected Magnitudes of Fluctuations

Error Bars:
  No Knowledge is Complete or Perfect

"Every time a scientific paper presents a bit of data, it's accompanied by an error bar…It's a calibration of how much we trust what we think we know…a pervasive, visible self-assessment of the reliability of our knowledge."                               

                                              Carl Sagan  The Demon-Haunted World

We now have a means of assigning to the CSAP test results confidence intervals or what are known as error bars.  The figure below shows an example of the expected range of uncertainty for two 2001 high school CSAP tests.  The band for 10th grade math is wider because the standard deviation of the raw test scores is greater.  The high school  tests involved some 300 students so the bands of uncertainty or error bars are smaller than would be expected for grade schools or the many high schools testing 50 students in each grade.

4

Page Title

4 Changes May Not Be What They Seem

Although the percentage of students achieving Proficient or Advanced on the 3rd grade reading test at PLES has varied by 20 points over the last five years the results are within the range to be expected for a class size of about 40 students.  The 20% drop between 2001 and 2002 was not the result of a significant drop in the average score.  There is indeed a high probability of the average score decreasing while the percentage of students achieving Proficient or Advanced increases and vice versa!

Points To Remember

  1. Random fluctucations in CSAP      performance are unavoidable
  2. Confidence intervals can be calculated and are proportional to   1/vN
  3. Low scores are subject to large fluctuations
  4. CSAP results are extremely sensitive to cut score boundaries
  5. Random fluctuations can mask any real quantitative educational improvement
  6. These fluctuations can lead to great uncertainty in the total overall score used to rank entire schools.

Details of my research and many examples can be found in the documents available on www.kinema.com.

5

Page Title

www.kinema.com

6