Back to Home
 Affiliated with PerformanceAssessment.Org Contact us at info@timeoutfromtesting.org


  Updated June 10



 Time Out...
Articles - 3 new!
Testimonies
Press Releases
Data and Charts
Article Archives

 Responses to
NYC School
Reorganization

Response to Klein
City Resolution
Garodnick Letter
Disastrous Reforms
Improving Schools

 DOE's New
Testing Plan

K-2, 8th retention
What's the plan?
Get the facts
Testing definitions
What DOE says
DOE Survey
CPE Letter to Klein

 Regents Review
Panel

Reports & Reviews
Panel Bios

 3rd Grade
Retention Policy

What it is
Articles
Documents
Take Action!
Testimonies

 About Time Out
    From Testing:

Mission
Timeline
Bios


STANDARDIZED TESTS: TERMS & DEFINITIONS

Standardized test:
a test in which all the questions, format, instructions, scoring and reporting of scores are the same for all test takers. Procedures for creating such tests are standardized, as are the procedures for creating, administering and analyzing the test.

High-stakes test:
a test that determines a critically important decision in a child's education, e.g. promotion or graduation. The Mayor of New York made the 3rd grade reading and math tests “high-stakes tests” by tying promotion to the test score. Regents exams in New York State are “high-stakes” because failing to pass any one of five exams will prevent a student from earning a high school diploma.

Norm:
sometimes referred to as the 50th percentile. The norm divides test takers into two groups - above or below the 50th percentile. Though often used differently, the norm in test scores merely denotes a place in the middle of the distribution of scores. Once determined, all other scores are described in reference to this norm -- hence “norm-referenced.” Establishing national norms in this way has become controversial because by definition half of all people who take the test must score below average.

Reliability:
in testing, reliability is a measure of consistency. So, if a group of people took a test on two different occasions they should get pretty much the same score both times. If a test is not reliable, it has to be abandoned. How do we know if a test is reliable? Test makers hope for a correlation between two administrations to reach .85 or higher. Of course, testing the same people twice is often inconvenient and timing is a concern. So, often one group of questions is correlated with another group of questions on the same test.

Validity:
a test can be reliable without being valid. Validity defines how accurately a test really measures what its promoters claim it measures. Some call this “meaningfulness." Are tests always "meaningful" in the same way? No, there are different types of validity:

  • predictive validity (also called criterion-related validity) measures how well a test predicts future school performance. For example, though it is generally assumed that performance on state standardized exit tests indicates how well those students will fare in college, there is much evidence to the contrary.

  • content validity measures how well the test covers the subject content being tested.

  • consequential validity refers to the inferences that are made from test results. Some experts have expressed concerns about the consequential validity of NY State's Regents exams in Global and US History, English, Science, and Math. They believe that a student who scores well on a Regents test may not necessarily be well-prepared to pursue that subject at the college level, and vice versa.

Obviously, we should care a lot about how valid tests are. A standardized reading test may measure how successfully a child can answer the set of questions on that specific test, but may tell us very little about whether the child enjoys using the skill tested. Such an omission prevents us from knowing how well the child will continue to develop the skill.

There are different kinds of test scores:

  • Norm-referenced scores:
    these compare a student's test performance to the performance of a clearly defined reference group called a "norming group." The scores of the norming group are used to devise test norms -- normal - below normal and above normal performance. The tests should have been normed on a population similar to the one taking the test.

  • Criterion-referenced scores:
    these scores say something about how the person tested performed relative to an absolute performance standard determined by the test-maker. Many criterion-referenced tests would be better referred to as "content-referenced."

  • Cut score:
    the minimum level a test-taker must attain in order to "pass" a given exam. Just where to put that level can determine whether children are labeled passing or failing. In the June 2003 Math A Regents, the cut score was set so high that nearly 70 percent of New York State students failed the Math A test. In January 2004, the cut score for the Math A Regents exam allowed 80-90 percent of the students to pass.
Measurement Error:
one way in which reliability or lack of reliability of a test is indicated; all tests have measurement error.

Scaling:
sometimes called "grading on a curve," scaling assigns scores received on standardized tests so they fit the classic "bell-shaped" curve used in statistics. Scaling insures that there is the same number of bottom and top grades with most people in the middle. Knowing in advance what the grading criteria and the objectives being assessed are eliminates the issue of "scaling." Everyone can pass.

Sampling:
a way to get information about a group by examining only some members of the group or by giving all members only small parts of the whole test.

Performance Assessments, also known as Authentic Assessment or Alternative Assessment:
a complex approach to assessment that uses direct measures of learning – essay writing, research projects, term papers – rather than test-driven indicators of learning. An oral defense is frequently part of the performance assessment.

Keep the promises!
Hands Across NYC
June 16
City Hall Park
4:30pm - 6:00 pm

STOP THE BUDGET CUTS! JOIN US TO DEMAND NO CUTS TO OUR SCHOOLS!

English flyer
Spanish flyer


NCLB is up for reauthorization NOW!
Read about it in THIS BOOKLET
Then contact your congressperson


Join the TOFT mailing list:








Music Video: "Not on the Test"
Produced by: Public School Test Records and Grammy Award-winner Tom Chapin

"Keeping Accountability Systems Accountable,"
Martha Foote, Jan. 2007

Schools Cut Back Subjects to Push Reading and Math
Sam Dillon, New York Times

As Test-Taking Grows, Test-Makers Grow Rarer
David M. Herszenhorn, New York Times

Principals Face Review in Education Overhaul
Elissa Gootman, New York Times

"No Child Left Behind: The Test"
Stan Karp, Rethinking Schools

National Education Association:
More information against NCLB.

"Test Question No. 1: Why Have These Tests?"
NYT article on one of Time Out's strongest activists: Jane R. Hirschmann

produced by Naava Katz Design