~ EPAA Vol. 7 No. 4: Haney, Fowler, Wheelock, Bebell & Malec "Massachusetts Teacher Test" ~
page 1 | introduction | background | reliability & validity | interviews | conclusions | references

Background


        Massachusetts Commissioner of Education Robert Antonucci reported to the Massachusetts Board of Education on December 15, 1997, that he had selected National Evaluation Systems (NES), a test development company based in Amherst, MA, to develop and administer new teacher certification tests for Massachusetts (see appendix 1 for more details on the chronology of the development of the MTT). We learned of this in January, 1998. One of us (Haney) was sufficiently concerned about this prospect that he faxed to then Associate Commissioner of Education David Driscoll a copy of a federal court decision in which NES had been found to have "violated the minimum requirements for professional test development" when it created teacher certification tests for the state of Alabama. (Note 3) Associate Commissioner Driscoll did not respond directly but we learned through an intermediary that NES had been one of only three contractors to bid on the Request for Responses (RFR) issued on February 24, 1997. In February 1998, the Commonwealth commissioned NES to develop tests of communication and literacy skills and of subject matter knowledge. (Note 4)
        At that time, the plan was that there would be two trial administrations of the new tests (in April and July, 1998) and that their scores would not count toward certification of eligible teacher candidates. As described in a study guide and registration bulletin released by the Massachusetts Department of Education (DOE) in January 1998, candidates would be required to achieve qualifying scores in order to be certified only beginning with the third administration in October 1998. But then a change was made: on March 25, 1998, the DOE announced that candidates taking the April and July tests would not qualify automatically; instead, they would have to make passing scores to be provisionally certified.
        The first MTT tests were given on April 4, 1998. In the ensuing weeks NES convened scoring panels to help determine where passing scores on the new tests should be set. On June 22, acting on the recommendation of NES, the Massachusetts Board of Education voted to set passing scores one standard error of measurement below those that resulted from NES's analyses of scoring panel reviews. This meant that 44% of the 1,800 prospective teachers who took the April tests would have failed.
        Then in the midst of his campaign for election as Governor, Massachusetts Acting Governor Paul Cellucci had Board of Education Chairman John Silber convene a special meeting of the Board to reconsider its vote on qualifying scores. Reversing its decision of less than two weeks earlier, the Board voted on July 2 for higher passing scores; now 59% of the April test-takers would fail. (At this meeting, Interim Commissioner of Education, Frank Haydu, appointed after Antonucci had taken a job in the private sector, resigned. Haydu had recommended the lower qualifying scores because the test was untried and subject to measurement error.)
        The results of the first MTT administration were mailed to test-takers in early July. They showed that 70% of examinees passed the reading test, 59% the writing test, and widely varying percentages the 32 subject matters tests. Because candidates had to pass all three tests to pass the MTT overall, the overall passing rate was only 41%.
        This overall result, with a majority of candidates failing, led to immense negative publicity. Even before the passing scores were adjusted in July, in a speech to the Greater Boston Chamber of Commerce on June 25, Massachusetts Speaker of the House Tom Finneran commented, "I'll tell you who won't be a teacher. The idiots who flunked that test and flunked so miserably--and, of course, the idiots who passed them." (As reported in the Boston Herald on 6/26/98, by Darrrell S. Pressley, "Dumb struck: Finneran slams 'idiots' who failed teacher tests.")
        The second MTT administration took place on July 11, just days after April test-takers had received their results. In a press release dated July 23, 1998, the DOE said:
MALDEN--State Commissioner of Education David P. Driscoll today released the results of the April 4 Massachusetts Teacher tests showing the pass and fail rates for prospective teachers by the institution they attended.

The 1795 prospective teachers who took tests in April in communication and literacy and in various subjects entered on their registration sheets information concerning the 56 colleges and universities they attended. That data was then verified by the institutions; 54 checked the data and made corrections where necessary.

Because in some cases there is a very small sample of students taking specific subject tests and a small sample of graduates from some of the institutions, results need to be interpreted carefully. Many schools have fewer than ten students who took parts of the test, so making a broad statistical conclusion in those cases is not sound.


        Release of results by institution prompted much public hand-wringing among teacher preparation institutions. A front page story in the Boston Globe on July 24, 1998, for example, reported, "Colleges vow to retool after failures on teacher tests" (O'Brien, 1998).
        All this heightened our concern that important decisions were being made on the basis of the hastily-developed MTT before their technical merits had been established. Hence in July, at the suggestion of Clarke Fowler, the three of us (Fowler, Haney, and Wheelock) decided to found the Ad Hoc Committee to Test the Teacher Test. Our initial idea was to ask people who had taken both the MTT and some post-collegiate national examination, such as the Praxis (the successor to the National Teacher Examination or NTE) or the Graduate Record Examination (GRE), to send us copies of their score reports. Comparing the two sets of scores would allow us to test the validity of the new MTT against established tests such as the Praxis or GRE. The principle was simple: if the new MTT were valid and reliable, then scores on the new Massachusetts test should be comparable with those from other tests.
        Flyers inviting test-takers to send us copies of their score reports were distributed at many of the testing sites for the July 11 administration of the MTT. As of December 1998, more than 30 individuals had generously provided us with copies of their score reports on both the MTT and some other test (including the Praxis, the Millers Analogies Test, the New York State Liberal Arts and Sciences teacher certification test, and the Texas teacher certification test, the ExCET). Among them, however, there were only twelve cases in which people had taken the MTT and the same other test, leaving us with samples that were too small for reasonable statistical analysis. Nonetheless, many individuals who sent us scores spontaneously offered us comments on the new MTT. Hence we decided to interview those willing to be interviewed. These interviews are summarized in part 4 and appendix 3 of this report.
        On August 17, detailed results of the July administration were released to the institutions of higher education whose students took the tests. Seeing a copy of these results for one institution, which listed students' MTT scores from both April and July, we realized that we could use these data to estimate the reliability of the MTT tests. Adults' basic skills in reading and writing would not change much in a three-month period; people could not cram for the July test since no study guide was available; and, in any case candidates did not learn they would have to retake the test until just days before the second administration when the April results were mailed to examinees. Hence, we set out to acquire data from institutions which had students take the MTT in both April and July. The results of this inquiry are presented in part 3 of this report.
        However, before describing the nature and results of this inquiry, we point out that many share our misgivings about the lack of technical documentation on the new MTT. For example, testing experts such as George Madaus of Boston College and Ron Hambleton of the University of Massachusetts at Amherst have publicly commented on the lack of documentation about the new MTT (Madaus, 1998; Hambleton, 1999); and in both press releases and letters, the Commonwealth Education Deans' Council has expressed concern about the lack of evidence on the MTT's validity (Jackson, 1998).
        On August 24, 1998 Board of Higher Education Chairman James Carlin called a meeting at Framingham State College for deans and presidents of Massachusetts institutions of higher education to discuss implications of the MTT results. At this meeting, DOE spokesman Alan Safran said: "We're committed to full disclosure about this test," and stated that the Department was willing to release all "non-proprietary" data. The Ad Hoc Committee therefore wrote to Acting Commissioner of Education David Driscoll to request three sets of data from the Massachusetts Teacher Tests that would allow us to analyze their psychometric properties. Specifically we asked for:
  1. Item level and total test scores (both raw scores and scaled scores) for all examinees who took the Massachusetts Teacher Tests in April. We seek these data in order to be able to analyze the psychometric properties of items on the April test.
  2. Item level and total test scores (both raw scores and scaled scores) for all examinees who took the Massachusetts Teacher Tests in July. We seek these data in order to be able to analyze the psychometric properties of items on the July test.
  3. Test scores of all examinees (both raw scores and scaled scores) for all examinees who took the Massachusetts Teacher Tests in both April and July. We seek these data in order to analyze the test-retest reliability of the Massachusetts Teacher Tests. (Ad Hoc Committee letter to Driscoll, September 7, 1998)

        Acting Commissioner Driscoll did not reply.


        More recently, in a November 22, 1998, commentary article in the Boston Globe, "Good teachers need a good test," Emanuel Mason, the chair of Northeastern University's Department of Counseling Psychology, Rehabilitation, and Special Education wrote:

The primary problem with the MTT in all its versions is its validity. Validity is the degree to which a test measures what it was designed to measure. . . . If a test does not have validity, it does not measure anything. Further, validity should be demonstrated before a test is used as the basis for decisions on licensing or any other outcome. The MTT's developer has yet to provide evidence of validity, and the test already has been administered three times.

        Ignoring well-established professional standards concerning testing, the DOE and NES have not made available any documentation on the reliability and validity of the MTT. We therefore set out to study the technical merits of the tests, to begin to answer the question Mason and others have raised: How good are the new MTT tests?

page 1 | introduction | background | reliability & validity | interviews | conclusions | references