Writing and Revising Multiple Choice Questions

September 5, 2019September 27, 2024 Staff

By Carie Cardamone, Associate Director for STEM & Professional Schools, Tufts Center for the Enhancement of Learning and Teaching

While tests are commonly thought of as a final evaluation tool, they can also be an important formative feedback mechanism for students and faculty, because they allow instructors to track students’ progress towards achieving course goals & objectives.

Tests with multiple choice questions can be an efficient way to provide frequent feedback to students about their progress, particularly in larger classes where there might not be other opportunities for individualized feedback. By targeting each learning objective with questions that probe different levels of knowledge, these tests can reveal an individual’s progress towards multiple course objectives. For example, one question could probe student’s ability to recall discipline-specific terms, while another might ask students to make a judgement based on a context they might encounter in the future.

By providing opportunities for students to check their current knowledge and the effectiveness of their study strategies, multiple choice questions are also a learning tool. Tests increase students’ long term retention of the material when they encourage students to practice retrieval (the Testing Effect). However, to maximize student learning, multiple choice questions must be carefully designed to target fundamental concepts in the course and focus students’ thinking on the course concepts instead of getting caught up in the language of the questions themselves.

Writing effective multiple choice questions is not an easy or quick process, but taking the time to carefully construct multiple choice questions is critical to creating a test that will provide useful feedback and help encourage students towards study practices that deepen their learning. Below we list several key points to keep in mind as you craft your own questions.

Target Learning Objectives: Exams are an important opportunity for students to consolidate key knowledge for future retrieval and to focus their learning on important objectives. To achieve this design each question should target key learning objectives for the course. “Gotcha” questions that target unimportant details may encourage shallow retention by focusing the student on memorizing details rather than building a deeper understanding of key concepts. Instead, identify which learning objective each test question targets, as this can provide you with feedback on student progress in specific areas of the curriculum. (See the Table of Specifications, for more information about tying test questions to learning objectives.)
Write a clear question stem: Question should present a single clearly formulated problem in simple language in the stem of the question. Start by placing as much of the question content and wording in the stem (to avoid verbose response alternatives). Then eliminate excessive or irrelevant details. After writing a draft, revise it to alter unnecessarily wordy or confusing language, omitting vague terms such as, e.g., seldom, usually, frequently, likely, etc. It is important to use positive framing whenever possible to focus students on the concepts being tested rather than difficult language in the question. If a negative word must be included make it noticeable, e.g. capitalized and bold (like NOT or EXCEPT).
Create ~3 or 4 parallel response alternatives: Question response alternatives should be uniform in their grammatical construction and length, and presented in a logical order. This helps avoid verbal cues—e.g. grammatical, syntax or words—that enable test-wise students to select the correct answer or to eliminate an incorrect alternative and focuses student thinking on the best game strategy for answering the question instead of the concepts and knowledge being tested. Research shows that 3 or 4 options are sufficient to differentiate student learning on a class exam. Adding implausible alternatives or ones with lengthy wording provide an unnecessary cognitive distraction, without providing additional information on student knowledge of the course objectives. Avoid options such as “none of the above” or “all of the above”, which are often selected by students with incomplete knowledge and targeted by the ‘test-wise.’ Overlapping answers (one option is a sub-category of another) should also be avoided, as it can confuse students and lead to inaccurate assessments of knowledge.

Once a multiple choice test is given to students, statistical measures student performance can identify questions that are in need of revision. Software, such as ExamSoft, will often give reports with a variety of statistical measures, but the value of these statistics may vary with individual context. A difficulty index (sometimes referred to as a p-value) identifies the fraction of students answering a question correctly. Thus, it is a number that ranges between 0 and 1, and a difficulty of 0.25 would be equal to pure chance for a question with 4 response alternatives. Low values may point to questions that are too difficult or contain confusing wording. Questions with high p-values (e.g, p=0.9) do not separate higher performing students from low performing students, but may be useful to make sure all students understand core ideas. Another useful statistic is the discrimination of a question. It is the difference in performance between top and bottom performing students. As a number it ranges from -1 to 1. The discrimination measures the correlation between a student’s score on the question compared to their overall score on an exam. Larger positive numbers indicate that students who do well on the exam overall perform well on that question, so negative numbers often point towards questions that need to be revised.

Selected References

Ali, S. H., and K. G. Ruit. “The Impact of Item Flaws, Testing at Low Cognitive Level, and Low Distractor Functioning on Multiple-Choice Question Quality.” Perspectives on Medical Education 4, no. 5 (October 2015): 244–51.

Coughlin, P. A., and C. R. Featherstone. “How to Write a High Quality Multiple Choice Question (MCQ): A Guide for Clinicians.” European Journal of Vascular and Endovascular Surgery 54, no. 5 (November 1, 2017): 654–58.

Danielson, J., & Hecker, K. (2017). Written Assessment. Veterinary Medical Education: A Practical Guide.

DiDonato-Barnes, Nicole, Helenrose Fives, and Emily S. Krause. “Using a Table of Specifications to Improve Teacher-Constructed Traditional Tests: An Experimental Design.” Assessment in Education: Principles, Policy & Practice 21, no. 1 (January 2, 2014): 90–108.

Frey, Bruce B. “The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation”, September 3, 2019.

Haladyna, Thomas M., and Steven M. Downing. “Validity of a Taxonomy of Multiple-Choice Item-Writing Rules.” Applied Measurement in Education 2, no. 1 (January 1, 1989): 51–78.

Morrison, S., and K. W. Free. “Writing Multiple-Choice Test Items That Promote and Measure Critical Thinking.” The Journal of Nursing Education 40, no. 1 (January 2001): 17–24.

Thorndike, Robert M., George K. Cunningham, Robert Ladd Thorndike, and Elizabeth P. Hagen. Measurement and Evaluation in Psychology and Education, 5th Ed. New York, NY, England: Macmillan Publishing Co, Inc, 1991.

Vyas, R. & Supe, A. 2008, Multiple choice questions: A literature review on the optimal number of options in The National Medical Journal of India VOL. 21, NO. 3, 2008

Image credit: Students in class in the Agnes Varis Campus Center at the Cummings School of Veterinary Medicine, Grafton, MA (Tufts University)