Cheating is undesirable and unethical, but unfortunately, sometimes it does occur. Recent events at 2 ABMS specialty boards1,2 have illustrated the fact that the medical certification industry is not immune from this phenomenon either. Although there are numerous moral and professional implications involved with cheating, we wish to address the implications of cheating from a psychometric perspective. Our intent is to highlight some of the less obvious ways in which all ABFM diplomates could possibly be impacted should its diplomates and candidates resort to cheating on our examinations.
So, what is cheating? Cizek3 defines it as “any action that violates the rules for administering a test, any behavior that gives an examinee an unfair advantage over other examinees, or any action on the part of an examinee or test administrator that decreases the accuracy of the intended inferences arising from the examinee’s test score or performance.” The ABFM goes to great length to ensure a fair test for all examinees. When examinees register for ABFM exams, they make a promise to adhere to both the ethical and legal standards associated with the administration of the examination. This compact between the ABFM and the candidates minimizes the risk of a compromised examination score(s). Unfortunately, however, when members of either party fail to adhere to the agreed upon standards problems can arise.
Cheating as a Threat to the Validity of Examination Scores
Validity is perhaps the most important aspect of any test.4 The concept of validity refers to the extent to which interpretations and inferences gleaned from a score are accurate. When cheating occurs, estimates of an examinee’s performance are no longer accurate. Perhaps the most obvious example of cheating as a threat to validity occurs when an individual is unduly advantaged and receives a score that is higher than his or her true estimate. The inflated score would essentially be a misrepresentation of that individual’s performance, thus yielding an inaccurate estimate of performance. More subtle ways in which cheating can affect validity exist as well.
The most overt threat to examination validity would be associated with the leakage of exam items. Most testing organizations, the ABFM included, possess item banks with a large pool of items readily available for appearance on an examination. Items often vary with regard to how many times they may be used; some items are only used once, whereas others may be used perpetually provided they remain valid from a content perspective and continue to function in a psychometrically sound manner. Some overlap of items across administrations almost always exists, although the amount of overlap varies considerably across testing organizations. In any instance, exam items that are leaked from the item bank could give those with access a significant advantage. Regardless of how the test is constructed, if a single item has been compromised it could result in some examinees receiving a score that misrepresents their actual estimates of performance. Of course, the more items that are leaked, the greater the threat to the validity of the examination
Because most high-stakes examinations are scored with some form of item response theory (IRT) methodology, the difficulty of the items plays an important role in discerning a measure of the examinee’s performance. As such, cheaters have the ability to impact the accuracy of item calibrations by making items appear easier than they actually are. Although isolated incidents of cheating would have negligible effects on these calibrations, wide-scale cheating, on the other hand, would severely affect these calibrations. In fact, the more rampant the cheating, the greater the negative consequences for all other examinees, as they would in turn need to get more items correct in order to pass the exam. Thus, one could surmise that anyone who cheats on a high-stakes exam is not only selfishly influencing his or her own score, but doing so at the expense of others as well.
The notion of item difficulty calibrations becoming altered can lead to other adverse effects. For instance, exams are typically equated, or brought onto the same scale, by using a number of common items across the exams. These common items are referred to as item anchors. If the items used in the anchoring process have been tainted by widespread cheating, the newly constructed test will tend to be considered easier from a measurement perspective. Under most IRT traditions, easier tests require more correct answers in order to pass. In the aforementioned scenario, all examinees would be affected and would need to answer more items correctly in order to pass the exam. With some IRT scoring methods, items are scored in such a way that credit is given (or not) based upon one’s response to each individual item. In instances where a particular item has been affected by inaccurate calibrations, examinees who correctly answer the question will receive less credit than they actually deserve and examinees who incorrectly answer the question will be punished more severely as the scoring method attempts to fine-tune a performance estimate. Regardless of the scoring method used, widespread cheating would have the potential to negatively impact all examinees in such a scenario.
Deterring Cheating—A Call for Assistance
The ABFM works diligently to ensure that a fair and psychometrically sound examination is administered and that all resulting scores are valid. In addition to some of the more straight-forward safeguards against cheating provided by our testing vendor and standardized exam process, our psychometric staff also have a number of sophisticated methods and techniques to detect cheating. For security purposes we will not reveal the specifics of the various tools and techniques we use, but we give all examinees assurance that we work very hard to ensure the accuracy of our examination results. Unfortunately, however, limitations to our means of detecting cheating exist. It is for this reason that we ask our candidates and diplomates to help ensure everyone is given a fair test so that all score results are as accurate as possible. We ask that anyone with knowledge of misconduct related to the administration of the ABFM examination immediately report this information to the ABFM Test Security Group. For more information about suspected cheating and how you may contact the ABFM, please refer to the “Suspected Cheating” page on our website.5
Threats to the validity of the ABFM’s examination results are minimized when cheating does not occur. However, any instance of cheating could generate significant consequences not only for the examinee(s) that benefitted from the unfair advantage, but also for the honest and ethically responsible examinees that did not. The old adage that “one bad apple destroys the entire bunch” in many ways applies equally to the accuracy of information yielded from test scores as well. While the overwhelming majority of family physicians conduct themselves in ethically responsible ways, we as a certification organization remain vigilant with regard to cheating and respectfully ask that anyone with knowledge of others who have cheated (or are planning to cheat) on ABFM examinations report this information to us as soon as possible.
- © 2012 Annals of Family Medicine, Inc.