Grade inflation has been a matter of concern at Williams for many years and has recently become the subject of intense discussion across higher education. In light of our upcoming decennial accreditation process, the Steering Committee and President Mandel are considering whether this is an appropriate moment to evaluate this issue on our campus. We are exploring the formation of a committee (potentially a special subcommittee of the CEA or an ad hoc committee) to examine grading trends at the College and, if necessary, propose policy recommendations.
We are writing to gauge your support for this initiative. We would be grateful if you would respond to the brief set of questions linked below, which will be used to inform our next steps. Responses are anonymous unless you choose to provide your contact information. [Survey link]
We would appreciate your response by 5pm on Monday, May 18.
Thank you,
The Faculty Steering Committee and President Maud S. Mandel
Why the email itself is the problem, not the survey
An email this short — under 200 words of substance — is doing a lot of work in a small space, and most of that work locks in a conclusion before the survey opens. Six things go wrong.
1The framing presupposes its conclusion. Grade inflation is one of several contested interpretations of why mean grades have risen. Others include grade compression, higher admissions selectivity, better teaching, and evaluation-driven incentives. There is also a substantial scholarly tradition that questions whether grades are a useful instrument at all (Kohn, 2002; Blum, 2020; Stommel, 2018). Once the President and the full Steering Committee name grade inflation as the concern in a faculty-wide email, that framing becomes the agenda — whatever the survey returns. The research on framing effects (Tversky & Kahneman, 1981), agenda-setting (McCombs & Shaw, 1972), and the continued-influence effect (Lewandowsky et al., 2012) all point the same way: institutional communications shape what people are asked to think about before they shape what people think.
2The accreditation framing is misleading. Williams' next NECHE comprehensive evaluation is Fall 2027. The 2026 Standards for Accreditation ask for evidence of academic quality, student achievement, and educational effectiveness — but none of them requires, mentions, or specifically invites an inquiry into grade inflation, grade distributions, or grade ranges. Invoking the decennial review makes the inquiry sound externally required when it is actually a discretionary choice.
3Williams already has the aggregate data, and a plausible structural cause is already on the record. The 2024 CEA memo showed 76% of grades in the A range and 50% straight As. The aggregate trend is not in dispute. What is worth examining is the cause. Statistics professor Richard De Veaux pointed at the Student Course Survey regime and tenure incentives; Love and Kotchen (2010, Eastern Economic Journal) modeled the same dynamic formally. A committee that "examines grading trends" without engaging the existing diagnosis is starting from scratch on a question Williams has already partly answered.
4The email is silent on what comparable committees have produced. Wellesley's 2004 grade cap widened racial and preparation-linked grade gaps (Butcher, McEwan, & Weerapana, 2014). Princeton's 2004 A-range targets were removed in 2014 after being "too often misinterpreted as quotas." Harvard's 2026 cap proposal faces strong opposition and has been delayed. The lesson is not that grading interventions are always bad — it's that numerical caps generate serious equity, behavioral, and legitimacy problems. Faculty asked to support a committee that might recommend such remedies should know that record before voting.
5The survey is framed to register support, not to question the premise. The email describes the survey as gauging "support for this initiative" — positioning a skeptical respondent as opposing evaluation itself rather than questioning how the inquiry is framed.
6The email doesn't meet the evidentiary standard we hold our students to. "A matter of concern at Williams for many years"; "the subject of intense discussion across higher education." No citations, no specification of whose concern, no engagement with counter-views. A student who opened a paper this way would be marked down. The standards we teach apply to institutional communications too — especially when those communications are proposing to evaluate our academic standards.
The annotated text
Faculty Steering Committee email on grade inflation
From the Faculty Steering Committee and President Maud S. Mandel · May 4, 2026
Dear Faculty Colleagues,
Grade inflation has been a matter of concern at Williams for many yearsand has recently become the subject of intense discussion across higher education.In light of our upcoming decennial accreditation process,the Steering Committee and President Mandel are considering whether this is an appropriate moment to evaluate this issue on our campus. We are exploring the formation of a committee (potentially a special subcommittee of the CEA or an ad hoc committee) to examine grading trends at the College and, if necessary, propose policy recommendations.
We are writing to gauge your support for this initiative. We would be grateful if you would respond to the brief set of questions linked below, which will be used to inform our next steps. Responses are anonymous unless you choose to provide your contact information. [Survey link] We would appreciate your response by 5pm on Monday, May 18.
Thank you, The Faculty Steering Committee and President Maud S. Mandel
Presupposition, and the framing harm
This sentence treats grade inflation as a settled, named phenomenon rather than a contested interpretation of grading data. It asserts concern without saying whose, how widespread, or whether well-founded. The phrase "for many years" suggests a long, validated tradition of worry. And once grade inflation at Williams is the named concern in a faculty-wide email from the President and the Steering Committee, that framing is now the agenda, whatever the survey returns. The relevant research is on framing effects (Tversky and Kahneman, 1981), agenda-setting (McCombs and Shaw, 1972), and the continued-influence effect (Lewandowsky et al., 2012). The short version: institutional communications shape what faculty are asked to think about before they shape what faculty think. Once a frame is set, dissent tends to happen inside it.
The phrase grade inflation is widely credited as entering public discourse through a March 13, 1972 New York Times article by Iver Peterson, which attributed the phrase to sociologist David Riesman. Riesman reportedly linked the phenomenon to anti-elitist faculty attitudes, embedding the term in the politics of its moment. Five decades later, the framing (that rising grades are inflation rather than improvement, compression, or better selection) has hardened into received wisdom without ever winning the underlying empirical argument. There is also a substantial scholarly tradition (Kohn, 2002; Blum, 2020; Stommel, 2018) that views grades themselves as arbitrary and pedagogically counterproductive. Williams faculty engaged that deeper question as recently as 2024, in debating a mandatory first-semester Credit/No Credit transcript policy.
Bandwagon framing, and a standard we hold our students to
"Intense discussion across higher education" is true but incomplete. The actual discussion includes both serious concern about grade compression and serious skepticism about the grade-inflation frame: scholars who argue the discourse is a manufactured panic, who think "grade compression" is the more accurate term, who say the focus on grades distracts from learning, and who note that comparable interventions have widened racial and preparation-linked grade gaps and exposed women faculty to evaluation pressures. None of that nuance survives the email's framing. "Intense discussion" is allowed to read as unanimous concern.
If a student submitted a paper opening "X has been a matter of concern for many years and has recently become the subject of intense discussion," with no citation, no specification of whose concern, and no engagement with counter-views, we would mark it down. The standards we teach apply to institutional communications too, especially when those communications are proposing to evaluate our academic standards. An appeal to vague consensus isn't an argument. We know it isn't because we tell our students it isn't.
See, e.g., Tannock (2019, LSE Higher Education Blog), "The destructive moral panic over university grade inflation"; Pattison, Grodsky, and Muller (2013, Educational Researcher), "Is the Sky Falling? Grade Inflation and the Signaling Power of Grades," which finds using nationally representative data that "the signaling power of grades has attenuated little, if at all"; and Alfie Kohn's longstanding writing in the Chronicle on the rhetorical structure of the discourse. The empirical literature distinguishes among three questions: rising mean grades (uncontroversial), the cause of those rising grades (genuinely contested), and whether rising grades are a problem at all (also contested).
Misleading invocation of accreditation
The email uses accreditation as an external pressure to justify the inquiry. It doesn't actually create one. Williams' next comprehensive NECHE evaluation is in Fall 2027, under the 2026 Standards for Accreditation (effective July 1, 2026). The 2026 standards restructured the older nine-standard framework into five: Mission, Organization, Governance, and Planning; The Academic Program, Faculty, and Students; Institutional Resources; Educational Effectiveness and the Success of All Students; and Integrity, Transparency, and Public Disclosure. The closest 2026 language is about academic quality, student achievement, and evidence of learning. That language could justify a broad inquiry into whether grades validly reflect learning outcomes. It doesn't require, mention, or specifically invite an inquiry into grade inflation, grade compression, grade ranges, or grade distributions. Invoking the decennial review makes the inquiry sound externally mandated when it's actually a discretionary choice the Steering Committee and President are making.
NECHE's 2026 Standards for Accreditation were restructured into five standards (down from nine in the 2021 version). A full reading of the 2026 standards finds no language requiring, prompting, or even suggesting an institutional inquiry into grade inflation, grade compression, or grade distributions. Standard Two (The Academic Program, Faculty, and Students) addresses curriculum, credits, and academic integrity. Standard Four (Educational Effectiveness and the Success of All Students) addresses learning outcomes, retention, graduation, and post-enrollment success. Neither standard concerns grade levels. If Williams' internal accreditation preparation is still organized around the 2021 standards, the bottom line is the same: the older framework also supports broad educational-effectiveness inquiry, not a required grade-inflation investigation.
Locked frame, and a plausible cause is already named
The committee isn't asking whether grade inflation is a problem. It's asking whether now is the right moment to evaluate the issue — and the issue has already been named and accepted as real. A genuinely open inquiry would start with a different question: is grade inflation even the right diagnosis? Williams already has the aggregate data, and a plausible structural cause is already on the record. Stats professor Richard De Veaux is quoted in the Record pointing at the Student Course Survey regime and tenure incentives, not faculty laxity. David Love (Williams) and Matthew Kotchen (2010, Eastern Economic Journal) modeled the same dynamic formally and reached a parallel conclusion.
Notice the inversion the email doesn't acknowledge. The President and Steering Committee bring the institutional weight. But the analysis on this question at Williams already exists in the public record and in the economics literature, and it points to the evaluation-and-tenure regime, not to faculty being lax. A committee that "examines grading trends" would be studying the symptom, not the cause. The email itself doesn't cite or engage the Love-Kotchen analysis, the De Veaux interview, or the broader course-evaluation bias literature.
Love and Kotchen (2010, abstract): "placing more emphasis on course evaluations exacerbates the problems of grade inflation and can even decrease a professor's teaching effort." (Note: the same paper also analyzes grade targets as a possible remedy. Love and Kotchen support the mechanism, not the whole anti-cap conclusion.) De Veaux to the Williams Record (2024): "young faculty perceive that to give low grades, to give Bs and B+s, is certainly not, they think, the road to a safe tenure evaluation." Mengel, Sauermann, and Zölitz (2019, JEEA) find that male students rate female instructors about 21 percent of a standard deviation lower than male instructors, while female students rate female instructors about 8 percent of a standard deviation lower. See also Boring, Ottoboni, and Stark (2016) and MacNell, Driscoll, and Hunt (2015).
"If necessary" as fig leaf, and the comparable interventions are cautionary
A committee charged to examine a named problem is under pressure to validate the premise of its charge, even when the formal recommendation stays conditional. "If necessary" is the qualifier that lets the recommendation appear conditional while the institutional motion all moves in one direction.
The email is silent on what kind of policy recommendations a committee might produce. That silence matters, because comparable committees at peer institutions have produced numerical grade targets and caps, and the empirical record on those is cautionary. Wellesley capped course averages at B+ in Fall 2004. Butcher, McEwan, and Weerapana (2014, Journal of Economic Perspectives) studied what happened: grades dropped more for African-American students and students with low entering test scores, the probability of magna cum laude in treated departments fell from about 20% to 16%, major enrollments in those departments dropped roughly 30%, and student evaluations of the affected faculty went down (the share "strongly recommending" their professors fell about 5 percentage points). Princeton set numerical targets for A-range grades in 2004 (the widely discussed 35% guideline) and removed them in 2014 after the review committee said the targets had been "too often misinterpreted as quotas," raised stress, and pulled attention away from feedback. Harvard proposed in 2026 to cap "flat A" grades at 20 percent plus four students per class (no cap on A-minus). Per the Harvard Crimson, the proposal has been delayed to fall 2027 if it passes, was still in faculty governance as of early May 2026, and faced strong opposition among survey respondents (84.9% "definitely" opposed in a Harvard Undergraduate Association survey of nearly 800 respondents). The lesson isn't that grading interventions are always bad. It's that numerical caps and forced decompression generate serious equity, behavioral, and legitimacy problems — and faculty asked to support a committee that might recommend such remedies deserve to know that record up front. None of it is in the email.
Butcher, McEwan, and Weerapana (2014): "the estimated drop in grades in treated departments is smaller for Latina students but much larger than average for black students (including African-Americans and foreign students who self-identify as black), those with low SAT verbal scores, and those with low Quantitative Reasoning scores." The paper finds "majors declined in the treated departments by about eight students, on average, representing a relatively large decline of about 30 percent." Within that decline, the fraction majoring in economics and the sciences increased; the fraction in other social sciences fell; humanities was roughly flat. Princeton's 2014 review report stated the 35-percent targets had been "too often misinterpreted as quotas." The Harvard FAS proposal would cap "flat A" grades at 20 percent plus four enrolled students per class (with no cap on A-minus). The Harvard Gazette describes the formula's effect as up to six A's in a 10-person seminar and 34 A's in a 150-person lecture.
Loaded-question / push-poll-like framing
A push poll is a survey designed to plant a position rather than measure one: the question itself does the persuading by smuggling in a premise the respondent has to either accept or fight. The critique here is of the email's framing of the survey, not of the Qualtrics instrument itself. This is not a "push poll" in AAPOR's strict sense; AAPOR reserves that term for political telemarketing disguised as research. But the email describes the survey as gauging "support for this initiative" rather than asking, for example, "do you think grade inflation is the right name for what's happening?" or "do you think Williams should evaluate its grading?" By framing the respondent's task as registering support for an initiative whose diagnosis is already named, the email's setup positions a skeptical respondent as opposing "evaluation," being against rigor, or refusing to engage. Whatever the actual Qualtrics instrument contains, the email's framing is what most faculty will read first.
Survey link, assess on its own terms
The Qualtrics instrument is a separate object from the email and may differ from the email's framing of it. A survey that allows respondents to register that they think the inquiry is misframed, to note that Williams already has the aggregate data and the structural diagnosis, and to propose alternative framings would partly repair the problem above. A survey that only collects variants of agree/disagree on a presupposed diagnosis would not. Faculty should evaluate the instrument on its own terms before responding. The critique in this annotation is of the email; the survey is downstream of the framing the email has already set.
Wall of authority
Signing with the President and the full Steering Committee signals that institutional power is already aligned. Anyone disagreeing has to do so against the President plus six faculty colleagues. That's rhetorically expensive before any argument has been made. The email doesn't cite or engage the Wellesley/Princeton/Harvard literature, the course-evaluation research, or the empirical critique of the grade-inflation frame. The signature line is doing work that an argument would otherwise have to do.
What a responsible version of this email would have included
A short email can't cover everything. But it doesn't take many extra words to frame a hard question without presupposing the answer. Four things would have made this one substantially better.
1. Name the disagreement, not the verdict
The email could have said: "Mean grades at Williams have risen substantially since the 1980s. Scholars disagree about what this reflects — increased student preparation, improved pedagogy, criterion-referenced grading, evaluation-driven incentives, or some mix. A separate tradition questions whether grades are the right instrument at all. We propose to investigate." That framing acknowledges the debate. The email's framing skips it, treating grade inflation as a settled diagnosis. It also forecloses a question Williams faculty engaged as recently as 2024, when a mandatory first-semester Credit/No Credit transcript policy was on the table.
2. Be honest about accreditation
The decennial NECHE review is real. It is not a mandate to investigate grade inflation. The 2026 standards do not require, mention, or specifically invite such an inquiry. A responsible email would say: "We think the accreditation cycle is a good occasion for this work" — not imply that accreditation requires it.
3. Acknowledge what Williams already knows — including the structural cause
The 2024 CEA memo, the 76%-A-range figure, the longitudinal data: the aggregate trends are not in dispute. A plausible structural cause is also already on the record — De Veaux pointed at the Student Course Survey regime and tenure incentives, and Love and Kotchen (2010) modeled the same dynamic formally. A committee that "examines grading trends" without engaging the existing diagnosis is starting from scratch. And the national literature on gender bias in student evaluations (Mengel, Sauermann, & Zölitz, 2019; Boring, Ottoboni, & Stark, 2016) means any committee that proposes to study grading without first dealing with the evaluation regime risks adopting policies that compound bias against the faculty already most exposed to it.
4. Put the peer-institution record in front of the ask
The email says "policy recommendations" but doesn't say what kind. Comparable committees at Wellesley, Princeton, and Harvard have produced numerical grade caps. The empirical record on those — equity gaps at Wellesley, quota misinterpretation at Princeton, strong opposition and delayed implementation at Harvard — is cautionary and well-documented. Faculty being asked to support a committee that might recommend such remedies deserve to know that record before voting. A responsible email would also explicitly invite the prior-question objection: "If you think this inquiry is misframed, we'd especially welcome that argument."
The bottom line
Mean grades at Williams have risen. What the rise means is contested. A serious inquiry would say so, would engage the structural causes already on the record, and would put the cautionary evidence from peer institutions in front of the ask. This email does none of that. A college that takes ideas seriously can ask itself a hard question better than this.
References
American Association for Public Opinion Research. (n.d.). AAPOR Statements on "Push" Polls. aapor.org
Blum, S. D. (Ed.). (2020). Ungrading: Why Rating Students Undermines Learning (and What to Do Instead). West Virginia University Press.
Boring, A., Ottoboni, K., & Stark, P. B. (2016). Student evaluations of teaching (mostly) do not measure teaching effectiveness. ScienceOpen Research. doi:10.14293/S2199-1006.1.SOR-EDU.AETBZC.v1
Butcher, K. F., McEwan, P. J., & Weerapana, A. (2014). The effects of an anti-grade-inflation policy at Wellesley College. Journal of Economic Perspectives, 28(3), 189–204. doi:10.1257/jep.28.3.189
Harvard Crimson. (2026, February 9). Nearly 85% of Harvard undergraduates oppose proposed cap on A grades, HUA survey finds. thecrimson.com
Harvard Crimson. (2026, March 31). Harvard College delays proposed A-grade cap to 2027, adds ‘SAT+’ designation. thecrimson.com
Harvard Crimson. (2026, May 1). Harvard faculty back 20-plus-four formula over square-root amendment in poll on A-grade cap. thecrimson.com
Harvard Gazette. (2026, March). ‘OK, I get it. This makes sense.’ Grade-inflation panel says updated plan focuses on reining in A’s, restoring integrity of system, freeing students to follow curiosity. news.harvard.edu
Kohn, A. (2002). The dangerous myth of grade inflation. The Chronicle of Higher Education, November 8, 2002. alfiekohn.org
Lewandowsky, S., Ecker, U. K. H., Seifert, C. M., Schwarz, N., & Cook, J. (2012). Misinformation and its correction: Continued influence and successful debiasing. Psychological Science in the Public Interest, 13(3), 106–131. doi:10.1177/1529100612451018
Love, D. A., & Kotchen, M. J. (2010). Grades, course evaluations, and academic incentives. Eastern Economic Journal, 36(2), 151–163. (Love is at Williams College Department of Economics; Kotchen is at UC Santa Barbara and NBER.) doi:10.1057/eej.2009.6
MacNell, L., Driscoll, A., & Hunt, A. N. (2015). What's in a name: Exposing gender bias in student ratings of teaching. Innovative Higher Education, 40, 291–303. doi:10.1007/s10755-014-9313-4
McCombs, M. E., & Shaw, D. L. (1972). The agenda-setting function of mass media. Public Opinion Quarterly, 36(2), 176–187. academic.oup.com
Mengel, F., Sauermann, J., & Zölitz, U. (2019). Gender bias in teaching evaluations. Journal of the European Economic Association, 17(2), 535–566. doi:10.1093/jeea/jvx057
New England Commission of Higher Education. (2026). Standards for Accreditation (effective July 1, 2026). neche.org
New England Commission of Higher Education. Williams College institution page (lists Fall 2027 as the date of the next comprehensive evaluation). neche.org/institutions/williams-college
Pattison, E., Grodsky, E., & Muller, C. (2013). Is the sky falling? Grade inflation and the signaling power of grades. Educational Researcher, 42(5), 259–265. doi:10.3102/0013189X13481382
Peterson, I. (1972, March 13). Flunking is harder as college grades rise rapidly. The New York Times, 1, 21. (Widely credited with introducing David Riesman's phrase "grade inflation" into public discourse.)
Princeton University Office of Communications. (2014, August 7). Faculty committee recommends modifications to Princeton's assessment and grading. princeton.edu
Princeton University Office of Communications. (2014, October 6). Princeton faculty approves changes in grading policy. princeton.edu
Rojstaczer, S., & Healy, C. (2010). Grading in American colleges and universities. Teachers College Record, March 4, 2010. gradeinflation.com
Rojstaczer, S., & Healy, C. Grade Inflation at American Colleges and Universities (longitudinal data set). gradeinflation.com (Williams page: Williams.html)
Stommel, J. (2018, March 11). How to ungrade. Jesse Stommel (jessestommel.com). jessestommel.com
Tannock, S. (2019, August 27). The destructive moral panic over university grade inflation. LSE Higher Education Blog. blogs.lse.ac.uk
Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211(4481), 453–458. doi:10.1126/science.7455683
Williams Record coverage:
Lin, M. (2021, February 17). Grade inflation continues rise through fall semester, some professors say. williamsrecord.com
Wignall, D. (2024, October 9). The birds and the B's: It's time to have 'the talk' about grades. williamsrecord.com
Zimmerman, H. (2024, November 6). Memo shows 76 percent of grades in A range last year, prompting faculty discussion. (Includes the De Veaux quote on the SCS regime.) williamsrecord.com