Adam Chilton, Peter Joy, Kyle Rozema, and James Thomas wrote an excellent, meticulous new paper entitled “Improving the Signal Quality of Notes”. The central claim is that some law professors are better graders than others, with a better grader being defined as one whose grades correlate more with the law student’s final GPA. As a result, some top students are unfortunate to have their records tarnished by a professor whose grades are relatively haphazard. You then miss opportunities like legal examination and court clerks, which are granted relatively early based on a small portion of the student’s legal portfolio. The authors then consider various possible remedies for noisy signals, such as: B. dropping bad assessors from the 1L curriculum and adding more gradations to the grading levels.
A preliminary question might be whether law schools really want to improve the signal quality of their grades. It can be fair to say that legal professionals probably don’t care that much. For every student denied a traineeship in a pure meritocracy is another student who falls in love with one. It seems doubtful that a law school that has implemented a better grading system will be more attractive to applicants. Aren’t students drawn to the lax grading systems at places like Yale Law School?
However, there is an argument that they should at least take care of their strongest students. After all, judges and top law firms have repeated relationships with law school and may hire students year after year. If they have a great experience, they are more likely to come back and hire more students. If a student has a job and is not performing well, the employer is less likely to hire students in later years. The stronger the signal, the more employers can rely on. A strong signal is therefore important for employers willing to hire only the best students in a school.
As you move down in the class, the importance of a strong signal to a school can diminish. Some employers may be very happy to hire an average or slightly below average to slightly above average student in a school. While employers may appreciate being able to weed out the worst students in the school, they may also be content with the degree alone, which serves as a form of certification that a student meets the school’s standards. That doesn’t mean the grades are irrelevant, but as the old joke suggests, a school could reasonably conclude that it has a greater interest in pinpointing the best graduate than the worst. The desire to avoid looking particularly bad can explain both the rate of inflation and an opaque rating system like that of the University of Chicago Law School. Employers who have to award the best students can, but many employers can simply look forward to hiring a graduate from this school without worrying about whether a GPA of 175 is a good thing.
Still, law schools want to reward at least their best students, and that requires multiple grades and good school leavers. The authors base their analyzes on data from a top 20 law school over four decades. (It quickly became clear to me which law school you are describing, but because they seem to have a certain degree of privacy for the school, I won’t name it.) Your data shows that the grades in the first year are always pretty good though still imperfect predictor of final performance, with 68% of those in the upper quartile staying there after the first year after graduation. They show that with more semesters the “proportion of missing students among the first 1 percent” – ie those who fill in the 99th percentile but are not there earlier – drops from 40% after two semesters to 22% after four semesters .
Overall, 1L grades are pretty meaningful, with the mean correlation between grades in a classroom and the student’s ultimate performance is 0.66. However, this number hides a great deal of variability among professors. For example, the following diagram shows the predictive value of seven different professors. Each X refers to a time the professor was teaching the class, and each O indicates the professor’s average correlation in the classroom. It is noticeable that there is a certain degree of variability even for an individual professor. Perhaps some exams work better than others. What is even more noticeable, however, is that some students are better than others.
Or are you? The authors consider the most obvious counter-argument that different professors measure different skills. Perhaps the professor at the bottom of the table above is the best assessor because that professor focuses on skills that are actually important in the workplace. However, the authors attack this possibility in a number of ways. First, they find that noisy students tend to be poor in predicting their own future grades from the same students. Second, the noisy graders are not highly correlated with one another; In fact, their grades predict the grades of the high-signal graders better than the grades of their high-noise counterparts.
Third, the authors conduct a principal component analysis of all teaching grades in the first year. This is a technique that reduces the dimensionality of data, such as converting eight-dimensional data that represents the grades of many first year students into four-dimensional data that represents the underlying qualities that measure those grades. (For a nice explanation of principal component analysis and similar techniques related to Supreme Court opinions, see this great article by Joshua Fischman and Tonja Jacobi.) They find that the first component of PCA – that is, the most important thing about grades to be seem to measure – explains 61% of the deviations in the grades of the first year, while the second component explains only 8%.
That’s pretty convincing, but I’m not quite sold. After all, 61% is not 100%. And the noisy graders may be rating different parts of the remaining 39%. In addition, what they are measuring may be more difficult to evaluate. There’s a reason law professors love basing their exams on subtle teaching points (nothing like an exception to the exception of weeding out the best students). It’s relatively easy. When the noisy graders are struggling to spot something that is difficult to identify, one would expect their grades to be loud predictors even of their own future grades (and those of their peers), but whatever they identify could always be still be important.
But yeah, it could just be that the noisy graders are lazy. (My colleague with the most unusual technique for grading exams isn’t the one outside of grading, however.) And if schools identify vocal graders, they might be trained or embarrassed. One question I’m curious about is whether the noisy graders also tend to be professors with low teaching ratings. The authors do not provide any information on this. Finding that less effective teachers are worse students would easily increase my confidence in the results. But it could also suggest that the real problem is teaching rather than grading. If teachers don’t see clearly what students need to learn, different students can focus on different things and this can make the assessment noise even if the assessment was done by a third party.
Improving grading is just one thing law schools could do to make grading fairer and more consistent. For example, law schools could do a better job of making the curve more consistent between classes – and better reflecting the quality of the students in the class. The authors note that the 1L students were randomly assigned to classes, but 2L and 3L students were not randomly assigned to classes. In many schools, students are effectively punished for taking tough classes like federal jurisdiction. Meanwhile, some graders tend to have relatively flat distributions while others have much more humped distributions. The authors explain why this does not affect their correlation measure, but schools could do more to reduce the impact of professor choice on students – unless it turns out, of course, that professors are too, the closer grades are awarded by those who are noisy graders.