Mastering Evolving-Hockey Pt 2: RAPM
Continuing Our Hockey Analytics Overview and Information Series
Welcome back to The Five Hohl after a bit of an unintentional break. I feel bad that the summer didn’t go as planned. Sure, I produced way more content than last summer, and this is just a silly hobby, but I always like to give back to those who support me and my writing.
Unfortunately, I'm only partially back into the swing of things, so this article is a bit short.
Due to everything, I’ve given a 30-day free extension to all paid-tier subscribers.
Thank you all for your understanding.
Now, back to our regularly scheduled (lolz) content.
Quick Update With Garret
I’m back to lifting weights semi-regularly—if you can call three weeks in a row at the bare minimum "regular." I had to adjust the program I mentioned a few weeks ago. Apparently, my back is what has atrophied the most, as it’s not enjoying high-bar squats after deadlifts compared to before.
My eldest just turned four, which scares the hell out of me with how fast time goes. The days are longer, but the years are shorter.
Somewhat related, I’ve been reading a lot about the intersection of health, longevity, and happiness—not the BS biohacking stuff, but actual academic research. I’m interested in discussing some of that when we get back to Featurette Fridays during the regular season.
Trending Winnipeg Jets Topics
This section will be on hold unless there is major, topical news to discuss. This will allow more time and space for the larger Analytics Overview Series.
Analytics Overview Series: Parsing Through Evolving-Hockey.com (Part 2)
RAPM PAGES
RAPM PLAYER TABLES
These tables work similarly to the player tables we looked at last time. The difference is that the values for goals, expected goals, and Corsi have been adjusted using a method called RAPM.
What is RAPM?
The TL;DR summary: Regularized Adjusted Plus-Minus is a way to account for the linemates and opponents a player faces when looking at their goal, expected goal, and Corsi differentials.
Now for the longer answer:
I’m going to give a bit of a look under the hood, with a hat tip to Justin Jacobs of the Squared 2020 blog. His write up goes more indepth in the actual math behind RAPM, so read it here if you want to check it out. I’ll give a quick introductory but hockey-centered version.
magine a hockey game with zero penalties called and no goalie pulled. In those sixty minutes, you'll see various combinations of 10 skaters (excluding goalies), five for each team.
A traditional differential like 5v5 goal +/-, xGoal +/-, or Corsi +/- would just be all the "for" events minus all the "against" events for any specific player while they were on the ice.
However, let’s say you played almost the entire game with Nik Ehlers while playing almost exclusively against some AHL journeyman. Your differential would be partly your performance and partly due to the players on the ice at the same time. You’d probably look better than you should.
Adjusted plus-minus turns each sequence of 10 skaters into an equation.
Example: x1(Scheifele) + x2(Ehlers) + x3(Vilardi) + x4(Morrissey) + x5(DeMelo) - y1(op1) - y2(op2) - y3(op3) - y4(op4) - y5(op5) = +1.5
This equation means that when these 10 skaters were on the ice for one session, the Jets were at a +1.5 Corsi per 60 pace for that time.
You could then take all these different equations and solve using linear algebra. The x2 would give you the estimated impact for Ehlers.
If Scheifele and Vilardi carry more negative results when away from Ehlers, the math would then work out with Ehlers’ x2 value garnering more of the +1.5 positive value than Scheifele’s x1 or Vilardi’s x3.
This is great because you can penalize players who get easier minutes and lift up players who play tougher minutes. It’s a slightly superior way than WOWY’s (With or Without You) or RelTM (Relative to Teammates) to see which players tend to make their teammates better.
However, the method comes with some issues. For one, you may have to exclude players, and you get massive confidence intervals with players carrying huge overlapping minutes with each other. Essentially, two players who rarely play apart have small samples that create large effects and estimates. This is a problem with collinearity.
So, in comes ridge regression.
Again, if you want to delve into the actual math, check out Justin Jacobs’ excellent write-up. Since I’m writing both to those who enjoy mathematics and those who just trust the scientific method and evidence-based decision-making, I’ll simplify things a bit:
Ridge regression is a Bayesian filtering process that shrinks the results of adjusted plus-minus towards an expected distribution.
Even more simply: Players get less extreme results, but we gain a lot more confidence in those results.
The Younggren twins also add some variables to help adjust things further, like zone starts, home-ice advantage, if one team is on the second game of a back-to-back, and such.
That’s all RAPM is. It’s a mathematical way to look at “who makes their teammates better” like RelTM, but also simultaneously looks at “who makes their opponents worse” and accounts for other environmental factors like zone starts and such.
It’s still just +/-, but you get punished for soft minutes and lifted for tough minutes.
RAPM TEAM TABLES
The team tables are the same thing, but the values are for teams instead of players.
I will point out that RAPM is a bit less useful for teams than it is for players. Teams have a greater number of events per game, so the raw numbers have a lot less luck in them.
Also, the variety in schedule difficulty for opponents and back-to-back situations is much smaller than the variety of opponents, teammates, and such that players experience.
You are much less likely to have a good Corsi team post a positive raw Corsi over the extent of a season than a skater.
That said, it is still useful and helps one critically view a team’s performance compared to the league.
RAPM CHARTS
We turn our attention to the infamous and highly useful RAPM charts.
We turn our attention to the infamous and highly useful RAPM charts.
The actual UI in RAPM charts is fairly simple and intuitive.
You can look at a skater for even strength and power play, compare two players simultaneously (for either EV or PP), or look at a team’s performance.
The filters let you change the player(s) (or team), the year, and the time frame to either a single or three years (for players only).
Before I go into reading these actual graphs, a quick note and a refresher on a few things.
RAPM methodologies improve our confidence in player impact, but there are still fairly large confidence intervals. Three seasons are much better for estimating true talent and comparing players than one season when available. There is a trade-off in the estimations being less “current” but also less “muddied.”
Hockey can be simplified into three underlying factors (split into six when looking at offense and defense separately):
Shot quantity: Create more chances and prevent your opponent.
Shot quality: Time your chances at their most opportune moments, and prevent your opponent from doing the same.
Finishing/setting/goaltending: Make the best of your chances and the least of theirs.
RAPM PLAYER CARDS
Readers of The Five Hohl have seen this image before, but here’s how I interpret it.
Looking at Shot Quantity and Corsi For per 60 (CF/60), we see that Nikolaj Ehlers substantially tilts the ice in the Jets’ favor. The team generates more shot attempts with Ehlers on the ice; his linemates create more, and his opponents allow more. Meanwhile, Kyle Connor has a slightly negative effect.
Ehlers boosts the team’s chances and shot volume to the 99th percentile, while Connor is below average in this area.
Next, we examine shot quality. Note that shot quality is NOT expected goals; it's expected goals per shot. Expected goals combine both shot quantity AND quality.
A player’s impact on shot quality can be estimated by looking at the difference between their impact on xGoals and Corsi. Ehlers has a negligible effect on shot quality relative to the number of chances, while Connor has a very positive impact on the Jets’ shot quality.
The net result is that Ehlers remains far ahead when we look at expected goals, but we see differences in the players' styles—Ehlers produces through quantity, and Connor through quality. Note that this impact is on the team's shot quantity and quality, not just the individual’s.
Finally, we have the goal columns, which introduce finishing and setting, plus much more noise due to the rarity of goals compared to shot attempts.
Ehlers’ goal column drops relative to his expected goal column. We cannot and should not infer with certainty how much of this is due to his finishing or setting talents versus “luck,” but the drop is worth noting.
Connor, on the other hand, sees his goal column rise relative to his expected goal column. This suggests some combination of finishing talent, setting talent, and “luck” is boosting his impact on goals beyond what we'd expect from his shot quantity and quality factors.
Even with Connor, who is known as a great finisher, we cannot say with certainty how much of the rise is due to skill versus luck. Goals are rare, and that rarity introduces volatility, increasing the confidence intervals.
We might expect Connor’s xGF→GF column differences to be positive and Ehlers’ to be negative, and this RAPM model provides a good estimate, but there's still uncertainty. This is even more true for defenders, who generally have more noise in their signal-to-noise ratio for goals.
Defensively, we see a similar but different pattern between the two players. The CF→xGF column differences suggest both are poor at preventing shot quality defensively; however, Ehlers mitigates this negative effect by reducing his opponent’s shot quantity.
Ehlers minimizes the amount of time he plays defense, which minimizes his somewhat equally poor defensive impact on shot quality.
This look at the RAPM chart through the lens of goal factors helps explain why the Younggren twins do not include goals against on RAPM charts.
The Younggren twins stated they did not include it because repeatability testing suggested players have very little control over their GA/60 rates. Repeatability is key in measuring skill, just like when you bet a friend they can’t make that shot a second time in a basketball game.
That said, this approach also works here. After shot quality and quantity, the major defensive factor is goaltending. While Connor or Ehlers could impact goaltending, perhaps by preventing finishing and setting plays that are more or less likely to challenge their specific goaltender (e.g., if the goaltender is weaker against cross-ice passes), that signal will come with a lot of noise that they have little real impact on.
RAPM TEAM CARDS
Using the same “Goal Factor” lens on RAPM metrics, we can analyze the Jets’ performance last season.
The Jets didn’t create many chances at even strength, but they were slightly above average. They fared better in shot quality per shot and even better when considering finishing and setting.
Defensively, they prevented chances fairly well, but their shot quality per shot could have been better. Their true strength in elite defensive play came from exceptional, generational goaltending—Connor Hellebuyck was the engine behind the Jets’ even-strength performance.
However, the special teams were pretty terrible, no matter how you look at it.
The Jets struggled to create chances, and those they did create weren’t great in terms of things like shot distance, rebounds, and tips. That said, they did finish and set chances well enough with what was given.
Defense on the PK was similar. The Jets struggled with shot quantity, and their shot quality per quantity made things even worse. Goaltending did help somewhat, but not enough to push the team above average.
Series Thus Far
Closing Thoughts
That’s all for now.
Next we’ll look into GAR, xGAR. There’s a chance the length of the post with multiple images could cause the length to be too long for email. If that happens, just open the article on your internet explorer or whatever to see the full write up.
Thank you all for reading this summer format, which will continue through September until the regular season begins. We’ll return to our regular, more structured three posts a week starting October 14th, with the regular season in full swing.
If you'd like to support my work here at The Five Hohl, please consider liking, sharing, and subscribing. If you want to support even more, consider joining the paid tier; I offer an additional post each week in the regular season, plus other extras for paid subscribers. I hope to provide even more value to paid subscribers over the coming years.