Real or Myth?: An Exploration of the “Clutch Gene” in the NBA

By Eric Zhu

Introduction

The “Clutch Gene” is a term that is thrown around all too casually in barbershop conversations about the best players in the NBA. For example, the late Kobe Bryant, widely regarded as one of the most clutch players in NBA history, a poster child for the “killer mentality,” is just 17 for 50 (34%) in shots with two minutes remaining in playoff games to tie or take the lead. While this is by no means a bad percentage, especially when you consider the difficulty of these shots and the high-pressure situations in which they were taken, it probably isn’t what the casual NBA fan would expect from one of the all-time great clutch players. Now, my use of this metric is not to insinuate that Kobe is not clutch. It is simply an example of an interesting statistic that has motivated me to ask the questions: (1) how can we quantify clutchness in the NBA from NBA game data, (2) do players generally elevate or decline in terms of performance during clutch moments, and (3) what regular season statistics, if any, are most correlated with clutch performance?

This question is important for a couple of reasons. For one, finding an effective way to quantify clutchness would greatly alter the discourse around NBA media about which players are clutch and which are not by providing a standardized way to compare players against one another. Nowadays most fans who argue for a player being clutch cite specific big moments in which a player rose to the occasion but ignore the countless times that they failed to. This is fair as it cannot be expected for anyone, not even the biggest NBA super-fan, to remember all of a player’s big game-time moments. However, introducing a standardized statistic with which to measure the clutchness of players will allow for the discourse surrounding clutchness to evolve into one that is much more based on fact and historical performance rather than anecdotal evidence and heuristics. Furthermore, having a way to quantify clutchness from an objective standpoint also has many benefits for players, teams, and coaches, especially for the decision-making process of who should take the last shot or make the last play in close-game situations. Perhaps a superstar player consistently fails to play well in big moments whereas a role player is shooting with superstar efficiency down the stretch. My metric aims to capture these phenomena and can hopefully inform the decision-making process of coaches.

Officially, clutch time is defined as the final five minutes of the fourth quarter or overtime when the score is within five points. I will use this official definition from the NBA as my working designation for clutch time in this project.

Related Work

The paper “Clutchness in Basketball” by Arvin Barkazian is a master’s thesis project and a study on the concept of clutchness in basketball with a specific focus on the effect of point differential on free throw shooting percentages down the stretch. In particular, Barkazian leverages data from twenty NBA players from the past 20 NBA seasons to measure the impact of point differential on clutch free throws. Barkazian applies a generalized linear model with a logit link with free throw percentage during the clutch as the dependent variable and point differential, minutes, field goal attempts, free throw attempts, and whether or not the player’s team is winning all as independent variables. The GLM finds a coefficient of nearly 0.20 for point differential with a statistically significant p-value. That is, in situations where the score is very close, players shoot free throws at a lower percentage. Furthermore, the study found that when a player’s team is one point ahead, players will shoot a high 89%, but this percentage drops significantly for when a player’s team is trailing by one point ahead. From the findings of this paper, it certainly seems that clutchness is a tangible quality with statistical significance and that free throw shooting percentages in close game situations are a solid way to measure the clutchness of a player.

The 2012 article “A Statistical Analysis Of Clutch NBA Shooters Since 2000” by Jordan Sams contains a statistical analysis of clutchness of NBA players since 2000. The paper starts by defining clutch and explains all the nuanced complexities and subjectivities of the subject. Sams then attempts to compare the clutch play of a handful of NBA stars through three main statistics, those being field goal percentage in the clutch, three-point percentage in the clutch, and how many of their shots were assisted. These statistics were compiled for two different definitions of “clutch time.” The first definition is: during the fourth quarter with less than five minutes and a point differential of five points or less. The second, more restrictive definition is: shots taken to tie or take the lead in the final minute of the fourth quarter or overtime of a game. The author then does some baseline analysis and comparisons between the different statistics of the players. This paper is very closely related to my project as it attempts to quantify clutchness in a meaningful way. I liked that the author attempted to define clutch time in multiple ways. Perhaps I can more heavily weigh statistics that are recorded in the final minute of the game or something along those lines. The issue with this paper is its limited scope resulting from only looking at three statistics while ignoring other important statistics such as turnovers, steals, blocks, and many more, all of which should be considered clutch players when made during clutch time in my opinion. My project will hopefully capture a broader, more nuanced sense of “clutchness.”

Data

The data for my project is scraped from the official NBA statistics website. In particular, I will be using player data from the past five completed seasons, including player-level data for each entire regular season, regular season clutch time, and playoffs. Each dataset contains data on the age, per-game counting stats, efficiency stats, wins, losses, and plus-minus of each player who played in the NBA during that season. I then combined the datasets across the five seasons for each scope (regular season, regular season clutch time, and playoffs) into aggregate datasets. Using these aggregate datasets, I calculated the Player Efficiency Rating for each player based on their per-game stats using the formula outlined online. Thus, I ended up with three datasets, each for a particular scope with a plethora of player-per-game statistics as well as PER. I filtered out players with less than 150 clutch minutes played across the past five seasons to ensure that there weren’t any outliers in the data resulting from insufficient sample size.

Methods

The task of quantifying clutchness is a difficult and nuanced one. Even if we simplify the problem by defining clutch as the final five minutes of the fourth quarter or overtime when the score is within five points, we are still left with the issue of what statistic or statistics to look at during this clutch period as a measure of clutchness. A very basic idea would be to simply look at something like points per minute during clutch time. This fails to capture the essence of “clutchness” for a few reasons, however. For one, a player shooting a very high quantity of shots on poor efficiency may score more points than a player who takes fewer but more accurate shots. We typically associate clutchness with scoring at an efficient rate during clutch moments not just simply shooting a lot of shots. Looking at effective field goal percentage along with points per minute during clutch time is an improvement, however, it fails to take into account all of the other aspects of the game besides scoring. Turning the ball over during a clutch moment should be penalized while making a perfect pass to create a wide-open layup at a crucial moment should be considered nearly just as clutch as hitting a shot. Even rebounds and fouls are highly crucial during clutch time, yet are much less talked about. If only there were a statistic that could capture a player’s per-minute contributions holistically…

While there is no perfect all-in-one statistic, there do exist some advanced statistic options that are designed to capture the comprehensive per-minute contributions of a player which factor in many different statistics outside of points and efficiency. In particular, Player Efficiency Rating, created by ESPN columnist and sports analyst John Hollinger, provides us with a carefully crafted per-minute rating of a player’s performance on both the offensive and defensive sides of the game. PER certainly has its flaws. It doesn’t account for opponent strength, tends to favor volume over efficiency, and doesn’t capture any non-box score stats that can impact the game. However, it provides us with a tried and tested metric for overall player contribution that offers significant benefits over only looking at efficiency and points.

Therefore, I will be using PER during clutch time (‘clutch_PER’) as an important baseline measure for my project. We are still missing one key component of clutchness with this metric, however. Colloquially speaking, a player is clutch when they elevate their game and rise to clutch moments. A player is not necessarily clutch just because they perform well during clutch time. For example, going off clutch PER alone, Joel Embiid would be considered the most clutch player in the NBA across the past five regular seasons, yet he is notoriously unclutch in the eyes of the public. Perhaps this makes a case for Embiid being more clutch than people give him credit for, but it more likely reflects his overall strong performance and an issue with our current definition of clutch. Embiid happens to also have one of the highest regular season PERs, so he could perform worse than he normally does in the clutch and still be considered a clutch player, which doesn’t seem to capture the essence of clutchness. Thus, I have decided to use a player’s regular season PER as a baseline with which to calculate ‘clutch_elevation’, or the difference between clutch PER and regular PER. This difference serves as a measure of a player’s improvement or decline in high-stakes moments with respect to their normal performance during the regular season as a whole. This clutch elevation measure will serve as our statistic for how clutch a player is.

Now that I have defined a specific measure of clutchness, to answer my other research questions I explored and analyzed the data. Firstly, I plotted a histogram of clutch elevation for all players across the league with sufficient data. I then plotted a scatter plot of regular season PER against clutch_PER to determine how correlated the two metrics were. I approached the question of how to predict/model clutchness based on regular season performance using multiple linear regression with a train/test split and a robust quantitative feature selection method.

I first split the data randomly into a train and test set with a 75%/25% train/test split. I then fit a multiple linear regression model with clutch elevation as the outcome variable and all of the regular season statistics as the covariates. Using all the possible covariates led to egregious overfitting and a poor adjusted r-squared, as expected, so I ran a stepwise feature selection algorithm based on Akaike information criterion (AIC) optimization where AIC is the metric defined as ​​

AIC = 2K – 2ln(L)

where K is the number of model parameters and ln(L) is the log-likelihood of the model. AIC is designed to capture how well a model fits the given data adjusted for how many covariates are used. Therefore, a model with an excessive number of parameters will be appropriately penalized. For the optimization itself, I used the stepAIC() function from the MASS package in R which iteratively removes predictor variables from a regression model until you reach a set of features that results in the lowest possible AIC value for the model. After applying feature selection, I then reran the multiple linear regression with the optimal set of features and calculated its prediction accuracy using the test set.

As for playoff data, I decided to look at the difference between playoff PER and regular season PER, which I call ‘playoff_elevation’, as another metric of interest. Since the entire playoffs are a higher-pressure environment in which the stakes are raised, comparative performance in the playoffs could also be another interesting metric to look at if we want to quantify clutchness. Thus, I did some baseline analysis of playoff elevation across the league and explored its relationship with clutch elevation to see if the two are correlated in any significant way.

Results

I decided to do an exploratory baseline examination of the relationship between PER and clutch time PER. Below is the scatter plot produced by this examination with the line of best fit plotted in red.

As we can see from the plot, there is certainly a positive correlation between PER and clutch PER, as one might expect. A good player is likely to be good in the clutch moments just on the basis that he is a good player. However, while there is a seemingly strong positive correlation, there is significant variance, motivating me to look at the residuals of the plot, aka clutch_PER – PER, which I define as ‘clutch_elevation’. This difference in clutch performance from baseline performance will serve as my working metric for clutchness in this project.

The distribution of clutch elevation scores across the league for the past five seasons (for players that have played at least 150 clutch minutes) loosely follows a Normal distribution centered at mean -2.89 with standard deviation 4.46 as shown in the histogram below:

From this distribution, we can see that most players decline slightly in clutch time compared to their standard regular season performance as measured by PER. Some take a steeper fall, some improve slightly, and a select few improve significantly. This more or less aligns with my intuition going into the analysis. Since defense ramps up during the end of close games and there is higher pressure, I would expect the average NBA player to perform slightly worse than they normally would. The relatively high level of variance is likely due to a combination of an inherent clutchness factor and random variability.

Looking at the top players by clutch_PER, we see some big names. The players with the top ten highest clutch PER scores are:

  1. Joel Embiid
  2. Kyrie Irving
  3. Nikola Jokic
  4. Paul George
  5. Giannis Antetokounmpo
  6. D’Angelo Russell
  7. Damian Lillard
  8. James Harden
  9. DeMar DeRozan
  10. Bradley Beal

This list contains players that we more or less associate with being elite high-level players and some that are typically regarded as clutch (Damian Lillard, Kyrie Irving, DeMar DeRozan). However, there are also some names on here that are more known for coming up short in crucial moments or at least are not regarded for their clutch play such as Joel Embiid, Giannis Antetokounmpo, and James Harden. Since we are looking at clutch PER here and not clutch elevation, this list tells us nothing about how these players rise to the occasion. However, it does show us that even if certain players on this list do not exceed their normal standard of play in the clutch or do not have the clutch gene, they are so good overall that it doesn’t matter and that their “unclutch” level of play exceeds most of the league anyway. For the regular season, at least…

Looking at the top and bottom six players by clutch elevation, we see some names that might be surprises:

While there are some big names with high clutch elevation, there are also several role players. Furthermore, an interesting trend I noticed is that 4 out of the 6 top clutch players are guards whereas 4 out of the bottom 6 clutch players are centers.

Next, I ran a multiple linear regression model to fit my data with clutch elevation as the outcome variable and I naively selected all the per-game regular season stats as the independent variables. The results from the model are shown below in both the model summary and the scatter plot of predicted clutch elevation based on this naive model against observed clutch elevation.

As we can see from the model summary and plot, there does seem to be a somewhat significant level of correlation between our predictions and the observed values. However, when we take a look at our very low adjusted r-squared and consider the fact that we trained and tested on the same data, some questions arise about the true predictive power of such a model. For one, including so many variables, many of which are highly correlated, is bound to lead to overfitting. Furthermore, since we are training and testing on the same data, the correlation that we observe may just be the model fitting well to this particular training set, not necessarily capturing true patterns and correlations between variables. Thus, to remedy the first issue of having too many features, I narrowed down the feature set for the model using stepwise AIC optimization as detailed in the Methods section. The summary for the resulting model is as follows:

The summary of the updated model shows that free throw percentage has a strong correlation with clutch elevation. A positive correlation for turnovers and a negative one of rebounds suggests that players who handle the ball more, such as guards, are generally more clutch than big men. The model has a higher, but still insignificant adjusted r-squared.

To remedy the issue of testing on the same data that I trained with, I split my dataset into a train set and a test set and trained another multiple linear regression model on my train set and again applied a stepwise AIC feature selection method to obtain the following model:

We see a similar set of features with similar coefficients. Most notably, free throw percentage has a very high positive coefficient again.

I then tested my model using the test set and plotted the predicted results against the actual observations from the test set as shown below.

When testing the predictive power of the model on the test set, we get a very low adjusted r-squared value (0.026), demonstrating a very weak relationship between our predictions and the observed values for the test set. Thus, there seems to be no robust way to predict clutchness based on regular season performance, at least with multiple linear regression, however, the one silver lining is that there does seem to be a consistent, non-negligible relationship between free throw percentage and clutchness.

Taking a closer look at this specific relationship, we see that there is a weak, but non-negligible correlation between the two variables:

When it comes to playoff data and playoff elevation, in particular, the distribution of playoff elevation looks to loosely resemble the distribution of clutch elevation:

We have a slightly more negative mean and a smaller standard deviation as well. Overall, though, the similarity between the distribution of clutch elevation and playoff elevation suggests that the two environments, clutch time and the playoffs, are similar in terms of how they affect the overall play of players. It seems that increased defensive intensity and pressure cause a slight decline in performance with variation based on the player.

Looking at the relationship between clutch elevation and playoff elevation, we see a slight positive correlation, but one that is less significant than I had expected.

This weak correlation suggests that perhaps the idea of the clutch gene is likely overstated. Yet the fact that there is still some positive correlation at all is insufficient, but promising evidence that certain plays do simply handle pressure and increased defensive intensity better than others.

Conclusion & Further Studies

We have defined clutchness as the difference in performance between a player during clutch time and their normal regular season play, as measured by PER. Using this definition, we have found that, on average, players decline in performance slightly during clutch time, though there is much variation. Looking at the most clutch players by my definition, some of the names align with general perceptions but many are unexpected role players. In my attempts to build a robust clutchness prediction model, I have concluded that the “clutch gene,” if it exists, transcends standard NBA statistics and exists mostly independently of regular season per game stats. However, there does seem to be a non-negligible correlation between free throw percentage and clutch elevation as well as a general preference for guards over big men. I also conclude from my exploration of playoff elevation as another proxy for clutchness that the “clutch gene,” if it exists, is likely overstated in terms of importance. The weak relationship between my two measures of clutchness, clutch elevation and playoff elevation, suggests that the clutch gene is not as real as people think it is, or at least plays a relatively minor role in determining clutch time play. The trends towards guards having higher clutch elevation and the weak, but positive correlation between clutch elevation and playoff elevation suggest that there is some sort of clutch gene, however the weakness of all the aforementioned correlations suggests that while certain players may translate their games slightly better to big moments, the largest factor is still natural variance. 

In terms of future work, I think it could be really interesting to quantify clutchness in other ways. Specifically, I would have liked to try to use some sort of metric that utilizes win probability added near the end of games, like the one used in this article. Using win probability added would allow us to better capture how clutch each play is and weigh them relative to each other rather than simply having all plays of the same type be weighed equally within the last five minutes of a close game. For example, a buzzer-beater, game-winning shot should add much more clutchness than say that same shot with 4:55 left on the clock up 2. Using win probability added would allow us to capture such context-based intricacies simply and accurately. I opted to not use such a method due to the time and data limitations of this project, but would certainly be interested in exploring it in the future. 

Code & Sources

GitHub Repository

https://github.com/ericz23/stats_100_final_project

Data Sources

Data pulled from: www.nba.com/stats

Bibliography

Barkazian, Arvin. “Clutchness in Basketball.” Calstate, Calstate, 2019, scholarworks.calstate.edu/downloads/w9505257f

“NBA Advanced Stats.” Official NBA Stats, NBA, www.nba.com/stats. Accessed 10 Mar. 2024. 

Porreca, Jackson. “How to Calculate Player Efficiency Rating (PER).” Sports Betting Dime, 14 Nov. 2023, www.sportsbettingdime.com/guides/how-to/calculate-per/.  

Sams, Jordan. “A Statistical Analysis of Clutch NBA Shooters Since 2000.” Liberty Ballers, Liberty Ballers, 29 Feb. 2012, www.libertyballers.com/2012/2/29/2832299/lebron-james-kobe-bryant-dwyane-wade-clutch-nba-playoffs-4th-quarter.

Leave a comment