nflfastR

CRAN status CRAN downloads Dev status R-CMD-check Lifecycle: stable nflverse support

nflfastR is a set of functions to efficiently scrape NFL play-by-play data. nflfastR expands upon the features of nflscrapR:

We owe a debt of gratitude to the original nflscrapR team, Maksim Horowitz, Ronald Yurko, and Samuel Ventura, without whose contributions and inspiration this package would not exist.

Installation

The easiest way to get nflfastR is to install it from CRAN with:

install.packages("nflfastR")

To get a bug fix or to use a feature from the development version, you can install the development version of nflfastR either from GitHub with:

if (!require("pak")) install.packages("pak")
pak::pak("nflverse/nflfastR")

or prebuilt from the development repo with:

install.packages("nflfastR", repos = c("https://nflverse.r-universe.dev", getOption("repos")))

Usage

We have provided some application examples in the Getting Started article. However, these require a basic knowledge of R. For this reason we have the nflfastR beginner’s guide, which we recommend to all those who are looking for an introduction to nflfastR with R.

You can find column names and descriptions in the Field Descriptions article, or by accessing the field_descriptions dataframe from the package.

Data access

Even though nflfastR is very fast, we recommend downloading the data from here or using the nflreadr package. These data sets include play-by-play data of complete seasons going back to 1999 and are updated nightly during the season. The files contain both regular season and postseason data, and one can use game_type or week to figure out which games occurred in the postseason.

nflfastR models

nflfastR uses its own models for Expected Points, Win Probability, Completion Probability, and Expected Yards After the Catch. To read about the models, please see this post on Open Source Football. For a more detailed description of the motivation for Expected Points models, we highly recommend this paper from the nflscrapR team located here.

Here is a visualization of the Expected Points model by down and yardline.

Here is a visualization of the Completion Probability model by air yards and pass direction.

nflfastR includes two win probability models: one with and one without incorporating the pre-game spread.

Special thanks