Skip to contents

Loop over five days at a time to download data from the Statcast search API at Baseball Savant. This includes swing tracking data not available through the MLB statsapi.

Usage

download_baseballsavant(
  start_date,
  end_date,
  game_type = "R",
  cl = NULL,
  verbose = TRUE
)

Arguments

start_date

first date included in the download

end_date

last date included in the download

game_type

character vector of game types to include. Options are "R" (regular sesason), "F" (first-round playoff series, aka wild card), "D" (division series), "L" (league championship series), "W" (world series), "S" (spring training), "A" (all-star game), "E" (exhibition). Default is "R".

cl

optional cluster object for parallel computation, default is NULL (not parallel)

verbose

logical, should progress be printed to console?

Value

a dataframe with one row per pitch and all available columns

Details

The baseballsavant API limits queries to 25,000 rows, so we run several queries if necessary. If any of the queries returns exactly 25,000 rows, this indicates that the user probably has not gotten all of the expected data, so this function throws a warning.

Examples

if (FALSE) { # \dontrun{
   data_baseballsavant <- download_baseballsavant(
     start_date = "2024-07-01",
     end_date = "2024-07-01"
   )
} # }