Use req_perform()
to automatically cache HTTP requests. Most API requests
are not cacheable, but static files often are.
req_cache()
caches responses to GET requests that have status code 200 and
at least one of the standard caching headers (e.g. Expires
,
Etag
, Last-Modified
, Cache-Control
), unless caching has been expressly
prohibited with Cache-Control: no-store
. Typically, a request will still
be sent to the server to check that the cached value is still up-to-date,
but it will not need to re-download the body value.
To learn more about HTTP caching, I recommend the MDN article HTTP caching.
Arguments
- req
A request.
- path
Path to cache directory
- use_on_error
If the request errors, and there's a cache response, should
req_perform()
return that instead of generating an error?- debug
When
TRUE
will emit useful messages telling you about cache hits and misses. This can be helpful to understand whether or not caching is actually doing anything for your use case.
Value
A modified HTTP request.
Examples
# GitHub uses HTTP caching for all raw files.
url <- paste0(
"https://raw.githubusercontent.com/allisonhorst/palmerpenguins/",
"master/inst/extdata/penguins.csv"
)
# Here I set debug = TRUE so you can see what's happening
req <- request(url) %>% req_cache(tempdir(), debug = TRUE)
# First request downloads the data
resp <- req %>% req_perform()
#> Saving response to cache "d5d1ddd7f99f55dbc920c63f942804c0"
# Second request retrieves it from the cache
resp <- req %>% req_perform()
#> Found url in cache "d5d1ddd7f99f55dbc920c63f942804c0"
#> Cached value is fresh; retrieving response from cache