This variation on req_perform_sequential()
performs multiple requests in
parallel. Exercise caution when using this function; it's easy to pummel a
server with many simultaneous requests. Only use it with hosts designed to
serve many files at once, which are typically web servers, not API servers.
req_perform_parallel()
has a few limitations:
Will not retrieve a new OAuth token if it expires part way through the requests.
Does not perform throttling with
req_throttle()
.Does not attempt retries as described by
req_retry()
.Only consults the cache set by
req_cache()
before/after all requests.
If any of these limitations are problematic for your use case, we recommend
req_perform_sequential()
instead.
Usage
req_perform_parallel(
reqs,
paths = NULL,
pool = NULL,
on_error = c("stop", "return", "continue"),
progress = TRUE
)
Arguments
- reqs
A list of requests.
- paths
An optional character vector of paths, if you want to download the response bodies to disk. If supplied, must be the same length as
reqs
.- pool
Optionally, a curl pool made by
curl::new_pool()
. Supply this if you want to override the defaults for total concurrent connections (100) or concurrent connections per host (6).- on_error
What should happen if one of the requests fails?
stop
, the default: stop iterating with an error.return
: stop iterating, returning all the successful responses received so far, as well as an error object for the failed request.continue
: continue iterating, recording errors in the result.
- progress
Display a progress bar for the status of all requests? Use
TRUE
to turn on a basic progress bar, use a string to give it a name, or see progress_bars to customize it in other ways. Not compatible withreq_progress()
, as httr2 can only display a single progress bar at a time.
Value
A list, the same length as reqs
, containing responses and possibly
error objects, if on_error
is "return"
or "continue"
and one of the
responses errors. If on_error
is "return"
and it errors on the ith
request, the ith element of the result will be an error object, and the
remaining elements will be NULL
. If on_error
is "continue"
, it will
be a mix of requests and error objects.
Only httr2 errors are captured; see req_error()
for more details.
Examples
# Requesting these 4 pages one at a time would take 2 seconds:
request_base <- request(example_url())
reqs <- list(
request_base |> req_url_path("/delay/0.5"),
request_base |> req_url_path("/delay/0.5"),
request_base |> req_url_path("/delay/0.5"),
request_base |> req_url_path("/delay/0.5")
)
# But it's much faster if you request in parallel
system.time(resps <- req_perform_parallel(reqs))
#> Iterating ■■■■■■■■■■■■■■■■ 50% | ETA: 1s
#> Iterating ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 100% | ETA: 0s
#> user system elapsed
#> 0.446 0.633 1.080
# req_perform_parallel() will fail on error
reqs <- list(
request_base |> req_url_path("/status/200"),
request_base |> req_url_path("/status/400"),
request("FAILURE")
)
try(resps <- req_perform_parallel(reqs))
#> Error in req_perform_parallel(reqs) : HTTP 400 Bad Request.
# but can use on_error to capture all successful results
resps <- req_perform_parallel(reqs, on_error = "continue")
# Inspect the successful responses
resps |> resps_successes()
#> [[1]]
#> <httr2_response>
#> GET http://127.0.0.1:40803/status/200
#> Status: 200 OK
#> Content-Type: text/plain
#> Body: None
#>
# And the failed responses
resps |> resps_failures() |> resps_requests()
#> [[1]]
#> <httr2_request>
#> GET http://127.0.0.1:40803/status/400
#> Body: empty
#>
#> [[2]]
#> <httr2_request>
#> GET FAILURE
#> Body: empty
#>