Returns a configuration object to pass as the parallel argument
of genproc(). The object describes how to parallelize; the
actual execution is carried out by future.apply::future_lapply()
on top of the backend selected by future::plan().
Usage
parallel_spec(
workers = NULL,
strategy = NULL,
chunk_size = NULL,
seed = TRUE,
packages = NULL,
globals = TRUE
)Arguments
- workers
Integer >= 1, or
NULL. Number of workers to use. Ignored whenstrategy = "sequential". IfNULL, the currentfuture::plan()decides.- strategy
Character, or
NULL. One of"sequential","multisession","multicore","cluster". IfNULL(default), the currentfuture::plan()is used unchanged. If specified, genproc temporarily sets the corresponding plan for the run and restores the previous plan on exit.- chunk_size
Integer >= 1, or
NULL. Number of iteration cases bundled per future. Larger values reduce scheduling overhead at the cost of load-balance granularity.NULLdelegates tofuture.apply's default heuristic.- seed
Controls reproducible random-number generation across workers. Passed to
future.apply::future_lapply()'sfuture.seedargument. DefaultTRUEderives independent L'Ecuyer-CMRG streams from a random master seed. A single integer fixes the master seed.FALSEdisables reproducible RNG and is not recommended unless the user function is known to be RNG-free.- packages
Character vector, or
NULL. Extra packages to attach on each worker before running the user function. genproc itself is attached automatically for every strategy other than"sequential".- globals
Logical or character. Forwarded to
future.apply::future_lapply()'sfuture.globals. DefaultTRUEenables automatic detection, which is correct in almost all cases. Set to a character vector only to override detection.
Choosing a strategy
"sequential": runs in the current process, no workers. Exercises the parallel code path without the overhead; useful for deterministic testing."multisession": portable (works on Windows), launches R subprocesses via parallelly. The default recommendation for most workloads."multicore": forks the current process (Unix/macOS only, not available on Windows and not reliable inside RStudio). Faster startup than multisession but loses portability."cluster": explicit cluster of workers, possibly on other machines. For large-scale batch execution.
For most users, leaving strategy = NULL and calling
future::plan() once at the top of the session is the cleanest
setup.
RNG reproducibility
With seed = TRUE, each case receives an independent
L'Ecuyer-CMRG stream derived from a random master seed. Same
master seed -> identical results regardless of worker count or
chunking. To pin the master seed, pass an integer
(seed = 42L).
Examples
# Use whatever plan the caller has set
spec <- parallel_spec()
# One-off parallel call with 4 workers, reproducible RNG
spec <- parallel_spec(workers = 4, strategy = "multisession",
seed = 42L)
# Exercise the parallel code path deterministically in a test
spec <- parallel_spec(strategy = "sequential")