Changelog
Source:NEWS.md
genproc 0.2.0
CRAN release: 2026-05-12
New features
-
genproc()now integrates with theprogressrframework. When the calling code is wrapped inprogressr::with_progress(...), one progression signal is emitted per completed case (in sequential and parallel modes; signals from worker subprocesses are propagated byfuture.apply). The user picks any handler (text bar, RStudio gadget, beeps, custom) viaprogressr::handlers(). Withoutwith_progress(), the integration is a complete no-op.progressris inSuggests; the integration is skipped when it is not installed. Live monitoring of non-blocking runs is on the roadmap. - New
errors(result)returns the failed-case rows of the log with all original columns (case_id, mask params, error_message, traceback, duration_secs). Replaces the boilerplateresult$log[!result$log$success, ]pattern. - New
summary(result)(S3 method ongenproc_result) produces a compact human-readable digest: status, success rate, per-case duration stats (mean, max, slowest case_id), and the top recurring error messages by occurrence (configurable viatop_errors). Useful on runs with many cases where the raw log is too noisy to eyeball. - New
rerun_failed(r0, f)helper. Sibling ofrerun_affected(): filters the original mask down to the cases that failed and re-runsgenproc()on that subset only. Useful after fixing the cause of a transient failure. - New
rerun_affected(r0, diff, f)helper. Closes the reproducibility loop: when [diff_inputs()] reports drift between two runs,rerun_affected()filters the original mask down to the cases that referenced the impacted files and re-runsgenproc()on that subset only. The resultinggenproc_resultis a small refresh, not a full re-run. -
diff_inputs()now returns a new$cases_affectedfield: a data.frame with columnscase_id,path,column,change_typelisting every (case, input column) pair impacted by the diff. Available both programmatically and as input torerun_affected(). The print method also shows a concise summary (“Cases affected: N”) and a hint towardsrerun_affected(). -
print.genproc_input_diffnow distinguishes small size variations whose human-readable rounding is identical: when the formatted size is the same on both sides, the byte delta is shown explicitly (size: 1.1 KB -> 1.1 KB (+6 B)).
UX improvements
result$reproducibility$parallelnow carries aneffective_strategyfield alongside the user-requestedstrategy. The two differ when the user passedworkerswithout an explicitstrategy, in which casegenproc()auto-defaults to"multisession"; the snapshot now records both, preserving the audit trail of what was requested vs what was applied. TheModeline ofprint(result)now shows the effective strategy by default, so a sequential vs parallel multisession run is no longer ambiguous in the printed summary.status()now distinguishes"done"(the wrapper future resolved successfully) from"error"(the wrapper crashed), even before [await()] is called. Previouslystatus()returned"done"as soon as the future was resolved, regardless of outcome — leading to the misleadingStatus: done (not collected)print on a job that had actually failed. The peek result is cached in a shared environment so that a subsequentawait()does not re-materialize the future.print(result)is more informative: aStartedline shows the run’s timestamp, aModeline summarises the execution configuration (sequential,multisession parallel (4 workers),non-blocking + multisession parallel (6 workers), etc.), and the method emitserrors(x)/summary(x)hints when failures occurred. The non-blocking print also distinguishesdone (not collected)fromerror (not collected).When
parallelwas used but startup overhead clearly dominated the run,print(result)now emits aNotewarning. Two metrics: parallel efficiency below 50% whenworkersis supplied (catches cases likeparallel_spec(workers = 4)that yield no real speedup), or wall-clock abovecumulative * 1.2in power-user mode (workers unknown). Both require wall > 0.5s to avoid noise. Addresses the common surprise of activating parallel on a small workload and observing a slowdown.Tracebacks captured by the logged layer are now substantially shorter and easier to read. Internal dispatcher frames (
execute_cases,do.call,FUN), invocation context frames (source,eval,withVisible), and PSOCK worker frames (workRSOCK,workLoop,workCommand,makeSOCKmaster) are now dropped from the head of the stack, so the first surviving frame is always user code. User calls tolapply()ordo.call()from within their own function are preserved (the head-position filter only consumes leading frames).Composing
parallel = parallel_spec(...)andnonblocking = nonblocking_spec(...)now works out of the box on Windows and in RStudio configurations where the wrapper subprocess inheritsgetOption("mc.cores")set to 1. Previously, the composed call failed with aparallelly“only 1 CPU cores available” error, and (less visibly) emitted a misleading soft-limit warning.genproc()now applies two surgical adjustments inside the wrapper subprocess in the composed case (only when the user has not set their own values): it setsR_PARALLELLY_AVAILABLECORES_METHODS = "system"to lift the hard limit, and raisesoptions(mc.cores)to silence the soft-limit warning. The calling session is never modified.
genproc 0.1.0
First public release. The package consolidates the four execution layers (logged, reproducibility, parallel, non-blocking) and the building blocks (from_example_to_function(), from_function_to_mask(), rename_function_params(), add_trycatch_logrow()) under a stable API contract. The genproc_result S3 class fields are guaranteed forward-compatible across the 0.x series.
Execution layers
- New
genproc()runs a function over an iteration mask, with two mandatory layers always active:-
Logged — structured log with real traceback (captured via
withCallingHandlers()) and per-case timing. - Reproducibility — environment snapshot at run start (R version, platform, loaded package versions, mask, and specs of any optional layer used).
-
Logged — structured log with real traceback (captured via
- New
parallel_spec()and theparallelargument ofgenproc(): optional parallel dispatch overfuture.apply::future_lapply(). Auto-defaults to"multisession"whenworkersis passed without an explicitstrategy, restoring the previous plan on exit. - New
nonblocking_spec()and thenonblockingargument ofgenproc():genproc()returns immediately with agenproc_resultof status"running"while the run continues in a background future. Usestatus()to poll,await()to block until resolution. Composable withparallel. - The reproducibility layer now records a stat-based fingerprint (size + mtime) of every input file referenced in the mask. Stored in
result$reproducibility$inputsas(method, files, refs). Heuristic detection by default; explicit override viagenproc(..., input_cols = ...)orskip_input_cols = .... Disable withtrack_inputs = FALSE. - New
diff_inputs(r0, r1)compares the input fingerprints of two runs and reports changed / unchanged / added / removed files, with a human-readable print method.
Result object
- New S3 class
genproc_resultwith stable fields:log,reproducibility,n_success,n_error,duration_total_secs,status. - Per-case errors do not stop the run; they are captured in the
logand surfaced inn_error. -
case_ids are index-based (case_0001, …) for now; a content-based variant is planned.
Building blocks
-
from_example_to_function(): turn an example expression that works for one case into a parameterized function. String literals and free symbols become parameters with the original value as default. Built on a dependency-free AST rewriter. -
from_function_to_mask(): derive a one-row templatedata.framefrom a function’s signature, ready to be expanded into a full iteration mask. -
rename_function_params(): rename parameters in formals and body in one pass, without editing the function source. -
add_trycatch_logrow(): the standalone logging wrapper used bygenproc(), exposed for users who want the logged layer outside the full pipeline.