Skip to contents

Excludes operationally low-value system and workflow artifacts from analytical reconstruction workflows while preserving the original observational evidence.

The function is designed for provenance-aware analytical pipelines where certain operational artifacts may:

  • inflate duplication metrics;

  • distort reuse analysis;

  • obscure meaningful reconstruction patterns;

  • or reduce review efficiency.

Unlike destructive filtering, the function is intended to operate on contextual or analytical reconstruction layers after observational evidence has already been preserved.

This distinction is important for:

  • forensic reproducibility;

  • provenance-aware reconstruction;

  • archival transparency;

  • and Heritage Digital Twin workflows.

The function supports lightweight operational noise profiles.

Current built-in profiles include:

  • "generic"

  • "rstudio"

The "generic" profile targets common system and synchronization artifacts.

The "rstudio" profile targets operational artifacts commonly produced during R and RStudio workflows.

Future versions may support:

  • workflow-specific profiles;

  • institution-specific registries;

  • YAML-based noise vocabularies;

  • preservation-oriented filtering policies.

The function is designed to work together with:

as part of layered provenance-aware reconstruction workflows.

Usage

exclude_operational_noise(
  x,
  filename = "filename",
  extension = "extension",
  profiles = c("generic", "rstudio")
)

Arguments

x

A data.frame or tibble containing contextual or analytical reconstruction entities.

filename

Character scalar identifying the filename column.

Defaults to "filename".

extension

Character scalar identifying the extension column.

Defaults to "extension".

profiles

Character vector defining operational noise profiles to apply.

Current profiles include:

  • "generic"

  • "rstudio"

Value

A filtered data.frame excluding operational noise resources.

Details

The function intentionally excludes only operationally low-priority resources.

It does not:

  • delete observational evidence;

  • modify the original snapshot data;

  • infer preservation value;

  • determine archival significance;

  • replace curatorial review.

Resources excluded from analytical workflows may still remain important for:

  • forensic preservation;

  • synchronization reconstruction;

  • reproducibility auditing;

  • or operational environment analysis.

Examples

toy_files <- tibble::tibble(
  filename = c(
    ".DS_Store",
    ".Rhistory",
    "analysis.R",
    "report.qmd"
  ),
  extension = c(
    "",
    "",
    "R",
    "qmd"
  )
)

exclude_operational_noise(
  toy_files,
  profiles = c(
    "generic",
    "rstudio"
  )
)
#> # A tibble: 2 × 2
#>   filename   extension
#>   <chr>      <chr>    
#> 1 analysis.R R        
#> 2 report.qmd qmd