
Observe a filesystem and construct a reproducible snapshot
scan_storage.RdRecursively scans a root folder and returns a data.frame where each row represents one filesystem observation recorded at a specific time.
Usage
scan_storage(
root,
storage_id = "l480-1-ssd",
person_id = "antaldaniel",
scan_time = Sys.time(),
compute_signature = TRUE,
max_signature_size = 200 * 1024 * 1024
)Arguments
- root
Character. Path to the root folder to observe.
- storage_id
Character. Identifier of the storage context.
- person_id
Character. Identifier of the observer or operator.
- scan_time
POSIXct. Timestamp of the observation. Defaults to
Sys.time()if not provided.- compute_signature
Logical. Whether to compute lightweight content signatures.
- max_signature_size
Numeric. Maximum file size (bytes) for signature computation.
Details
The function implements a read-only filesystem observation model:
it records accessible filesystem state;
it does not interpret file contents;
it does not assume canonical, complete, or authoritative state.
Each observation records:
a relative filesystem locator (
rel_path);a storage context (
storage_id);an observation timestamp (
scan_time).
Additional metadata may include:
filesystem properties (size, timestamps, permissions);
optional content signatures (
quick_sig);repository and version-control context (
repo_root,repo_rel_path,git_tracked).
The package deliberately records filesystem observations first and postpones documentary interpretation, Record Set construction, and RiC-aligned semantic assertions to later analytical stages.
This creates a reproducible observational snapshot suitable for:
forensic analysis of development environments;
reconstruction of activity patterns;
audit and compliance workflows;
alignment with version-controlled repositories.
The returned dataset is normalised to the canonical snapshot schema
via normalise_snapshot_schema().
At minimum, the result contains:
rel_path: relative filesystem locator within the observed root;storage_path_id: deterministic storage-scoped identifier derived fromstorage_id::rel_path;filename: basename of the observed file;mtime: last modification timestamp;extension: file extension.
Additional variables may be present depending on scan configuration.
The function is:
read-only and non-destructive;
deterministic for a given filesystem state;
robust to inaccessible files, which are silently skipped.
The result represents observed filesystem state rather than complete historical provenance.