Skip to contents

Ensures that a snapshot data.frame conforms to the canonical schema used by the package.

Usage

normalise_snapshot_schema(df)

Arguments

df

A data.frame representing a filesystem snapshot.

Value

A data.frame conforming to the current snapshot schema.

Details

This function detects the schema version of the input (via the schema_version attribute) and applies the necessary migration steps to bring it to the current version.

Currently supported:

  • snapshots without a schema_version attribute are assumed to originate from version 0.1.0 and are migrated accordingly

  • snapshots with schema_version = "0.1.0" are migrated to version 0.1.2

The function is safe to call on already-normalised data and may be used at the start of analytical workflows to ensure consistency.

The canonical schema includes:

  • rel_path as the primary file-instance identifier

  • filename as the basename of the file

  • dir_path derived from rel_path

Additional columns from scan_storage() are preserved.

Examples

data(test_snapshot)
#> Warning: data set 'test_snapshot' not found

# normalise legacy snapshot
snap <- normalise_snapshot_schema(test_snapshot)
#> Error: object 'test_snapshot' not found