
Create a provenance-aware Record Set data frame
recordset_df.RdConstruct a recordset_df, a provenance-aware contextual dataset
representing members of a Record Set.
Record Set is a contextual aggregation concept defined by the
International Council on Archives (ICA) Records in Contexts
standard (RiC). In operational terms, a Record Set may represent:
a project workspace;
a research corpus;
a synchronized working environment;
a digital collection;
a reconstruction context;
or another contextual grouping of related digital resources.
The Records in Contexts (RiC) standard provides a flexible and provenance-aware approach for describing evolving digital records, their relationships, and their contextual environments.
Unlike rigid hierarchical archival models, RiC allows records and digital resources to participate in multiple overlapping contextual groupings while preserving provenance and contextual relationships.
More information:
ICA Records in Contexts overview: https://www.ica.org/ica-network/expert-groups/egad/records-in-contexts-ric/
RiC-O ontology repository: https://github.com/ica-egad/ric-o
A recordset_df extends the
dataset_df class with lightweight
contextual Record Set semantics suitable for:
filesystem observations;
synchronized cloud folders;
web archive members;
digital surrogate collections;
curation batches;
Digital Twin workspaces;
provenance-aware research collections;
contextual digital preservation workflows.
The class is designed to work together with:
while preserving the distinction between:
observed filesystem evidence;
contextual grouping of related resources;
later analytical interpretation;
and archival or semantic enrichment workflows.
In operational terms:
record_set_ididentifies a contextual grouping of related digital resources (similar to a project workspace, collection, or reconstruction environment);member_ididentifies one observed or asserted member within that grouping.
The resulting object inherits from:
recordset_dfdataset_dftbl_dftbldata.frame
Usage
recordset_df(
...,
identifier = c(member = "http://example.com/recordset#member"),
var_labels = NULL,
units = NULL,
concepts = NULL,
dataset_bibentry = NULL,
dataset_subject = NULL
)Arguments
- ...
Vectors (columns) to include in the record set.
- identifier
A named vector of URI prefixes used to generate row identifiers.
Defaults to:
c(member = "http://example.com/recordset#member")- var_labels
Optional named list of human-readable variable labels.
- units
Optional named list of measurement units.
- concepts
Optional named list of semantic concept URIs.
- dataset_bibentry
Optional bibliographic metadata created with
dataset::dublincore()ordataset::datacite().- dataset_subject
Optional dataset subject metadata.
Details
The constructor requires at minimum the columns:
record_set_idmember_id
Validation and class assignment are delegated to
new_recordset_df.
The constructor is intentionally lightweight and does not:
infer authoritative archival hierarchy;
reconcile duplicate identities;
infer canonical resources;
construct ontology-complete provenance graphs;
or replace curatorial or archival interpretation.
Instead, it provides a stable contextual preservation layer for provenance-aware reconstruction and human-in-the-loop workflows.
Examples
toy_recordset <- recordset_df(
record_set_id = c(
"heritage_digitisation",
"heritage_digitisation",
"heritage_digitisation"
),
member_id = c(
"inst_001",
"inst_002",
"inst_003"
),
member_path = c(
"scans/photo_001.tif",
"ocr/photo_001.txt",
"reports/collection_summary.qmd"
),
member_type = c(
"file",
"file",
"file"
),
source_type = c(
"filesystem",
"filesystem",
"filesystem"
),
identifier = c(
member =
"https://example.org/recordset/heritage#member"
),
var_labels = list(
record_set_id = "Record set identifier",
member_id = "Member identifier",
member_path = "Member path"
),
concepts = list(
record_set_id =
"https://www.ica.org/standards/RiC/ontology#RecordSet",
member_id =
"https://www.ica.org/standards/RiC/ontology#Instantiation"
),
dataset_bibentry = dataset::dublincore(
title = "Toy Heritage Digitisation Record Set",
creator = person("Jane", "Doe"),
publisher = "fscontext"
)
)
toy_recordset
#> Doe (2026): Toy Heritage Digitisation Record Set [dataset]
#> rowid record_set_id member_id member_path member_type source_type
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 member1 heritage_digitisation inst_001 scans/photo_0… file filesystem
#> 2 member2 heritage_digitisation inst_002 ocr/photo_001… file filesystem
#> 3 member3 heritage_digitisation inst_003 reports/colle… file filesystem