
Classify operational file types from observed resources
classify_operational_file_type.RdClassifies observed digital resources into operational file types using workflow-oriented classification profiles.
Unlike simple MIME-type or extension lookups, the function is designed for provenance-aware analytical and reconstruction workflows where file meaning depends on operational context.
The function supports lightweight operational classification for:
filesystem reconstruction;
digital preservation review;
repository analytics;
synchronized workspace inspection;
web archive inventories;
and Heritage Digital Twin workflows.
The current implementation provides a small set of built-in profiles intended as operational starting points.
These profiles are intentionally lightweight and extensible.
Future versions may support:
user-defined profiles;
YAML-based vocabularies;
institutional review profiles;
preservation-oriented classification schemes;
workflow-specific semantic enrichment.
The function is designed to work together with:
as part of layered provenance-aware reconstruction workflows.
Arguments
- x
A
data.frameor tibble containing observed resources.- extension
Character scalar identifying the column containing file extensions.
Defaults to
"extension".- profile
Character scalar defining the operational classification profile.
Current built-in profiles include:
"r_development"
The
"r_development"profile is designed for:R package development;
Quarto and R Markdown workflows;
reproducible research repositories;
analytical reporting pipelines.
Value
A character vector containing operational file type classifications.
Typical output categories include:
"code""markdown""workspace""data""artifact""document""website_generated""other"
Details
The function intentionally performs lightweight operational classification only.
It does not:
infer authoritative media types;
inspect file contents;
perform preservation risk assessment;
infer documentary semantics;
replace curatorial review.
Classification is based primarily on operational workflow heuristics derived from file extensions and workflow profiles.