Skip to contents

Retrieves file-level change events from GitHub repositories referenced in a filesystem snapshot.

Usage

create_github_commit_journal(snapshot, since = NULL, per_page = 100)

Arguments

snapshot

A data frame containing at least git_remote and git_branch columns.

since

Optional character or datetime value passed to the GitHub API since parameter to restrict commits by time.

per_page

Number of commits retrieved per repository. Defaults to 100.

Value

A tibble containing GitHub-derived file events.

Details

The function identifies unique GitHub repository and branch combinations from the snapshot and queries the GitHub API for commit and changed-file history.

The resulting journal contains observed GitHub file events such as additions, deletions, and modifications.

The function operates on observational snapshot metadata and reconstructs file-level events from associated GitHub repositories.

Event types currently include:

  • "git_add"

  • "git_delete"

  • "git_change"

Repository identifiers are normalized with normalize_git_remote().

If no repositories are found, an empty commit journal is returned.

Examples

if (FALSE) { # \dontrun{

data("fscontextdemo_snapshot_02")

snapshot_df <- fscontextdemo_snapshot_02

snapshot_df$git_remote <-
  "https://github.com/dataobservatory-eu/fscontextdemo"

snapshot_df$git_branch <- "main"

github_journal <-
  create_github_commit_journal(
    snapshot_df,
    per_page = 5
  )

head(github_journal)
} # }