Skip to contents

For each row in result, queries the EBI OLS4 REST API by term_iri to confirm the term exists and retrieve an authoritative definition. All rows are retained; two columns are added:

validated

logical — TRUE if the IRI was found in OLS4, FALSE if not (possible hallucination or deprecated term)

definition

character — the OLS4-sourced definition for validated terms; NA for unvalidated rows

Usage

ols4_enrich(result, label_match = FALSE)

Arguments

result

data.frame as returned by map_concepts.

label_match

logical(1) if TRUE, add llm_label and label_match columns (content-word overlap check between the LLM label and the OLS4 canonical label). Defaults to FALSE.

Value

result with validated and definition columns added, and optionally llm_label and label_match.

Details

Three columns are added or updated:

llm_label

The original label as produced by the LLM, preserved before being replaced by the OLS4 canonical label

label_match

logical — TRUE if the LLM label and the OLS4 canonical label share at least one content word (a basic semantic consistency check); FALSE flags cases where the LLM supplied a real but unrelated IRI (e.g. "variant calling" → "Cystatin-SN"); NA for unvalidated rows

definition

OLS4-sourced definition; NA for unvalidated rows

Rows where validated = TRUE but label_match = FALSE are the most suspicious: the IRI exists in OLS4 but likely does not correspond to the extracted concept.

Examples

if (interactive()) {
    ch  <- ols4_chat()
    raw <- map_concepts("atrial fibrillation and genome sequencing", chat = ch)
    enr <- ols4_enrich(raw)
    enr[enr$validated, ]   # confirmed terms only
}