Curation of Bioconductor package metadata, targeting EDAM ontology and ELIXIR bio.tools metadata schemas
Vincent J. Carey, stvjc at channing.harvard.edu
November 07, 2024
Source:vignettes/curate.Rmd
curate.Rmd
Introduction
This vignette is derived almost entirely from collaborative code supplied by Anh Nguyet Vu of Sage Bionetworks. The purpose is to illustrate usage of OpenAPI transformation to provide systematic organization and tagging of content available for Bioconductor packages.
Example 1: chromVAR
library(biocEDAM)
library(listviewer)
library(reticulate)
docv = curate_bioc()
## Success after 0 attempts
## Success after 0 attempts
if (!(length(py_to_r(docv$base_final))==18)) docv = curate_bioc() # retry
if (!(length(py_to_r(docv$edam_processed))==2)) docv = curate_bioc()
if (!(length(py_to_r(docv$edam_processed))==2)) stop("curate_bioc failing on retry")
The curation step uses an OpenAI API key. The curated view of the chromVAR landing page at bioconductor.org is:
The README of the chromVAR github repo provides additional metadata.
These can be “merged” for a richer metadata artifact.
Example 2: tximport
dotx = curate_bioc(
"tximport",
devurl = "https://raw.githubusercontent.com/thelovelab/tximport/refs/heads/devel/vignettes/tximport.Rmd")
## Success after 0 attempts
## Success after 0 attempts
Example 3: MSnbase
We added a simpler function edamize
to simply use chat completion to EDAM schema with input from a URL. This allows a setting of the completion “temperature” parameter, which defaults to zero. See this doc. edamize
does not appear to be very robust and often returns Null in examples; work is needed.
nn = edamize("https://raw.githubusercontent.com/lgatto/MSnbase/refs/heads/master/vignettes/v04-benchmarking.Rmd")
## Success after 0 attempts