Retrieve pre-computed EDAM term embeddings from AnnotationHub
Source:R/embed.R
get_edam_embeddings.RdDownloads (and caches locally via AnnotationHub) a matrix of
text-embedding-3-small embeddings for all non-deprecated EDAM terms.
Each term is represented by its label concatenated with its
oio:hasDefinition text. On first call the file is downloaded;
subsequent calls in the same or future sessions use the local cache.
Value
a list with components ids, labels, types,
texts, embeddings (numeric matrix, terms × dimensions),
model, and created.
Details
Lookup order:
If
EDAM_EMBEDDING_RDSis set to a readable.rdspath, that file is loaded and returned immediately.The file bundled with the package at
inst/demo_embedding/edam_embeddings.rdsis used (viasystem.file).AnnotationHub is queried for a
biocEDAMembedding resource.
To override the bundled demo file with a freshly generated artifact, set
EDAM_EMBEDDING_RDS to its path or ensure the resource is in
AnnotationHub (see make_edam_embeddings).
Examples
emb <- get_edam_embeddings()
#> Loading bundled EDAM embeddings from /private/var/folders/yw/gfhgh7k565v9w83x_k764wbc0000gp/T/RtmpbnJ5Gy/temp_libpathed7e7dd7ef17/biocEDAM/demo_embedding/edam_embeddings.rds
cat(sprintf("%d terms | %d dimensions | model: %s\n",
length(emb$ids), ncol(emb$embeddings), emb$model))
#> 2399 terms | 1536 dimensions | model: text-embedding-3-small