use output of cyclicSigset to generate a series of character vectors constituting OBO terms
Source:R/CLextend.R
ldfToTerms.Rd
use output of cyclicSigset to generate a series of character vectors constituting OBO terms
Usage
ldfToTerms(
ldf,
propmap,
sigels,
prologMaker = function(id, ...) sprintf("id: %s", id)
)
Arguments
- ldf
a 'long format' data.frame as created by cyclicSigset
- propmap
a character vector with names of elements corresponding to 'abbreviated' relationship tokens and element values corresponding to full relationship-naming strings
- sigels
a named character vector associating cell types (names) to genes expressed in a cyclic set, one element per type
- prologMaker
a function with arguments (id, ...), in which id is character(1), that generates a vector of strings that will be used for each cell type-specific term.
Note
ldfToTerms is not sufficiently general to produce terms for any reasonably populated long data frame/propmap combination, but it is a working example for the cyclic set context.
Examples
# a set of cell types -- names are cell type token, values are genes expressed in a
# cyclic set -- each cell type expresses exactly one gene in the set and fails to
# express all the other genes in the set. See Figs 3 and 4 of Bakken et al [PMID 29322913].
sigels = c("CL:X01"="GRIK3", "CL:X02"="NTNG1", "CL:X03"="BAGE2",
"CL:X04"="MC4R", "CL:X05"="PAX6", "CL:X06"="TSPAN12", "CL:X07"="hSHISA8",
"CL:X08"="SNCG", "CL:X09"="ARHGEF28", "CL:X10"="EGF")
# create the associated long data frame
ldf = cyclicSigset(sigels)
# describe the abbreviations
pmap = c("hasExp"="has_expression_of", lacksExp="lacks_expression_of")
# now define the prolog for each cell type
makeIntnProlog = function(id, ...) {
# make type-specific prologs as key-value pairs
c(
sprintf("id: %s", id),
sprintf("name: %s-expressing cortical layer 1 interneuron, human", ...),
sprintf("def: '%s-expressing cortical layer 1 interneuron, human described via RNA-seq observations' [PMID 29322913]", ...),
"is_a: CL:0000099 ! interneuron",
"intersection_of: CL:0000099 ! interneuron")
}
tms = ldfToTerms(ldf, pmap, sigels, makeIntnProlog)
cat(tms[[1]], sep="\n")
#> [Term]
#> id: CL:X01
#> name: GRIK3-expressing cortical layer 1 interneuron, human
#> def: 'GRIK3-expressing cortical layer 1 interneuron, human described via RNA-seq observations' [PMID 29322913]
#> is_a: CL:0000099 ! interneuron
#> intersection_of: CL:0000099 ! interneuron
#> has_expression_of: PR:000008242 ! GRIK3
#> lacks_expression_of: PR:000011467 ! NTNG1
#> lacks_expression_of: PR:000004625 ! BAGE2
#> lacks_expression_of: PR:000001237 ! MC4R
#> lacks_expression_of: PR:000012318 ! PAX6
#> lacks_expression_of: PR:000016738 ! TSPAN12
#> lacks_expression_of: PR:B8ZZ34 ! hSHISA8
#> lacks_expression_of: PR:000015325 ! SNCG
#> lacks_expression_of: PR:000013942 ! ARHGEF28
#> lacks_expression_of: PR:000006928 ! EGF