R/cache_citeseq_pbmcs.R
cacheCiteseq5k10kPbmcs.Rd
grab scvi-tools-processed PBMC CITE-seq data in anndata format (gzipped) from Open Storage Network
cacheCiteseq5k10kPbmcs()
invisibly, the path to the .h5ad file
Original h5ad files obtained using scvi-tools 0.18.0 scvi.data.pbmcs_10x_cite_seq, then processed according to steps in the scviR vignette, which follow the [scvi-tools tutorial](https://colab.research.google.com/github/scverse/scvi-tutorials/blob/0.18.0/totalVI.ipynb) by Gayoso et al.
It may be advantageous to set `options(timeout=3600)` or to allow an even greater time for internet downloads, if working at a relatively slow network connection.
h5path <- cacheCiteseq5k10kPbmcs()
cmeta <- rhdf5::h5ls(h5path)
dim(cmeta)
#> [1] 52 5
head(cmeta, 17)
#> group name otype dclass dim
#> 0 / X H5I_DATASET FLOAT 4000 x 10849
#> 1 / layers H5I_GROUP
#> 2 /layers counts H5I_DATASET FLOAT 4000 x 10849
#> 3 / obs H5I_GROUP
#> 4 /obs _scvi_batch H5I_DATASET INTEGER 10849
#> 5 /obs _scvi_labels H5I_DATASET INTEGER 10849
#> 6 /obs batch H5I_GROUP
#> 7 /obs/batch categories H5I_DATASET STRING 2
#> 8 /obs/batch codes H5I_DATASET INTEGER 10849
#> 9 /obs index H5I_DATASET STRING 10849
#> 10 /obs n_counts H5I_DATASET FLOAT 10849
#> 11 /obs n_genes H5I_DATASET INTEGER 10849
#> 12 /obs percent_mito H5I_DATASET FLOAT 10849
#> 13 / obsm H5I_GROUP
#> 14 /obsm protein_expression H5I_GROUP
#> 15 /obsm/protein_expression CD127_TotalSeqB H5I_DATASET INTEGER 10849
#> 16 /obsm/protein_expression CD14_TotalSeqB H5I_DATASET INTEGER 10849