very preliminary analog of mapIds from AnnotationDbi, uses remote query to parquet for NCBI gene_info; the NG connotes NCBI Gene
Source:R/mapIdsNG.R
mapIdsNG.Rdvery preliminary analog of mapIds from AnnotationDbi, uses remote query to parquet for NCBI gene_info; the NG connotes NCBI Gene
Note
At present this function only uses `remote_gene_query`. A `left_join` is conducted between the data.frame composed of keys, and the query result, with multiple parameter set to "first".
Examples
if (is_online()) {
mapIdsNG()
if (requireNamespace("airway") && requireNamespace("tidySummarizedExperiment")) {
data(airway, package="airway")
tse = as(airway, "tidySummarizedExperiment")
print(tse)
tse = tse |> dplyr::mutate(map_location=mapIdsNG(keys=.feature, keytype="Ensembl", column="map_location"))
tse = tse |> dplyr::mutate(MIM=mapIdsNG(keys=.feature, keytype="Ensembl", column="MIM"))
print(tse)
head(table(SummarizedExperiment::rowData(tse)$map_location))
}
}
#> Loading required namespace: airway
#> Loading required namespace: tidySummarizedExperiment
#> # A SummarizedExperiment-tibble abstraction: 509,416 × 23
#> # Features=63677 | Samples=8 | Assays=counts
#> .feature .sample counts SampleName cell dex albut Run avgLength
#> <chr> <chr> <int> <fct> <fct> <fct> <fct> <fct> <int>
#> 1 ENSG00000000003 SRR10395… 679 GSM1275862 N613… untrt untrt SRR1… 126
#> 2 ENSG00000000005 SRR10395… 0 GSM1275862 N613… untrt untrt SRR1… 126
#> 3 ENSG00000000419 SRR10395… 467 GSM1275862 N613… untrt untrt SRR1… 126
#> 4 ENSG00000000457 SRR10395… 260 GSM1275862 N613… untrt untrt SRR1… 126
#> 5 ENSG00000000460 SRR10395… 60 GSM1275862 N613… untrt untrt SRR1… 126
#> 6 ENSG00000000938 SRR10395… 0 GSM1275862 N613… untrt untrt SRR1… 126
#> 7 ENSG00000000971 SRR10395… 3251 GSM1275862 N613… untrt untrt SRR1… 126
#> 8 ENSG00000001036 SRR10395… 1433 GSM1275862 N613… untrt untrt SRR1… 126
#> 9 ENSG00000001084 SRR10395… 519 GSM1275862 N613… untrt untrt SRR1… 126
#> 10 ENSG00000001167 SRR10395… 394 GSM1275862 N613… untrt untrt SRR1… 126
#> # ℹ 40 more rows
#> # ℹ 14 more variables: Experiment <fct>, Sample <fct>, BioSample <fct>,
#> # gene_id <chr>, gene_name <chr>, entrezid <int>, gene_biotype <chr>,
#> # gene_seq_start <int>, gene_seq_end <int>, seq_name <chr>, seq_strand <int>,
#> # seq_coord_system <int>, symbol <chr>, GRangesList <list>
#> # A SummarizedExperiment-tibble abstraction: 509,416 × 25
#> # Features=63677 | Samples=8 | Assays=counts
#> .feature .sample counts SampleName cell dex albut Run avgLength
#> <chr> <chr> <int> <fct> <fct> <fct> <fct> <fct> <int>
#> 1 ENSG00000000003 SRR10395… 679 GSM1275862 N613… untrt untrt SRR1… 126
#> 2 ENSG00000000005 SRR10395… 0 GSM1275862 N613… untrt untrt SRR1… 126
#> 3 ENSG00000000419 SRR10395… 467 GSM1275862 N613… untrt untrt SRR1… 126
#> 4 ENSG00000000457 SRR10395… 260 GSM1275862 N613… untrt untrt SRR1… 126
#> 5 ENSG00000000460 SRR10395… 60 GSM1275862 N613… untrt untrt SRR1… 126
#> 6 ENSG00000000938 SRR10395… 0 GSM1275862 N613… untrt untrt SRR1… 126
#> 7 ENSG00000000971 SRR10395… 3251 GSM1275862 N613… untrt untrt SRR1… 126
#> 8 ENSG00000001036 SRR10395… 1433 GSM1275862 N613… untrt untrt SRR1… 126
#> 9 ENSG00000001084 SRR10395… 519 GSM1275862 N613… untrt untrt SRR1… 126
#> 10 ENSG00000001167 SRR10395… 394 GSM1275862 N613… untrt untrt SRR1… 126
#> # ℹ 40 more rows
#> # ℹ 16 more variables: Experiment <fct>, Sample <fct>, BioSample <fct>,
#> # gene_id <chr>, gene_name <chr>, entrezid <int>, gene_biotype <chr>,
#> # gene_seq_start <int>, gene_seq_end <int>, seq_name <chr>, seq_strand <int>,
#> # seq_coord_system <int>, symbol <chr>, map_location <chr>, MIM <chr>,
#> # GRangesList <list>
#>
#> - 10p11.1 10p11.21 10p11.21-p11.1 10p11.22
#> 12 8 29 1 24
#> 10p11.22-p11.21
#> 1