EXPERIMENTAL This function parses several json metadata files at once
and combines them into a data.frame so the datasets can easily be
filtered based on categorizations, tags, number of classifications, etc.
Arguments
- server
the OGD-server to be used.
"ext"(the default) for the external server orprodfor the production server- local
If
TRUE(the default), the catalogue is created based on cached json metadata. Otherwise, the cache is updated prior to creating the catalogue using a "bulk-download" for metadata files.
Value
a data.frame with the following structure
| Column | Type | Description |
| title | chr | Title of the dataset |
| measures | int | Number of measure variables |
| fields | int | Number of classification fields |
| modified | datetime | Timestamp when the dataset was last modified |
| created | datetime | Timestamp when the dataset was created |
| database | chr | ID of the corresponding STATcube database |
| title_en | chr | English title |
| notes | chr | Description for the dataset |
| frequency | chr | How often is the dataset updated? |
| category | chr | Category of the dataset |
| tags | list<chr> | tags assigned to the dataset |
| json | list<od_json> | Full json metadata |
The type datetime refers to the POSIXct format as returned by Sys.time().
The last column "json" contains the full json metadata as returned by
od_json().
Examples
catalogue <- od_catalogue()
catalogue
#> # A data frame: 2 × 13
#> title measures fields modified created
#> <chr> <int> <int> <dttm> <dttm>
#> 1 Krebsstatis… 1 4 2024-01-25 16:03:34 2019-08-08 11:09:49
#> 2 Verdienstst… 5 4 2022-03-24 11:29:48 2017-08-02 20:00:00
#> # ℹ 8 more variables: id <ogd_id>, database <chr>, title_en <chr>,
#> # notes <chr>, update_frequency <chr>, tags <I<list>>,
#> # categorization <chr>, json <I<list>>
table(catalogue$update_frequency)
#>
#> jährlich nicht geplant
#> 1 1
table(catalogue$categorization)
#>
#> Arbeit Gesundheit
#> 1 1
catalogue[catalogue$categorization == "Gesundheit", 1:4]
#> # A data frame: 1 × 4
#> title measures fields modified
#> * <chr> <int> <int> <dttm>
#> 1 Krebsstatistik 1 4 2024-01-25 16:03:34
catalogue[catalogue$measures >= 70, 1:3]
#> # A data frame: 0 × 3
#> # ℹ 3 variables: title <chr>, measures <int>, fields <int>
catalogue$json[[1]]
#> Krebsstatistik
#>
#> Krebsstatistik nach Krebslokalisation (ICD10), Geschlecht und
#> Wohnbundesland
#>
#> Measures: Anzahl der Datensätze F-KRE
#> Fields: Tumore ICD/10 3-Steller, Berichtsjahr, Bundesland, Geschlecht
#> Updated: 2024-01-25 16:03:34
#> Tags: Krebsstatistik, Krebslokalisation-ICD10, Geschlecht,
#> Wohnbundesland
#> Categories: Gesundheit
head(catalogue$database)
#> [1] "dekrebs_ext" "deveste309"
