sc_tabulate()
extracts the data in the table and turns it into a tidy
data.frame. It applies labeling of the data and transforms time variables
into a Date
format if they satisfy certain STATcube Standards.
sc_tabulate(table, ...)
is just an alias for table$tabulate(...)
and
was added so this rather complicated method can have a separate documentation
page. It is recommended to use the table$tabulate()
syntax
the ...
argument decides which measures and/or fields should be included
in the output. If no measures are given, all measures are included. The same
is true for fields.
Usage
sc_tabulate(
table,
...,
.list = NULL,
raw = FALSE,
parse_time = TRUE,
recode_zeros = inherits(table, "sc_table"),
language = NULL,
sort = FALSE
)
Arguments
- table
An object of class
sc_data
- ...
Names of measures and/or fields
- .list
allows to define the arguments for
...
as a character vector.- raw
If FALSE (the default), apply labeling to the dataset. Otherwise, return codes.
- parse_time
Should time variables be converted into a
Date
format? Ignored ifraw
is set toTRUE
.- recode_zeros
turn zero values into
NA
s- language
The language to be used for labelling. By default, the dataset language (
table$language
) is used.- sort
If
TRUE
, the resulting data will be sorted by all provided field values
Details
Aggregation is done as follows
First, all columns that priovide a total code via
table$total_codes()
will be used to filter forcolumn == total_code
orcolumn != total_code
Then, the remaining data is aggregated using
rowsum()
The ellipsis (...
) supports partial matching of codes and labels.
See Examples
For objects of class sc_table
two additional operations are performed.
zeros are recoded to
NA
srounding is done according to the precision of each measure. Ronding happens after the recoding to
NA
values
Examples
############################ OGD Data #######################################
table <- od_table("OGD_veste309_Veste309_1")
# no arguments -> same output as `table$data`
table$tabulate()
#> # A STATcubeR tibble: 72 x 9
#> Sex Citizen…¹ Regio…² Form …³ Arith…⁴ 1st q…⁵ 2nd q…⁶ 3rd q…⁷
#> * <fct> <fct> <fct> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 Sum total Total Total "Total" 17.6 11.6 15.1 20.1
#> 2 Sum total Total Total "Stand… 19.0 12.8 16.2 21.7
#> 3 Sum total Total Total "Non-s… 15.2 10.2 13.0 17.2
#> 4 Sum total Total Total "Non-s… 15.9 10.7 13.8 18.3
#> 5 Sum total Total Total "Non-s… 16.8 10.2 13.4 18.6
#> 6 Sum total Total Total "Non-s… 12 9 10.1 12.3
#> 7 Sum total Total Total "Non-s… 14.0 11.5 13.5 15.7
#> 8 Sum total Total AT11 B… "Total" 15.7 11.0 13.7 17.5
#> 9 Sum total Total AT12 L… "Total" 16.4 11.4 14.3 18.5
#> 10 Sum total Total AT13 V… "Total" 19.2 11.4 15.9 22.6
#> # … with 62 more rows, 1 more variable: `Number of employees` <dbl>,
#> # and abbreviated variable names ¹Citizenship, ²`Region (NUTS2)`,
#> # ³`Form of employment`, ⁴`Arithmetic mean`, ⁵`1st quartile`,
#> # ⁶`2nd quartile (median)`, ⁷`3rd quartile`
# provide some fields -> aggregate to keep only these fields
table$tabulate("Sex", "Citizenship")
#> # A STATcubeR tibble: 27 x 7
#> Sex Citizenship Arith…¹ 1st q…² 2nd q…³ 3rd q…⁴ Numbe…⁵
#> * <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Sum total "Total" 266. 181. 230. 301. 8917965
#> 2 Sum total "Austrian" 18.5 12.4 15.9 21.2 2079655
#> 3 Sum total "Non-Austrian " 14.4 9.72 12.3 15.9 571282
#> 4 Sum total "EU-28/EWR/Switz… 15.2 10.0 12.8 16.6 380327
#> 5 Sum total "EU-15/EWR befor… 19.6 11.9 15.9 21.9 122572
#> 6 Sum total "EU-13 EU Member… 13.0 9.59 11.9 14.6 257755
#> 7 Sum total "Former Yugoslav… 12.8 9.58 12.0 14.7 96065
#> 8 Sum total "Turkey" 12.2 9.35 11.3 14.4 34759
#> 9 Sum total "Other" 13.8 9.07 10.9 14.6 60131
#> 10 Male "Total" 290. 195. 246. 327. 4952512
#> # … with 17 more rows, and abbreviated variable names
#> # ¹`Arithmetic mean`, ²`1st quartile`, ³`2nd quartile (median)`,
#> # ⁴`3rd quartile`, ⁵`Number of employees`
# provide some measures -> drop all other measures from the output
table$tabulate("Arithmetic mean")
#> # A STATcubeR tibble: 72 x 5
#> Sex Citizenship `Region (NUTS2)` Form of employme…¹ Arith…²
#> * <fct> <fct> <fct> <fct> <dbl>
#> 1 Sum total Total Total "Total" 17.6
#> 2 Sum total Total Total "Standard employm… 19.0
#> 3 Sum total Total Total "Non-standard emp… 15.2
#> 4 Sum total Total Total "Non-standard emp… 15.9
#> 5 Sum total Total Total "Non-standard emp… 16.8
#> 6 Sum total Total Total "Non-standard emp… 12
#> 7 Sum total Total Total "Non-standard emp… 14.0
#> 8 Sum total Total AT11 Burgenland "Total" 15.7
#> 9 Sum total Total AT12 Lower Austria "Total" 16.4
#> 10 Sum total Total AT13 Vienna "Total" 19.2
#> # … with 62 more rows, and abbreviated variable names
#> # ¹`Form of employment`, ²`Arithmetic mean`
# mixture of measures and fields -> keep exactly those columns
table$tabulate("Sex", "Arithmetic mean")
#> # A STATcubeR tibble: 3 x 2
#> Sex `Arithmetic mean`
#> * <fct> <dbl>
#> 1 Sum total 385.
#> 2 Male 418.
#> 3 Female 342.
## define total codes
table$total_codes(
`C-A11-0` = "A11-1",
`C-STAATS-0` = "STAATS-9",
`C-VEBDL-0` = "VEBDL-10",
`C-BESCHV-0` = "BESCHV-1"
)
## alternatively, use partial matching to define totals
table$total_codes(
Sex = "Sum total",
Citizenship = "Total",
Region = "Total",
`Form of employment` = "Total"
)
# filter for totals in `Region (NUTS2)` and `Form of employment`. Drop totals
# in `Sex` and `Citizenship`.
table$tabulate("Sex", "Citizenship")
#> # A STATcubeR tibble: 16 x 7
#> Sex Citizenship Arith…¹ 1st q…² 2nd q…³ 3rd q…⁴ Numbe…⁵
#> * <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Male "Austrian" 20.4 13.8 17.4 23.5 1190627
#> 2 Male "Non-Austrian " 15.3 10.6 13.2 16.6 353961
#> 3 Male "EU-28/EWR/Switzerl… 16.0 10.9 13.5 17.1 236607
#> 4 Male "EU-15/EWR before 2… 21.2 12.6 16.7 24.0 75300
#> 5 Male "EU-13 EU Member St… 13.6 10.3 12.7 15.1 161308
#> 6 Male "Former Yugoslavia … 14.1 11.2 13.6 16.1 56474
#> 7 Male "Turkey" 13.3 10.4 12.8 15.4 22516
#> 8 Male "Other" 14.0 9.07 11.2 14.6 38364
#> 9 Female "Austrian" 15.9 11.1 14.0 18.3 889028
#> 10 Female "Non-Austrian " 13.0 9.19 10.6 14.3 217322
#> 11 Female "EU-28/EWR/Switzerl… 13.7 9.38 11.3 15.5 143719
#> 12 Female "EU-15/EWR before 2… 17.1 11.0 14.7 19.7 47272
#> 13 Female "EU-13 EU Member St… 12.1 9.06 10.3 13.2 96447
#> 14 Female "Former Yugoslavia … 10.9 9 9.78 11.8 39591
#> 15 Female "Turkey" 10.3 8.68 9.51 10.9 12244
#> 16 Female "Other" 13.3 9.07 10.6 14.6 21767
#> # … with abbreviated variable names ¹`Arithmetic mean`,
#> # ²`1st quartile`, ³`2nd quartile (median)`, ⁴`3rd quartile`,
#> # ⁵`Number of employees`
## switch language
table$language <- "de"
## `...` matches for codes and labels
table$tabulate("C-A11-0", "Staats", "2. Quartil (Median)")
#> # A STATcubeR tibble: 16 x 3
#> Geschlecht Staatsangehörigkeit 2. Quartil…¹
#> * <fct> <fct> <dbl>
#> 1 männlich "Österreich" 17.4
#> 2 männlich "Nicht-Österreich" 13.2
#> 3 männlich "EU-28/EWR/Schweiz (ohne Österreich)" 13.5
#> 4 männlich "EU-15/EWR vor 2004/Schweiz " 16.7
#> 5 männlich "EU-13 Beitrittsstaaten ab 2004" 12.7
#> 6 männlich "ehemaliges Jugoslawien (außerhalb der EU)" 13.6
#> 7 männlich "Türkei" 12.8
#> 8 männlich "Andere" 11.2
#> 9 weiblich "Österreich" 14.0
#> 10 weiblich "Nicht-Österreich" 10.6
#> 11 weiblich "EU-28/EWR/Schweiz (ohne Österreich)" 11.3
#> 12 weiblich "EU-15/EWR vor 2004/Schweiz " 14.7
#> 13 weiblich "EU-13 Beitrittsstaaten ab 2004" 10.3
#> 14 weiblich "ehemaliges Jugoslawien (außerhalb der EU)" 9.78
#> 15 weiblich "Türkei" 9.51
#> 16 weiblich "Andere" 10.6
#> # … with abbreviated variable name ¹`2. Quartil (Median)`
## Keep totals in the output by removing total codes
table$tabulate("C-A11-0") # -> 2 rows: "male" "female"
#> # A STATcubeR tibble: 2 x 6
#> Geschlecht `Arithmetisches Mittel` 1. Quar…¹ 2. Qu…² 3. Qu…³ Zahl …⁴
#> * <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 männlich 19.2 12.8 16.3 22.0 1544588
#> 2 weiblich 15.3 10.4 13.4 17.6 1106350
#> # … with abbreviated variable names ¹`1. Quartil`,
#> # ²`2. Quartil (Median)`, ³`3. Quartil`,
#> # ⁴`Zahl d unselbst Beschäftigten`
table$total_codes(`C-A11-0` = NA)
table$tabulate("C-A11-0") # -> 3 rows: "total", "male", "female"
#> # A STATcubeR tibble: 3 x 6
#> Geschlecht `Arithmetisches Mittel` 1. Quar…¹ 2. Qu…² 3. Qu…³ Zahl …⁴
#> * <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 insgesamt 17.6 11.6 15.1 20.1 2650938
#> 2 männlich 19.2 12.8 16.3 22.0 1544588
#> 3 weiblich 15.3 10.4 13.4 17.6 1106350
#> # … with abbreviated variable names ¹`1. Quartil`,
#> # ²`2. Quartil (Median)`, ³`3. Quartil`,
#> # ⁴`Zahl d unselbst Beschäftigten`
## table$tabulate(...) is an alias for sc_tabulate(table, ...)
sc_tabulate(table, "C-A11-0")
#> # A STATcubeR tibble: 3 x 6
#> Geschlecht `Arithmetisches Mittel` 1. Quar…¹ 2. Qu…² 3. Qu…³ Zahl …⁴
#> * <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 insgesamt 17.6 11.6 15.1 20.1 2650938
#> 2 männlich 19.2 12.8 16.3 22.0 1544588
#> 3 weiblich 15.3 10.4 13.4 17.6 1106350
#> # … with abbreviated variable names ¹`1. Quartil`,
#> # ²`2. Quartil (Median)`, ³`3. Quartil`,
#> # ⁴`Zahl d unselbst Beschäftigten`
######################### STATcube REST API #################################
table_tourism <- sc_example("accomodation.json") %>% sc_table("de")
table_tourism$tabulate()
#> # A STATcubeR tibble: 2,448 x 5
#> `Saison/Tourismusmonat` Herkunftsland Beherbergun…¹ Übern…² Ankün…³
#> * <date> <fct> <fct> <dbl> <dbl>
#> 1 1999-11-01 Österreich Hotels gesamt 858602 371215
#> 2 1999-11-01 Österreich Ferienwohnun… 22398 5365
#> 3 1999-11-01 Österreich Sonstige 386865 67272
#> 4 1999-11-01 Deutschland Hotels gesamt 620973 223856
#> 5 1999-11-01 Deutschland Ferienwohnun… 56012 13095
#> 6 1999-11-01 Deutschland Sonstige 85583 24262
#> 7 1999-11-01 andere Länder Hotels gesamt 544074 235343
#> 8 1999-11-01 andere Länder Ferienwohnun… 30265 6493
#> 9 1999-11-01 andere Länder Sonstige 41945 14433
#> 10 1999-12-01 Österreich Hotels gesamt 928834 346040
#> # … with 2,438 more rows, and abbreviated variable names
#> # ¹Beherbergungsbetrieb, ²Übernachtungen, ³Ankünfte
table_tourism$tabulate("Saison/Tourismusmonat")
#> # A STATcubeR tibble: 272 x 3
#> `Saison/Tourismusmonat` Übernachtungen Ankünfte
#> * <date> <dbl> <dbl>
#> 1 1999-11-01 2646717 961334
#> 2 1999-12-01 7373034 1771530
#> 3 2000-01-01 12437812 2462258
#> 4 2000-02-01 13480000 2532404
#> 5 2000-03-01 11340174 2273436
#> 6 2000-04-01 6138146 1666777
#> 7 2000-05-01 5150347 1673232
#> 8 2000-06-01 8908931 2438518
#> 9 2000-07-01 14618147 3118576
#> 10 2000-08-01 16498662 3248084
#> # … with 262 more rows
table_tourism$tabulate("Saison/Tourismusmonat", "Ankünfte")
#> # A STATcubeR tibble: 272 x 2
#> `Saison/Tourismusmonat` Ankünfte
#> * <date> <dbl>
#> 1 1999-11-01 961334
#> 2 1999-12-01 1771530
#> 3 2000-01-01 2462258
#> 4 2000-02-01 2532404
#> 5 2000-03-01 2273436
#> 6 2000-04-01 1666777
#> 7 2000-05-01 1673232
#> 8 2000-06-01 2438518
#> 9 2000-07-01 3118576
#> 10 2000-08-01 3248084
#> # … with 262 more rows
table_tourism$tabulate("Ankünfte")
#> # A STATcubeR tibble: 2,448 x 4
#> `Saison/Tourismusmonat` Herkunftsland Beherbergungsbetrieb Ankün…¹
#> * <date> <fct> <fct> <dbl>
#> 1 1999-11-01 Österreich Hotels gesamt 371215
#> 2 1999-11-01 Österreich Ferienwohnungen insg… 5365
#> 3 1999-11-01 Österreich Sonstige 67272
#> 4 1999-11-01 Deutschland Hotels gesamt 223856
#> 5 1999-11-01 Deutschland Ferienwohnungen insg… 13095
#> 6 1999-11-01 Deutschland Sonstige 24262
#> 7 1999-11-01 andere Länder Hotels gesamt 235343
#> 8 1999-11-01 andere Länder Ferienwohnungen insg… 6493
#> 9 1999-11-01 andere Länder Sonstige 14433
#> 10 1999-12-01 Österreich Hotels gesamt 346040
#> # … with 2,438 more rows, and abbreviated variable name ¹Ankünfte
## TODO: param annotations does not work currently
if (FALSE) {
table_trade <- sc_example("foreign_trade.json") %>% sc_table("de")
tt <- sc_tabulate(table_trade, "Berichtsjahr", "Import, Wert in Euro",
annotations = TRUE)
tt
tt[['Import, Wert in Euro_a']] %>% str()
}