Getting Started with ilschooldata • ilschooldata

Installation

Install from GitHub:

# install.packages("remotes")
remotes::install_github("almartin82/ilschooldata")

Quick Example

Fetch the most recent year of Illinois enrollment data:

library(ilschooldata)
library(dplyr)

# Fetch 2025 enrollment data (2024-25 school year)
enr <- fetch_enr(2025, use_cache = TRUE)

head(enr)

## # A tibble: 6 × 18
##   end_year rcdts     type  district_id district_name school_id school_name city 
##      <int> <chr>     <chr> <chr>       <chr>         <chr>     <chr>       <chr>
## 1     2025 65-000-0… Stat… 65-000-0000 NA            -80-      NA          NA   
## 2     2025 01-009-2… Dist… 01-009-2620 A-C Central … -26-      NA          Ashl…
## 3     2025 01-009-2… Scho… 01-009-2620 A-C Central … -26-      A-C Centra… Chan…
## 4     2025 01-009-2… Scho… 01-009-2620 A-C Central … -26-      A-C Centra… Ashl…
## 5     2025 01-009-2… Scho… 01-009-2620 A-C Central … -26-      A-C Centra… Ashl…
## 6     2025 33-048-2… Dist… 33-048-2760 Abingdon-Avo… -26-      NA          Abin…
## # ℹ 10 more variables: county <chr>, grade_level <chr>, subgroup <chr>,
## #   n_students <dbl>, pct <dbl>, is_state <lgl>, is_district <lgl>,
## #   is_school <lgl>, is_charter <lgl>, aggregation_flag <chr>

Understanding the Data

The data is returned in tidy (long) format by default:

Each row is one subgroup for one school/district/state
subgroup identifies the demographic group (e.g., “total_enrollment”, “white”, “hispanic”, “econ_disadv”)
grade_level shows the grade (“TOTAL”, “K”, “01”, “02”, etc.)
n_students is the enrollment count
pct is the percentage of total enrollment

enr %>%
  filter(is_state) %>%
  select(end_year, type, subgroup, grade_level, n_students) %>%
  head(10)

## # A tibble: 10 × 5
##    end_year type      subgroup         grade_level n_students
##       <int> <chr>     <chr>            <chr>            <dbl>
##  1     2025 Statewide total_enrollment TOTAL          1848560
##  2     2025 Statewide white            TOTAL           818912
##  3     2025 Statewide black            TOTAL           301315
##  4     2025 Statewide hispanic         TOTAL           528688
##  5     2025 Statewide asian            TOTAL           105368
##  6     2025 Statewide native_american  TOTAL             3697
##  7     2025 Statewide pacific_islander TOTAL             1849
##  8     2025 Statewide multiracial      TOTAL            83185
##  9     2025 Statewide special_ed       TOTAL           375258
## 10     2025 Statewide lep              TOTAL           323498

Filtering by Level

Use the aggregation flags to filter data:

# State totals
state <- enr %>% filter(is_state, subgroup == "total_enrollment", grade_level == "TOTAL")
state %>% select(end_year, n_students)

## # A tibble: 1 × 2
##   end_year n_students
##      <int>      <dbl>
## 1     2025    1848560

# All districts
districts <- enr %>% filter(is_district, subgroup == "total_enrollment", grade_level == "TOTAL")
nrow(districts)

## [1] 864

# All schools
schools <- enr %>% filter(is_school, subgroup == "total_enrollment", grade_level == "TOTAL")
nrow(schools)

## [1] 3825

Simple Analysis: Top 10 Districts

enr %>%
  filter(is_district, subgroup == "total_enrollment", grade_level == "TOTAL") %>%
  arrange(desc(n_students)) %>%
  select(district_name, city, n_students) %>%
  head(10)

## # A tibble: 10 × 3
##    district_name                       city       n_students
##    <chr>                               <chr>           <dbl>
##  1 Chicago Public Schools District 299 Chicago        323047
##  2 SD U-46                             Elgin           33525
##  3 Rockford SD 205                     Rockford        28162
##  4 Indian Prairie CUSD 204             Aurora          25932
##  5 Plainfield SD 202                   Plainfield      24411
##  6 CUSD 300                            Algonquin       20392
##  7 CUSD 308                            Oswego          16601
##  8 Naperville CUSD 203                 Naperville      15899
##  9 Schaumburg CCSD 54                  Schaumburg      15266
## 10 Valley View CUSD 365U               Romeoville      14529

Wide Format

If you prefer wide format (one column per demographic), set tidy = FALSE:

enr_wide <- fetch_enr(2025, tidy = FALSE, use_cache = TRUE)

enr_wide %>%
  filter(type == "Statewide") %>%
  select(end_year, row_total, white, black, hispanic, asian, econ_disadv)

## # A tibble: 1 × 7
##   end_year row_total  white  black hispanic  asian econ_disadv
##      <int>     <dbl>  <dbl>  <dbl>    <dbl>  <dbl>       <dbl>
## 1     2025   1848560 818912 301315   528688 105368      918734

Historical Data

Fetch multiple years to analyze trends:

# Fetch 5 years of data
years <- 2021:2025
all_enr <- purrr::map_df(years, ~fetch_enr(.x, use_cache = TRUE))

# State enrollment trend
all_enr %>%
  filter(is_state, subgroup == "total_enrollment", grade_level == "TOTAL") %>%
  select(end_year, n_students)

## # A tibble: 5 × 2
##   end_year n_students
##      <int>      <dbl>
## 1     2021    1887316
## 2     2022    1869325
## 3     2023    1857790
## 4     2024    1851290
## 5     2025    1848560

Next Steps

See vignette("diagnostic-plots") for visualization examples
See vignette("cohort-survival-forecasts") for cohort analysis methodology
Use ?fetch_enr for full function documentation

Session Info

sessionInfo()

## R version 4.5.2 (2025-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
##  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
##  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
## [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
## 
## time zone: UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] dplyr_1.2.0        ilschooldata_0.1.0
## 
## loaded via a namespace (and not attached):
##  [1] vctrs_0.7.1       cli_3.6.5         knitr_1.51        rlang_1.1.7      
##  [5] xfun_0.56         purrr_1.2.1       generics_0.1.4    textshaping_1.0.5
##  [9] jsonlite_2.0.0    glue_1.8.0        htmltools_0.5.9   ragg_1.5.1       
## [13] sass_0.4.10       rappdirs_0.3.4    rmarkdown_2.30    tibble_3.3.1     
## [17] evaluate_1.0.5    jquerylib_0.1.4   fastmap_1.2.0     yaml_2.3.12      
## [21] lifecycle_1.0.5   compiler_4.5.2    codetools_0.2-20  fs_1.6.7         
## [25] pkgconfig_2.0.3   systemfonts_1.3.2 digest_0.6.39     R6_2.6.1         
## [29] utf8_1.2.6        tidyselect_1.2.1  pillar_1.11.1     magrittr_2.0.4   
## [33] bslib_0.10.0      withr_3.0.2       tools_4.5.2       pkgdown_2.2.0    
## [37] cachem_1.1.0      desc_1.4.3