School Data Category Taxonomy
Source:DATA-CATEGORY-TAXONOMY.md
Canonical taxonomy of data types that US state Departments of Education publish about schools. Built from an inventory of njschooldata (the mothership package) and research across state DOE portals (CA CDE, TX TEA, NY NYSED, OH DEW, FL DOE), ESSA requirements, and CRDC reporting.
Current State of Affairs
njschooldata covers ~13 data domains with 170+ functions. The other 49 state packages mostly cover enrollment only, with a handful having assessments and graduation rates.
Tier 1 — Core (every state DOE publishes this)
| # | Category | njschooldata? | 49 states? | Notes |
|---|---|---|---|---|
| 1 | Enrollment & Demographics | Yes | Yes (all 50) | Race, gender, grade level, FRL |
| 2 | Assessments / Test Scores | Yes | ~5 states | State tests, proficiency levels, growth |
| 3 | Graduation Rates | Yes | ~3 states | 4-yr ACGR, extended rates, diploma types |
| 4 | School/District Directory | Yes | ~2 states | Names, IDs, addresses, coordinates, locale |
1. Enrollment & Demographics
- Total headcount by school, district, state
- Enrollment by grade level (PK through 12)
- Enrollment by race/ethnicity (White, Black, Hispanic, Asian, Native American, Pacific Islander, Multiracial)
- Enrollment by gender
- Gender-race intersections (e.g., White_M, Black_F)
- Free and Reduced-Price Lunch (FRPL) eligibility
- Direct certification (SNAP, TANF, foster, homeless, migrant)
- Community Eligibility Provision (CEP) participation
- English Learners / LEP counts by grade and home language
- Students with Disabilities / IEP counts by disability category
- Gifted and Talented enrollment
- Homeless, foster youth, migrant, military-connected students
- Charter, magnet, virtual school enrollment
- Pre-K / early childhood enrollment
2. Assessments / Test Scores
- State standardized test proficiency rates (ELA, Math, Science, Social Studies)
- Performance level distributions (Below Basic through Advanced)
- Disaggregated by race, gender, economic status, disability, EL status
- Student growth measures (growth percentiles, value-added)
- End-of-course exams (Algebra I, Biology, US History)
- Alternate assessments for students with significant cognitive disabilities
- Kindergarten readiness assessments
- Third-grade reading proficiency benchmarks
- Legacy assessment systems (states transition every ~10 years)
Student Growth Measures
Growth data quantifies how much individual students improve year-over-year, independent of absolute proficiency level. Three major model families are used across states:
- SGP (Student Growth Percentiles): ~24 states. Compares each student’s growth to academic peers. Scale: 1-99.
- EVAAS/VAM (Value-Added Models): ~8 states. Regression-based teacher/school effectiveness measures.
- Custom models: ~6 states use growth-to-proficiency, gain scores, or status-and-change approaches.
- ~20 states publish school-level growth data in downloadable bulk files.
See GROWTH-SGP-VAM-LANDSCAPE.md for the full 50-state survey with download URLs and implementation priorities.
3. Graduation Rates
- 4-year adjusted cohort graduation rate (ACGR) — federally required
- 5-year and 6-year extended graduation rates
- Disaggregated by all ESSA subgroups
- Diploma types (standard, advanced, honors, alternative, GED)
- Graduation pathway endorsements (career-ready, college-ready, seal of biliteracy)
- Dropout rates (annual and cohort)
- Grade retention rates
4. School/District Directory
- School name, address, phone, website
- NCES and state school/district IDs
- Geographic coordinates (lat/long)
- Locale classification (city, suburban, town, rural)
- County, congressional district
- School type (traditional, charter, magnet, alternative, virtual, CTE)
- Title I status, grade span, enrollment capacity
Tier 2 — ESSA-Required (federally mandated reporting)
| # | Category | njschooldata? | 49 states? | Notes |
|---|---|---|---|---|
| 5 | Per-Pupil Expenditure | No | No | School-level spending — ESSA mandated this |
| 6 | Accountability Ratings | Partial (ESSA status) | No | A-F grades, star ratings, CSI/TSI lists |
| 7 | Chronic Absenteeism | Yes | No | Most common ESSA “5th indicator” |
| 8 | English Learner Progress | Yes (ACCESS) | No | ELP assessment, reclassification rates |
| 9 | Special Ed Classification | Yes | No | IDEA disability categories, placement settings |
5. Per-Pupil Expenditure
- Total per-pupil expenditure at school level (new under ESSA)
- Spending by funding source (federal, state, local)
- Instructional vs. non-instructional spending
- Personnel vs. non-personnel costs
6. Accountability Ratings
- Overall school/district ratings (A-F, stars, color-coded)
- Component scores (achievement, growth, gap closing, graduation, etc.)
- CSI (Comprehensive Support and Improvement) school lists
- TSI (Targeted Support and Improvement) school lists
- ATSI (Additional Targeted Support) school lists
- Improvement plan status and exit criteria
7. Chronic Absenteeism
- Chronic absenteeism rate (10%+ of school days missed)
- Disaggregated by race, gender, economic status, disability, EL
- Average daily attendance (ADA) and membership (ADM)
- Truancy rates
- Student mobility/stability rates
8. English Learner Progress
- ELP assessment results (WIDA ACCESS, ELPAC, or state-specific)
- Reclassification / redesignation rates (EL to Fluent English Proficient)
- Long-term English Learner counts
- Home language distribution
- Recently arrived immigrant counts
9. Special Education
- Students with disabilities by IDEA disability category (13 categories)
- Educational placement/setting (regular class 80%+, resource, self-contained, separate)
- Least Restrictive Environment (LRE) percentages
- IDEA performance indicators (SPP/APR)
- Disproportionate representation by race/ethnicity
- Dispute resolution data (mediations, due process)
Tier 3 — Commonly Published (most state DOEs have this)
| # | Category | njschooldata? | 49 states? | Notes |
|---|---|---|---|---|
| 10 | Discipline | Yes | No | Suspensions, expulsions, by demographic |
| 11 | Teacher/Staff Data | Yes | No | Demographics, experience, qualifications, ratios |
| 12 | College-Going Rates | Yes (postsecondary) | No | Enrollment in 2yr/4yr within 16 months |
| 13 | Dropout Rates | No | No | Annual and cohort dropout rates |
| 14 | SAT/ACT Scores | Yes | No | Participation and performance |
| 15 | AP/IB Participation | Yes | No | Enrollment and pass rates |
| 16 | CTE (Career/Technical Ed) | Partial | No | Programs, concentrators, credentials |
10. Discipline
- In-school suspensions (count and rate)
- Out-of-school suspensions (count and rate)
- Expulsions
- Disciplinary alternative education placements
- All disaggregated by race, gender, disability, EL status
- Restraint and seclusion incidents
- Referrals to law enforcement, school-related arrests
- Bullying/harassment incidents (including cyberbullying)
- Violent incidents, weapon possession, drug/alcohol incidents
11. Teacher/Staff Data
- Teacher FTE counts by school/district
- Student-teacher ratios
- Teacher race/ethnicity and gender
- Years of experience distribution
- Certification type (standard, provisional, emergency, alternative)
- In-field vs. out-of-field teaching rates
- National Board Certified teacher counts
- Advanced degree attainment
- Support staff ratios (counselors, psychologists, nurses, social workers, librarians)
- Principal demographics and turnover
12. College-Going Rates
- Enrollment in postsecondary within 12/16 months
- Institution type (2-year, 4-year, public, private)
- College persistence / retention after first year
- Remediation rates (developmental course placement)
13. Dropout Rates
- Annual dropout rate by grade
- Cohort dropout rate
- Dropout reasons (where collected)
- Dropout recovery program metrics
14. SAT/ACT Scores
- Participation rates by school/district
- Average scores (composite and by section)
- College readiness benchmark attainment
- PSAT/NMSQT results
Tier 4 — Rich Data (well-resourced state DOEs publish this)
| # | Category | njschooldata? | 49 states? | Notes |
|---|---|---|---|---|
| 17 | Course Enrollment | Yes | No | Math, science, CS, arts, world language |
| 18 | School Finance (full) | No | No | Revenue sources, expenditures by function |
| 19 | Teacher Salaries/Vacancies | No | No | Salary schedules, shortage areas, turnover |
| 20 | Class Size | No | No | By grade and subject |
| 21 | School Climate Surveys | No | No | Student/parent/staff perception surveys |
| 22 | Gifted & Talented | No | No | Identification rates, program enrollment |
17. Course Enrollment
- Math courses (Algebra, Geometry, Calculus, AP Math)
- Science courses (Biology, Chemistry, Physics, AP Science)
- Computer science courses
- Arts and music enrollment
- World language courses by language
- Social studies/history courses
- Physical education
18. School Finance (Full)
- Revenue by source (federal, state, local property tax, other local)
- Expenditure by function (instruction, support, admin, operations, transportation, food service)
- Fund balance / reserve levels
- Bond debt and capital spending
- Financial integrity/transparency ratings (e.g., TX FIRST)
- Audit findings
19. Teacher Salaries & Vacancies
- Average teacher salary by district and experience level
- Salary schedules (step-and-lane tables)
- Vacancy counts by subject area
- Teacher turnover/attrition rates
- Critical shortage area designations
- New hires by pathway (traditional, alternative, out-of-state)
- Emergency/temporary permit counts
- Teacher absenteeism rates
20. Class Size
- Average class size by grade level
- Average class size by subject
- Class size distribution
- State mandate compliance
Tier 5 — Emerging / Specialized
| # | Category | njschooldata? | 49 states? | Notes |
|---|---|---|---|---|
| 23 | Post-Secondary Outcomes | Partial (matric) | No | College persistence, remediation, completion |
| 24 | Workforce Outcomes | No | No | Employment, earnings (TX, FL, VA lead) |
| 25 | School Nutrition | No | No | NSLP/SBP participation, CEP status |
| 26 | Health/Fitness | No | No | BMI, fitness tests, immunization compliance |
| 27 | Facilities | No | No | Building age, capacity, condition |
| 28 | Technology/Connectivity | No | No | Device ratios, broadband, E-Rate |
| 29 | Transportation | No | No | Students bused, miles, fleet data |
| 30 | Pre-K Programs | No | No | State-funded pre-K, Head Start |
23. Post-Secondary Outcomes
- College completion rates (where state data systems link to NSC or state higher ed)
- College persistence by institution type
- Remediation rates in college
- Degree attainment by field
24. Workforce Outcomes
- Employment rates after graduation
- Earnings data (linked via state workforce agencies)
- Military enlistment rates
- Apprenticeship completion and placement
- Industry by sector
25. School Nutrition
- National School Lunch Program (NSLP) participation
- School Breakfast Program (SBP) participation
- Community Eligibility Provision (CEP) adoption
- Fresh Fruit and Vegetable Program participation
26. Health & Fitness
- Immunization compliance rates
- Health screening results (vision, hearing, dental, BMI)
- Physical fitness test results (CA and TX mandate these)
- School nurse staffing levels
27. Facilities
- Building age and condition assessments
- Square footage and capacity
- ADA accessibility compliance
- Environmental hazards (lead, asbestos)
- Capital improvement tracking
- Deferred maintenance estimates
28. Technology & Connectivity
- Device-to-student ratios
- Broadband speed and connectivity
- Wi-Fi classroom coverage
- E-Rate program participation
- Home internet access surveys
Recommended Expansion Priority
Highest-impact order for expanding the 49 state packages beyond enrollment:
| Priority | Category | Rationale |
|---|---|---|
| 1 | Assessments | Most states publish this, huge analytical value, pairs with enrollment for equity stories |
| 2 | Graduation Rates | Second most requested after enrollment, universally available, strong story potential |
| 3 | Per-Pupil Expenditure | ESSA-mandated at school level, relatively new, underexplored, money stories get attention |
| 4 | Accountability Ratings | A-F grades are catnip for data stories, every state has a system |
| 5 | Chronic Absenteeism | Post-pandemic this is the hottest metric in education, ESSA 5th indicator |
| 6 | Teacher/Staff Data | Teacher shortages are a massive ongoing story, demographics + diversity |
| 7 | Discipline | Racial disproportionality in suspensions is evergreen, CRDC makes this high-profile |
Coverage Matrix (njschooldata functions by category)
For reference, here’s what njschooldata already implements per category:
| Category | Key Functions |
|---|---|
| Enrollment |
fetch_enr(), tidy_enr(), fetch_enr_cached()
|
| Assessments |
fetch_parcc(), fetch_njgpa(), fetch_njask(), fetch_hspa(), fetch_gepa()
|
| EL Assessments |
fetch_access(), fetch_all_access()
|
| Graduation |
fetch_grad_rate(), fetch_grad_count(), fetch_6yr_grad_rate()
|
| Directory |
get_school_directory(), get_district_directory()
|
| Accountability |
fetch_essa_status(), fetch_essa_progress()
|
| Chronic Absence |
fetch_chronic_absenteeism(), fetch_days_absent()
|
| Discipline |
fetch_disciplinary_removals(), fetch_violence_vandalism_hib()
|
| Staff |
fetch_staff_demographics(), fetch_staff_ratios(), fetch_teacher_experience()
|
| College/Career |
fetch_sat_participation(), fetch_ap_participation(), fetch_postsecondary()
|
| Courses |
fetch_math_course_enrollment(), fetch_science_course_enrollment(), fetch_cs_enrollment()
|
| CTE |
fetch_cte_participation(), fetch_apprenticeship_data(), fetch_industry_credentials()
|
| Special Ed |
fetch_sped(), clean_sped_df()
|
| Report Cards |
get_one_rc_database(), 63+ SPR sheets via fetch_spr_data()
|
| Socioeconomic |
fetch_dfg(), add_dfg()
|
| Charter Analysis |
charter_market_share(), charter_sector_*_aggs()
|