Stats NZ data via eolas
Statistics New Zealand is the national statistical agency. eolas serves 415 datasets from Stats NZ — the largest single source in the catalogue — covering everything from quarterly CPI prints to 2023-vintage census meshblock boundaries.
This page is the orientation guide. For specific datasets, browse eolas.fyi/datasets?source=Stats+NZ.
What's in the catalogue
Stats NZ datasets fall into seven broad categories. Counts are approximate; check the live API for current totals.
Macroeconomic indicators
CPI, GDP, unemployment, balance-of-payments. Quarterly series for most. See also the OECD source guide — OECD provides the same headline indicators with international comparability.
Business demography (BDS)
Enterprise counts, business births and deaths, industry breakdowns, geographic units by region and size. ~30 datasets prefixed bds_.
df = client.statsnz("bds_enterprises_industry_size")
df = client.statsnz("bds_geographic_units_births_deaths")
Population estimates and projections
National + sub-national estimated resident population, projections to 2048, by age / sex / ethnicity / Māori-descent. Prefixed popes_ (estimates) and poppr_ (projections).
Productivity and earnings
Labour productivity, multifactor productivity, LEED (linked employer-employee data), wages by industry / occupation / region.
df = client.statsnz("prd_labour_productivity_growth")
df = client.statsnz("leed_q_measures_industry")
Justice and social
Charges, convictions, youth justice, household expenditure, household income. Prefixed jus_, hes_ (household expenditure), inc_ (income).
df = client.statsnz("jus_charges_by_offence_fiscal")
df = client.statsnz("hes_expenditure_category")
Iwi statistics (2018 census)
Iwi affiliation, iwi grouping counts for the Māori-descent population. 20+ datasets prefixed iwi18_. Note: 2023 census iwi frame not yet published; 2018 is current.
Geospatial boundaries + census
About ~230 of the 415 Stats NZ datasets are geospatial — census boundaries at every vintage Stats NZ has published, plus the census + population data already tabulated against those boundaries. All ship with geometry_wkt and load as sf / GeoDataFrame when the geo extras are installed.
The set splits into three layers:
Boundary geometries. Pure polygons keyed to a vintage. Used to put data on a map.
| Geography | 2023 vintage | 2018 vintage | 2013 vintage |
|---|---|---|---|
| Meshblock (~50k blocks) | nz_meshblock_2023 |
nz_meshblock_2018 |
nz_meshblock_2013 |
| SA1 (~30k blocks) | nz_statistical_area_1_2023 |
nz_statistical_area_1_2018 |
— |
| SA2 (~2k blocks) | nz_statistical_area_2_2023 |
nz_statistical_area_2_2018 |
nz_statistical_area_2_2013 |
| SA3 | nz_statistical_area_3_2023 |
— | — |
| Territorial authority | nz_territorial_authority_2023 |
nz_territorial_authority_2018 |
nz_territorial_authority_2013 |
| Regional council | nz_regional_council_2023 |
nz_regional_council_2018 |
— |
| Urban / rural | nz_urban_rural_2023 |
nz_urban_rural_2018 |
— |
| Urban area | nz_urban_area_2023 |
nz_urban_area_2018 |
— |
| Ward | nz_ward_2025 |
nz_ward_2020 |
— |
| Community board | nz_community_board_2025 |
nz_community_board_2020 |
— |
| Constituency / iwi / electoral | available per-vintage — search the catalogue |
Census tables, vintaged against boundaries. The 2023 + 2018 + 2013 censuses each have ~15-20 datasets where the underlying census records have already been aggregated to a boundary geography. Use these for "show me a map of X by SA2" without doing the boundary join yourself.
# 2023 census usually-resident population, by SA2, ready to map
pop = client.statsnz("census_2023_pop_change_sa2", as_sf=True)
pop.plot(column="usually_resident_2023", legend=True, cmap="viridis")
Common 2018 datasets: census_2018_dwelling_sa1, census_2018_household_sa1, census_2018_individual_sa1_p1/p2/p3a/p3b (individual variables split across four parts due to row width), census_2018_pop_age_sa2, census_2018_rc_urban_accessibility, census_2018_electoral_mb_2020.
2023 has 18 equivalent tables; 2013 has 3 (the older census results are mostly available via SDMX time-series in the existing tables above).
Population estimates against boundaries (post-census). Stats NZ publishes Estimated Resident Population (ERP) updates between censuses. The popes_* and poppr_* tables come pre-tabulated against modern boundaries:
# Sub-national population estimates by SA2 (current frame)
sa2_pop = client.statsnz("popes_sub_rc_sa2")
# Projected population by SA2 to 2048 (2023 base)
proj = client.statsnz("poppr_sub_sa23_2023")
Boundary vintages matter — see the vintaged columns convention for joining historical census data to current geographies. Don't assume a meshblock or SA code from the 2013 vintage maps cleanly to 2018 or 2023 — Stats NZ redraws boundaries each census, so use the matched vintage on both sides of a join.
Refresh schedule
Most Stats NZ pipelines run weekly, Wednesday morning NZ time. Stats NZ itself publishes on its own schedule — CPI is quarterly, business demography annual, population estimates quarterly. Our refresh fires once a week regardless; if the upstream hasn't changed, the data is identical to last week.
You can check the freshness of any specific dataset via the metadata endpoint:
meta = client.info("nz_cpi")
meta["last_refreshed_at"] # our last pull
meta["source_last_modified_at"] # when Stats NZ last touched the file (where capturable)
License
All Stats NZ data is published under Creative Commons Attribution 4.0 (CC-BY 4.0). You can use it commercially, derive from it, and redistribute — with attribution. eolas serves the data unchanged; attribution requirements transfer to you when you redistribute.
Recommended attribution: "Source: Stats NZ, served via eolas (eolas.fyi). CC-BY 4.0."
Common patterns
CPI over time
Census boundary + value join
The classic geo-demographic pattern: load boundaries, join your own data:
# 2023 SA2 boundaries (about 2,000 polygons, fits in Free tier)
sa2 = client.statsnz_geo("nz_statistical_area_2_2023", as_sf=True)
# Your own analysis data (with sa2_code_2023 column)
import pandas as pd
survey = pd.read_csv("my_survey.csv")
merged = sa2.merge(survey, on="sa2_code_2023")
merged.plot(column="my_metric", legend=True, cmap="viridis")
Discovering Stats NZ datasets
# CLI
eolas datasets list --source "Stats NZ" --search population
# Python
[d for d in client.list("Stats NZ") if "population" in d["name"]]
# R
sn <- eolas_list_statsnz()
sn[grepl("population", sn$name), ]
Pipeline use
Stats NZ datasets are full-snapshot — the upstream SDMX source replaces the whole table on each publish. When you call eolas sync on a Stats NZ dataset, it issues a lightweight HEAD check and only re-downloads when the snapshot has changed. If the snapshot is unchanged, no bytes are transferred.
In practice, CPI and GDP tables are 1–5 MB — a weekly re-download is negligible. The geospatial boundary tables (meshblocks, SA2s) can be larger, but they change infrequently; most weeks the sync call returns "unchanged".
from eolas_data import Client
client = Client("your_eolas_key")
# First call: full download; subsequent calls are no-ops when snapshot is unchanged
result = client.sync_bulk("nz_cpi", path="/data/nz_cpi.parquet")
print(result.status) # "downloaded" (first time) or "unchanged"
import pandas as pd
df = pd.read_parquet("/data/nz_cpi.parquet")
See the Bulk downloads guide for cron, Airflow, and dbt integration recipes.
Source-specific notes
- SDMX origin: most Stats NZ time-series come from their SDMX API. Multi-dimensional series (e.g. CPI by quarter × group) are flattened into a long-format
(date, period, value, ...)table. Theperiodcolumn carries the original SDMX period code (e.g.2024Q1). - Vintage-suffixed columns: any geospatial column drawn from the 2023 census frame carries
_2023(e.g.meshblock_id_2023,sa2_code_2023). 2018-frame columns carry_2018. Don't assume codes match across vintages — boundaries are redrawn every 5 years. - Suppressed counts: for privacy, Stats NZ rounds many small-count cells to base-3 and suppresses cells below threshold (typically
<6). These appear asnullwith a companion_suppressed=trueflag in the response. - Provisional vs final: some boundary tables exist in both forms (e.g.
nz_ward_2025_v2_provisionalandnz_ward_2025). The provisional version is usually the working draft Stats NZ released for public consultation; the un-suffixed version is the final.
Where to find more
- All Stats NZ datasets on eolas: eolas.fyi/datasets?source=Stats+NZ
- Stats NZ's own data portal: www.stats.govt.nz — SDMX API, downloads, methodology
- Stats NZ Geospatial (Datafinder): datafinder.stats.govt.nz — the WFS/Koordinates source for our boundary tables
- Open data licence summary: data.govt.nz/about/open-data-nzgoal
Related
- OECD source guide — international-comparable indicators
- Examples — worked code recipes
- Authentication — how to set your API key