Load synthesizer
Produce a credible county-level hourly active power demand profile for the six Johor LAs of interest (Iskandar Puteri, Johor Bahru, Pasir Gudang, Kulai, Kota Tinggi, Pontian), spanning 2020-01-01 to the latest backfilled month, in UTC and aligned to the bronze weather grid.
The synthesizer is the methodological spine for everything downstream (DC anchor BTM economics, ENEGEM arbitrage proxy, dispatch sim). Without it those models are unanchored.
Why we need to synthesize at all
Section titled “Why we need to synthesize at all”JB has no public hourly load curve. The available signals are:
| Source | Cadence | Granularity | Status |
|---|---|---|---|
DOSM electricity_consumption | monthly | national peninsular only | implemented |
| TNB ETOU | tariff windows only | implicit time-of-day weights | static reference, no real-time signal |
| ICPT | 6-monthly | flat sen/kWh adjustment | not yet ingested |
| Senai METAR | hourly | point | implemented |
| NASA POWER | hourly | 0.25° gridded | implemented |
| ST annual reports | yearly | state-level mix | not yet ingested |
DOSM gdp_state_real_supply | yearly | state × sector GDP | implemented |
Every public Malaysian load signal is either monthly (DOSM, ST), or implicit (ETOU windows), or both. Hourly is not exposed. We bridge the gap with a weather-driven model whose monthly integral is forced to the DOSM total times a Johor allocation share.
Architecture (signal flow)
Section titled “Architecture (signal flow)” ┌─ POWER 9-pt × hourly T2M, GHI, RH ─┐JB metro climate ──────────►│ │── (silver) └─ METAR WMKJ × hourly tmpc, dwpc ────┘ weather_hourly │ ▼ base hourly shape = CDH_t × ETOU_w(t) × DoW_w(t) │DOSM peninsular monthly ─► Johor allocation share ─► monthly target ▼ (sector="local") (GDP-weighted, +DC adj) normalize │ ▼ county allocation ─► 6 county-level hourly profiles (population × industrial-park weights) │ ▼ gold.load_hourly_countyInputs
Section titled “Inputs”| Input | Symbol | Unit | Source dataset | Notes |
|---|---|---|---|---|
| Hourly temperature at JB centroid | °C | bronze.weather.nasa-power (lat=1.5, lon=103.75) + METAR validation | Use POWER as primary; fall back to METAR-anchored bias correction once silver layer exists | |
| ETOU peak/off-peak mask | binary | reference/tariffs/tnb_etou.yaml | Peak = Mon–Fri 14:00–22:00 MYT, excluding federal+Johor PHs | |
| Day-of-week class | enum | derived from MYT calendar | {weekday, saturday, sunday, ph} | |
| Monthly peninsular consumption | GWh | bronze.macro.dosm.electricity-consumption (sector=“local”) | Excludes T&D losses and exports | |
| Annual Johor share of peninsular GDP | unitless | bronze.macro.dosm.gdp-state-real-supply (sector=“p0”) | Empirical 2023: 11.1% | |
| Johor county population | thousands | bronze.macro.dosm.population-state | Annual; we interpolate to monthly | |
| Johor industrial park land area | km² | manual reference (from IRDA data) | TODO: extract; for v0 use uniform | |
| Data center anchor MW per county | MW | dc_tracker/jb_data_centers.yaml | Adjustment to GDP-based allocation |
Mathematical formulation
Section titled “Mathematical formulation”Step 1 — Hourly base shape
Section titled “Step 1 — Hourly base shape”For each hour (in UTC, MYT-localized for calendar joins), define the unnormalized base shape
where:
- with . JB centroid 5y mean T is 26.94 °C, so this base is just below the mean → cooling-driven response is mostly above water.
- = base load coefficient (always-on industrial + lighting)
- = cooling sensitivity,
- — captures industrial duty cycle compression into peak window
- — Saturday is partial-shift in Johor (note: Johor weekend is Fri–Sat for state government; private sector is more mixed)
The and coefficients are fit by Step 3 calibration.
Step 2 — Johor monthly target
Section titled “Step 2 — Johor monthly target”For month in year :
- = DOSM
electricity_consumption“local” sector for that month (peninsular) - = Johor’s share of peninsular GDP for the year (we use GDP as the best public proxy for state electricity share; caveat: Johor’s industrial energy intensity differs from the peninsular average, so this is biased low — see Limitations below for the planned correction)
- = DC anchor adjustment: when DCs come online, peninsular load ramps but the GDP share doesn’t move accordingly. For each commissioned DC at month , add outside the GDP allocation.
Step 3 — Calibration (per month, per year)
Section titled “Step 3 — Calibration (per month, per year)”For each month, solve for a single scaling constant such that:
This is closed-form: .
The (α, w) coefficients are fit once (across all 5 years of data) by minimizing
the residual sum of squared errors against the calibration targets in §
“Validation strategy” below.
Step 4 — County allocation
Section titled “Step 4 — County allocation”For each hour and county :
with allocation weight
Default starting weights: , , DC = 0.2. DC weight rises mechanically as more DCs commission.
Calibration targets
Section titled “Calibration targets”| Target | Tolerance | Source |
|---|---|---|
| Sum across counties × hours per month = | exact (algebraic by construction) | DOSM monthly |
| ETOU peak share of monthly energy ≥ 0.30 | ±5 pp | inferred from TNB tariff economics + 5y CDH ratio (peak/offpeak = 1.72×) |
| CDH sensitivity ∈ [0.5, 1.5] % per °C above base | as range | published industrial CDH studies for tropical Asia |
| Saturday/Sunday valley = 0.85 / 0.75 of weekday daily total | ±10 pp | analog reference (Singapore EMA shape) |
| 24-hour autocorrelation > 0.6 | as bound | smoothness sanity |
Validation strategy
Section titled “Validation strategy”- Total-energy reconciliation: monthly must equal exactly (algebraic). Test on every month.
- Shape backtest against analog markets: compare hourly profile shape (peak hour, peak/valley ratio, weekend depth) against Singapore EMA system load — same climate, different demand structure. Target Pearson r > 0.7 on hour-of-week pattern.
- Cross-source weather sensitivity: re-run synthesizer using METAR temperatures instead of POWER. Output should be within 5% in monthly energy and within 1% in daily peak hour. This catches over-reliance on a single weather source.
- Single-customer ground truth (when obtained): any single Johor industrial consumer with a 15-min profile is gold. The synthesizer’s profile for that customer’s county should correlate r > 0.6 with the real curve at hourly resolution.
Empirical climate priors (5y backfill, JB centroid lat=1.5, lon=103.75)
Section titled “Empirical climate priors (5y backfill, JB centroid lat=1.5, lon=103.75)”These are baseline expectations the synthesizer should reproduce:
| Metric | Empirical value | Implication |
|---|---|---|
| Diurnal T range | 25.7 °C (04:00 MYT) → 28.5 °C (15:00 MYT) | ~3 °C swing — modest but consistent |
| Annual T range | 26.2 °C (Jan mean) → 28.5 °C (May mean) | secondary peak in Oct (27.7 °C) — equatorial bimodal |
| Mean GHI peak month | March (220 W/m² hourly mean) | NE monsoon transition |
| Mean GHI trough month | November (174 W/m² hourly mean) | SW monsoon onset clouds |
| Max instantaneous GHI | 989 W/m² | clear-sky boundary, March |
| ETOU peak window CDH (sum 5y) | 23,542 °C·h over 10,440 hrs | per-hour mean 2.25 °C·h |
| ETOU off-peak CDH (sum 5y) | 43,893 °C·h over 33,408 hrs | per-hour mean 1.31 °C·h |
| Peak/off-peak CDH ratio | 1.72× | physical justification for |
| POWER vs METAR T mean bias | +0.10 °C (POWER warmer) | small enough to use POWER directly without bias correction |
| POWER vs METAR T std delta | 1.87 °C | grid-cell vs point variance — expected |
| POWER vs METAR T correlation | 0.769 | use METAR for validation, not as primary input (point obs can miss spatial structure) |
Empirical macro priors (DOSM 2018-06 to 2024-06)
Section titled “Empirical macro priors (DOSM 2018-06 to 2024-06)”| Metric | Value | Note |
|---|---|---|
Peninsular local consumption 2018 | 145.2 TWh | pre-COVID baseline |
Peninsular local consumption 2020 | 144.8 TWh | COVID dip |
Peninsular local consumption 2023 | 163.3 TWh | full-year recovery + growth |
Peninsular local H1 2024 annualized | ~173 TWh | +6% YoY — DC buildout signal |
| Johor share of peninsular GDP 2023 | 11.1% | starting allocation share |
2018→2023 CAGR of local | +2.4 % | sub-GDP-growth — efficiency gains and pre-DC |
| 2023→2024-H1 implied YoY | +6 % | DC ramp visible |
Limitations and when each binds
Section titled “Limitations and when each binds”| Limitation | When it binds | Mitigation |
|---|---|---|
| GDP-share allocation underweights Johor’s industrial intensity | Always; ~5 pp underestimate likely | Replace with ST annual report Johor-share when ingested (Stage 2 PDF parser) |
| County weights are crude (population + DC, no industrial mix) | Always for sub-state work | Manually curate industrial park MW estimates from IRDA / MIDA |
| No real customer profile to calibrate against | Until we obtain one | Treat synthesizer as a structural model, not a forecast |
| ICPT and tariff revisions not modeled | Stage 4 onwards (BTM economics) | Static reference YAML for now; ICPT ingester next |
| Public holiday calendar manual (annual update) | Each new year | Cron alert in Stage 2; for now we maintain in tnb_etou.yaml |
| ENEGEM clearing not in synthesizer (it’s downstream demand response) | Stage 5 dispatch sim | Modeled as price-driven export, not load |
| MERRA-2 native res ~0.5° — our 0.25° grid is interpolation | Spatial allocation across the 9 points | Use POWER for time-shape only; do not infer spatial gradients within JB |
| Saturday-as-weekend depends on customer (Johor state govt: Fri-Sat; private: Sat-Sun) | When modeling specific industrial customers | Default to Sat-Sun; flag DC anchors as 7-day |
Phased implementation
Section titled “Phased implementation”| Phase | What | Output |
|---|---|---|
| 3a | Build silver.weather_hourly view: join POWER (centroid) + METAR + derived MYT calendar | parquet view + DuckDB query layer |
| 3b | Coefficient fit via OLS: vs implied monthly-disaggregated demand | gold.synth_coefficients + diagnostics report |
| 3c | County allocation as a separate transform that consumes 3b output + dc_tracker | gold.load_hourly_county |
| 3d | Validation harness: all 4 calibration targets above; produce a one-page red/green report each run | reports/load_synth_<run_id>.md |
Phase 3a + 3b are MVP. 3c is gated on having county-level industrial park data (IRDA). 3d is mandatory before any downstream model consumes this output.
Decisions to NOT relitigate (until evidence forces a revisit)
Section titled “Decisions to NOT relitigate (until evidence forces a revisit)”- POWER as primary, METAR as validator only. 5y bias is +0.10 °C — too small to bother with bias-correction in production. METAR is for validation, not a fallback temperature feed.
- GDP-share allocation, not direct ST state consumption. ST publishes Johor state electricity in their annual report PDF, but it lags 12–18 months and the categorical breakdown changes year-over-year. GDP share is faster, more consistent, and bias-corrected once we obtain a single anchor calibration.
- Linear CDH model, not quadratic. Tropical demand is well-modelled by piecewise linear above ; quadratic terms over-fit for an equatorial 3 °C diurnal range.
- No HDH (heating) term. JB never sees heating demand.
- Public holidays treated as Sundays for ETOU. Aligns with TNB’s actual tariff schedule.