Skip to content

Raw data sources & manifests

Every public dataset feeding this research is logged in the manifest table below. The site reads it directly from a Cloudflare D1 database (jb-vpp-data), so the ledger is always live — re-running the ingest pipeline updates the page.

Why this matters for the IC: the model isn’t built on opinion or vendor brochures. Every number traces back to a public source URL, a timestamped ingest run, and a row count. If a regulator asks “where did this come from?” we answer with a hash and a URL.

All raw data sources — live from D1 via /api/manifests
Source Dataset Ingested at Source URL Rows Bytes
Loading…
  • NASA POWER — 5-year hourly meteorology over a 9-point Johor grid (T2M, GHI, DNI, DHI, wind, RH). Time-standard verified UTC.
  • Senai METAR (WMKJ) — 5-year hourly METAR observations cross-validated against NASA POWER (T-bias 0.10 °C over 43,818 aligned hours).
  • DOSM — peninsular monthly electricity consumption + supply, GDP-state-real, population-state. Underlies the synthesised Johor load profile.
  • Ember — Malaysia monthly electricity mix (44% coal, 33% gas, 23% other → 631 gCO₂/kWh grid intensity).
  • TNB ICPT / ETOU / E-rates — manual extracts from official tariff PDFs, version-bumped when the source changes. Static reference YAML in reference/.

Below are the same datasets bundled directly into this site (no DB call required), useful for offline reading or notebook prototyping: