planning

Sprint Overview

Sprint tracking and deliverables. Updated after each sprint review.

Sprint 1SetupFeb 19 – Feb 26, 2026

meeting notes ↗completed

goal

Set up the project infrastructure — repo, tooling, documentation, architecture decisions, and data structure analysis.

deliverables

—

GitHub repo structured and scaffolded

—

36 issues created with labels and milestones

—

GitHub Project board live

—

Notion ↔ GitHub sync (auto + manual)

—

Architecture diagram finalized

—

Data structure fully documented from real JSON files

—

Project website live on Vercel

—

Stack decided: Python + custom watcher loop, PostgreSQL, KNIME, Power BI

—

Architecture decisions recorded (ADR-001 to 004)

done

✓

Create the repo

✓

Add teammates as collaborators

✓

Create labels and milestones

✓

Create 36 issues

✓

Link repo to Project board

✓

Link GitHub to Notion

✓

Verify everyone can connect to the VM

✓

Verify MySQL DB (pidb) connection works

✓

Verify access to sFTP / Meteo2 folder

✓

Verify access to Raspberry Pi / sensor JSON endpoint

✓

Look at actual JSON sensor files to understand the structure

pending / carry over

○

Install chosen tools (Python, KNIME, Power BI Desktop, etc.)

○

Set up Bronze folder structure on the VM

○

Gold layer storage engine decision

○

ETL orchestration tool decision

○

SAC export mechanism decision

○

Presence derivation logic threshold

○

Who owns which part

○

DoD & DoR

Sprint 2Bronze + Silver PipelineFeb 27 – Mar 5, 2026

meeting notes ↗completed

goal

Build the full ingestion pipeline — Bronze raw files to Silver cleaned tables in PostgreSQL — for the two apartment sources and the MySQL master registry.

deliverables

—

Bronze raw storage (#3) — timestamped folder layout on the VM

—

MySQL → Silver static metadata import (#4)

—

Sensor JSON ingestion: Jimmy (#5) and Jérémie (#6)

—

Weather forecast → raw store (#7)

—

Silver clean storage (#13) + load flow with quality checks (#14)

—

Fast-flow scheduler retrieving the latest data each minute (#11a)

—

Watcher optimization — pipeline cycles from 5 min to 6 s

—

create_silver.py — auto DB creation + admin privileges

—

Documentation: SETUP.md, ETL.md, ARCHITECTURE.md

—

Architecture diagram updated

—

Star schema v2 submitted for teacher feedback

—

Dashboard mockups — tenant view + admin view

done

✓

Bronze folder structure on VM

✓

Watcher downloading JSON sensor files (Jimmy + Jérémie)

✓

MySQL → Silver static-metadata flow

✓

Silver tables created in PostgreSQL

✓

Flattening + cleaning pipeline working

✓

Deduplication logic implemented

✓

Watcher perf fix — cycles from 5 min to 6 s

✓

create_silver.py with DB_ADMIN_URL support

✓

Documentation written (SETUP, ETL, ARCHITECTURE)

✓

Star schema v2 designed and sent for review

✓

Bronze predictive ingestion — ~5 ms .exists() check vs full SMB scan

✓

COPY-into-TEMP-TABLE upsert pattern adopted (50–150× faster than per-row INSERT)

✓

silver.etl_watermark table — idempotent re-runs from day one

pending / carry over

○

Gold layer (blocked on star schema v2 validation)

○

Weather Bronze→Silver flow (Sacha)

○

Slow-flow scheduler (#11b)

○

ML model exploration (Johann)

○

Dashboard mockup finalisation

Sprint 3KNIME data import + watcher hardeningMar 7 – Mar 13, 2026

meeting notes ↗completed

goal

Wire the first half of the KNIME data flow (Silver → KNIME import), fix watcher edge-cases, refresh tooling — while Gold remained blocked on star-schema-v2 validation.

deliverables

—

KNIME data flow — import side (#25a)

—

create_silver.py — auto DB creation + admin privileges (DB_ADMIN_URL)

—

Watcher fix — date-based filename comparison (DD.MM sorting bug)

—

--scan flag added to watcher for full-rescan triggers

—

Dashboard mockups — tenant + admin views

—

Code review — Sacha's weather_download.py PR

—

Per-flow workflow docs on the website

done

✓

#25a — Build a data flow to import (closed Mar 13)

✓

create_silver.py with DB_ADMIN_URL support

✓

Watcher DD.MM sort bug resolved

✓

--scan flag added to watcher

✓

Dashboard mockups delivered (tenant + admin)

✓

Code review of Sacha's weather_download.py PR

✓

Silver → Gold ETL workflow documented on the site

pending / carry over

○

BLOCKER: Gold layer — star schema v2 sent for review, awaiting 2nd round of feedback

○

weather_download.py — final fixes after PR review (Sacha)

○

Presence prediction model — KNIME workflow in progress (Johann)

Sprint 4Gold layer build-out + ML kickoffMar 14 – Mar 27, 2026 (2 weeks)

meeting notes ↗completed

goal

Unblock Gold (proceed without final external feedback), iterate on clean_weather, kick off the first ML model-selection workflow on KNIME. Energy pricing decision required for the cost dimension.

deliverables

—

Gold layer implementation — first cut (Dehlya)

—

Energy pricing decision: Oiken tariffs (both apartments on Oiken's network)

—

clean_weather.py implementation (Sacha)

—

Code review for Sacha's Bronze→Silver weather work (Dehlya)

—

Presence model selection workflow on KNIME (#26a)

—

Scrum management updates (Johann)

done

✓

#26a — Build a workflow to select the best presence model (closed Mar 26)

✓

Gold layer scaffolding in place

✓

Energy pricing — Oiken tariffs adopted

✓

clean_weather.py — first implementation

✓

Bronze→Silver weather code review

✓

ML — model selection workflow for presence prediction

✓

Star schema v2 finalised after teacher feedback

✓

dim_tariff design with provider × year grain (Oiken 2023–2025 @ 0.34 CHF/kWh)

pending / carry over

○

Gold fact tables — implementation continuing

○

weather_download.py — date-handling issues

○

Energy consumption prediction model — to start

Sprint 5Gold facts + clean_weather refinementMar 28 – Apr 2, 2026

meeting notes ↗completed

goal

Push Gold facts forward, fix the weather pipeline's date/path handling, advance both ML models, decide on prediction storage strategy.

deliverables

—

Gold fact tables — implementation in progress (Dehlya)

—

clean_weather.py — updated after code review (Sacha)

—

Decision on prediction storage strategy (history vs overwrite)

—

Presence + energy prediction models — continued work (Johann)

done

✓

Code reviews exchanged across the team

✓

Gold fact tables progressing

✓

clean_weather.py — second pass after review feedback

✓

Decision: predictions kept as history (no overwrite on re-run)

✓

sFTP folder-selection bug identified — fix scheduled for next sprint

✓

Weather pipeline date-format issue scoped

pending / carry over

○

sFTP folder selection — date/path bug to resolve

○

Date formatting in the weather pipeline

○

ML presence + energy models — finalise

○

Gold layer completion — fact tables, materialised view

Sprint 6Gold + Weather + KNIME ML — landingApr 3 – Apr 17, 2026 (2 weeks)

meeting notes ↗completed

goal

Land Gold end-to-end, complete both weather pipelines, scaffold both KNIME ML workflows. The biggest single sprint — twelve issues closed.

deliverables

—

Gold OLAP modelisation (#15)

—

Gold database created (#16)

—

Silver → Gold ETL flow (#17)

—

Weather Sources → Bronze (#7a)

—

Weather Bronze → Silver (#7b)

—

Weather raw-store flow (#7) — full chain wired

—

Slow-flow scheduler — daily weather + nightly catch-up (#11b)

—

Presence model — workflow with selected model (#26a follow-up)

—

Energy model — best-model-selection workflow (#27a)

—

Energy model — workflow with selected model (#27b)

—

KNIME data flow — export side (#25b)

—

Mockups delivered (#12)

done

✓

#7 — Weather raw-store flow

✓

#7a — Weather Sources → Bronze

✓

#7b — Weather Bronze → Silver

✓

#11b — Slow-flow scheduler

✓

#12 — Mockups

✓

#15 — OLAP modelisation

✓

#16 — Gold database

✓

#17 — Silver → Gold ETL flow

✓

#25b — Data flow to export (KNIME)

✓

#26a — Workflow with selected presence model

✓

#27a — Best-model-selection workflow (energy)

✓

#27b — Workflow with selected energy model

✓

Gold tables: 7 dims + 5 facts + mv_energy_with_cost

✓

9-step populate process (populate_dimensions, populate_sensors, populate_weather)

✓

KNIME Variable → Credentials pattern adopted for runtime credential injection

✓

silver.weather_forecasts with weather_watermark for idempotent re-runs

✓

scripts/run_knime_predictions.py — batch-mode invocation with stdout/stderr capture

✓

sFTP folder-selection bug fixed alongside the weather pipeline

pending / carry over

○

Predictions back to Gold (next sprint — #28)

○

Power BI dashboards (next sprint — #19, #20, #29)

○

Row-level security (next sprint — #24)

Sprint 7Predictions → DB + Power BI dashboards + RLSApr 18 – Apr 24, 2026

meeting notes ↗completed

goal

Persist KNIME predictions back to Gold, build the three Power BI dashboards, implement row-level security per apartment.

deliverables

—

Predictions written back to gold.fact_prediction_motion / fact_prediction_consumption (#28)

—

Power BI energy consumption dashboard (#19)

—

Power BI environment dashboard — temperature, humidity, CO₂, door/window status (#20)

—

Power BI prediction visualisation dashboard (#29)

—

Power BI row-level security — apartment-scoped views (#24)

—

KNIME data export/import flow — both directions complete (#25)

—

Presence model running headless via run_knime_predictions.py (#26)

—

Energy/consumption model running headless (#27)

done

✓

#19 — Power BI energy dashboard

✓

#20 — Power BI environment dashboard

✓

#24 — Row-level security per apartment

✓

#25 — KNIME data flow export/import

✓

#26 — Presence prediction model in production

✓

#27 — Energy consumption prediction model in production

✓

#28 — Predictions loaded back to data warehouse

✓

#29 — Power BI prediction dashboard

pending / carry over

○

Anonymisation / masking (next sprint — #18)

○

GDPR writeup (next sprint — #32)

○

Scalability forecast (next sprint — #31)

○

Customer deployment package (next sprint — #33)

Sprint 8Anonymisation + GDPR + Scalability + PackagingApr 25 – May 1, 2026

meeting notes ↗completed

goal

Wrap up customer-facing deliverables: anonymisation, GDPR/ethics writeup, scalability forecast, and the self-contained installer for the customer environment.

deliverables

—

Data masking / anonymisation at the silver → gold boundary (#18)

—

GDPR & ethics written assessment (#32)

—

Scalability forecast — storage + compute projections (#31)

—

Full data-flow scheduling — all flows in one watcher process (#11)

—

Self-contained customer deployment package — Python installer (#33)

—

Compress-after-silver — 10-15× bronze shrink while preserving audit trail

—

Postgres tuning (shared_buffers = 4 GB) + COPY-into-TEMP-TABLE upsert

—

Drop-constraint backfill script — first install from ~4 h to ~6 min on re-run

—

Power BI filters across the dashboards (Sacha)

—

User guide written (Sacha)

—

Scrum management, project coordination, documentation review, slide support (Johann)

done

✓

#11 — Full scheduling end-to-end

✓

#18 — Anonymisation: building_name → 'Building <id>', owner_user_id stripped, first names kept as RLS pseudonyms (ADR-005)

✓

#31 — Scalability forecast delivered

✓

#32 — GDPR / ethics assessment delivered

✓

#33 — Customer installer tested end-to-end on the VM

✓

Compress-after-silver shipped (with .json.gz read-back support)

✓

Postgres tuning + COPY upsert path

✓

fast_silver_backfill.py drop-constraint script + pre-flight duplicate check

✓

Sacha — Power BI filters across all dashboards

✓

Sacha — user guide written

✓

Johann — scrum management, project coordination

✓

Johann — documentation contributions and review

✓

Johann — presentation slides structure + content support

pending / carry over

○

Defense prep (next sprint — #36)

○

User guide polish (next sprint — #34)

○

Technical documentation polish (next sprint — #35)

○

Decision: SAP SAC track (#21, #22) abandoned — Power BI focus for the defense

○

Decision: storage encryption (#9), monitoring/alerts (#10), full IAM (#23), i18n (#30) deferred to future work

Sprint 9Defense PrepMay 2 – May 8, 2026

meeting notes ↗current

goal

Polish the customer-facing deliverables (technical doc, installation guide, user guide), finalise the slide deck, rehearse the defense.

deliverables

—

Technical documentation polish (#35) — Word + Markdown, 19 chapters

—

Installation guide polish (#34) — embedded screenshots

—

User guide polish — review pass against the actual UI

—

Defense slide deck (#36) — Bronze→Gold deep-dive owned by Dehlya (slides 7–13), with speaker notes

—

Personal cheatfile / study packet for Q&A (19 sections, ~600 lines)

—

Final installer end-to-end test on the VM

—

Last-minute fixes from the test run (e.g. .json.gz read-back, KNIME version pin)

done

✓

Documentation regenerated to .docx with screenshots

✓

Slide deck polished (slides 7–13) with technical speaker notes

✓

AI declaration paragraph added to all customer-facing docs

✓

KNIME workflows pinned to VM's KNIME 5.8 version

✓

Bronze .json.gz read-back path fixed (audit-trail story now real)

✓

/scrum/devops page removed (production deployment was abandoned)

pending / carry over

○

Final rehearsal of the defense

○

Last polish pass on installer prompt copy ("5-course tasting menu ☕")

○

Light test coverage (#37) — idempotency tested by re-run rather than pytest

Sprints are bounded by the weekly Friday review meetings — see the meeting log for the agenda and decisions from each. Sprints 4 and 6 ran two weeks each; the other sprints were one-week cycles.