USE CASE 02 — Group 14, HES-SO Valais

Turning raw IoT signals
into apartment intelligence

An end-to-end data pipeline processing 245,000+ sensor files from two smart apartments in Valais, Switzerland. Energy monitoring, environment tracking, presence detection — from raw JSON to business dashboards.

sourcesJSON / CSV / MySQL
bronze245k files
silver15M+ rows
goldstar schema
bi / mldashboards

245k+

sensor files ingested

15M+

rows in Silver

6

sensor types

2

smart apartments

~6s

watcher cycle

4

Gold fact tables

the challenge

Smart apartments generate massive amounts of data

Every minute, each apartment produces a JSON file containing readings from plugs, motion sensors, door/window sensors, meteo stations, humidity sensors, and consumption meters. Over 10 weeks, thats 245,000+ files and 15 million sensor readings.

Raw data is useless without a pipeline

Nested JSON, inconsistent room names, outlier values, missing readings, sentinel values. The data needs to be ingested, cleaned, normalized, aggregated, and structured into a star schema before BI tools and ML models can consume it.

what we deliver

Four analytics domains

01

Energy Monitoring

Track power consumption per device, per room, per apartment. Watt-hour to kWh conversion, cost estimation, anomaly detection.

fact_energy_minute

02

Environment Tracking

Temperature, humidity, CO2, noise, atmospheric pressure. Window and door status. Indoor climate quality at a glance.

fact_environment_minute

03

Presence Detection

Motion sensors and door activity combined into a presence signal per room. ML-ready for predictive occupancy models.

fact_presence_minute

04

Sensor Health

Battery levels, uptime tracking, error detection. Know when a sensor is failing before it stops reporting.

fact_device_health_day

the pipeline

Medallion architecture

Data flows through five stages, each adding structure and value. Every script is idempotent and resume-capable — interrupt and re-run safely at any point.

01
Sources3 external sources

SMB share (sensor JSON, every minute), MySQL (static metadata, 10 tables), sFTP (weather CSV, daily)

02
BronzeRaw, immutable

245k+ JSON files stored in timestamped folder structure. Never modified, always auditable.

03
SilverClean, full resolution

15M+ rows in PostgreSQL. Deduplicated, normalized, outliers flagged. Watermark-based incremental processing.

04
GoldBusiness-ready

Star schema with minute-grain fact tables. 5 dimensions, 4 facts. Optimized for BI and ML consumption.

05
BI / MLInsights

Power BI dashboards, SAP Analytics Cloud, KNIME prediction models. Row-level security per apartment.

get started

Deploy in minutes, maintain with confidence

Whether youre a building manager or a data engineer, weve got you covered.

Quick Install

coming soon

No technical knowledge required. Fill in a form, download the installer, run it. Guided .env configuration, automatic schema creation, one-click pipeline start.

download installerconfigure .env

Developer Setup

available

Full control. Clone the repo, configure your .env manually, run each script step by step. 10 commands from zero to a running pipeline. Detailed setup guide included.

view setup guide →

the team

Group 14

Dehlya

Data Engineer & Architect

Pipeline, Gold layer, website, orchestration, deployment

Sacha

Data Engineer

Weather ingestion, BI dashboards, security

Johann

Data Analyst & Scientist

ML models (KNIME), Power BI reports, user guide

HES-SO Valais / Wallis — Haute Ecole de Gestion — 64-61 Data Cycle — 2026

explore