data-cycle / uc2
Technical Documentation
Apartment Domotics & IoT Sensors — End-to-end data pipeline
version
1.0
date
April 2, 2026
team
Group 14, HES-SO Valais
project
64-61 Data Cycle
System Overview
The Data Cycle project (Use Case 2) processes IoT sensor data from two smart apartments in Valais, Switzerland. The system ingests raw sensor readings (temperature, humidity, CO2, motion, door/window, energy consumption) and weather forecasts, transforms them through a medallion architecture (Bronze, Silver, Gold), and serves them to BI dashboards and ML models.
| Project | Data Cycle -- UC2 Apartment Domotics & IoT Sensors |
| Team | Group 14 -- HES-SO Valais, Haute Ecole de Gestion, 2026 |
| Data period | August 18 -- October 27, 2023 |
| Apartments | 2 (Jimmy, Jeremie) -- ~245,000 JSON sensor files |
| Data volume | 15M+ rows in Silver, 6 sensor types, 4 weather measurements |
| VM | Windows 11, on-premise (school network) |
| Repository | github.com/dehlya/data-cycle-domotic |
Architecture
The system follows the medallion architecture pattern with three data layers, each serving a distinct purpose. Data flows from external sources through Bronze (raw), Silver (cleaned), and Gold (aggregated) before reaching BI tools and ML models.
| layer | purpose | storage | grain |
|---|---|---|---|
| Bronze | Raw, immutable copy of source data | File system (NTFS) | Per-file (JSON, CSV) |
| Silver | Cleaned, deduplicated, full-resolution data | PostgreSQL (silver schema) | Per-reading (1 row per sensor per minute) |
| Gold | Aggregated, business-ready star schema | PostgreSQL (gold schema) | Per-minute (facts) and per-day (health) |
Key principle—Every ETL script is idempotent and resume-capable. Scripts can be interrupted and re-run safely without creating duplicates, thanks to watermark tracking and ON CONFLICT upserts.
Technology Stack
| layer | technology | version | purpose |
|---|---|---|---|
| Ingestion | Python | 3.11+ | All ETL scripts, async I/O, parallel processing |
| Orchestration | watcher.py | custom | 60s loop, prediction-based, nightly scan, daily weather |
| Bronze storage | NTFS file system | Windows 11 | Timestamped directory structure, immutable raw files |
| Silver / Gold | PostgreSQL | 15+ | Relational DB with schema separation (silver, gold) |
| ETL | pandas + SQLAlchemy | 2.1 / 2.0 | DataFrame transforms, parameterized SQL, connection pooling |
| sFTP | paramiko | 3.4 | SSH transport for weather CSV download |
| BI | Power BI + SAP SAC | latest | Energy/environment dashboards + presence analytics |
| ML | KNIME | latest | Energy forecasting, presence prediction workflows |
| Website | Next.js + Tailwind | 14.1 / 3 | Project documentation and team hub |
Data Sources
source 1 — sensor JSON (SMB share)
Two apartments (Jimmy, Jeremie) generate one JSON file per minute per apartment, stored on an SMB network share (Z:\). Each file contains nested sensor readings: plugs, doors/windows, motions, meteo stations, humidity sensors, and consumption meters.
Filename format—DD.MM.YYYY HHMM_ApartmentName_received.json — e.g. 18.08.2023 1000_JimmyLoup_received.json
| apartment (filename) | mapped to | bronze path |
|---|---|---|
| JimmyLoup | jimmy | bronze/jimmy/YYYY/MM/DD/HH/ |
| JeremieVianin | jeremie | bronze/jeremie/YYYY/MM/DD/HH/ |
JSON structure (top-level fields)
| field | type | description |
|---|---|---|
| user | string | Apartment owner identifier |
| api_token | string | Auth token (masked/removed in Silver) |
| datetime | string | Collection timestamp — DD.MM.YYYY HH:MM |
| plugs | object | Smart plugs keyed by room name |
| doorsWindows | object | Door/window sensors keyed by room name |
| motions | object | Motion sensors keyed by room name |
| meteos | object | Environmental sensors nested under meteo key |
| humidities | object | Humidity sensors keyed by room name |
| consumptions | object | Whole-house energy consumption keyed by House |
sensor types — fields extracted per category
| sensor type | JSON key | fields extracted | units |
|---|---|---|---|
| plug | plugs | power, total, temperature | W, Wh, °C |
| door | doorsWindows (type=door) | open, battery | bool, % |
| window | doorsWindows (type=window) | open, battery | bool, % |
| motion | motions | motion, light, temperature | bool, lux, °C |
| meteo | meteos.meteo | temperature_c, co2_ppm, humidity_pct, noise_db, pressure_hpa, battery | °C, ppm, %, dB, hPa, % |
| humidity | humidities | temperature, humidity, battery | °C, %, % |
| consumption | consumptions | total_power, power1-3, current1-3, voltage1-3 | W, A, V |
source 2 — MySQL database (school network)
Static reference tables from the school’s MySQL database (pidb at 10.130.25.152:3306). Contains building metadata, room details, sensor mappings, device inventories, and energy profiles.
| MySQL table | Silver table | description |
|---|---|---|
| buildings | dim_buildings | Apartment metadata, location, building year |
| buildingtype | dim_building_types | Maison / Appartement lookup |
| rooms | dim_rooms | Room details, sensor counts, orientation, m² |
| sensors | dim_sensors | Sensor IPs mapped to rooms |
| devices | dim_devices | Appliances per room (fridge, washer, etc.) |
| profilereference | ref_energy_profiles | Reference energy consumption kWh/yr |
| profile | ref_power_snapshots | Power consumption snapshots over time |
| parameters | ref_parameters | Threshold configs per building |
| parameterstype | ref_parameters_type | Parameter type lookup |
| dierrors | log_sensor_errors | Sensor error logs — null values, failures |
GDPR—Tables skipped: users (names, emails, passwords, phones), actions, achievements, badges, events, eventsgeneric, categories, userrelationships — personal data or irrelevant for analytics.
source 3 — weather CSV (sFTP)
Daily weather forecast files from an sFTP server (/Meteo2). File format: Pred_YYYY-MM-DD.csv. Each CSV has columns: Time, Value, Prediction, Site, Measurement, Unit.
| measurement code | mapped column | unit |
|---|---|---|
| PRED_T_2M_ctrl | temperature_c | °C |
| PRED_RELHUM_2M_ctrl | humidity_pct | % |
| PRED_TOT_PREC_ctrl | precipitation_mm | mm |
| PRED_GLOB_ctrl | radiation_wm2 | W/m² |
Sentinel value—-99999.0 in the Value column means missing data. Converted to NULL during cleaning.
Pipeline: Sources to Bronze
watcher.py — orchestrator
Continuous loop running every 60 seconds. Instead of scanning 245k+ files on SMB (~72s), it predicts the next expected filenames from the last known timestamp and checks with .exists() calls (~0.01s). A nightly full scan at midnight catches any missed files. Daily at 07:30, it triggers the weather pipeline as a subprocess.
bulk_to_bronze.py — sensor file copy
Copies new JSON files from SMB to local Bronze storage. Two modes: prediction (default, fast) and full scan (--full flag). Uses 16 parallel threads. Files are stored in timestamped folders: bronze/jimmy/YYYY/MM/DD/HH/filename.json. Skips files that already exist locally.
weather_download.py — sFTP download
Connects to the sFTP server via paramiko, lists CSV files in the remote directory, downloads new ones to bronze/weather/YYYY/MM/DD/. Configurable retry logic (default: 3 attempts, 600s delay). Logs to logs/weather_download.log.
| metric | prediction mode | full scan mode |
|---|---|---|
| Files on SMB | 246,000 | 246,000 |
| New files = 2 | 0.4s | ~80s |
| New files = 1,245 | 20s | ~85s |
Pipeline: Bronze to Silver
flatten_sensors.py — JSON to sensor_events
Parses Bronze JSON files, extracts all sensor readings, normalizes room names (Bhroom to Bathroom, Bdroom to Bedroom, etc.), flags outliers, and upserts into silver.sensor_events. Uses 8 parallel workers with batches of 2,000 files. Watermark tracking via silver.etl_watermark prevents reprocessing.
Load watermark -- reads processed filenames from silver.etl_watermark
Find new files (newest first) -- walks Bronze folders backwards, stops after 50 consecutive already-processed files
Parallel processing -- splits into batches of 2,000, processed by 8 workers (ProcessPoolExecutor)
Each worker: opens JSON, parses datetime, calls flatten() to extract all sensor readings
Upsert -- inserts into silver.sensor_events with ON CONFLICT DO UPDATE
Mark done -- adds processed filenames to the watermark table
room normalization
Raw JSON room names are inconsistent. The flatten function maps them to standardized names:
| raw name | normalized |
|---|---|
| Bhroom | Bathroom |
| Bdroom | Bedroom |
| Livingroom | Living Room |
| Office | Office |
| Kitchen | Kitchen |
| Laundry | Laundry |
| Outdoor | Outdoor |
| House | House |
outlier detection bounds
Values outside these physical bounds are flagged with is_outlier = TRUE but not removed. This preserves data for audit while allowing BI/Gold to filter.
| field | min | max | unit |
|---|---|---|---|
| temperature_c | -20 | 60 | °C |
| humidity_pct | 0 | 100 | % |
| co2_ppm | 300 | 5,000 | ppm |
| noise_db | 0 | 140 | dB |
| pressure_hpa | 870 | 1,085 | hPa |
| power | 0 | 10,000 | W |
| battery | 0 | 100 | % |
import_mysql_to_silver.py — dimension tables
Snapshots 10 reference tables from the school MySQL database into PostgreSQL Silver. All columns imported as TEXT (type casting done in Gold). Idempotent: DROP + CREATE on each run.
clean_weather.py — weather CSV processing
Reads raw weather CSVs from Bronze, validates required columns (Time, Value, Prediction, Site, Measurement, Unit), cleans timestamps as UTC, filters by WEATHER_MIN_YEAR, deduplicates by timestamp/site/measurement, maps 4 measurement types to standardized columns, replaces sentinel values (-99999.0) with NULL, pivots from long to wide format, flags outliers, and upserts into silver.weather_clean. Watermark via silver.weather_watermark. Logs row-drop counts at each step to logs/clean_weather.log.
weather outlier bounds
| field | min | max |
|---|---|---|
| temperature_c | -50 | 60 |
| humidity_pct | 0 | 100 |
| precipitation_mm | 0 | 500 |
| radiation_wm2 | 0 | 1,500 |
| scenario | files | rows | time |
|---|---|---|---|
| First run (full backfill) | 245,000 | 15M+ | ~3.5 hours |
| Incremental (after watcher) | 2-20 | 100-1,500 | 2-4 seconds |
| Nothing new | 0 | 0 | 1-2 seconds |
Pipeline: Silver to Gold
create_gold.py — schema creation
Creates the gold schema with admin privileges, then creates all dimension and fact tables as the app user (so the app user owns them). Idempotent: uses CREATE TABLE IF NOT EXISTS. Creates indexes on date_key, apartment_key, room_key for fast BI queries.
populate_gold.py — 9-step process
dim_date -- extract unique dates from sensor_events, compute day_of_week, week, month, year, is_weekend
dim_datetime -- extract minute-level timestamps, truncate to minute grain, compute time attributes
dim_apartment -- distinct apartments, enriched from silver.dim_buildings (building_name, building_id)
dim_room -- distinct rooms per apartment, linked via apartment_key
dim_device -- synthetic device_id (apartment_room_sensortype), linked to room_key
fact_energy_minute -- GROUP BY minute/device: MAX(power), MAX(total energy), Wh to kWh conversion
fact_environment_minute -- GROUP BY minute/room: MAX per metric, BOOL_OR for window/door flags
fact_presence_minute -- GROUP BY minute/room: SUM(motion), BOOL_OR(motion OR door open) = presence
fact_device_health_day -- GROUP BY day/device: MIN/AVG battery percentage
Pattern—All fact tables use GROUP BY date_trunc('minute', timestamp) with CASE WHEN pivots. Upserts via ON CONFLICT DO UPDATE. Currently a full reload each run -- future: watermark-based incremental.
Pipeline: Gold to BI / ML
BI dashboards (planned)
| dashboard | tool | fact table | key metrics |
|---|---|---|---|
| Energy | Power BI | fact_energy_minute | Power (W), energy (kWh), cost (CHF) |
| Environment | Power BI | fact_environment_minute | Temperature, humidity, CO2, noise, pressure |
| Presence | SAP Analytics Cloud | fact_presence_minute | Motion count, presence flag, door activity |
| Sensor Health | Power BI | fact_device_health_day | Battery levels, uptime, error counts |
row-level security (RLS)
Each apartment owner only sees their own data. Implemented via Power BI RLS roles filtering on apartment_key. Three roles: role_jimmy, role_jeremie (filtered), and role_admin (unfiltered, all apartments).
ML models (in progress)
Built with KNIME Analytics Platform. Two models: energy consumption prediction (considering weather forecasts and historical patterns) and room presence prediction (from motion/door sensor patterns). Results will be loaded back into gold.fact_ml_predictions (future table).
Database Schemas
silver.sensor_events (main fact table)
| column | type | description |
|---|---|---|
| id | BIGSERIAL | Primary key |
| apartment | VARCHAR(20) | 'jimmy' or 'jeremie' |
| room | VARCHAR(50) | Normalized: 'Bathroom', 'Kitchen', 'Living Room', etc. |
| sensor_type | VARCHAR(20) | 'plug', 'motion', 'meteo', 'door', 'window', 'humidity', 'consumption' |
| field | VARCHAR(50) | 'power', 'temperature_c', 'co2_ppm', 'open', 'motion', etc. |
| value | FLOAT | The sensor reading |
| unit | VARCHAR(10) | 'W', '°C', 'ppm', '%', 'bool', etc. |
| timestamp | TIMESTAMPTZ | When the reading was taken (UTC) |
| is_outlier | BOOLEAN | TRUE if value outside physical bounds |
Constraints—UNIQUE (apartment, room, sensor_type, field, timestamp). Indexes on timestamp, apartment, sensor_type.
silver.weather_clean
| column | type | description |
|---|---|---|
| id | BIGSERIAL | Primary key |
| timestamp | TIMESTAMPTZ | Forecast timestamp (UTC) |
| site | VARCHAR(50) | Weather station name |
| temperature_c | FLOAT | Predicted temperature |
| humidity_pct | FLOAT | Predicted humidity |
| precipitation_mm | FLOAT | Predicted precipitation |
| radiation_wm2 | FLOAT | Predicted solar radiation |
| is_outlier | BOOLEAN | TRUE if any value outside bounds |
Constraints—UNIQUE (timestamp, site). Index on timestamp.
silver — other tables
| table | purpose | source |
|---|---|---|
| etl_watermark | Tracks processed sensor JSON filenames | flatten_sensors.py |
| weather_watermark | Tracks processed weather CSV filenames | clean_weather.py |
| dim_buildings | Building metadata | MySQL |
| dim_building_types | Maison / Appartement lookup | MySQL |
| dim_rooms | Room details, sensor counts, m² | MySQL |
| dim_sensors | Sensor IPs mapped to rooms | MySQL |
| dim_devices | Appliances per room | MySQL |
| ref_energy_profiles | Reference kWh/yr by type | MySQL |
| ref_power_snapshots | Power consumption snapshots | MySQL |
| ref_parameters | Threshold configs per building | MySQL |
| ref_parameters_type | Parameter type lookup | MySQL |
| log_sensor_errors | Sensor error logs | MySQL |
gold schema — dimension tables
| table | grain | PK | unique constraint | key columns |
|---|---|---|---|---|
| dim_datetime | 1 minute | BIGINT (YYYYMMDDHHMM) | timestamp_utc | date_key, hour, minute, day_of_week, week, month, year, is_weekend |
| dim_date | 1 day | INT (YYYYMMDD) | date | day_of_week, week, month, year, is_weekend, is_holiday |
| dim_apartment | apartment | SERIAL | apartment_id | name, building_id, building_name, floor |
| dim_room | room | SERIAL | (room_name, apartment_key) | room_type, apartment_key (FK) |
| dim_device | device | SERIAL | (device_id, sensor_type) | room_key (FK), device_type, is_active |
gold schema — fact tables
| table | grain | PK | FK references | measures |
|---|---|---|---|---|
| fact_energy_minute | 1 min / device | (datetime_key, device_key) | dim_datetime, dim_date, dim_device, dim_room, dim_apartment | power_w, energy_wh, energy_kwh, cost_chf, counter_total, is_valid |
| fact_environment_minute | 1 min / room | (datetime_key, room_key) | dim_datetime, dim_date, dim_room, dim_apartment | temperature_c, humidity_pct, co2_ppm, noise_db, pressure_hpa, window_open_flag, door_open_flag, is_anomaly |
| fact_presence_minute | 1 min / room | (datetime_key, room_key) | dim_datetime, dim_date, dim_room, dim_apartment | motion_count, door_open_flag, presence_flag, presence_prob (NULL until ML) |
| fact_device_health_day | 1 day / device | (date_key, device_key) | dim_date, dim_device, dim_room, dim_apartment | error_count, missing_readings, uptime_pct, battery_min_pct, battery_avg_pct |
Indexes—All fact tables have indexes on date_key and apartment_key for fast BI filtering. Energy also indexed on room_key.
Scripts Reference
| script | path | description | flags |
|---|---|---|---|
| watcher.py | ingestion/fast_flow/ | Main loop: sensor ingestion (60s) + weather (daily) | --scan | --weather |
| bulk_to_bronze.py | ingestion/fast_flow/ | Copy sensor JSON from SMB to Bronze | --full |
| flatten_sensors.py | etl/bronze_to_silver/ | Bronze JSON to silver.sensor_events | |
| create_silver.py | etl/bronze_to_silver/ | Create Silver schema + tables | |
| import_mysql_to_silver.py | etl/bronze_to_silver/ | MySQL dimensions to Silver | |
| weather_download.py | ingestion/slow_flow/ | Download weather CSV from sFTP | |
| clean_weather.py | etl/bronze_to_silver/ | Clean weather CSV to silver.weather_clean | |
| create_gold.py | etl/silver_to_gold/ | Create Gold star schema | |
| populate_gold.py | etl/silver_to_gold/ | Populate Gold dimensions + facts |
Configuration
Security—The .env file contains credentials and must never be committed to git. It is listed in .gitignore. Share securely (e.g. via Teams, encrypted channel).
| variable | required | description |
|---|---|---|
| DB_URL | yes | PostgreSQL app user connection string |
| DB_ADMIN_URL | optional | Admin connection for schema/DB creation (first run only) |
| MYSQL_URL | yes | MySQL source for dimension tables (school network) |
| SMB_PATH | yes | Mounted SMB share path (sensor JSON files) |
| BRONZE_ROOT | yes | Local Bronze storage folder |
| SFTP_HOST | optional | sFTP server for weather data |
| SFTP_PORT | optional | sFTP port (default: 22) |
| SFTP_USER | optional | sFTP username |
| SFTP_PASSWORD | optional | sFTP password |
| SFTP_PATH | optional | Remote sFTP directory (e.g. /Meteo2) |
| WEATHER_MIN_YEAR | optional | Ignore weather data before this year (default: 2023) |
| WEATHER_HOUR | optional | Hour to trigger daily weather pipeline (default: 7) |
| WEATHER_MIN | optional | Minute to trigger daily weather pipeline (default: 30) |
| LOG_DIR | optional | Directory for ETL log files (default: logs/) |
| SFTP_MAX_RETRIES | optional | Max sFTP connection retries (default: 3) |
| SFTP_RETRY_DELAY | optional | Seconds between retries (default: 600) |
Deployment
Development
Database: domotic_dev
Config: .env
Purpose: testing changes, schema experiments
Production
Database: domotic_prod
Config: .env.prod
Purpose: BI dashboards connect here, stable data
deployment steps (dev to prod)
Pull latest code on VM: git pull origin main
Run schema scripts against prod: python create_silver.py / create_gold.py
Run import_mysql_to_silver.py against prod (refreshes dimension tables)
Run populate_gold.py against prod (rebuilds fact tables from Silver)
Verify row counts: SELECT COUNT(*) FROM gold.fact_energy_minute;
Restart watcher if running: stop old process, start new one with .env.prod
Services & Processes
| service | description | start | stop |
|---|---|---|---|
| watcher.py | Main pipeline orchestrator | python ingestion/fast_flow/watcher.py | Ctrl+C |
| PostgreSQL | Silver + Gold storage | Windows service (auto) | services.msc |
| Next.js site | Documentation website | Auto-deploys on git push (Vercel) | n/a |
Normal operation—Only the watcher needs to be running. PostgreSQL starts automatically as a Windows service. The website is hosted on Vercel and deploys automatically on git push to main.
Monitoring & Logs
| log file | source | what to check |
|---|---|---|
| logs/weather_download.log | weather_download.py | sFTP connection errors, download failures, file counts |
| logs/clean_weather.log | clean_weather.py | Row drop counts per step, outlier counts, processing errors |
| stdout (terminal) | watcher.py | Pipeline runs, skip counts, timing, weather triggers |
| stdout (terminal) | flatten_sensors.py | Batch progress, row counts, errors per file |
health checks
Is the watcher running? Check the terminal or process list for python watcher.py
Is data flowing? SELECT MAX(timestamp) FROM silver.sensor_events -- should be recent
Is Gold up to date? SELECT COUNT(*) FROM gold.fact_energy_minute -- should match expectations
Any weather errors? Check logs/weather_download.log for recent entries
Disk space? Check storage\bronze size -- grows ~2GB/month
Troubleshooting
| problem | cause | fix |
|---|---|---|
| Watcher says 'SMB path not found' | Z: drive not mounted | Map network drive in File Explorer or check VPN |
| flatten_sensors.py is slow | First run processes ~245k files | Normal -- takes ~3.5h. Subsequent runs are seconds. |
| Schema creation fails with permission error | App user can't create schema | Set DB_ADMIN_URL in .env with superuser credentials |
| weather_download.py connection refused | sFTP unreachable or wrong credentials | Check SFTP_HOST, SFTP_USER, SFTP_PASSWORD. Check VPN. |
| populate_gold.py returns 0 rows | Silver is empty | Run flatten_sensors.py first |
| Duplicate key errors in upserts | Normal behavior | ON CONFLICT handles this. Not an error. |
| Log files not appearing | logs/ directory issue | Scripts auto-create it. Check LOG_DIR and write permissions. |
Maintenance
| task | frequency | steps |
|---|---|---|
| Rotate database password | Quarterly | Update PostgreSQL password, update DB_URL in .env, restart watcher |
| Rotate sFTP credentials | On change | Update SFTP_USER / SFTP_PASSWORD in .env, restart watcher |
| Check disk space (Bronze) | Monthly | Bronze grows ~2GB/month. Archive old years if needed. |
| PostgreSQL VACUUM | Weekly (auto) | autovacuum enabled by default. Manual: VACUUM ANALYZE silver.sensor_events; |
| Review log files | Weekly | Check logs/weather_download.log and logs/clean_weather.log |
| Backup database | Daily | pg_dump -Fc domotic_prod > backup_YYYYMMDD.dump |
Security
Credentials
All credentials stored in .env (gitignored). No hardcoded passwords in source code. Shared via secure channels only (Teams, not email).
SQL injection
All database queries use SQLAlchemy parameterized statements. No string concatenation or f-strings in SQL.
Data privacy (GDPR)
MySQL tables containing personal data (users, relationships, emails, passwords) are explicitly skipped during import. Apartment names used as pseudonymized identifiers.
Network
VM runs on school internal network. SMB and MySQL are internal-only. sFTP weather uses encrypted SSH transport (paramiko). No public endpoints.
Architecture Decisions
Context: Need to process 245k+ files from SMB with I/O-bound operations (file copy, DB writes).
Decision: Use Python with ThreadPoolExecutor for parallel file copy and ProcessPoolExecutor for CPU-bound JSON parsing. asyncio reserved for future real-time ingestion.
Consequence: Team familiarity, rich ecosystem (pandas, SQLAlchemy, paramiko). Acceptable performance for batch processing.
Context: Need a relational DB that supports schema separation, OLAP queries, and Power BI connector.
Decision: PostgreSQL 15+ with separate silver and gold schemas in the same database. Two databases for dev/prod isolation.
Consequence: Free, multi-user, Power BI native connector, excellent OLAP performance. Hosted on the VM.
Context: Evaluated Apache Airflow for orchestration, but single-VM deployment made it overkill.
Decision: Lightweight Python watcher script with 60s loop, prediction-based file discovery, and daily weather trigger.
Consequence: Zero infrastructure overhead, easy to understand and maintain. Trade-off: no built-in alerting or DAG visualization.
Context: Need immutable, auditable raw data storage that is easy to inspect and replay.
Decision: Local NTFS file system with timestamped directory structure (YYYY/MM/DD/HH/).
Consequence: Zero overhead, easy to browse in File Explorer, cloud-agnostic. Trade-off: no built-in versioning.