🔧 Data QualityJune 8, 2026

Building Digital Twins: Why the Data Model Is the Hard Part

Digital-twin platforms are easy to buy and hard to feed. Why the underlying building data model — not the visualisation — decides whether a twin actually works.

A building digital twin is supposed to be the live, queryable mirror of a physical building: every asset, every sensor, every space, connected the way they are in reality, so you can ask questions and run analytics across the whole thing. The demos are seductive — a 3D model, live data flowing, fault alerts surfacing themselves. What the demos quietly assume is the part that actually decides whether a twin succeeds: a clean, structured, validated data model underneath. That model is the hard part, and it's where most twin projects stall.

A twin is only as good as its graph

Strip away the visualisation and a digital twin is a graph: nodes for equipment, points and spaces, and edges for the relationships between them. Every useful question a twin answers is really a graph query —

"show me every VAV downstream of AHU-3"
"which zones are affected if this chiller trips?"
"compare energy use across every air-handling unit in the portfolio"

None of these can be answered by a 3D model or a pile of live values. They can only be answered if the relationships are present and correct in the underlying graph. A twin with a beautiful interface and a broken graph is a very expensive dashboard.

This is why an ontology like the Brick Schema sits at the centre of serious twin work: it gives the graph standard types and typed relationships, so the same query runs across any building modelled to the standard.

The data doesn't arrive twin-ready

Here's the gap the platform vendors don't dwell on. The twin platform is the easy purchase. Feeding it is the project.

Real building data arrives as a raw BACnet scrape full of cryptic abbreviations, a controls contractor's equipment schedule, a BMS export, and as-built drawings that disagree with all three. Before the twin can mirror anything, every point has to be classified, every piece of equipment related to its points and parent systems, every space placed, and the whole thing validated. That mapping-and-cleaning work is unglamorous, slow, and exactly where the timeline goes.

Load that data in raw and the twin doesn't fail loudly — it fails subtly. The model renders, data flows, and the answers are quietly wrong because a sensor is attached to the wrong unit or a feed relationship is missing. Those errors propagate into every rule and alert the twin generates. It's the building-data equivalent of pushing untested data into a live system: cheap to prevent, expensive to unpick after the fact.

Connectivity is what makes it a twin, not a list

The single feature that separates a twin from an asset register is connectivity — the "what feeds what" graph. A flat list or a single-parent tree can tell you a chiller exists and where it lives. Only a connectivity graph can tell you what that chiller feeds and therefore what's affected when it fails.

And a building needs both graphs at once: spatial containment (where things live) and flow connectivity (what feeds what), with the same asset appearing in both. This is the structural requirement most first attempts miss — they collapse everything into one hierarchy and lose the ability to reason about impact and flow. To model a building properly your working layer has to support multiple, typed parent relationships per node, spatial and flow at the same time. The same capability that models a power network or a pipeline — a connectivity graph with branches, not a tidy tree — is what a building twin needs too.

Stage the model before you feed the twin

The lesson that keeps repeating across twin projects: the work is in the data, not the platform. So treat the data accordingly. Build, relate and validate the model in a staging layer — a working space where you can reconcile conflicting sources, classify points to standard types, construct both relationship graphs, validate against the ontology's rules, and fix errors in bulk — before anything loads into the twin.

Then hand the twin a model you trust. The platform was never the bottleneck. A validated graph, modelled once and exported cleanly, is what turns an expensive visualisation into a twin that actually answers questions.

Planning a building twin and staring at the data problem first? See how Brick Schema staging and the smart-buildings approach close the gap from raw data to a validated model, or get in touch.