MicroLink × NVIDIA · Zone 03 · Monitoring and control

Four layersone stack#

GPU at the bottom, twin at the top, building and cluster between. Each layer is a published NVIDIA product. The novelty is running them as one coupled control surface.

DCGM is table stakes. Metropolis applied to a data centre as a smart building is more interesting. DSX Blueprint extended to the thermal coupling layer is genuinely new.

The four-layer stack is built from existing NVIDIA platform products. At the GPU layer, DCGM and NVML provide telemetry for power, temperature, utilisation, and per-tenant attribution. At the cluster layer, Mission Control handles fleet orchestration, GPU lifecycle, workload scheduling, and observability across pods. At the building layer, Metropolis treats the facility as a managed asset, fusing sensor data from BMS, leak detection, ingress monitoring, and security cameras. At the twin layer, Omniverse with DSX Blueprint models the data centre's physical state as a real-time simulation.

The bridge between the IT and facility worlds runs on Jetson and IGX Orin at the cabinet edge. These read facility sensors, push BMS state up into Metropolis, and execute control loops back down into chillers, valves, and CDU setpoints. Every cabinet is a Jetson node. The fleet is observable and controllable from one place.

What makes this architectural rather than incremental: the twin layer simulates physical processes that the published DSX Blueprint reference does not currently cover. Heat flow into a host digester loop. Biogas yield response to thermal demand. PHE saturation under workload shifts. Dry-cooler engagement under host-loop drop-off. Building this extension is the white-paper opportunity.

Figure 01 · The four-layer monitoring and control stack

GPU, cluster, building, twin.One operator surface across all four.

Telemetry flows up. Control flows down. Jetson at the edge bridges IT and BMS. Each layer anchored to a published NVIDIA product.

Confidence · architectural

Source · NVIDIA platform reference architectures · MicroLink monitoring stack v0.4 Method · Four-layer abstraction · NVIDIA products mapped to MicroLink ops surface

Section 02 · Coupled physics

Physical-software pairing

1 : 1

Every physical node has a software node observing or controlling it. The coupling lives in software.

Why two stacks fail

Server heat is a control input to the host
PHE outlet temp drives digester throughput
Workload pattern shifts thermal load
Host demand drops trigger dry-cooler engagement
None of this works with siloed monitoring

The architectural premise

In a closed-loop deployment, IT and BMS are not two systems sharing a building. They are one system with two telemetry surfaces. The software has to reflect that.

The coupled physicsmade observable#

In our deployment the IT load and the host process are physically coupled by the closed thermodynamic loop. Two parallel monitoring stacks cannot represent that coupling. The software has to know what the physics knows.

Every physical node corresponds to a software node. Every software node corresponds to a physical reality. The coupling is the architecture.

In a conventional data centre, IT and BMS are weakly coupled. CRAC setpoints respond to room temperature, which responds to GPU power. The feedback loop is closed by the building, not by software, and the control system does not need to know what the GPUs are doing — just how hot the room is. That model breaks the moment heat becomes an output product, not just waste. When the digester wants 6 MW of thermal at 45 °C continuously, the IT layer is not free to ignore the building layer.

The coupling has to be made observable in software. DCGM tells Mission Control about workload patterns. Mission Control tells Metropolis about expected thermal load over the next ten seconds. Metropolis reads PHE state, digester demand, and dry-cooler engagement. Omniverse runs ahead of all of it as a forecast: if this scheduling decision is taken at the cluster layer, what does the building layer look like in five minutes? That is the kind of question the coupled stack answers.

Figure 02 · Coupled physics · physical and software flow

Two flows.One coupling. Vertical correspondence.

Physical flow on top: heat moving through the system. Software flow on bottom: telemetry observing it. Dotted lines mark where the layers correspond.

Confidence · architectural

Source · MicroLink monitoring stack v0.4 · WWTP Thermal Memo §03 Method · Physical-software correspondence diagram · 1:1 node mapping

§

The coupling is the architecture

In a conventional deployment, the physical and software stacks operate independently and the building closes the loop. In a closed-loop deployment, the software has to close the loop because the building cannot. Every physical node has a software node that observes or controls it. That correspondence is what the four-layer stack delivers.

The digital twinextended for closed-loop ops#

Omniverse and DSX Blueprint already model the IT envelope. We extend the twin to model the thermal and biogas loops as well. This is the architecturally novel work, and the white-paper opportunity.

The twin lets the operator ask: if I take this scheduling decision now, what does the host loop look like in five minutes? What does biogas yield look like in five hours?

The DSX Blueprint reference architecture today covers compute, networking, power, and IT-side cooling. It does not cover the host-coupled thermal loops, the biogas feedback path, or the integrated source mix (MCFC plus LFP plus PEM plus grid) that this deployment runs. The extension we propose adds those elements so the twin reflects the full physical reality of a closed-loop deployment.

What the extended twin lets the operator do: simulate workload pattern shifts and see PHE saturation in advance. Predict digester thermal demand against current host operating state. Forecast biogas yield in response to thermal delivery. Plan dry-cooler engagement under predicted host-loop drop-off. Schedule tenant jobs against predicted thermal availability and gas displacement value. Each of these is a real operations question that the published reference twin cannot currently answer.

Figure 03 · Digital twin scope · what's in the model

Inside the twin.Outside the twin: what the operator does with it.

The boundary box is the load-bearing element. Elements inside are co-modelled in real time. Elements outside are operator workflows that consume the twin.

Confidence · architectural · subject to engineering review

Source · NVIDIA DSX Blueprint reference · MicroLink twin extension v0.4 Method · Boundary diagram · published vs co-developed scope · operator workflow

Audit metricsby construction#

The disclosure regime for state, federal, and international public-sector deployments asks for specific metrics. The four-layer stack emits them as operational telemetry, not as bolt-on reporting.

Public-sector deployments under California SB 253, SB 261, and the EU Energy Efficiency Directive 2027 are required to disclose operational and environmental metrics that the conventional data centre playbook treats as a separate audit exercise. The four-layer stack generates these metrics as a side effect of operations: ERE alongside PUE drops out of the GPU and building layers in real time. WUE drops out of the building layer's loop telemetry. Heat utilisation factor drops out of the PHE state in the twin. Per-tenant attestation drops out of BlueField-3 and DCGM. The audit footing is constructive.

This matters strategically because every additional public-sector deployment of this pattern faces the same disclosure framework. Once we publish the reference for how the four-layer stack maps to the disclosure regime, that mapping becomes part of the deployment template. Pazos's portfolio benefits from a published audit-by-construction approach across the public-sector accounts NVIDIA touches.

Source layer · GPU + Building

ERE · PUE · WUE

Real-time telemetry from DCGM and Metropolis. Published as standard operational metrics.

Source layer · Twin

Heat factor · grid position

DSX Blueprint extension emits heat utilisation factor and net grid energy position continuously.

Source layer · Cluster + GPU

Attestation · CO₂

Per-tenant attestation from BlueField-3 plus per-GPU CO₂ attribution from DCGM, both published as continuous metrics.

§

The audit footing is the operational footing

There is no separate audit data pipeline. There is no quarterly assembly exercise. The metrics required by SB 253, SB 261, and EU EED 2027 are emitted by the same stack that runs the deployment. That alignment is rare in this industry, and it is the right published reference for any public-sector deployment that follows.

The askand what we bring#

Co-develop the four-layer stack as the reference for closed-loop and public-sector deployments. The platform team owns the products. We bring the deployment that exercises all four layers.

Tier · Monitoring and control reference architecture

One stack, four layersThe DSX Blueprint extension for closed-loop deployments

DCGM at the GPU. Mission Control at the cluster. Metropolis at the building. Omniverse and DSX Blueprint at the twin. We deploy them as one surface and contribute the extension that closes the loop.

From the platform team

Four-layer stack architecture review
Metropolis deployment guidance for facility-as-asset
Mission Control + k0rdent integration spec
Jetson + IGX Orin reference for facility-side BMS
Cadence with the platform team through Q4 2026

From Fassiotti specifically

DSX Blueprint extension scope
Co-authored closed-loop twin reference
Omniverse simulation for thermal coupling
Joint white paper opportunity

From Pazos specifically

Audit-by-construction template review
Public-sector disclosure mapping
SB 253 / SB 261 / EU EED 2027 alignment
Bridge to public-sector deployments that follow