Executive summary
Defence and intelligence operations generate vast, heterogeneous data streams — text reports, imagery, video, audio, RF and signals intelligence, sensor telemetry, drone-derived measurements, and emerging modalities including quantum-derived measurements. These streams remain siloed in practice, limiting real-time situational awareness and slowing operational decision-making.
Traditional fusion stacks are rule-based, schema-rigid, and brittle under modality drift. The path forward, articulated by the CAF Digital Campaign Plan and the DND/CAF AI Strategy and converged on by allied research programs, is AI-driven fusion that learns relationships across modalities, propagates uncertainty, respects classification boundaries, and delivers explainable outputs operators can defend.
We propose an ontology-first fusion architecture. A shared eighteen-type operational ontology provides the entity backbone. A six-type ISR extension grammar — Sensor, ContactReport, Detection, Track, PlatformContact, EngagementZone — extends the spine for multi-domain situational awareness. AI scorers running over this graph perform spatiotemporal alignment, uncertainty propagation, and persistent object tracking. A governance-lens binding enforces classification and policy at runtime, not in documentation. A counterfactual attribution layer produces explainable outputs with full lineage to a tamper-evident audit chain.
Why fusion is an entity-resolution problem
The dominant pattern in defence software for two decades has been point-to-point integration. The radar talks to the C2 system. The SIGINT collector talks to the intelligence database. The drone video feed talks to the operator console. Each integration is bespoke, brittle, and shaped by the schemas of its endpoints, not by the questions the operator wants answered.
The result is a graph of pipes that resists the only operation ever asked of it: cross-modal reasoning. When an operator wants to know whether a flagged RF signature in the Northwest Passage correlates to a satellite-imaged surface contact correlated to a crew-pairing disruption on a CP-140 sortie, the answer requires four systems and a watch officer with intuition. The systems are not at fault. The integration pattern is.
An ontology inverts the pattern. Every modality contributes to a single shared graph through ingest. The fusion layer reads against the ontology, not against the source systems. Every cross-modal question becomes a traversal in a single graph. Persistent object tracking — the property defence operations need most — is the natural output of entity resolution against a stable schema.
The fusion stack, six layers
01 Ingest
Eight source-type connectors: text/CSV, JSON, REST, database, stream (Kafka, Kinesis, MQTT, AMQP), object storage, webhook, and synthetic. Each booting against a sandbox tenant by default. Heterogeneous modalities including imagery, video, audio, RF, and telemetry feed through these connectors as ContactReports.
02 Alignment
Spatiotemporal alignment across modalities. ContactReports from different sensors join when they share a window in space and time, weighted by sensor reliability priors. The output is a Detection — a candidate fused observation with a confidence score and an uncertainty bound.
03 Resolution
Entity resolution against the ontology. Detections resolve to existing Tracks where they fit; new Tracks materialize where they do not. Resolution is dynamic: the graph updates as new evidence arrives. Persistent multi-modal object tracking is the property the resolution layer guarantees.
04 Fusion
AI-driven scoring across resolved entities. Three scorer classes: spatiotemporal correlators (radar + EO + RF), anomaly detectors (RF without correlated visual return), and platform classifiers (Track to PlatformContact via radio fingerprint, kinematics, signature). Each scorer emits confidence and uncertainty.
05 Decision
Severity-banded action gates. Auto-tier executes; review-tier surfaces to operator with a defended recommendation; critical-tier holds for explicit authorization. Every decision carries reasoning visible to the operator and a reversibility window during which it can be undone.
06 Audit
Hash-chained lineage. Every entity, score, decision, and action references its predecessor in an FNV-1a chain anchored to a genesis block per tenant. The chain is tamper-evident; tampering breaks the chain visibly. Provenance for every output is one query away.
Policy-aware fusion and cross-classification integration
Defence operations span classification levels. Unclassified satellite imagery, Protected B operational reporting, Secret signals intelligence, Top Secret special-access compartments — each modality can carry data at a different classification, and fusion must respect the lattice.
We model classification as a runtime property on every entity. A scorer running at level N can only read entities at level ≤ N. A fused output is labeled at the maximum of its inputs. Outputs are visible only to operators cleared at or above the output's level. Cross-classification fusion is therefore impossible to misuse: the platform refuses to surface an output to a viewer below its classification.
The lens binding model extends to allied interoperability. A Track shared with a Five Eyes partner carries a release marking (REL TO USA, GBR, AUS, CAN, NZL) as a property the runtime enforces at every read. The audit chain records which partner received which Track at which time, which is the property interoperability auditors most often ask for and rarely receive.
None of this requires a separate security layer bolted on. It is the same code path the runtime uses for any other constraint. Classification is a runtime property, not a documentation property. If the system does not enforce it on every read, it is not enforced.
The Arctic anchor
Persistent Arctic domain awareness is the use case where this architecture demonstrates most. Sparse traffic, slow-moving objects, mixed sensor coverage — RADARSAT satellite passes are minutes long; NORAD long-range radar is continuous; RF SIGINT is intermittent — and a strategic priority anchored in NORAD modernization, the Arctic and Northern Policy Framework, and Operation NANOOK.
Our reference deployment ingests satellite imagery, NORAD-class radar telemetry, and RF SIGINT. The alignment layer joins reports across the three modalities within space-time gates calibrated for Arctic operational tempo. Detections resolve to Tracks that persist for days as objects transit the Northwest Passage. EngagementZones (NORAD ADIZ segments, NWP chokepoints) trigger policy lenses; an unidentified Track entering an ADIZ segment opens an automatic investigation against the Authorization extension.
SWaP-aware edge deployment
Cloud and command-post deployments are not enough. Tactical edge — wearable systems for forward operators, vehicle-mounted compute, forward command posts under degraded connectivity — has hard Size, Weight, and Power constraints. The fusion stack must run within a 5–20 MB binary, on 256 MB to 2 GB RAM, drawing 1–10 W, often offline.
Our edge variant compiles a subset of the runtime to a Rust binary linked against an embedded SQLite. Heavier scorers and the audit aggregator run in the command post or cloud, with the edge syncing opportunistically over low-bandwidth secure channels. Deterministic RNG seeds the edge state so the same input produces the same fused output, regardless of network state.
Roadmap and companion papers
The eighteen-type core ontology and the ISR extension are operational on MAIA's decision platform today. The runtime, the spatiotemporal correlator, the entity-resolution layer, the counterfactual outcome ledger, and the hash-chained audit trail are all in production. The classification-aware fusion layer and the SWaP-aware edge variant are on a six-month milestone for Phase 2 deployment.
The companion specification is The Operational Ontology. The reproducible measurement for end-to-end latency is the Decision Latency Benchmark. The field study that grounds all three is The State of Operational Decision-Making.
