Version: v1.1

Multi-Node Operation

Introduction

AosEdge supports Units composed of multiple Nodes — distinct computing elements (boards, VMs, or hardware partitions) that work together as a single managed system. Multi-Node operation enables OEMs to build systems from heterogeneous hardware while presenting a unified management interface to the cloud.

This section covers how Nodes coordinate within a Unit, how the Main Node orchestrates the system, and how services and files are distributed across Nodes.

Why Multi-Node?

Real-world edge systems often combine hardware with different capabilities:

A gateway board with network connectivity handles cloud communication and management
A compute board with a powerful CPU or GPU runs intensive workloads
A sensor board with specialized I/O handles real-time data acquisition
A safety-critical partition runs certified software in isolation

Rather than requiring a separate cloud connection and management stack for each board, AosEdge groups them into a single Unit. The cloud sees one entity, sends one desired state, and receives one unified status — regardless of how many Nodes compose the Unit internally.

The Main Node Model

Every multi-Node Unit has exactly one Main Node that serves as the coordination point:

Responsibility	How It Works
Cloud connectivity	The Main Node's Communication Manager (CM) maintains the single WebSocket connection to AosCloud
Desired-state distribution	CM receives the desired state for the entire Unit and distributes service assignments to the appropriate Nodes
Status aggregation	CM collects status from all Nodes' Service Managers and reports a unified `UnitStatus` to the cloud
Configuration management	CM manages the Unit configuration and pushes per-Node configurations to each Node
Update orchestration	CM coordinates firmware and service updates across all Nodes

Secondary Nodes run their own Service Manager (SM) and Identity and Access Manager (IAM) locally, but rely on the Main Node's CM for cloud interaction and coordination.

How Nodes Communicate

Nodes within a Unit communicate through two mechanisms:

SM-to-CM Connection (Service Coordination)

Each Node's Service Manager connects to the Main Node's CM via gRPC. Through this connection:

SM registers itself and reports available resources, runtimes, and running instances
CM pushes service deployment commands (run, stop, update) to the Node
SM sends back status, monitoring data, alerts, and logs

On the Main Node, SM connects to CM locally. On Secondary Nodes, the Message Proxy (MP) bridges this connection across the inter-Node transport.

IAM Registration Stream (Node Management)

Each Secondary Node's IAM maintains a persistent bidirectional gRPC stream (RegisterNode) to the Main Node's IAM. This stream carries:

Node identity and state information
Provisioning and certificate operations
Pause/resume commands from the cloud
Connection state tracking

The Message Proxy's Role

The Message Proxy (MP) runs on Secondary Nodes and bridges the gap between the local components and the Main Node's CM. From the cloud's perspective, the Unit is a single entity — MP is purely an internal implementation detail that enables multi-Node operation.

MP provides:

Message routing — relays CM commands to the local SM and forwards status back to the Main Node
File distribution — downloads service images from CM's file server and delivers them to the local SM
Transport abstraction — supports multiple inter-Node transports (Xen vchan for virtualized environments, TCP sockets for networked boards)

Component Distribution Summary

Component	Main Node	Secondary Node	Purpose
CM	✓	—	Cloud connectivity, orchestration
SM	✓	✓	Local service lifecycle management
IAM	✓	✓	Per-Node identity and certificates
MP	—	✓	Inter-Node communication bridge

In This Section

Multi-Node Architecture — detailed topology, transport layers, and communication patterns between Nodes

Node Lifecycle — Node state machine (unprovisioned, provisioned, paused, error), transitions, and connection state
Inter-Node File Distribution — how service images are distributed from the Main Node to Secondary Nodes via MP
Dynamic Node Registration — how Nodes dynamically join and leave a Unit at runtime

Unit and Node Model — foundational explanation of the Unit/Node hierarchy
Architecture Overview — system-wide component architecture and interactions
Message Proxy — detailed MP component documentation
Node Identity — how Nodes establish and report their identity
Configuration — Unit and Node configuration reference

Introduction​

Why Multi-Node?​

The Main Node Model​

How Nodes Communicate​

SM-to-CM Connection (Service Coordination)​

IAM Registration Stream (Node Management)​

The Message Proxy's Role​

Component Distribution Summary​

In This Section​

Related Pages​