Skip to main content
Version: v1.1

Multi-Node Operation

Introduction

AosEdge supports Units composed of multiple Nodes — distinct computing elements (boards, VMs, or hardware partitions) that work together as a single managed system. Multi-Node operation enables OEMs to build systems from heterogeneous hardware while presenting a unified management interface to the cloud.

This section covers how Nodes coordinate within a Unit, how the Main Node orchestrates the system, and how services and files are distributed across Nodes.

Why Multi-Node?

Real-world edge systems often combine hardware with different capabilities:

  • A gateway board with network connectivity handles cloud communication and management
  • A compute board with a powerful CPU or GPU runs intensive workloads
  • A sensor board with specialized I/O handles real-time data acquisition
  • A safety-critical partition runs certified software in isolation

Rather than requiring a separate cloud connection and management stack for each board, AosEdge groups them into a single Unit. The cloud sees one entity, sends one desired state, and receives one unified status — regardless of how many Nodes compose the Unit internally.

The Main Node Model

Every multi-Node Unit has exactly one Main Node that serves as the coordination point:

ResponsibilityHow It Works
Cloud connectivityThe Main Node's Communication Manager (CM) maintains the single WebSocket connection to AosCloud
Desired-state distributionCM receives the desired state for the entire Unit and distributes service assignments to the appropriate Nodes
Status aggregationCM collects status from all Nodes' Service Managers and reports a unified UnitStatus to the cloud
Configuration managementCM manages the Unit configuration and pushes per-Node configurations to each Node
Update orchestrationCM coordinates firmware and service updates across all Nodes

Secondary Nodes run their own Service Manager (SM) and Identity and Access Manager (IAM) locally, but rely on the Main Node's CM for cloud interaction and coordination.

How Nodes Communicate

Nodes within a Unit communicate through two mechanisms:

SM-to-CM Connection (Service Coordination)

Each Node's Service Manager connects to the Main Node's CM via gRPC. Through this connection:

  • SM registers itself and reports available resources, runtimes, and running instances
  • CM pushes service deployment commands (run, stop, update) to the Node
  • SM sends back status, monitoring data, alerts, and logs

On the Main Node, SM connects to CM locally. On Secondary Nodes, the Message Proxy (MP) bridges this connection across the inter-Node transport.

IAM Registration Stream (Node Management)

Each Secondary Node's IAM maintains a persistent bidirectional gRPC stream (RegisterNode) to the Main Node's IAM. This stream carries:

  • Node identity and state information
  • Provisioning and certificate operations
  • Pause/resume commands from the cloud
  • Connection state tracking

The Message Proxy's Role

The Message Proxy (MP) runs on Secondary Nodes and bridges the gap between the local components and the Main Node's CM. From the cloud's perspective, the Unit is a single entity — MP is purely an internal implementation detail that enables multi-Node operation.

MP provides:

  • Message routing — relays CM commands to the local SM and forwards status back to the Main Node
  • File distribution — downloads service images from CM's file server and delivers them to the local SM
  • Transport abstraction — supports multiple inter-Node transports (Xen vchan for virtualized environments, TCP sockets for networked boards)

Component Distribution Summary

ComponentMain NodeSecondary NodePurpose
CMCloud connectivity, orchestration
SMLocal service lifecycle management
IAMPer-Node identity and certificates
MPInter-Node communication bridge

In This Section