Skip to main content
Version: v1.1

Unit and Node Model

Introduction

AosEdge organizes edge computing resources into a two-level hierarchy: Units and Nodes. This model enables AosCore to manage heterogeneous hardware — from single-board deployments to complex multi-board systems — through a consistent abstraction.

This page explains what constitutes a Unit, what constitutes a Node, how Nodes are identified and classified, and how AosCore components are distributed across the Nodes within a Unit. Understanding this model is essential for configuring multi-Node deployments and interpreting system status information.

The Unit

A Unit is the complete AosEdge edge system — the top-level entity that the cloud manages as a single logical endpoint. A Unit:

  • Has a unique identity (system_id) and a model designation (unit_model)
  • Connects to AosCloud as a single entity
  • Receives desired-state updates as a whole
  • Reports its status (all Nodes, all services, all components) back to the cloud as a unified UnitStatus

The Unit is the management boundary. From the cloud's perspective, it interacts with Units — not individual Nodes. The cloud sends a desired state describing what should run across the entire Unit, and the Unit's internal coordination (handled by the Communication Manager) distributes work to the appropriate Nodes.

Unit Identity

The Unit is identified by the SystemInfo structure:

FieldDescription
system_idUnique identifier for this Unit instance
unit_modelHardware/platform model designation (e.g., a product SKU)

The system_id is established during provisioning and remains stable for the lifetime of the Unit. The unit_model describes the hardware platform type, allowing the cloud to target deployments to specific hardware configurations.

Unit Configuration

The Unit's configuration (UnitConfig) defines the set of Nodes that belong to the Unit and their per-Node settings:

FieldDescription
format_versionConfiguration format version for compatibility
versionConfiguration revision (incremented on each update from cloud)
nodes[]Array of NodeConfig entries — one per Node in the Unit

The Unit configuration is managed by the Communication Manager (CM) and distributed to all Nodes. When the cloud sends a new Unit configuration, CM validates it, stores it, and pushes the relevant NodeConfig to each Node.

The Unit configuration can be in one of three states:

StateMeaning
absentNo configuration has been applied yet
installedConfiguration is active and applied to all Nodes
failedConfiguration could not be applied (error recorded)

The Node

A Node is a computing element within a Unit. Each Node represents a distinct execution environment — typically a physical board, a virtual machine, or a hardware partition — that runs AosCore components and hosts service instances.

Node Identification

Every Node is identified by two key properties:

PropertyDescription
node_idUnique identifier for this Node within the Unit
node_typeClassification of the Node's role or hardware type

The node_id uniquely distinguishes each Node. The node_type groups Nodes by capability — for example, a Unit might have Nodes of type "main" and "compute", allowing the cloud to target specific services to specific Node types during scheduling.

Node Information

Each Node reports comprehensive information about its capabilities through the NodeInfo structure:

FieldDescription
node_idUnique Node identifier
node_typeNode classification
titleHuman-readable Node name
max_dmipsMaximum processing capacity (Dhrystone MIPS)
total_ramTotal RAM available
os_infoOperating system type, version, and features
cpus[]CPU information (model, cores, threads, architecture)
partitions[]Storage partitions (name, types, path, total size)
attrs[]Node attributes (key-value pairs for role designation)
stateCurrent Node state
is_connectedWhether the Node is currently reachable

This information is collected by the Identity and Access Manager (IAM) on each Node and reported to the Communication Manager on the main Node, which aggregates it into the Unit status.

Node States

A Node progresses through the following states:

StateDescription
unprovisionedNode has not completed initial provisioning (no certificates, no identity)
provisionedNode is provisioned and operational
pausedNode is intentionally paused (not accepting new workloads)
errorNode has encountered an error condition

State transitions are managed by the IAM node manager. The cloud can pause and resume Nodes through the IAMNodesService interface (PauseNode / ResumeNode RPCs).

Node Attributes

Node attributes are key-value pairs that define a Node's role within the Unit. Two attributes have special significance:

AttributePurpose
MainNodeDesignates this Node as the main Node (runs CM)
AosComponentsComma-separated list of AosCore components running on this Node (e.g., "CM,SM,IAM")

The MainNode attribute is checked by the IsMainNode() method in the codebase. The AosComponents attribute is parsed by ContainsComponent() to determine which core components (CM, SM, IAM, MP) are present on a given Node.

Main Node vs. Secondary Nodes

The main Node is the Node that runs the Communication Manager (CM). It serves as the coordination point for the entire Unit:

  • Maintains the connection to AosCloud
  • Receives desired-state updates and distributes them to other Nodes
  • Aggregates status from all Nodes into a unified UnitStatus
  • Manages the Unit configuration and pushes per-Node configurations
  • Coordinates update orchestration across all Nodes

A Unit has exactly one main Node. The main Node is identified by the presence of the MainNode attribute in its NodeInfo.attrs[] array.

Secondary Nodes are all other Nodes in the Unit. They:

  • Run SM and IAM locally
  • Connect to the main Node's CM via internal communication channels
  • Report their status (Node info, service instances, monitoring data) to CM
  • Receive commands from CM (run instances, apply configuration, update components)

Component Distribution Across Nodes

AosCore consists of four core components. Their distribution across Nodes follows a specific pattern:

ComponentRuns onRole
Communication Manager (CM)Main Node onlyCloud connectivity, update orchestration, Unit-level coordination
Service Manager (SM)Every NodeService lifecycle management, image handling, resource management
Identity and Access Manager (IAM)Every NodeCertificate management, Node identity, provisioning
Message Proxy (MP)Nodes requiring inter-Node message routingRoutes messages between Nodes and between Nodes and CM

Why SM and IAM Run on Every Node

Each Node needs its own SM because services execute locally — SM manages the container runtime, storage, networking, and monitoring for services running on that specific Node.

Each Node needs its own IAM because cryptographic identity is per-Node — each Node has its own certificates, manages its own key material (potentially in a local hardware security module), and handles its own provisioning lifecycle.

How CM Coordinates Multiple Nodes

The CM on the main Node uses the SM Controller module to manage connections to SM instances on all Nodes. When a secondary Node's SM connects to CM:

  1. CM receives the Node's SM info (available resources, runtimes, running instances)
  2. CM's Node Info Provider combines this with IAM-reported Node info to build a complete picture
  3. CM can then send commands to that Node's SM (run instances, stop instances, apply network config)

If a secondary Node's SM disconnects, CM marks that Node's state as error after a configurable timeout. When the SM reconnects and reports its info, the Node state returns to its IAM-reported state.

Node Configuration

Each Node receives its own configuration (NodeConfig) as part of the Unit configuration:

FieldDescription
node_idWhich Node this configuration applies to
node_typeNode type classification
versionConfiguration version
alert_rulesMonitoring alert thresholds (CPU, RAM, partitions, network)
resource_ratiosResource allocation ratios (CPU, RAM, storage, state)
labels[]Labels for service scheduling and placement
priorityNode priority for scheduling decisions

The labels field enables service placement constraints — the cloud can specify that a service should only run on Nodes with certain labels. The priority field influences which Node is preferred when multiple Nodes are eligible to run a service.

The resource_ratios field defines what fraction of the Node's physical resources are allocated to AosCore-managed services (as opposed to the host OS or other non-managed workloads).

Multi-Node Deployment Scenarios

Single-Node Unit (Simplest Case)

A single board runs all four components (CM, SM, IAM, MP). The Node is both the main Node and the only Node. This is the minimal deployment:

Unit
└── Node (main) — runs CM + SM + IAM + MP
├── Service A
├── Service B
└── Service C

Two-Node Unit (Typical Embedded)

A primary board handles coordination and connectivity, while a secondary board provides additional compute capacity:

Unit
├── Node 1 (main) — runs CM + SM + IAM
│ ├── Service A (management services)
│ └── Service B (connectivity services)
└── Node 2 (secondary) — runs SM + IAM + MP
├── Service C (compute-intensive workload)
└── Service D (real-time workload)

MP runs on the secondary Node to route messages between the secondary Node's SM and the main Node's CM.

Multi-Node Unit (Complex System)

A vehicle or industrial system with multiple heterogeneous boards:

Unit
├── Node 1 (main, type: "gateway") — runs CM + SM + IAM
│ └── Connectivity and management services
├── Node 2 (secondary, type: "compute") — runs SM + IAM + MP
│ └── High-performance compute services
└── Node 3 (secondary, type: "sensor") — runs SM + IAM + MP
└── Sensor processing and data collection services

Each Node type can have different hardware capabilities (CPU architecture, RAM, storage, attached peripherals), and the cloud uses node_type and labels to place services on appropriate Nodes.

Node Registration and Discovery

Nodes register with the Unit through the IAM RegisterNode streaming RPC. During registration:

  1. The Node's IAM sends its NodeInfo to the main Node's IAM
  2. The main Node's IAM stores the Node information and notifies listeners
  3. CM's Node Info Provider picks up the new Node and begins tracking it
  4. When the Node's SM connects to CM, the Node becomes fully operational

The IAMPublicNodesService provides APIs for discovering and managing Nodes:

RPCPurpose
GetAllNodeIDsList all registered Node IDs in the Unit
GetNodeInfoGet detailed information for a specific Node
SubscribeNodeChangedStream notifications when Node info changes
RegisterNodeBidirectional stream for Node registration