Skip to main content
Version: v1.1

Desired State Model

Introduction

AosCore operates on a declarative desired-state model. Rather than receiving imperative commands ("start service X", "stop service Y"), the system receives a complete description of what should be running and continuously reconciles toward that target. This page documents the desired-state data structure, the reconciliation state machine within the Communication Manager (CM), and how the system converges even after failures, restarts, or network interruptions.

Understanding this model is essential for OEMs because it defines how workload changes propagate from the cloud to the edge — every service deployment, configuration change, and firmware update flows through this mechanism.

Desired State Structure

The desired state is a JSON message sent from AosCloud over the WebSocket connection. It describes the complete target configuration for the Unit. The CM deserializes this into a DesiredStatus structure containing the following sections:

FieldTypeDescription
NodesArray of node statesTarget state for each Node (provisioned or paused)
Unit ConfigOptional config objectNew Unit-level configuration (version, vendor config)
Deployable ItemsArray of item descriptorsService images and firmware components that should be present
InstancesArray of desired instancesWhich services should run, how many replicas, with what priority and labels
SubjectsArray of subject infoIdentity subjects (users/services) associated with instances
CertificatesArray of certificate infoCertificates needed for Verification of Deployable Items
Certificate ChainsArray of chain infoCertificate chains for trust validation

Desired Instance Info

Each entry in the Instances array specifies:

  • Item ID — identifies which Deployable Item (service image) to run
  • Subject ID — the identity subject under which the instance runs
  • Priority — scheduling priority (higher priority instances are placed first)
  • Num Instances — how many replicas of this service should run across the Unit
  • Labels — placement constraints that match against Node labels

Desired Node State

Each Node can be in one of two desired states:

StateDescription
ProvisionedNode is active and should run assigned instances
PausedNode is intentionally idle — no instances should be scheduled to it

Reconciliation Flow Diagram

The following diagram shows the end-to-end reconciliation flow from cloud desired-state reception through convergence:

loading...

Reconciliation State Machine

When the CM receives a new desired state, the Update Manager's DesiredStatusHandler drives reconciliation through a sequential state machine. Each state persists to storage so the process can resume after a restart.

Update States

StateActionNext State
DownloadingDownload all Deployable Items specified in the desired state via the Image ManagerPending
PendingTransition point — ready to apply changesInstalling
InstallingApply Node state changes (pause/resume) and Unit configuration updatesLaunching
LaunchingBuild run-instance requests and invoke the Launcher to start/stop instances on NodesWaitingActive
WaitingActiveWait for all instances to transition out of the Activating state (timeout: 10 minutes)Finalizing
FinalizingCommit Deployable Items as installed, clean up obsolete itemsNone (complete)

After reaching the None state (reconciliation complete), the Unit Status Handler sends a full status report to AosCloud.

State Persistence and Recovery

The desired status and current update state are persisted to storage at each transition:

  1. When a new desired state arrives, it is stored before processing begins
  2. Each state transition is written to storage
  3. On restart, the handler reads the stored state and resumes from where it left off

This ensures that even if the Unit loses power mid-update, reconciliation continues from the last completed phase rather than restarting from scratch.

Comparison and Change Detection

Before starting the reconciliation state machine, the CM determines whether an update is actually needed. The comparison logic checks:

  1. Node state changes — are any Nodes being paused or resumed?
  2. Unit configuration — is there a new Unit config version?
  3. Deployable Item changes — are there new items to download, or installed items that should be removed?
  4. Instance changes — do the desired instances differ from what is currently running?

For Deployable Items, the comparison checks each desired item against the currently installed items by ID, type, and version. If any desired item is not installed, or any installed item is not in the desired set, an update is required.

For instances, the comparison verifies that each desired instance (by item ID, subject ID, and instance index) exists in the current running set, and that no running instances exist that are not in the desired set.

If no changes are detected, the desired state is acknowledged without triggering the state machine.

Cancellation and Superseding Updates

A new desired state can arrive while a previous reconciliation is in progress. The system handles this as follows:

  1. If the new desired state is identical to the one being processed, it is ignored
  2. If the new desired state differs, the current update is cancelled:
    • If in the Downloading or Installing state, the Image Manager's download operations are cancelled
    • The state machine resets to Downloading with the new desired state
  3. The new desired state replaces the pending state and reconciliation restarts

This ensures the Unit always converges toward the most recent desired state from the cloud, even if intermediate states were partially applied.

Component Interactions During Reconciliation

CM → Image Manager (Downloading Phase)

The CM's Image Manager handles downloading and verifying Deployable Items:

  • Downloads OCI image indexes and layers from the configured image registry
  • Verifies image integrity using the certificates and certificate chains from the desired state
  • Reports per-item status (success or failure with error details)

CM → Node Handler (Installing Phase)

For Node state changes, the CM instructs the IAM client's Node Handler to pause or resume Nodes:

  • Pause — the Node stops accepting new instance assignments
  • Resume — the Node returns to active scheduling

CM → Unit Config (Installing Phase)

If the desired state includes a new Unit configuration:

  1. The configuration is validated (CheckUnitConfig)
  2. If valid, it is applied (UpdateUnitConfig)
  3. The result (installed or failed) is recorded for status reporting

CM → Launcher (Launching Phase)

The Launcher receives a complete list of RunInstanceRequest entries built from the desired state:

  • Each request includes the item ID, version, owner, subject, priority, number of instances, and labels
  • The Launcher handles scheduling instances to Nodes and communicating with each Node's Service Manager
  • Instance statuses are returned immediately (instances may still be in Activating state)

SM → CM (WaitingActive Phase)

The Service Manager on each Node reports instance status changes back to the CM via a listener callback. The CM waits until all instances have transitioned from Activating to either Active or Failed, with a 10-minute timeout.

Status Reporting

After reconciliation completes (or fails), the Unit Status Handler sends a comprehensive status report to AosCloud containing:

  • Node information — current state and capabilities of each Node
  • Deployable Item statuses — installed version and state of each item
  • Instance statuses — running state of each instance across all Nodes
  • Unit config status — current configuration version and any errors
  • Update Node statuses — per-Node errors from the reconciliation process

The status report is also sent when the cloud connection is established (or re-established), ensuring the cloud always has an accurate view of the Unit's current state.

Convergence Guarantees

The desired-state model provides the following convergence properties:

PropertyMechanism
Crash recoveryState persisted at each phase; resumes on restart
Network resilienceDesired state stored locally; reconciliation continues offline
IdempotencyDuplicate desired states are detected and ignored
Monotonic progressEach phase completes fully before advancing; partial states are not committed
Latest-winsNew desired state supersedes in-progress reconciliation
Status visibilityFull status report sent after each reconciliation and on reconnect