Skip to main content
Version: v1.1

Launcher

Introduction

The Launcher is the Service Manager (SM) module responsible for starting and stopping service instances on a Node. Rather than implementing a single execution model, the Launcher uses a pluggable runtime architecture — each configured runtime handles a specific type of Deployable Item (container services, boot-level firmware, or root filesystem components).

The Launcher does not decide which instances to run. It receives UpdateInstances commands from the SM (originating from CM) and delegates execution to the appropriate runtime based on the instance's runtime assignment. It also manages instance status reporting, offline TTL enforcement, and coordinates reboots when required by firmware-level runtimes.

Architecture

The Launcher sits between the SM's control logic and the underlying execution runtimes:

┌─────────────────────────────────────────────────────────┐
│ SM (smclient) │
│ │ │
│ UpdateInstances │
│ ▼ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Launcher │ │
│ │ │ │
│ │ • Instance lifecycle management │ │
│ │ • Runtime dispatch │ │
│ │ • Status aggregation and reporting │ │
│ │ • Offline TTL enforcement │ │
│ │ • Reboot coordination │ │
│ └──────┬──────────────┬──────────────┬─────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ Container │ │ Boot │ │ Rootfs │ │
│ │ Runtime │ │ Runtime │ │ Runtime │ │
│ └────────────┘ └────────────┘ └────────────┘ │
│ │ │ │ │
│ systemd units EFI boot squashfs images │
│ (crun/runc) partitions + systemd reboot │
└─────────────────────────────────────────────────────────┘

Key Interfaces

The Launcher implements and consumes several interfaces:

InterfaceDirectionPurpose
LauncherItfImplementsReceives UpdateInstances commands (stop/start instance lists)
InstanceStatusReceiverItfImplementsReceives status updates from runtimes
RuntimeInfoProviderItfImplementsProvides runtime capability information to CM during registration
InstanceInfoProviderItfImplementsProvides instance monitoring parameters and data
RuntimeItfConsumesDelegates instance start/stop to individual runtimes
StorageItfConsumesPersists current instance state across restarts
SenderItfConsumesSends instance status updates to CM

Instance Lifecycle

Initialization

On SM startup, the Launcher:

  1. Retrieves previously-running instances from persistent storage
  2. Starts each configured runtime
  3. Restarts instances that were active before the previous shutdown
  4. Reports the resulting instance statuses to CM

Instances are processed in parallel using a thread pool, enabling concurrent startup of multiple services.

Update Flow

When the Launcher receives an UpdateInstances command:

  1. Guard check — if a previous launch is still in progress, returns an error (eWrongState)
  2. Stop phase — stops all instances in the stop list by delegating to their assigned runtimes
  3. Cache update — removes stopped instances from local cache and storage
  4. Start phase — starts all instances in the start list:
    • Determines the correct runtime for each instance
    • Delegates StartInstance to the runtime
    • Persists successfully-started instances to storage
  5. Status report — sends the complete Node instance status to CM

The stop and start phases also run in parallel via a thread pool.

Offline TTL

Each instance can have an offline TTL (Time-To-Live) — a duration after which the instance is automatically stopped if the Node loses cloud connectivity. This prevents services from running indefinitely without cloud oversight.

  • When the cloud connection drops, the Launcher starts a TTL timer
  • When the timer expires, instances whose TTL has elapsed are stopped
  • When connectivity is restored, the Launcher reports current statuses to CM

Reboot Coordination

Some runtimes (boot, rootfs) require a system reboot to apply updates. The Launcher handles this through a reboot queue:

  1. A runtime signals RebootRequired after preparing an update
  2. The Launcher queues the reboot request
  3. After the current launch operation completes, the Launcher triggers the reboot via the runtime's Reboot() method
  4. After reboot, the runtime verifies the update succeeded (health check)

Pluggable Runtime Architecture

Each runtime is a plugin identified by a string name in the configuration. The Runtimes factory instantiates the appropriate runtime class based on the mPlugin field:

Plugin NameClassDescription
"container"ContainerRuntimeOCI container execution via systemd
"boot"BootRuntimeEFI boot partition management
"rootfs"RootfsRuntimeRoot filesystem image deployment

All runtimes implement the common RuntimeItf interface:

  • Start() / Stop() — runtime lifecycle
  • GetRuntimeInfo() — reports capabilities (architecture, OS, max instances, DMIPS, RAM)
  • StartInstance() / StopInstance() — instance lifecycle
  • Reboot() — trigger system reboot (supported by boot and rootfs; not supported by container)
  • GetInstanceMonitoringData() — per-instance resource metrics

Runtime Configuration

Each runtime is configured with:

{
"plugin": "container",
"type": "crun",
"isComponent": false,
"workingDir": "/var/aos/sm/launcher/container",
"config": { /* runtime-specific configuration */ }
}
  • plugin — selects the runtime implementation
  • type — runtime type identifier reported to CM (e.g., "crun", "boot", "rootfs")
  • isComponent — whether this runtime handles component-type items
  • workingDir — base directory for runtime working files
  • config — runtime-specific JSON configuration (parsed by each runtime)

Container Runtime

The container runtime is the primary execution environment for standard AosEdge services. It runs OCI-compatible containers managed as systemd transient units.

Execution Model

Each container instance is managed through a systemd template unit (aos-service@.service):

[Service]
Type=forking
Restart=always
ExecStartPre=/usr/bin/<runner> delete -f %i
ExecStart=/usr/bin/<runner> run -d --pid-file /run/aos/runtime/%i/.pid -b /run/aos/runtime/%i %i
ExecStop=/usr/bin/<runner> kill %i SIGKILL
ExecStopPost=/usr/bin/<runner> delete -f %i

The <runner> is an OCI-compatible container runtime binary (e.g., crun or runc). The Runner module manages systemd unit lifecycle:

  1. Start — creates a systemd drop-in with runtime parameters, then starts the unit
  2. Monitor — polls unit status periodically to detect state changes
  3. Stop — stops the systemd unit, which sends SIGKILL to the container process

Instance Setup

When starting a container instance, the container runtime:

  1. Loads OCI configs — retrieves the image config and item config from the Image Manager
  2. Prepares the root filesystem — creates an overlay mount combining the service image layers with host filesystem binds (default: bin, sbin, lib, lib64, usr)
  3. Generates the OCI runtime config (config.json) including:
    • Process configuration (entrypoint, args, environment variables, working directory)
    • Linux namespaces and cgroup configuration
    • Resource limits (CPU quota/period, memory limit) from the item config
    • Device access rules from the Resource Manager
    • Mount points for state and storage directories
    • Network configuration (hostname, DNS, hosts file)
  4. Sets up networking — delegates to the Network Manager for CNI plugin execution
  5. Registers permissions — registers instance permissions with IAM
  6. Starts the container — instructs the Runner to start the systemd unit
  7. Starts monitoring — begins collecting per-instance resource metrics

Environment Variables

Each container receives standard AosEdge environment variables:

VariableDescription
AOS_ITEM_IDService item identifier
AOS_SUBJECT_IDSubject (owner) identifier
AOS_INSTANCE_INDEXInstance index within the service
AOS_INSTANCE_IDUnique instance identifier
AOS_SECRETInstance secret for permission-based API access

State and Storage

Container instances can have persistent state and storage:

  • State (/state.dat) — a single file for instance state, preserved across restarts and updates of the same service version
  • Storage (/storage/) — a directory for persistent data, preserved across service version updates

These are backed by dedicated partitions managed by the Resource Manager, mounted into the container at runtime.

Resource Limits

The container runtime enforces resource limits via Linux cgroups:

  • CPU — quota/period-based limiting (minimum quota: 1000µs per 100000µs period)
  • RAM — memory limit in bytes
  • PID — maximum number of processes

Limits are derived from the item configuration provided by the Image Manager.

Boot Runtime

The boot runtime manages firmware-level components that require EFI boot partition updates. It uses an A/B partition scheme for safe updates with automatic rollback.

Update Mechanism

  1. Receive update — the Launcher calls StartInstance with a new manifest digest
  2. Install to inactive partition — the boot runtime writes the new image to the currently-inactive boot partition
  3. Set boot target — configures EFI boot variables to boot from the updated partition
  4. Request reboot — signals RebootRequired to the Launcher
  5. After reboot — the boot runtime verifies the update:
    • Calls SetBootOK() to confirm successful boot
    • Runs health checks (configured systemd services must reach active state)
    • On success: marks the update as installed
    • On failure: reverts to the previous partition

Partition Management

The boot runtime manages two boot partitions (A/B scheme):

  • Tracks which partition is currently active (mMainPartition) and which was booted (mCurrentPartition)
  • Supports both full and incremental image types
  • Persists installed and pending instance data as JSON files in the working directory
  • Uses an EFI boot controller to manipulate boot order and boot-success flags

Configuration

{
"workingDir": "/var/aos/sm/launcher/boot",
"loader": "systemd-boot",
"detectMode": "auto",
"versionFile": "/etc/os-release",
"partitions": ["/dev/sda1", "/dev/sda2"],
"healthCheckServices": ["aos_servicemanager.service"]
}
  • loader — boot loader type (e.g., "systemd-boot")
  • detectMode — how to detect the current boot partition ("auto" or none)
  • versionFile — path to read the current installed version
  • partitions — list of boot partition devices (A/B)
  • healthCheckServices — systemd services that must be active for the update to be considered successful

Rootfs Runtime

The rootfs runtime manages root filesystem components — system-level software deployed as squashfs images that require a reboot to activate.

Update Mechanism

  1. Receive updateStartInstance is called with a new component image
  2. Copy image — extracts the squashfs image from the OCI layer to the working directory
  3. Store action — writes a do_update action file indicating the update type (full or incremental)
  4. Save pending state — persists the pending instance info
  5. Request reboot — signals RebootRequired to the Launcher
  6. After reboot — the system applies the rootfs image during boot, then:
    • The rootfs runtime reads the action state (updated or failed)
    • On updated: runs health checks (configured systemd services must be active)
    • On health check success: writes do_apply action, requests another reboot to finalize
    • On health check failure: writes failed action, requests reboot to rollback
    • On final boot with no action: promotes pending to installed, reports success

Image Types

The rootfs runtime supports two image types, identified by OCI media type:

Media Type PrefixUpdate TypeDescription
vnd.aos.image.component.fullFullComplete filesystem image replacement
vnd.aos.image.component.incIncrementalDelta update applied to existing filesystem

Configuration

{
"workingDir": "/var/aos/sm/launcher/rootfs",
"versionFilePath": "/etc/os-release",
"healthCheckServices": ["aos_servicemanager.service", "aos_communicationmanager.service"]
}
  • workingDir — directory for update artifacts and state files
  • versionFilePath — path to read the current rootfs version (format: VERSION="x.y.z")
  • healthCheckServices — systemd services that must reach active state for the update to be considered successful

Health Checks and Update Verification

Both the boot and rootfs runtimes use a common SystemdUpdateChecker utility to verify update success:

  1. After reboot, the checker monitors configured systemd services
  2. It polls service states with retry logic (up to 5 attempts with exponential backoff from 10s to 60s)
  3. Success — all configured services reach active state
  4. Failure — any service reaches failed state, or retries are exhausted

The SystemdRebooter utility triggers reboots via the aos-reboot.service systemd unit with replace-irreversibly mode, ensuring the reboot cannot be cancelled.

Runtime Capabilities Reporting

Each runtime reports its capabilities to CM during SM registration. This information enables the cloud to make scheduling decisions:

  • Runtime ID — unique identifier for this runtime instance
  • Runtime type — the configured type string (e.g., "crun", "boot", "rootfs")
  • Architecture — CPU architecture (e.g., aarch64, x86_64)
  • OS — operating system (e.g., linux)
  • Max instances — maximum concurrent instances (unlimited for container; 1 for boot and rootfs)
  • DMIPS — processing capability metric from Node info
  • RAM — available memory from Node info
  • Runtime Types — detailed comparison of runtime characteristics
  • Key Concepts — terminology including Deployable Item and Verification