Launcher
Introduction
The Launcher is the Service Manager (SM) module responsible for starting and stopping service instances on a Node. Rather than implementing a single execution model, the Launcher uses a pluggable runtime architecture — each configured runtime handles a specific type of Deployable Item (container services, boot-level firmware, or root filesystem components).
The Launcher does not decide which instances to run. It receives UpdateInstances commands from the SM (originating
from CM) and delegates execution to the appropriate runtime based on the instance's runtime assignment. It also manages
instance status reporting, offline TTL enforcement, and coordinates reboots when required by firmware-level runtimes.
Architecture
The Launcher sits between the SM's control logic and the underlying execution runtimes:
┌─────────────────────────────────────────────────────────┐
│ SM (smclient) │
│ │ │
│ UpdateInstances │
│ ▼ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Launcher │ │
│ │ │ │
│ │ • Instance lifecycle management │ │
│ │ • Runtime dispatch │ │
│ │ • Status aggregation and reporting │ │
│ │ • Offline TTL enforcement │ │
│ │ • Reboot coordination │ │
│ └──────┬──────────────┬──────────────┬─────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ Container │ │ Boot │ │ Rootfs │ │
│ │ Runtime │ │ Runtime │ │ Runtime │ │
│ └────────────┘ └────────────┘ └────────────┘ │
│ │ │ │ │
│ systemd units EFI boot squashfs images │
│ (crun/runc) partitions + systemd reboot │
└─────────────────────────────────────────────────────────┘
Key Interfaces
The Launcher implements and consumes several interfaces:
| Interface | Direction | Purpose |
|---|---|---|
LauncherItf | Implements | Receives UpdateInstances commands (stop/start instance lists) |
InstanceStatusReceiverItf | Implements | Receives status updates from runtimes |
RuntimeInfoProviderItf | Implements | Provides runtime capability information to CM during registration |
InstanceInfoProviderItf | Implements | Provides instance monitoring parameters and data |
RuntimeItf | Consumes | Delegates instance start/stop to individual runtimes |
StorageItf | Consumes | Persists current instance state across restarts |
SenderItf | Consumes | Sends instance status updates to CM |
Instance Lifecycle
Initialization
On SM startup, the Launcher:
- Retrieves previously-running instances from persistent storage
- Starts each configured runtime
- Restarts instances that were active before the previous shutdown
- Reports the resulting instance statuses to CM
Instances are processed in parallel using a thread pool, enabling concurrent startup of multiple services.
Update Flow
When the Launcher receives an UpdateInstances command:
- Guard check — if a previous launch is still in progress, returns an error (
eWrongState) - Stop phase — stops all instances in the stop list by delegating to their assigned runtimes
- Cache update — removes stopped instances from local cache and storage
- Start phase — starts all instances in the start list:
- Determines the correct runtime for each instance
- Delegates
StartInstanceto the runtime - Persists successfully-started instances to storage
- Status report — sends the complete Node instance status to CM
The stop and start phases also run in parallel via a thread pool.
Offline TTL
Each instance can have an offline TTL (Time-To-Live) — a duration after which the instance is automatically stopped if the Node loses cloud connectivity. This prevents services from running indefinitely without cloud oversight.
- When the cloud connection drops, the Launcher starts a TTL timer
- When the timer expires, instances whose TTL has elapsed are stopped
- When connectivity is restored, the Launcher reports current statuses to CM
Reboot Coordination
Some runtimes (boot, rootfs) require a system reboot to apply updates. The Launcher handles this through a reboot queue:
- A runtime signals
RebootRequiredafter preparing an update - The Launcher queues the reboot request
- After the current launch operation completes, the Launcher triggers the reboot via the runtime's
Reboot()method - After reboot, the runtime verifies the update succeeded (health check)
Pluggable Runtime Architecture
Each runtime is a plugin identified by a string name in the configuration. The Runtimes factory instantiates the
appropriate runtime class based on the mPlugin field:
| Plugin Name | Class | Description |
|---|---|---|
"container" | ContainerRuntime | OCI container execution via systemd |
"boot" | BootRuntime | EFI boot partition management |
"rootfs" | RootfsRuntime | Root filesystem image deployment |
All runtimes implement the common RuntimeItf interface:
Start()/Stop()— runtime lifecycleGetRuntimeInfo()— reports capabilities (architecture, OS, max instances, DMIPS, RAM)StartInstance()/StopInstance()— instance lifecycleReboot()— trigger system reboot (supported by boot and rootfs; not supported by container)GetInstanceMonitoringData()— per-instance resource metrics
Runtime Configuration
Each runtime is configured with:
{
"plugin": "container",
"type": "crun",
"isComponent": false,
"workingDir": "/var/aos/sm/launcher/container",
"config": { /* runtime-specific configuration */ }
}
- plugin — selects the runtime implementation
- type — runtime type identifier reported to CM (e.g.,
"crun","boot","rootfs") - isComponent — whether this runtime handles component-type items
- workingDir — base directory for runtime working files
- config — runtime-specific JSON configuration (parsed by each runtime)
Container Runtime
The container runtime is the primary execution environment for standard AosEdge services. It runs OCI-compatible containers managed as systemd transient units.
Execution Model
Each container instance is managed through a systemd template unit (aos-service@.service):
[Service]
Type=forking
Restart=always
ExecStartPre=/usr/bin/<runner> delete -f %i
ExecStart=/usr/bin/<runner> run -d --pid-file /run/aos/runtime/%i/.pid -b /run/aos/runtime/%i %i
ExecStop=/usr/bin/<runner> kill %i SIGKILL
ExecStopPost=/usr/bin/<runner> delete -f %i
The <runner> is an OCI-compatible container runtime binary (e.g., crun or runc). The Runner module manages systemd
unit lifecycle:
- Start — creates a systemd drop-in with runtime parameters, then starts the unit
- Monitor — polls unit status periodically to detect state changes
- Stop — stops the systemd unit, which sends SIGKILL to the container process
Instance Setup
When starting a container instance, the container runtime:
- Loads OCI configs — retrieves the image config and item config from the Image Manager
- Prepares the root filesystem — creates an overlay mount combining the service image layers with host filesystem
binds (default:
bin,sbin,lib,lib64,usr) - Generates the OCI runtime config (
config.json) including:- Process configuration (entrypoint, args, environment variables, working directory)
- Linux namespaces and cgroup configuration
- Resource limits (CPU quota/period, memory limit) from the item config
- Device access rules from the Resource Manager
- Mount points for state and storage directories
- Network configuration (hostname, DNS, hosts file)
- Sets up networking — delegates to the Network Manager for CNI plugin execution
- Registers permissions — registers instance permissions with IAM
- Starts the container — instructs the Runner to start the systemd unit
- Starts monitoring — begins collecting per-instance resource metrics
Environment Variables
Each container receives standard AosEdge environment variables:
| Variable | Description |
|---|---|
AOS_ITEM_ID | Service item identifier |
AOS_SUBJECT_ID | Subject (owner) identifier |
AOS_INSTANCE_INDEX | Instance index within the service |
AOS_INSTANCE_ID | Unique instance identifier |
AOS_SECRET | Instance secret for permission-based API access |
State and Storage
Container instances can have persistent state and storage:
- State (
/state.dat) — a single file for instance state, preserved across restarts and updates of the same service version - Storage (
/storage/) — a directory for persistent data, preserved across service version updates
These are backed by dedicated partitions managed by the Resource Manager, mounted into the container at runtime.
Resource Limits
The container runtime enforces resource limits via Linux cgroups:
- CPU — quota/period-based limiting (minimum quota: 1000µs per 100000µs period)
- RAM — memory limit in bytes
- PID — maximum number of processes
Limits are derived from the item configuration provided by the Image Manager.
Boot Runtime
The boot runtime manages firmware-level components that require EFI boot partition updates. It uses an A/B partition scheme for safe updates with automatic rollback.
Update Mechanism
- Receive update — the Launcher calls
StartInstancewith a new manifest digest - Install to inactive partition — the boot runtime writes the new image to the currently-inactive boot partition
- Set boot target — configures EFI boot variables to boot from the updated partition
- Request reboot — signals
RebootRequiredto the Launcher - After reboot — the boot runtime verifies the update:
- Calls
SetBootOK()to confirm successful boot - Runs health checks (configured systemd services must reach active state)
- On success: marks the update as installed
- On failure: reverts to the previous partition
- Calls
Partition Management
The boot runtime manages two boot partitions (A/B scheme):
- Tracks which partition is currently active (
mMainPartition) and which was booted (mCurrentPartition) - Supports both full and incremental image types
- Persists installed and pending instance data as JSON files in the working directory
- Uses an EFI boot controller to manipulate boot order and boot-success flags
Configuration
{
"workingDir": "/var/aos/sm/launcher/boot",
"loader": "systemd-boot",
"detectMode": "auto",
"versionFile": "/etc/os-release",
"partitions": ["/dev/sda1", "/dev/sda2"],
"healthCheckServices": ["aos_servicemanager.service"]
}
- loader — boot loader type (e.g.,
"systemd-boot") - detectMode — how to detect the current boot partition (
"auto"or none) - versionFile — path to read the current installed version
- partitions — list of boot partition devices (A/B)
- healthCheckServices — systemd services that must be active for the update to be considered successful
Rootfs Runtime
The rootfs runtime manages root filesystem components — system-level software deployed as squashfs images that require a reboot to activate.
Update Mechanism
- Receive update —
StartInstanceis called with a new component image - Copy image — extracts the squashfs image from the OCI layer to the working directory
- Store action — writes a
do_updateaction file indicating the update type (full or incremental) - Save pending state — persists the pending instance info
- Request reboot — signals
RebootRequiredto the Launcher - After reboot — the system applies the rootfs image during boot, then:
- The rootfs runtime reads the action state (
updatedorfailed) - On
updated: runs health checks (configured systemd services must be active) - On health check success: writes
do_applyaction, requests another reboot to finalize - On health check failure: writes
failedaction, requests reboot to rollback - On final boot with no action: promotes pending to installed, reports success
- The rootfs runtime reads the action state (
Image Types
The rootfs runtime supports two image types, identified by OCI media type:
| Media Type Prefix | Update Type | Description |
|---|---|---|
vnd.aos.image.component.full | Full | Complete filesystem image replacement |
vnd.aos.image.component.inc | Incremental | Delta update applied to existing filesystem |
Configuration
{
"workingDir": "/var/aos/sm/launcher/rootfs",
"versionFilePath": "/etc/os-release",
"healthCheckServices": ["aos_servicemanager.service", "aos_communicationmanager.service"]
}
- workingDir — directory for update artifacts and state files
- versionFilePath — path to read the current rootfs version (format:
VERSION="x.y.z") - healthCheckServices — systemd services that must reach active state for the update to be considered successful
Health Checks and Update Verification
Both the boot and rootfs runtimes use a common SystemdUpdateChecker utility to verify update success:
- After reboot, the checker monitors configured systemd services
- It polls service states with retry logic (up to 5 attempts with exponential backoff from 10s to 60s)
- Success — all configured services reach
activestate - Failure — any service reaches
failedstate, or retries are exhausted
The SystemdRebooter utility triggers reboots via the aos-reboot.service systemd unit with replace-irreversibly
mode, ensuring the reboot cannot be cancelled.
Runtime Capabilities Reporting
Each runtime reports its capabilities to CM during SM registration. This information enables the cloud to make scheduling decisions:
- Runtime ID — unique identifier for this runtime instance
- Runtime type — the configured type string (e.g.,
"crun","boot","rootfs") - Architecture — CPU architecture (e.g.,
aarch64,x86_64) - OS — operating system (e.g.,
linux) - Max instances — maximum concurrent instances (unlimited for container; 1 for boot and rootfs)
- DMIPS — processing capability metric from Node info
- RAM — available memory from Node info
Related Pages
- Service Manager — SM overview and subcomponent summary
- Image Manager — how service images are acquired and stored before launching
- Resource Manager — resource allocation and device access for instances
- Network Manager — per-instance network setup via CNI
- Runtime Types — detailed comparison of runtime characteristics
- Service Instance States — instance state machine and transitions
- Key Concepts — terminology including Deployable Item and Verification