Image Deployment Pipeline
Introduction
The image deployment pipeline is the sequence of operations that transforms a cloud-published Deployable Item into a running service instance on a Node. This pipeline spans multiple AosCore components — the Communication Manager (CM) coordinates the overall update, the Service Manager's (SM) Image Manager handles download and storage, and the Launcher assembles the runtime environment and starts the container.
Understanding this pipeline is essential for OEMs because it governs how quickly services deploy, what happens when downloads fail, and how the system ensures that only verified, untampered images execute on the Unit.
Pipeline Overview
The image deployment pipeline consists of five stages:
| Stage | Component | Description |
|---|---|---|
| 1. Blob URL resolution | CM → SM | CM provides download URLs for image blobs when SM requests them via GetBlobsInfo |
| 2. Download | SM Image Manager + Downloader | Blobs are retrieved over HTTP/HTTPS with retry, resume, and progress reporting |
| 3. Verification | SM Image Manager | Each downloaded blob is verified against its SHA-256 content digest |
| 4. Layer unpacking | SM Image Manager + ImageHandler | Compressed tar layers are extracted to overlay-compatible filesystem layout |
| 5. Rootfs assembly and launch | SM Launcher | Layers are stacked into an OverlayFS mount and the container runtime starts the service |
Stage 1: Blob URL Resolution
When CM receives a new desired state from AosCloud, it determines which Deployable Items need to be deployed to each Node. CM's own Image Manager downloads the image index and manifests from cloud-provided URLs, then makes these blobs available to SM through a local gRPC interface.
SM's Image Manager does not communicate directly with the cloud. Instead, when it needs to download a blob (identified
by its SHA-256 digest), it calls GetBlobsInfo on the SM client, which issues a gRPC request to CM. CM resolves the
digest to a download URL — either a cloud-hosted blob URL or a local file server URL (in multi-Node configurations where
the Message Proxy serves images to secondary Nodes).
SM Image Manager SM Client (gRPC) CM
│ │ │
│── GetBlobsInfo([digest1, ...]) ──▶│ │
│ │── GetBlobsInfos(digests) ──▶│
│ │ │── resolve digests
│ │◀── BlobsInfos(urls) ────────│ to URLs
│◀── urls[] ────────────────────────│ │
│ │ │
This indirection allows CM to control blob distribution — for example, serving images from a local cache rather than re-downloading from the cloud when multiple Nodes need the same image.
Stage 2: Download
Once the SM Image Manager has a URL for a blob, it delegates the actual download to the Downloader module. The download process includes:
- Space allocation — the Image Manager reserves storage space via the Space Allocator before starting the download
- Duplicate detection — if the same blob (by digest) is already being downloaded (e.g., shared layer between two services), the second request waits for the first to complete
- HTTP/HTTPS retrieval — the Downloader fetches the blob with automatic retry (up to 3 attempts with exponential backoff) and resume support via HTTP range requests
- Progress tracking — CM-initiated downloads report progress to the cloud; SM-initiated downloads operate silently
Download Parameters
| Parameter | Value | Description |
|---|---|---|
| Max retries | 3 | Total download attempts before failure |
| Initial backoff | 1 second | Delay before first retry |
| Max backoff | 5 seconds | Upper bound on retry delay |
| Connection timeout | 10 seconds | Maximum time to establish connection |
If all retry attempts fail, the download error propagates up to the Launcher, which marks the instance as failed. The Space Allocator releases the reserved space.
Stage 3: Verification
After each blob is downloaded, the Image Manager computes its SHA-256 hash and compares it against the declared digest from the OCI manifest. This verification ensures:
- Integrity — the blob was not corrupted during transfer
- Authenticity — the blob matches what was declared in the signed manifest
Verification occurs at multiple points in the pipeline:
| Check | When | Action on failure |
|---|---|---|
| Blob digest | After download completes | Delete blob, return error, fail installation |
| Layer digest | After unpacking | Delete unpacked content, return error |
| Periodic integrity | Background timer (every 24 hours) | Remove corrupted item, reclaim space |
If verification fails during installation, the entire Deployable Item installation is aborted and partially-downloaded content is cleaned up. The Launcher receives an error and reports the instance as failed to CM.
Stage 4: Layer Unpacking
For service-type Deployable Items, each layer blob is a compressed tar archive that must be unpacked into a filesystem
tree suitable for OverlayFS. The platform-specific ImageHandler performs this operation:
- Extract tar archive — decompress (gzip) and extract the layer contents to a dedicated directory under
{imagePath}/layers/sha256/{diffID}/layer/ - Convert OCI whiteouts to OverlayFS format:
.wh.<filename>markers → character device nodes (major/minor 0) signaling file deletion.wh..wh..opqmarkers →trusted.overlay.opaqueextended attribute on the parent directory
- Set ownership — adjust file UID/GID to the configured service execution context
- Compute unpacked digest — calculate the digest of the unpacked layer content for future integrity checks
- Remove compressed blob — delete the original tar archive to reclaim space; the blob path is repurposed to store the diff ID as a pointer to the unpacked layer location
After unpacking, layers are stored by their diff ID (the uncompressed content digest declared in the image config's
rootfs.diffIDs array). This content-addressable layout enables layer sharing — if two services use the same base
layer, it is stored only once.
Supported Layer Formats
| Media Type | Description |
|---|---|
application/vnd.oci.image.layer.v1.tar | Uncompressed tar |
application/vnd.oci.image.layer.v1.tar+gzip | Gzip-compressed tar (most common) |
For non-service items (firmware components), layers are stored as raw blobs without unpacking — the boot and rootfs runtimes handle them differently.
Stage 5: Rootfs Assembly and Launch
Once all layers are verified and unpacked, the Launcher prepares the runtime environment and starts the container:
5.1 Load Configurations
The container runtime's Instance module loads three configuration files from the Image Manager's blob storage:
- Image Manifest — identifies the image config, item config, and layer references
- Image Config — provides the container's entrypoint, environment variables, working directory, and the
rootfs.diffIDsarray listing layers in stack order - Item Config — provides AosEdge-specific metadata: resource quotas, runtime selection, permissions, network rules, and alert thresholds
5.2 Assemble Root Filesystem
The Launcher creates an OverlayFS mount by stacking multiple directories:
OverlayFS mount (container rootfs)
├── Mount points directory ← top layer (proc, dev, sys mount points)
├── Image layer N ← uppermost service layer
├── Image layer N-1 ← ...
├── Image layer 1 ← base service layer
├── Host whiteouts directory ← masks host files not needed in container
└── Host root filesystem (/) ← provides system binaries (bin, sbin, lib, usr)
Each image layer path is resolved by calling GetLayerPath on the Image Manager with the layer's diff ID from the image
config. The layers are stacked in order — the first diff ID is the bottom layer, the last is the top.
5.3 Generate Runtime Config
The Launcher generates an OCI Runtime Config (config.json) that includes:
- Process — entrypoint, arguments, environment variables (including
AOS_ITEM_ID,AOS_INSTANCE_ID, etc.), UID/GID - Resource limits — CPU quota/period, memory limit, PID limit (from item config quotas)
- Namespaces — PID, mount, IPC, UTS, and optionally network namespace
- Mounts — state directory, storage directory, tmpfs, proc, dev, sys
- Devices — hardware access rules from the Resource Manager
- Network — hostname, DNS configuration, hosts file
5.4 Start Container
The container is started as a systemd transient unit (aos-service@<instanceID>.service) using an OCI-compatible
runtime binary (e.g., crun). The Runner module:
- Creates a systemd drop-in with instance-specific parameters
- Starts the systemd unit, which invokes the OCI runtime to create and run the container
- Begins monitoring the unit status for state changes
Once the container process is running, the Launcher reports the instance state as Active to CM.
Error Handling
The pipeline handles failures at each stage with appropriate recovery:
| Failure Point | Behavior |
|---|---|
| URL resolution fails | SM retries via GetBlobsInfo; if CM is unavailable, instance remains in Activating state |
| Download fails (all retries exhausted) | Space allocation released; instance reported as Failed |
| Verification fails | Corrupted blob deleted; installation aborted; instance reported as Failed |
| Layer unpacking fails | Partial content cleaned up; installation aborted; instance reported as Failed |
| Rootfs assembly fails | Mount cleaned up; instance reported as Failed |
| Container start fails | Runtime reports failure; Launcher marks instance as Failed |
In all failure cases, the instance status is reported back to CM, which forwards it to the cloud. The cloud can then decide whether to retry the deployment or take corrective action.
Concurrency and Parallelism
The pipeline exploits parallelism at multiple levels:
- Multiple items in parallel — when an
UpdateInstancescommand includes multiple services, their image installations run concurrently in a thread pool - Shared layer deduplication — if two services share a common layer, only one download occurs; the second waits for the first to complete and reuses the result
- Stop before install — outdated instances are stopped first, then new images are installed, then new instances are started (stop → install → start ordering within a single update)
Storage Lifecycle
After deployment, the Image Manager continues to manage stored images:
- Version retention — at most 2 versions of each Deployable Item are kept simultaneously
- TTL-based cleanup — removed items are permanently deleted after 30 days (configurable)
- Orphan removal — blobs and layers no longer referenced by any item are automatically cleaned up
- Space pressure eviction — the Space Allocator can trigger removal of outdated items when storage is constrained
Related Pages
- Image Manager — detailed documentation of the SM Image Manager component
- Launcher — how the Launcher starts and manages service instances
- OCI Image Format — structure of OCI images used by AosCore
- Downloader — HTTP download subsystem with retry and resume
- Desired State Model — how desired-state updates trigger the deployment pipeline
- Runtime Types — container, boot, and rootfs runtime execution models
- Service Instance States — instance state transitions during deployment