SM Controller
Introduction
The SM Controller (smcontroller) is the Communication Manager's interface to all Service Manager (SM) instances in the
Unit. It runs a gRPC server implementing the SMService defined in servicemanager/v5/servicemanager.proto. Each SM
instance (one per Node) connects to this server as a client, establishing a bidirectional streaming channel for command
distribution and telemetry collection.
This page documents the SM Controller's architecture, how it manages per-Node connections, the commands it distributes, and the data it collects from SM instances.
Architecture
The SM Controller is composed of two layers:
| Component | Responsibility |
|---|---|
SMController | gRPC server lifecycle, TLS credential management, Node lookup and routing, interface implementation for other CM modules |
SMHandler (one per Node) | Per-connection message processing — sends commands to a specific SM and receives its responses, status, monitoring, alerts, and logs |
┌──────────────────────────────────────────────────────────────────┐
│ Communication Manager (CM) │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ SMController (gRPC Server) │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ SMHandler │ │ SMHandler │ │ SMHandler │ │ │
│ │ │ (Node A) │ │ (Node B) │ │ (Node C) │ │ │
│ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │
│ │ │ │ │ │ │
│ └─────────┼──────────────────┼──────────────────┼────────────┘ │
│ │ │ │ │
└────────────┼──────────────────┼──────────────────┼────────────────┘
│ gRPC (mTLS) │ gRPC (mTLS) │ gRPC (mTLS)
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ SM (Node A)│ │ SM (Node B)│ │ SM (Node C)│
└────────────┘ └────────────┘ └────────────┘
Interfaces Implemented
The SM Controller implements multiple interfaces that other CM modules use to interact with Nodes:
| Interface | Consumer | Purpose |
|---|---|---|
NodeConfigHandlerItf | Unit Config module | Check, get, and update Node configuration |
InstanceRunnerItf | Launcher module | Start and stop service instances on a specific Node |
MonitoringProviderItf | Monitoring module | Retrieve averaged monitoring data from a Node |
NodeNetworkItf | Network Manager | Push network configuration updates to a Node |
LogProviderItf | Log forwarding | Request system, instance, or crash logs from Nodes |
This design means other CM modules never interact with gRPC directly — they call typed interface methods on the SM
Controller, which routes the request to the appropriate Node's SMHandler.
Dependencies
The SM Controller requires the following external interfaces:
| Interface | Purpose |
|---|---|
CloudConnectionItf | Subscribes to cloud connection events; forwards connection state to all connected SMs |
CertProviderItf | Obtains TLS server certificates for the gRPC endpoint |
CertLoaderItf | Loads certificate files from storage |
x509::ProviderItf | Provides cryptographic operations for TLS |
ItemInfoProviderItf | Resolves blob digests to download URLs (for GetBlobsInfos RPC) |
AlertsReceiverItf | Receives alerts forwarded from SM instances |
SenderItf (log) | Receives log data forwarded from SM instances |
launcher::SenderItf | Receives environment variable status updates from SM instances |
MonitoringReceiverItf | Receives instant monitoring data from SM instances |
InstanceStatusReceiverItf | Receives instance lifecycle status changes from SM instances |
SMInfoReceiverItf | Receives SM registration info (Node capabilities, runtimes, resources) |
gRPC Server
Service Definition
The SM Controller implements two RPCs from the SMService:
service SMService {
rpc RegisterSM(stream SMOutgoingMessages) returns (stream SMIncomingMessages) {}
rpc GetBlobsInfos(BlobsInfosRequest) returns (BlobsInfos) {}
}
RegisterSM— A bidirectional streaming RPC. Each SM opens one stream that persists for the lifetime of the connection. CM sends commands through the return stream and receives status/telemetry through the request stream.GetBlobsInfos— A unary RPC that resolves content-addressable blob digests into downloadable URLs. Used by SM's image manager during image download.
Server Configuration
The gRPC server is configured through the smcontroller::Config structure:
| Parameter | Description |
|---|---|
mCMServerURL | Listen address for the gRPC server (e.g., :8093 or 0.0.0.0:8093) |
mCertStorage | Certificate storage identifier for obtaining server TLS credentials |
mCACert | CA certificate path for verifying SM client certificates |
If the address starts with : (port only), the server binds to 0.0.0.0 on that port.
TLS Security
The server uses mutual TLS (mTLS) by default:
- Server credentials are loaded from the certificate storage identified by
mCertStorage - Client verification uses the CA certificate at
mCACertto validate connecting SM instances - Insecure mode is available for development (uses
InsecureServerCredentials)
When certificates are renewed, the SM Controller receives an OnCertChanged callback from the certificate provider. It
then schedules a server restart (with a 10-second retry timeout) to load the new credentials. All connected SMs are
disconnected during the restart and automatically reconnect with their own renewed certificates.
Connection Management
SM Registration Flow
When an SM instance connects:
- The
RegisterSMRPC handler creates a newSMHandlerfor the connection - The handler starts two threads: a read thread (receives messages from SM) and a message processing thread (dispatches received messages to appropriate receivers)
- The handler immediately sends the current cloud connection status to the SM
- The SM sends its
SMInfomessage (Node ID, runtimes, resources) as the first message on the stream - The handler extracts the Node ID and notifies the
SMInfoReceiverthat the Node is connected - The handler is added to the active handlers list, making the Node addressable by other CM modules
Per-Node Routing
When a CM module calls an interface method (e.g., UpdateInstances(nodeID, ...)), the SM Controller:
- Acquires the handlers mutex
- Searches the active handlers list for one matching the requested
nodeID - If found, delegates the call to that handler
- If not found, returns an
eNotFounderror
This routing ensures commands are delivered to the correct Node even in multi-Node deployments.
Disconnection Handling
When an SM connection drops (network failure, SM restart, or certificate rotation):
- The read thread detects the stream closure and exits
- The
OnNodeDisconnectedcallback fires, notifying theSMInfoReceiver - The handler is removed from the active handlers list
- The
RegisterSMRPC returns, releasing the gRPC thread
The SM is expected to reconnect automatically (handled by the SM's smclient reconnection loop).
Server Shutdown
When the SM Controller stops:
- All active
SMHandlerinstances are stopped (their gRPC contexts are cancelled) - The gRPC server is shut down
- The controller waits for all
RegisterSMRPC threads to complete (all handlers removed from the list)
Commands (CM → SM)
The SM Controller sends commands to SM instances through the SMIncomingMessages stream:
| Command | Purpose | Response |
|---|---|---|
UpdateInstances | Start and/or stop service instances | UpdateInstancesStatus (async) |
UpdateNetworks | Push network configuration (subnet, IP, VLAN) | None (fire-and-forget) |
GetNodeConfigStatus | Query current Node configuration state | NodeConfigStatus (sync) |
CheckNodeConfig | Validate a proposed configuration | NodeConfigStatus (sync) |
SetNodeConfig | Apply a new Node configuration | NodeConfigStatus (sync) |
SystemLogRequest | Request system-level logs | LogData (async, chunked) |
InstanceLogRequest | Request instance-specific logs | LogData (async, chunked) |
InstanceCrashLogRequest | Request crash logs for an instance | LogData (async, chunked) |
GetAverageMonitoring | Request time-averaged monitoring data | AverageMonitoring (sync) |
ConnectionStatus | Notify SM of cloud connection state | None (notification) |
Synchronous vs Asynchronous Commands
The SM Controller uses two communication patterns:
Synchronous (request-response): For commands that require an immediate answer — GetNodeConfigStatus,
CheckNodeConfig, SetNodeConfig, and GetAverageMonitoring. The handler sends the request and blocks (up to 5
seconds) waiting for the matching response message on the stream.
Asynchronous (fire-and-forget): For commands where the response arrives later or not at all — UpdateInstances,
UpdateNetworks, log requests, and ConnectionStatus. The handler sends the message and returns immediately. Responses
(like UpdateInstancesStatus) arrive asynchronously and are dispatched to the appropriate receiver interface.
Telemetry Collection (SM → CM)
The SM Controller receives the following data from connected SM instances through the SMOutgoingMessages stream:
| Message | Content | Receiver |
|---|---|---|
SMInfo | Node ID, available runtimes (with DMIPS, RAM, OS/arch info), host resources | SMInfoReceiverItf |
UpdateInstancesStatus | Per-instance status after a deployment command (state, errors, env var status) | InstanceStatusReceiverItf |
NodeInstancesStatus | Complete snapshot of all instance states on the Node | InstanceStatusReceiverItf |
InstantMonitoring | Real-time Node and per-instance resource metrics (CPU, RAM, disk, network) | MonitoringReceiverItf |
AverageMonitoring | Time-averaged metrics (response to GetAverageMonitoring) | Returned synchronously to caller |
Alert | Threshold violations and system events (6 alert types) | AlertsReceiverItf |
LogData | Requested log content, delivered in parts with correlation IDs | SenderItf (log) |
Message Processing
Each SMHandler processes incoming messages in a dedicated thread:
- The read thread continuously reads from the gRPC stream
- Messages that are responses to synchronous requests are matched and delivered to the waiting caller
- All other messages are queued for the processing thread
- The processing thread dispatches each message to the appropriate receiver interface
This two-thread design allows the handler to receive messages concurrently with sending commands, preventing deadlocks on the bidirectional stream.
Cloud Connection Forwarding
The SM Controller subscribes to cloud connection events from CM's CloudConnectionItf. When the cloud connection state
changes:
OnConnect()orOnDisconnect()is called on the SM Controller- The controller iterates over all active
SMHandlerinstances - Each handler sends a
ConnectionStatusmessage to its SM with the new state (CONNECTEDorDISCONNECTED)
This allows SM instances to know whether the Unit currently has cloud connectivity, which can influence local behavior (e.g., buffering monitoring data during disconnection).
Multi-Node Operation
In a multi-Node Unit, the SM Controller manages connections from multiple SM instances simultaneously:
- Main Node SM connects directly to the local CM process
- Secondary Node SMs connect through the network (their traffic may traverse the Message Proxy depending on network topology)
Each SM is identified by its Node ID (extracted from the SMInfo registration message). CM modules address specific
Nodes by passing the nodeID parameter to SM Controller interface methods. The controller routes each request to the
correct handler.
The SM Controller does not limit the number of concurrent SM connections — it dynamically creates and removes handlers as SMs connect and disconnect.
Related Pages
- Communication Manager — parent overview of CM architecture and all subcomponents
- Update Manager — uses SM Controller (via
InstanceRunnerItf) to launch instances after downloading updates - Network Manager — uses SM Controller (via
NodeNetworkItf) to push network configuration to Nodes
- Unit Configuration — uses SM Controller (via
NodeConfigHandlerItf) to manage Node configs
- Service Manager — the component that connects to SM Controller as a client
- Client Communication (smclient) — SM's client-side counterpart to this server
- Architecture Overview — high-level view of all AosCore components
- Key Concepts — terminology definitions for Unit, Node, Deployable Item
comments in communication-manager/index.md, update-manager.md, and service-manager/client-communication.md
Open Questions
- None
Assumptions
- The 5-second response timeout (cResponseTime) for synchronous messages is a hardcoded constant, not configurable
- The 10-second reconnect retry timeout (cReconnectRetryTimeout) for server restart after certificate change is also hardcoded
- There is no limit on concurrent SM connections — the mSMHandlers vector grows dynamically
- The CorrectAddress function confirms that a port-only address (e.g., ":8093") is expanded to "0.0.0.0:8093"
- Secondary Node SMs connect over the network; the SM Controller does not distinguish between local and remote connections
Human Review Checklist
- Technical accuracy verified against source code
- Terminology compliance (no deprecated terms)
- Cross-references resolve to correct targets
- Interface list matches SMControllerItf inheritance chain
- gRPC service description matches v5 proto definition
- Command and telemetry tables accurately reflect message types
- Synchronous vs asynchronous pattern correctly described
- Connection lifecycle (register, disconnect, restart) accurately reflects implementation
- Content appropriate for OEM audience level --- -->