Dynamic Node Registration
Introduction
In a multi-Node AosEdge Unit, Secondary Nodes must register themselves with the Main Node's Identity and Access Manager (IAM) before they can participate in the system. This registration is dynamic — Nodes can join and leave the Unit at runtime without requiring a restart of the Main Node or any static configuration of the Node roster.
Dynamic node registration uses a bidirectional gRPC streaming connection (RegisterNode) between the Secondary
Node's IAM client and the Main Node's IAM server. This persistent stream serves as both the registration mechanism and
the ongoing command channel for provisioning, certificate management, and lifecycle operations.
This page describes the registration protocol, the message exchange, how the system handles disconnection and reconnection, and how provisioning state affects which server endpoint a Node connects to.
Registration Protocol Overview
The registration protocol operates as follows:
- The Secondary Node's IAM client opens a
RegisterNodebidirectional stream to the Main Node - The client immediately sends its
NodeInfoas the first outgoing message - The Main Node's Node Controller receives the
NodeInfo, validates the Node's state, and links the stream to the Node ID - The Main Node's Node Manager persists the Node information and marks it as connected
- The stream remains open — the Main Node can send commands (provisioning, certificate operations, pause/resume) and
the Secondary Node can send updated
NodeInfowhen its state changes
┌──────────────────┐ ┌──────────────────┐
│ Secondary Node │ │ Main Node │
│ (IAM Client) │ │ (IAM Server) │
└────────┬─────────┘ └────────┬─────────┘
│ │
│──── Open RegisterNode stream ──────────────▶│
│ │
│──── IAMOutgoingMessages { NodeInfo } ──────▶│
│ │ HandleNodeInfo()
│ │ LinkNodeIDToHandler()
│ │ SetNodeInfo() → persist
│ │
│◀─── IAMIncomingMessages { commands } ───────│
│ │
│──── IAMOutgoingMessages { responses } ─────▶│
│ │
│ ... stream remains open ... │
│ │
Connection Establishment
Endpoint Selection
The IAM client selects which Main Node endpoint to connect to based on the Secondary Node's provisioning state:
| Node State | Target Endpoint | Connection Security |
|---|---|---|
unprovisioned | mainIAMPublicServerURL | TLS (server-only authentication) or insecure |
provisioned / paused | mainIAMProtectedServerURL | mTLS (mutual authentication with client certificate) |
This separation ensures that:
- Unprovisioned Nodes can connect without client certificates (they have none yet) — they use the public endpoint
- Provisioned Nodes use mutual TLS for stronger authentication — they connect to the protected endpoint using their IAM-issued certificates
Credential Strategy
The IAM client supports a credential fallback mechanism for robust connectivity:
- Insecure mode (unprovisioned): Attempts insecure connection first, with TLS as a fallback. This handles the case where the Main Node may already be serving a secure listener.
- Secure mode (provisioned): Uses mTLS with the configured certificate storage. If the connection fails, the client cycles through available credentials before retrying.
When a credential fails, the client advances to the next available credential and rebuilds the gRPC stub before the next connection attempt.
Connection Loop
The IAM client runs a dedicated connection thread that implements a persistent connection loop:
ConnectionLoop:
while not stopped:
1. Create gRPC channel with current credentials
2. Wait for channel to become ready (with timeout)
3. Create RegisterNode bidirectional stream
4. Notify: OnConnected()
└── Send NodeInfo as first message
5. Read incoming messages until stream closes
6. Notify: OnDisconnected()
7. If connection failed, try next credential
8. Wait reconnect interval (3 seconds)
9. Retry from step 1
The connection loop runs continuously until explicitly stopped. Any stream closure — whether from network failure, server shutdown, or explicit cancellation — triggers a reconnection attempt after a brief delay.
NodeInfo Message
The first message sent on the stream is always a NodeInfo containing the Secondary Node's complete identity:
| Field | Description |
|---|---|
node_id | Unique identifier for this Node within the Unit |
node_type | Node classification (e.g., "secondary") |
title | Human-readable Node name |
max_dmips | Maximum processing capacity |
total_ram | Total RAM available |
os_info | Operating system type and version |
cpus[] | CPU information (model, cores, threads, architecture) |
partitions[] | Storage partitions with names, types, and sizes |
attrs[] | Custom key-value attributes |
state | Current provisioning state (unprovisioned, provisioned, paused, error) |
The NodeInfo is also re-sent whenever the Node's state changes (e.g., after provisioning completes, or when the Node
is paused/resumed).
Server-Side Registration Handling
Node Controller
The Main Node's Node Controller (NodeController) manages all active RegisterNode streams. It maintains a
registry mapping Node IDs to their stream handlers.
When a new stream arrives:
- A
NodeStreamHandleris created and stored in the controller's handler map - The handler enters a read loop, waiting for messages from the Secondary Node
- The first message must be a
NodeInfo— this triggersHandleNodeInfo()
HandleNodeInfo Processing
When the Node Controller receives a NodeInfo message:
-
State validation — The controller checks whether the Node's provisioning state is appropriate for the connection type:
- Public endpoint: Only
unprovisionedNodes are accepted - Protected endpoint: Only
provisionedorpausedNodes are accepted - If the state doesn't match, the handler is unlinked (but the stream is not forcibly closed)
- Public endpoint: Only
-
Node Manager update — The Node information is passed to the Node Manager, which:
- Stores the Node info in its cache and persistent database
- Marks the Node as connected (
is_connected = true) - Notifies all registered listeners of the new or updated Node
-
Handler linking — The Node ID is linked to the stream handler. If a previous handler was linked to the same Node ID (stale connection), it is unlinked first.
Command Forwarding
Once a Node is registered, the Main Node can forward operations to it through the stream. The Node Controller provides methods that:
- Look up the stream handler by Node ID
- Send a request message (
IAMIncomingMessages) on the stream - Wait for the corresponding response (
IAMOutgoingMessages) with a timeout
Supported operations forwarded to Secondary Nodes:
| Operation | Request | Response |
|---|---|---|
| Get certificate types | GetCertTypesRequest | CertTypes |
| Start provisioning | StartProvisioningRequest | StartProvisioningResponse |
| Finish provisioning | FinishProvisioningRequest | FinishProvisioningResponse |
| Deprovision | DeprovisionRequest | DeprovisionResponse |
| Pause node | PauseNodeRequest | PauseNodeResponse |
| Resume node | ResumeNodeRequest | ResumeNodeResponse |
| Create key | CreateKeyRequest | CreateKeyResponse |
| Apply certificate | ApplyCertRequest | ApplyCertResponse |
Each forwarded operation uses a request-response pattern with a configurable timeout. If the Secondary Node does not respond within the timeout, the operation returns a timeout error.
Disconnection and Reconnection
Disconnection Detection
Disconnection is detected when:
- The gRPC stream read operation returns
false(stream closed by peer or network failure) - The server context is cancelled (server shutdown)
- A write operation fails (broken pipe)
Server-Side Cleanup
When a Secondary Node disconnects:
- The
NodeStreamHandlerdestructor callsSetNodeConnected(nodeID, false)on the Node Manager - The handler is removed from the Node Controller's registry
- The Node's information remains in the Node Manager's persistent storage — only the connected flag changes
- All pending response promises are cleared (any in-flight operations receive cancellation)
Client-Side Reconnection
The IAM client handles disconnection automatically:
OnDisconnected()is called — the CurrentNode handler is notified that the Node is no longer connected- The connection loop waits for the reconnect interval (3 seconds by default)
- A new connection attempt begins — credential cycling if the previous attempt failed
- On successful reconnection,
OnConnected()fires and the client sends a freshNodeInfo
Certificate-Triggered Reconnection
The IAM client also subscribes to certificate change notifications. When the Node's IAM certificate is renewed:
- The
GRPCClientCertListenerreceives the certificate change event - A reconnection is scheduled (with a 10-second retry timeout)
- The client rebuilds its credentials and reconnects with the new certificate
This ensures that certificate rotation does not permanently break the registration stream.
Provisioning State Transitions
The registration stream plays a central role during provisioning of Secondary Nodes:
Initial Registration (Unprovisioned)
Secondary Node Main Node (Public Endpoint)
│ │
│── RegisterNode stream (insecure/TLS) ───────▶│
│── NodeInfo { state: "unprovisioned" } ──────▶│
│ │ Node registered as unprovisioned
│◀── StartProvisioningRequest ─────────────────│
│── StartProvisioningResponse ────────────────▶│
│◀── CreateKeyRequest ─────────────────────────│
│── CreateKeyResponse { CSR } ────────────────▶│
│◀── ApplyCertRequest { signed cert } ─────────│
│── ApplyCertResponse ────────────────────────▶│
│◀── FinishProvisioningRequest ────────────────│
│── FinishProvisioningResponse ───────────────▶│
│ │
│── NodeInfo { state: "provisioned" } ────────▶│
│ │ State updated
Re-Registration (Provisioned)
After provisioning completes, the Node has certificates and reconnects to the protected endpoint:
- The IAM client detects the state change to
provisioned - The certificate change triggers a reconnection
- The client reconnects to
mainIAMProtectedServerURLusing mTLS - A fresh
NodeInfowithstate: "provisioned"is sent - The Node Controller validates the state against the protected endpoint (accepted)
Node Departure
A Node "leaves" the Unit in one of two ways:
Graceful Departure (Deprovisioning)
- A
DeprovisionRequestis sent to the Node through the registration stream - The Node clears its provisioning state and certificates
- The Node sends updated
NodeInfowithstate: "unprovisioned" - The Node Controller detects the state mismatch (unprovisioned on protected endpoint) and unlinks the handler
- The Node Manager removes the unprovisioned Node from persistent storage
Ungraceful Departure (Network Loss)
- The stream read fails — the handler detects disconnection
- The Node is marked as disconnected but remains in the registry
- If the Node never reconnects, its entry persists in the Node Manager's database
- The Node can be explicitly removed through administrative action
Configuration
The IAM client configuration controls registration behavior:
| Parameter | Description | Default |
|---|---|---|
mainIAMPublicServerURL | Main Node's public IAM endpoint (for unprovisioned Nodes) | — |
mainIAMProtectedServerURL | Main Node's protected IAM endpoint (for provisioned Nodes) | — |
certStorage | Certificate storage name for mTLS credentials | — |
Connection timing constants (defined in the implementation):
| Constant | Value | Purpose |
|---|---|---|
| Reconnect interval | 3 seconds | Delay between reconnection attempts |
| Connect timeout | 3 seconds | Maximum time to wait for channel handshake |
| Service timeout | 10 seconds | Timeout for individual RPC operations |
| Cert reconnect retry | 10 seconds | Timeout for certificate-triggered reconnection |
Related Pages
- Node Identity — how Nodes establish their identity (NodeInfo structure, CurrentNode handler, Node Manager)
- Provisioning and Enrollment — the provisioning workflow that uses the registration stream
- Architecture Overview — system-wide component relationships and inter-node communication
- Node Lifecycle — Node state machine from registration to removal
- Multi-Node Architecture — how multiple Nodes coordinate within a Unit
- Certificate Architecture — certificate hierarchy and trust chain used for mTLS