Skip to main content
Version: v1.1

Kubernetes Deployment Architecture

AosCloud runs on Amazon EKS (Elastic Kubernetes Service) with a single managed node group, Bottlerocket-based worker nodes, and a comprehensive set of Helm charts covering service mesh, observability, security scanning, certificate management, and the AosCloud application itself. This page documents the cluster configuration, all deployed workloads, and how Kubernetes integrates with AWS services via IRSA.

Prerequisites

Before reading this page, you should be familiar with:

EKS Cluster Configuration

Cluster Settings

ParameterValue
Kubernetes version1.35
API endpoint accessPrivate (default), optionally public
NetworkingVPC CNI (pod-level ENI attachment)
Services CIDRConfigurable
LoggingCloudWatch Log Group (/aws/eks/<name>/cluster)

The cluster control plane uses the AmazonEKSClusterPolicy, AmazonEKSServicePolicy, and AmazonEKSVPCResourceController managed policies via a dedicated IAM role.

Node Group

AosCloud uses a single managed node group with the following configuration:

ParameterValue
AMI typeBOTTLEROCKET_x86_64
Default instance typet3a.2xlarge (dev) / m5.2xlarge (prod)
Scaling — minimum nodes3
Scaling — maximum nodes10
Root volume50 GB gp3 EBS
Metadata serviceIMDSv2 required (http_tokens = "required")

The instance type and scaling configuration are configurable per environment, allowing per-environment tuning.

Allowed instance types (validated): t3/t3a (small through 2xlarge), t4g (xlarge, 2xlarge), m5/m5a (large through 24xlarge), m6a.xlarge, c5 (large through 24xlarge), r5 (large through 24xlarge).

Launch Template

The launch template configures Bottlerocket OS-specific settings:

  • Cluster name, endpoint, and CA certificate injected for kubelet registration
  • Admin container enabled for debugging access
  • Tag specifications applied to instances, volumes, and network interfaces
  • Autoscaler ownership tags (k8s.io/cluster-autoscaler/enabled, k8s.io/cluster-autoscaler/<cluster-id>) for automatic node scaling

EKS Managed Addons

The following addons are installed as EKS managed addons (not Helm charts), ensuring they are always compatible with the cluster version:

AddonVersionPurposeIRSA Role
kube-proxyv1.35.3-eksbuild.11Network proxy on each nodeNone (DaemonSet)
aws-ebs-csi-driverv1.60.0-eksbuild.1Persistent volumes via EBS<cluster>-ebs-controller
aws-efs-csi-driverv3.2.0-eksbuild.1Shared filesystem via EFS<cluster>-efs-controller
corednsv1.14.3-eksbuild.2Cluster DNS resolutionNone
vpc-cniv1.22.1-eksbuild.2Pod networking (VPC-native IPs)<cluster>-vpc-cni-controller

Helm Chart Catalog

All Helm charts are deployed to EKS as part of the infrastructure provisioning process.

Infrastructure Charts (Primary)

ChartRelease NameNamespaceVersionPurpose
aws_autoscalercluster-autoscalerkube-system9.57.0Scales node group based on pod scheduling demand
aws_csi_secrets_providersecrets-store-csi-awskube-system0.0.4 (local tgz)Mounts AWS Secrets Manager values as Kubernetes volumes
aws_for_fluent_bitaws-for-fluent-bitkube-system0.1.35Ships pod logs to CloudWatch Logs
aws_load_balancer_controlleraws-load-balancer-controllerkube-system1.14.1Provisions NLB/ALB for Kubernetes Services and Ingresses
cert_managercert-managercert-manager(matches app version)Automates TLS certificate provisioning (Let's Encrypt, ACM, custom CA)
istio_basebaseistio-system(configurable)Istio CRDs and base resources
istio_discoveryistio-discoveryistio-system(configurable)Istiod control plane (pilot)
istio_cniistio-cniistio-system(configurable)CNI plugin for ambient mode traffic interception
istio_ztunnelztunnelistio-system(configurable)L4 transparent proxy for ambient mesh
metrics_servermetrics-serverkube-system3.13.0Exposes pod/node resource metrics for HPA and kubectl top

Secondary Charts

Deployed after primary charts (depends on primary charts being ready):

ChartRelease NameNamespacePurpose
istio_ingress (ig_public)istio-ingressgatewayistio-systemPublic-facing NLB ingress gateway for external traffic

Observability Charts

ChartRelease NameNamespaceVersionPurpose
prometheus<base>-prometheusmonitoring23.1.0Cluster metrics collection, alerting, kube-state-metrics, node-exporter

Application & Operator Charts

ChartRelease NameNamespacePurpose
rabbitmq-cluster-operator(Bitnami operator)configurableDeploys and manages RabbitMQ clusters as Kubernetes-native CRDs
aos<base_name><environment>Main AosCloud application (all microservices, InfluxDB2 dependency)

Note: RabbitMQ is deployed via the rabbitmq-cluster-operator Helm chart. It runs as pods within EKS, managed via the Bitnami RabbitMQ Cluster Operator CRDs.

Istio Service Mesh — Ambient Mode

AosCloud deploys Istio in ambient mode, a sidecar-less architecture that uses per-node ztunnel proxies instead of per-pod sidecars. The four Istio charts together form the complete mesh:

ComponentChartFunction
Baseistio_baseCRDs (VirtualService, Gateway, PeerAuthentication, etc.)
Discovery (Istiod)istio_discoveryControl plane — pushes configuration to ztunnel and waypoint proxies
CNIistio_cniNode-level network plugin that redirects traffic into the mesh without init containers
Ztunnelistio_ztunnelPer-node L4 proxy handling mTLS, authorization, and telemetry

The CNI chart is configured with profile: "ambient", which activates the ambient-specific traffic interception rules.

Istio Ingress Gateway

The public Istio ingress gateway (istio-ingressgateway) is deployed as a secondary chart and provisions an internet-facing AWS Network Load Balancer (NLB) with the following annotations:

  • External NLB with IP target type
  • Cross-zone load balancing enabled
  • Source IP stickiness
  • Proxy protocol enabled

All external HTTPS traffic enters the cluster through this gateway.

Pod-to-AWS-Service Mapping

The following table shows how AosCloud pods connect to AWS managed services:

Pod / Service AccountAWS ServiceAccess MethodPurpose
Backend, API, Auth (sa_app)S3 (backend bucket)IRSADeployable Item storage
Backend, API, Auth (sa_app)KMSIRSAEncryption/decryption of stored objects
Task runner, Message Handler (sa_task)S3, KMS, Secrets ManagerIRSAFull access to secrets and storage for deployment tasks
Task runner (sa_task)EC2IRSAUnit management operations
Service Discovery (sa_sd)S3 (backend bucket)IRSAService registration data storage
Secrets updater (sa_secrets_manager)Secrets ManagerIRSASynchronizes secrets to Kubernetes
Base services (sa_base)S3, KMS, Secrets Manager, EC2, CloudWatchIRSAInfrastructure-wide access for operational services
Data services (sa_data_services)S3 (backend bucket)IRSAData pipeline access
Units Queues Management (sa_uqm)S3, EC2IRSAQueue management and Unit operations
Fluent BitCloudWatch LogsNode roleShips container logs to CloudWatch
Cluster AutoscalerAuto Scaling Groups, EC2IRSAScales node group
Load Balancer ControllerEC2, ELBIRSAProvisions NLB/ALB resources
EBS CSI DriverEBSIRSA (EKS addon)Persistent volume lifecycle
EFS CSI DriverEFSIRSA (EKS addon)Shared filesystem mounts
VPC CNIEC2 (ENI management)IRSA (EKS addon)Pod IP address allocation
Prometheus(in-cluster scraping)Metrics collection from all pods
All podsAurora PostgreSQLNetwork (security group)Primary database
All podsElastiCache RedisNetwork (security group)Caching and session storage
Alert HandlerDocumentDBNetwork (security group)Alert storage (MongoDB-compatible)
All podsRabbitMQ (in-cluster)Kubernetes Service DNSMessage queue (via rabbitmq-cluster-operator)

EKS OIDC Provider and IRSA

IRSA (IAM Roles for Service Accounts) is the mechanism that grants Kubernetes pods fine-grained AWS permissions without embedding credentials.

How OIDC Integration Works

  1. OIDC Provider Creation: When the EKS cluster is created, the cluster's OIDC issuer URL is registered as an IAM OIDC identity provider.

  2. Trust Policy: Each IAM role defines a trust policy that allows sts:AssumeRoleWithWebIdentity from the OIDC provider, scoped to specific Kubernetes service account names:

    {
    "Effect": "Allow",
    "Principal": {
    "Federated": "<OIDC_PROVIDER_ARN>"
    },
    "Action": "sts:AssumeRoleWithWebIdentity",
    "Condition": {
    "StringLike": {
    "<OIDC_ISSUER>:sub": "system:serviceaccount:<namespace>:<sa-name>"
    }
    }
    }
  3. Service Account Annotation: Kubernetes service accounts are annotated with eks.amazonaws.com/role-arn: <ROLE_ARN>, causing the EKS pod identity webhook to inject temporary credentials.

  4. Credential Injection: The AWS SDK in each pod automatically discovers credentials from the projected service account token, requiring no application-level configuration.

IRSA Roles for Cluster Infrastructure

The EKS cluster creates IRSA roles for cluster infrastructure components:

Role NameService AccountsPolicies
<cluster>-autoscaling-rolekube-system:aws-node, kube-system:cluster-autoscalerCustom autoscaler policy
<cluster>-load-balancer-controllerkube-system:aws-load-balancer-controllerCustom LB controller policy
<cluster>-ebs-controllerkube-system:ebs-csi-*AmazonEBSCSIDriverPolicy
<cluster>-efs-controllerkube-system:efs-csi*AmazonEFSCSIDriverPolicy
<cluster>-vpc-cni-controllerkube-system:aws-node*AmazonEKS_CNI_Policy

IRSA Roles for Application Workloads

Application-level IRSA roles:

Role NameAssigned PodsKey Permissions
<project>-<env>-appAPI, Auth, BackendS3 read/write (backend bucket), KMS encrypt/decrypt
<project>-<env>-taskTask runner, Message HandlerS3, KMS, Secrets Manager (full), EC2 operations
<project>-<env>-sdService DiscoveryS3 read/write (backend bucket)
<project>-<env>-secrets-managerSecrets updaterSecrets Manager read/write
<project>-<env>-baseOperational servicesS3, KMS, Secrets Manager, EC2, S3 (infra bucket), SaaS, CloudWatch
<project>-<env>-data-servicesData servicesS3 read/write (backend bucket)
<project>-<env>-qmUnits Queues ManagementS3, EC2 operations

Deployment Ordering

Helm charts are deployed in a strict dependency order:

  1. Primary charts: All infrastructure charts install in parallel after ECR sync and auth patch complete
  2. Secondary charts: Istio ingress gateway installs after all primary charts
  3. AOS chart: The main application installs last, after secondary charts complete
  4. Prometheus stack: Deployed independently from the primary/secondary chain

Additional Resources

EFS for InfluxDB2

A dedicated EFS filesystem is provisioned for InfluxDB2 persistent storage within the AOS chart:

  • Encrypted with the infrastructure KMS key
  • Access point at /influxdb2 (UID/GID 1000, permissions 775)
  • Mounted into EKS pod subnets
  • Referenced in the AOS Helm values as an EFS volume handle