AWS Distro for OpenTelemetry

ADOT EKS Add-on v0.88.0 Advanced Configuration Migration Guide

ADOT EKS Add-on v0.88.0 Advanced Configuration Migration Guide

ADOT EKS Add-on v0.88.0 will be introducing a set of breaking changes to the add-on advanced configuration. These breaking changes will affect any user who has configured an Advanced Configuration which uses the collector.* section. Users with an incompatible advanced configuration will receive errors when upgrading their add-on version to v0.88.0 or beyond. These changes are intended to provide a better interface for configuring the ADOT Collector deployments that can be opted into through the advanced configuration.

This guide will provide an overview of what has changed and provide explicit steps on how you can migrate a pre v0.88.0-eksbuild.1 advanced configuration to the new schema. Please visit the v0.88.0 advanced configuration documentation for a full list of configurable options for Add-on version v0.88.0 and beyond.

Notable changes

Separated collector deployments for Prometheus metrics and OTLP trace ingestion

Previously, configuring a metrics (collector.amp.enabled or collector.emf.enabled) pipeline in addition to collector.xray.enabled would create a single OpenTelemetryCollector custom resource. This has now been split into two distinct OpenTelemetryCollector custom resource deployments, prometheusMetrics and otlpIngest. Doing this provides benefits such as the usage of distinct service accounts with a minimum set of permissions and individually configurable CPU and memory allocations.

Service accounts are now created with a fixed name and creation behavior

Previously users would be instructed to create service accounts as a prerequisite step and then set collector.serviceAccount.create to false while providing the previously created service account names. Service accounts will now always be created when opting into a preconfigured collector custom resource. Service accounts will also use a fixed, non configurable, name that is unique per collector custom resource
Users will now be required to provide an annotation eks.amazonaws.com/role-arn after creating an IAM Role associated to the provided Kubernetes service account name and namespace.

Deployment mode is no longer configurable

The option to configure a deployment mode has been removed in the current advanced configuration version. This is now automatically set for the preconfigured OpenTelmetery collector custom resource.

Replica count is no longer configurable

The option to configure the replica count of the advanced configuration has been removed in the current advanced configuration version. Increasing the replica count could lead to errors with certain pre-configured Collector configurations.

Increased resource configuration defaults

The default CPU and Memory resource requests and limits have been increased. The previous defaults did not cover basic use cases which could lead to CPU throttling or out of memory errors. It is highly recommended to configure resource advanced configuration values to match to the cluster workload.

Advanced Configuration Migration

The following sections will detail how to convert your pre v0.88.0 advanced configuration to a format for v0.88.0 and beyond. A scenario will be presented where Amazon Managed Service for Prometheus, X-Ray, and CloudWatch EMF have been configured in the same template.

Create new IAM roles for Kubernetes service accounts

Two new IAM roles will be needed in this scenario. One for the prometheusMetrics configuration and one for the otlpIngest configuration. Our examples will use eksctl for creating the IAM Roles but alternative options can be found in the IAM roles for service account documentation. In the following examples <cluster_name> must be substituted for the name of your EKS cluster.

eksctl create iamserviceaccount \
--name adot-col-prom-metrics \
--namespace opentelemetry-operator-system \
--cluster <cluster_name> \
--attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess \
--attach-policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy \
--role-only \
--approve
eksctl create iamserviceaccount \
--name adot-col-otlp-ingest \
--namespace opentelemetry-operator-system \
--cluster <cluster_name> \
--attach-policy-arn arn:aws:iam::aws:policy/AWSXrayWriteOnlyAccess \
--role-only \
--approve

Pre v0.88.0 Advanced Configuration

collector:
serviceAccount:
create: false
name: "addon-test-sa"
xray:
enabled: true
amp:
enabled: true
remoteWriteEndpoint: "https://aps-workspaces.<region>.amazonaws.com/workspaces/<workspace_id>/api/v1/remote_write"
cloudwatch:
enabled: true
resources:
requests:
cpu: "1"
memory: "1G"
limits:
cpu: "2"
memory: "2G"

Post v0.88.0

collector:
prometheusMetrics:
pipelines:
metrics:
amp:
enabled: true
emf:
enabled: true
exporters:
prometheusremotewrite:
endpoint: "https://aps-workspaces.<region>.amazonaws.com/workspaces/<workspace_id>/api/v1/remote_write"
resources:
requests:
cpu: "1"
memory: "1G"
limits:
cpu: "2"
memory: "2G"
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: "<prom_metrics_iam_role_arn>"
otlpIngest:
pipelines:
traces:
xray:
enabled: true
resources:
requests:
cpu: "1"
memory: "1G"
limits:
cpu: "2"
memory: "2G"
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: "<otlp_ingest_iam_role_arn>"

Updating application OTLP Exporter endpoints

Previously, after configuring collector.xray.enabled to true users would configure their applications OTLP exporter to send trace signals to http://my-collector-collector:4317.

With the individual collector deployments introduced in v0.88.0 users will need to update their applications OTLP exporter endpoint to http://adot-col-otlp-ingest-collector:4317