AWS Distro for OpenTelemetry
Getting started using AWS ECS container metrics receiver in AWS Distro for OpenTelemetry Collector
Getting started using AWS ECS container metrics receiver in AWS Distro for OpenTelemetry Collector
The Amazon ECS container agent provides a method for customers to retrieve various task metadata and
Docker stats by using
ECS Task Metadata Endpoint.
The AWS Container Observability team developed a receiver in the OpenTelemetry Collector that scrapes this endpoint and
collects container metrics (such as CPU, memory, network, and disk). Customers can enable awsecscontainermetrics
receiver
in their OpenTelemetry configuration file to collect specific task- and container-level metrics and send the data to
desired monitoring tools such as Amazon CloudWatch.
This receiver works with ECS Task Metadata Endpoint V4,
which means Amazon ECS tasks with Fargate launch type with platform version 1.4.0
or later or Amazon ECS tasks with Amazon
EC2 launch type with ECS agent version 1.39.0+
. For more information, see
Amazon ECS Container Agent Versions.
Enabling the AWS ECS Container Metrics Receiver
To enable the awsecscontainermetrics
receiver, add the name under receiver section in the config file (local/config.yaml)
.
By default, the receiver scrapes the ECS task metadata endpoint every 20s and collects all metrics
(For the full list of metrics, see Available Metrics).
The following configuration collects AWS ECS resource usage metrics by using awsecscontainermetrics
receiver and sends
them to CloudWatch using awsemf
exporter. Check out SETUP section for
configuring AWS Distro for OpenTelemetry Collector in Amazon Elastic Container Service.
1receivers:2 awsecscontainermetrics:3exporters:4 awsemf:5 namespace: 'ECS/ContainerMetrics/OpenTelemetry'6 log_group_name: '/ecs/containermetrics/opentelemetry'7
8service:9 pipelines:10 metrics:11 receivers: [awsecscontainermetrics]12 exporters: [awsemf]
Set Metrics Collection Interval
Customers can configure collection_interval
under awsecscontainermetrics
receiver to scrape and gather metrics
at a specific interval. The following example configuration will collect metrics every 40 seconds.
1receivers:2 awsecscontainermetrics:3 collection_interval: 40s4exporters:5 awsemf:6 namespace: 'ECS/ContainerMetrics/OpenTelemetry'7 log_group_name: '/ecs/containermetrics/opentelemetry'8
9service:10 pipelines:11 metrics:12 receivers: [awsecscontainermetrics]13 exporters: [awsemf]
Collect specific metrics and update metric names
The previous configuration collects all the metrics and sends them to Amazon CloudWatch using default names. Customers
can use filter
and metrictransform
processors to send specific metrics and rename them respectively.
The following configuration example collects only the ecs.task.memory.utilized
metric and renames it
to MemoryUtilized
before sending to CloudWatch.
1receivers:2 awsecscontainermetrics:3exporters:4 awsemf:5 namespace: 'ECS/ContainerMetrics/OpenTelemetry'6 log_group_name: '/ecs/containermetrics/opentelemetry'7processors:8 filter:9 metrics:10 include:11 match_type: strict12 metric_names:13 - ecs.task.memory.utilized14
15 metricstransform:16 transforms:17 - metric_name: ecs.task.memory.utilized18 action: update19 new_name: MemoryUtilized20
21service:22 pipelines:23 metrics:24 receivers: [awsecscontainermetrics]25 processors: [filter, metricstransform]26 exporters: [awsemf]
Available Metrics
The following table lists all metrics emitted by AWS ECS container metrics receiver.
Task Level Metrics | Container Level Metrics | Unit |
---|---|---|
ecs.task.memory.usage | container.memory.usage | Bytes |
ecs.task.memory.usage.max | container.memory.usage.max | Bytes |
ecs.task.memory.usage.limit | container.memory.usage.limit | Bytes |
ecs.task.memory.reserved | container.memory.reserved | Megabytes |
ecs.task.memory.utilized | container.memory.utilized | Megabytes |
ecs.task.cpu.usage.total | container.cpu.usage.total | Nanoseconds |
ecs.task.cpu.usage.kernelmode | container.cpu.usage.kernelmode | Nanoseconds |
ecs.task.cpu.usage.usermode | container.cpu.usage.usermode | Nanoseconds |
ecs.task.cpu.usage.system | container.cpu.usage.system | Nanoseconds |
ecs.task.cpu.usage.vcpu | container.cpu.usage.vcpu | vCPU |
ecs.task.cpu.cores | container.cpu.cores | Count |
ecs.task.cpu.onlines | container.cpu.onlines | Count |
ecs.task.cpu.reserved | container.cpu.reserved | vCPU |
ecs.task.cpu.utilized | container.cpu.utilized | Percent |
ecs.task.network.rate.rx | container.network.rate.rx | Bytes/Second |
ecs.task.network.rate.tx | container.network.rate.tx | Bytes/Second |
ecs.task.network.io.usage.rx_bytes | container.network.io.usage.rx_bytes | Bytes |
ecs.task.network.io.usage.rx_packets | container.network.io.usage.rx_packets | Count |
ecs.task.network.io.usage.rx_errors | container.network.io.usage.rx_errors | Count |
ecs.task.network.io.usage.rx_dropped | container.network.io.usage.rx_dropped | Count |
ecs.task.network.io.usage.tx_bytes | container.network.io.usage.tx_bytes | Bytes |
ecs.task.network.io.usage.tx_packets | container.network.io.usage.tx_packets | Count |
ecs.task.network.io.usage.tx_errors | container.network.io.usage.tx_errors | Count |
ecs.task.network.io.usage.tx_dropped | container.network.io.usage.tx_dropped | Count |
ecs.task.storage.read_bytes | container.storage.read_bytes | Bytes |
ecs.task.storage.write_bytes | container.storage.write_bytes | Bytes |
Resource Attributes and Metrics Labels
Metrics emitted by this receiver comes with a set of resource attributes. These resource attributes can be converted to metrics labels using appropriate processors/exporters (See Full Configuration section below). These metrics labels can be set as metrics dimensions while exporting to desired destinations. Check the following table to see available resource attributes for Task and Container level metrics. Container level metrics have seven additional attributes than task level metrics.
Resource Attributes for Task Level Metrics | Resource Attributes for Container Level Metrics |
---|---|
aws.ecs.cluster.name | aws.ecs.cluster.name |
aws.ecs.task.family | aws.ecs.task.family |
aws.ecs.task.arn | aws.ecs.task.arn |
aws.ecs.task.id | aws.ecs.task.id |
aws.ecs.task.version | aws.ecs.task.version |
aws.ecs.service.name | aws.ecs.service.name |
cloud.zone | cloud.zone |
cloud.account.id | cloud.account.id |
cloud.region | cloud.region |
aws.ecs.task.pull_started_at | aws.ecs.task.pull_started_at |
aws.ecs.task.pull_stopped_at | aws.ecs.container.finished_at |
aws.ecs.task.known_status | aws.ecs.container.know_status |
aws.ecs.task.launch_type | aws.ecs.task.launch_type |
aws.ecs.container.created_at | |
container.name | |
container.id | |
aws.ecs.docker.name | |
container.image.tag | |
aws.ecs.container.image.id | |
aws.ecs.container.exit_code |
Note: Please do not contains more than 9 dimension keys in the dimension set. See cloudwatch docs for more information
Full configuration for task level metrics
The following example shows a full configuration to get most useful task level metrics. It uses awsecscontainermetrics
receiver to collect all the resource usage metrics from ECS task metadata endpoint. It applies filter
processor to
select only 8 task-level metrics and update metric names using metricstransform
processor. It also renames the resource
attributes using resource
processor which will be used as metric dimensions in the Amazon CloudWatch awsemf
exporter.
Finally, it sends the metrics to CloudWatch using awsemf
exporter under the /aws/ecs/containerinsights/{ClusterName}/performance
namespace
where the {ClusterName}
placeholder will be replaced with actual cluster name. Check the
AWS EMF Exporter documentation to see and
explore the metrics in Amazon CloudWatch.
Note: AWS Distro for OpenTelemetry Collector has a default configuration backed into it for Container Insights experience which is smiliar to this one. Follow our setup doc to check how to use that default config.
1receivers:2 awsecscontainermetrics: # collect 52 metrics3
4processors:5 filter: # filter metrics6 metrics:7 include:8 match_type: strict9 metric_names: # select only 8 task level metrics out of 5210 - ecs.task.memory.reserved11 - ecs.task.memory.utilized12 - ecs.task.cpu.reserved13 - ecs.task.cpu.utilized14 - ecs.task.network.rate.rx15 - ecs.task.network.rate.tx16 - ecs.task.storage.read_bytes17 - ecs.task.storage.write_bytes18 metricstransform: # update metric names19 transforms:20 - metric_name: ecs.task.memory.utilized21 action: update22 new_name: MemoryUtilized23 - metric_name: ecs.task.memory.reserved24 action: update25 new_name: MemoryReserved26 - metric_name: ecs.task.cpu.utilized27 action: update28 new_name: CpuUtilized29 - metric_name: ecs.task.cpu.reserved30 action: update31 new_name: CpuReserved32 - metric_name: ecs.task.network.rate.rx33 action: update34 new_name: NetworkRxBytes35 - metric_name: ecs.task.network.rate.tx36 action: update37 new_name: NetworkTxBytes38 - metric_name: ecs.task.storage.read_bytes39 action: update40 new_name: StorageReadBytes41 - metric_name: ecs.task.storage.write_bytes42 action: update43 new_name: StorageWriteBytes44 resource:45 attributes: # rename resource attributes which will be used as dimensions46 - key: ClusterName47 from_attribute: aws.ecs.cluster.name48 action: insert49 - key: aws.ecs.cluster.name50 action: delete51 - key: ServiceName52 from_attribute: aws.ecs.service.name53 action: insert54 - key: aws.ecs.service.name55 action: delete56 - key: TaskId57 from_attribute: aws.ecs.task.id58 action: insert59 - key: aws.ecs.task.id60 action: delete61 - key: TaskDefinitionFamily62 from_attribute: aws.ecs.task.family63 action: insert64 - key: aws.ecs.task.family65 action: delete66exporters:67 awsemf:68 namespace: ECS/ContainerInsights69 log_group_name: '/aws/ecs/containerinsights/{ClusterName}/performance'70 log_stream_name: '{TaskId}' # TaskId placeholder will be replaced with actual value71 resource_to_telemetry_conversion:72 enabled: true73 dimension_rollup_option: NoDimensionRollup74 metric_declarations:75 dimensions: [ [ ClusterName ], [ ClusterName, TaskDefinitionFamily ] ]76 metric_name_selectors: [ . ]77service:78 pipelines:79 metrics:80 receivers: [awsecscontainermetrics ]81 processors: [filter, metricstransform, resource]82 exporters: [ awsemf ]
Full configuration for task- and container-level metrics
The following example shows a full configuration to get most useful task- and container-level metrics. It uses awsecscontainermetrics
receiver to collect all the resource usage metrics from ECS task metadata endpoint. It applies filter
processor to
select only 8 task- and container-level metrics and update metric names using metricstransform
processor. It also renames the resource
attributes using resource
processor which will be used as metric dimensions in the Amazon CloudWatch awsemf
exporter.
Finally, it sends the metrics to CloudWatch using awsemf
exporter under the /aws/ecs/containerinsights/{ClusterName}/performance
namespace
where the {ClusterName}
placeholder will be replaced with actual cluster name. Check the
AWS EMF Exporter documentation to see and
explore the metrics in Amazon CloudWatch.
1receivers:2 awsecscontainermetrics:3
4processors:5 filter:6 metrics:7 include:8 match_type: regexp9 metric_names:10 - .*memory.reserved11 - .*memory.utilized12 - .*cpu.reserved13 - .*cpu.utilized14 - .*network.rate.rx15 - .*network.rate.tx16 - .*storage.read_bytes17 - .*storage.write_bytes18 metricstransform:19 transforms:20 - metric_name: ecs.task.memory.utilized21 action: update22 new_name: MemoryUtilized23 - metric_name: ecs.task.memory.reserved24 action: update25 new_name: MemoryReserved26 - metric_name: ecs.task.cpu.utilized27 action: update28 new_name: CpuUtilized29 - metric_name: ecs.task.cpu.reserved30 action: update31 new_name: CpuReserved32 - metric_name: ecs.task.network.rate.rx33 action: update34 new_name: NetworkRxBytes35 - metric_name: ecs.task.network.rate.tx36 action: update37 new_name: NetworkTxBytes38 - metric_name: ecs.task.storage.read_bytes39 action: update40 new_name: StorageReadBytes41 - metric_name: ecs.task.storage.write_bytes42 action: update43 new_name: StorageWriteBytes44 resource:45 attributes:46 - key: ClusterName47 from_attribute: aws.ecs.cluster.name48 action: insert49 - key: aws.ecs.cluster.name50 action: delete51 - key: ServiceName52 from_attribute: aws.ecs.service.name53 action: insert54 - key: aws.ecs.service.name55 action: delete56 - key: TaskId57 from_attribute: aws.ecs.task.id58 action: insert59 - key: aws.ecs.task.id60 action: delete61 - key: TaskDefinitionFamily62 from_attribute: aws.ecs.task.family63 action: insert64 - key: aws.ecs.task.family65 action: delete66 - key: ContainerName67 from_attribute: container.name68 action: insert69 - key: container.name70 action: delete 71exporters:72 awsemf:73 namespace: ECS/ContainerInsights74 log_group_name: '/aws/ecs/containerinsights/{ClusterName}/performance'75 log_stream_name: '{TaskId}'76 resource_to_telemetry_conversion:77 enabled: true78 dimension_rollup_option: NoDimensionRollup79 metric_declarations:80 - dimensions: [[ClusterName], [ClusterName, TaskDefinitionFamily]]81 metric_name_selectors: 82 - MemoryUtilized 83 - MemoryReserved 84 - CpuUtilized85 - CpuReserved86 - NetworkRxBytes87 - NetworkTxBytes88 - StorageReadBytes89 - StorageWriteBytes90 - dimensions: [[ClusterName], [ClusterName, TaskDefinitionFamily, ContainerName]]91 metric_name_selectors: [container.*]92 93service:94 pipelines:95 metrics:96 receivers: [awsecscontainermetrics]97 processors: [filter, metricstransform, resource]98 exporters: [awsemf]