AWS Open Distro for OpenTelemetry

Getting started using AWS ECS container metrics receiver in AWS OpenTelemetry Collector

Getting started using AWS ECS container metrics receiver in AWS OpenTelemetry Collector

The Amazon ECS container agent provides a method for customers to retrieve various task metadata and Docker stats by using ECS Task Metadata Endpoint. The AWS Container Observability team developed a receiver in the OpenTelemetry Collector that scrapes this endpoint and collects container metrics (such as CPU, memory, network, and disk). Customers can enable awsecscontainermetrics receiver in their OpenTelemetry configuration file to collect specific task- and container-level metrics and send the data to desired monitoring tools such as Amazon CloudWatch.

This receiver works with ECS Task Metadata Endpoint V4, which means Amazon ECS tasks with Fargate launch type with platform version 1.4.0 or later or Amazon ECS tasks with Amazon EC2 launch type with ECS agent version 1.39.0+. For more information, see Amazon ECS Container Agent Versions.




Enabling the AWS ECS Container Metrics Receiver

To enable the awsecscontainermetrics receiver, add the name under receiver section in the config file (local/config.yaml). By default, the receiver scrapes the ECS task metadata endpoint every 20s and collects all metrics (For the full list of metrics, see Available Metrics).

The following configuration collects AWS ECS resource usage metrics by using awsecscontainermetrics receiver and sends them to CloudWatch using awsemf exporter. Check out SETUP section for configuring AWS Distro for OpenTelemetry Collector in Amazon Elastic Container Service.

1receivers:
2 awsecscontainermetrics:
3exporters:
4 awsemf:
5 namespace: 'ECS/ContainerMetrics/OpenTelemetry'
6 log_group_name: '/ecs/containermetrics/opentelemetry'
7
8service:
9 pipelines:
10 metrics:
11 receivers: [awsecscontainermetrics]
12 exporters: [awsemf]



Set Metrics Collection Interval

Customers can configure collection_interval under awsecscontainermetrics receiver to scrape and gather metrics at a specific interval. The following example configuration will collect metrics every 40 seconds.

1receivers:
2 awsecscontainermetrics:
3 collection_interval: 40s
4exporters:
5 awsemf:
6 namespace: 'ECS/ContainerMetrics/OpenTelemetry'
7 log_group_name: '/ecs/containermetrics/opentelemetry'
8
9service:
10 pipelines:
11 metrics:
12 receivers: [awsecscontainermetrics]
13 exporters: [awsemf]



Collect specific metrics and update metric names

The previous configuration collects all the metrics and sends them to Amazon CloudWatch using default names. Customers can use filter and metrictransform processors to send specific metrics and rename them respectively.

The following configuration example collects only the ecs.task.memory.utilized metric and renames it to MemoryUtilized before sending to CloudWatch.

1receivers:
2 awsecscontainermetrics:
3exporters:
4 awsemf:
5 namespace: 'ECS/ContainerMetrics/OpenTelemetry'
6 log_group_name: '/ecs/containermetrics/opentelemetry'
7processors:
8 filter:
9 metrics:
10 include:
11 match_type: strict
12 metric_names:
13 - ecs.task.memory.utilized
14
15 metricstransform:
16 transforms:
17 - metric_name: ecs.task.memory.utilized
18 action: update
19 new_name: MemoryUtilized
20
21service:
22 pipelines:
23 metrics:
24 receivers: [awsecscontainermetrics]
25 processors: [filter, metricstransform]
26 exporters: [awsemf]



Available Metrics

The following table lists all metrics emitted by AWS ECS container metrics receiver.

Task Level MetricsContainer Level MetricsUnit
ecs.task.memory.usagecontainer.memory.usageBytes
ecs.task.memory.usage.maxcontainer.memory.usage.maxBytes
ecs.task.memory.usage.limitcontainer.memory.usage.limitBytes
ecs.task.memory.reservedcontainer.memory.reservedMegabytes
ecs.task.memory.utilizedcontainer.memory.utilizedMegabytes
ecs.task.cpu.usage.totalcontainer.cpu.usage.totalNanoseconds
ecs.task.cpu.usage.kernelmodecontainer.cpu.usage.kernelmodeNanoseconds
ecs.task.cpu.usage.usermodecontainer.cpu.usage.usermodeNanoseconds
ecs.task.cpu.usage.systemcontainer.cpu.usage.systemNanoseconds
ecs.task.cpu.usage.vcpucontainer.cpu.usage.vcpuvCPU
ecs.task.cpu.corescontainer.cpu.coresCount
ecs.task.cpu.onlinescontainer.cpu.onlinesCount
ecs.task.cpu.reservedcontainer.cpu.reservedvCPU
ecs.task.cpu.utilizedcontainer.cpu.utilizedPercent
ecs.task.network.rate.rxcontainer.network.rate.rxBytes/Second
ecs.task.network.rate.txcontainer.network.rate.txBytes/Second
ecs.task.network.io.usage.rx_bytescontainer.network.io.usage.rx_bytesBytes
ecs.task.network.io.usage.rx_packetscontainer.network.io.usage.rx_packetsCount
ecs.task.network.io.usage.rx_errorscontainer.network.io.usage.rx_errorsCount
ecs.task.network.io.usage.rx_droppedcontainer.network.io.usage.rx_droppedCount
ecs.task.network.io.usage.tx_bytescontainer.network.io.usage.tx_bytesBytes
ecs.task.network.io.usage.tx_packetscontainer.network.io.usage.tx_packetsCount
ecs.task.network.io.usage.tx_errorscontainer.network.io.usage.tx_errorsCount
ecs.task.network.io.usage.tx_droppedcontainer.network.io.usage.tx_droppedCount
ecs.task.storage.read_bytescontainer.storage.read_bytesBytes
ecs.task.storage.write_bytescontainer.storage.write_bytesBytes



Resource Attributes and Metrics Labels

Metrics emitted by this receiver comes with a set of resource attributes. These resource attributes can be converted to metrics labels using appropriate processors/exporters (See Full Configuration section below). These metrics labels can be set as metrics dimensions while exporting to desired destinations. Check the following table to see available resource attributes for Task and Container level metrics. Container level metrics have seven additional attributes than task level metrics.

Resource Attributes for Task Level MetricsResource Attributes for Container Level Metrics
aws.ecs.cluster.nameaws.ecs.cluster.name
aws.ecs.task.familyaws.ecs.task.family
aws.ecs.task.arnaws.ecs.task.arn
aws.ecs.task.idaws.ecs.task.id
aws.ecs.task.versionaws.ecs.task.version
aws.ecs.service.nameaws.ecs.service.name
cloud.zonecloud.zone
cloud.account.idcloud.account.id
cloud.regioncloud.region
aws.ecs.task.pull_started_ataws.ecs.task.pull_started_at
aws.ecs.task.pull_stopped_ataws.ecs.container.finished_at
aws.ecs.task.known_statusaws.ecs.container.know_status
aws.ecs.task.launch_typeaws.ecs.task.launch_type
aws.ecs.container.created_at
container.name
container.id
aws.ecs.docker.name
container.image.tag
aws.ecs.container.image.id
aws.ecs.container.exit_code

Note: Please do not contains more than 9 dimension keys in the dimension set. See cloudwatch docs for more information




Full configuration for task level metrics

The following example shows a full configuration to get most useful task level metrics. It uses awsecscontainermetrics receiver to collect all the resource usage metrics from ECS task metadata endpoint. It applies filter processor to select only 8 task-level metrics and update metric names using metricstransform processor. It also renames the resource attributes using resource processor which will be used as metric dimensions in the Amazon CloudWatch awsemf exporter. Finally, it sends the metrics to CloudWatch using awsemf exporter under the /aws/ecs/containerinsights/{ClusterName}/performance namespace where the {ClusterName} placeholder will be replaced with actual cluster name. Check the AWS EMF Exporter documentation to see and explore the metrics in Amazon CloudWatch.

Note: AWS OpenTelemetry Collector has a default configuration backed into it for Container Insights experience which is smiliar to this one. Follow our setup doc to check how to use that default config.

1receivers:
2 awsecscontainermetrics: # collect 52 metrics
3
4processors:
5 filter: # filter metrics
6 metrics:
7 include:
8 match_type: strict
9 metric_names: # select only 8 task level metrics out of 52
10 - ecs.task.memory.reserved
11 - ecs.task.memory.utilized
12 - ecs.task.cpu.reserved
13 - ecs.task.cpu.utilized
14 - ecs.task.network.rate.rx
15 - ecs.task.network.rate.tx
16 - ecs.task.storage.read_bytes
17 - ecs.task.storage.write_bytes
18 metricstransform: # update metric names
19 transforms:
20 - metric_name: ecs.task.memory.utilized
21 action: update
22 new_name: MemoryUtilized
23 - metric_name: ecs.task.memory.reserved
24 action: update
25 new_name: MemoryReserved
26 - metric_name: ecs.task.cpu.utilized
27 action: update
28 new_name: CpuUtilized
29 - metric_name: ecs.task.cpu.reserved
30 action: update
31 new_name: CpuReserved
32 - metric_name: ecs.task.network.rate.rx
33 action: update
34 new_name: NetworkRxBytes
35 - metric_name: ecs.task.network.rate.tx
36 action: update
37 new_name: NetworkTxBytes
38 - metric_name: ecs.task.storage.read_bytes
39 action: update
40 new_name: StorageReadBytes
41 - metric_name: ecs.task.storage.write_bytes
42 action: update
43 new_name: StorageWriteBytes
44 resource:
45 attributes: # rename resource attributes which will be used as dimensions
46 - key: ClusterName
47 from_attribute: aws.ecs.cluster.name
48 action: insert
49 - key: aws.ecs.cluster.name
50 action: delete
51 - key: ServiceName
52 from_attribute: aws.ecs.service.name
53 action: insert
54 - key: aws.ecs.service.name
55 action: delete
56 - key: TaskId
57 from_attribute: aws.ecs.task.id
58 action: insert
59 - key: aws.ecs.task.id
60 action: delete
61 - key: TaskDefinitionFamily
62 from_attribute: aws.ecs.task.family
63 action: insert
64 - key: aws.ecs.task.family
65 action: delete
66exporters:
67 awsemf:
68 namespace: ECS/ContainerInsights
69 log_group_name: '/aws/ecs/containerinsights/{ClusterName}/performance'
70 log_stream_name: '{TaskId}' # TaskId placeholder will be replaced with actual value
71 resource_to_telemetry_conversion:
72 enabled: true
73 dimension_rollup_option: NoDimensionRollup
74 metric_declarations:
75 dimensions: [ [ ClusterName ], [ ClusterName, TaskDefinitionFamily ] ]
76 metric_name_selectors: [ . ]
77service:
78 pipelines:
79 metrics:
80 receivers: [awsecscontainermetrics ]
81 processors: [filter, metricstransform, resource]
82 exporters: [ awsemf ]



Full configuration for task- and container-level metrics

The following example shows a full configuration to get most useful task- and container-level metrics. It uses awsecscontainermetrics receiver to collect all the resource usage metrics from ECS task metadata endpoint. It applies filter processor to select only 8 task- and container-level metrics and update metric names using metricstransform processor. It also renames the resource attributes using resource processor which will be used as metric dimensions in the Amazon CloudWatch awsemf exporter. Finally, it sends the metrics to CloudWatch using awsemf exporter under the /aws/ecs/containerinsights/{ClusterName}/performance namespace where the {ClusterName} placeholder will be replaced with actual cluster name. Check the AWS EMF Exporter documentation to see and explore the metrics in Amazon CloudWatch.

1receivers:
2 awsecscontainermetrics:
3
4processors:
5 filter:
6 metrics:
7 include:
8 match_type: regexp
9 metric_names:
10 - .*memory.reserved
11 - .*memory.utilized
12 - .*cpu.reserved
13 - .*cpu.utilized
14 - .*network.rate.rx
15 - .*network.rate.tx
16 - .*storage.read_bytes
17 - .*storage.write_bytes
18 metricstransform:
19 transforms:
20 - metric_name: ecs.task.memory.utilized
21 action: update
22 new_name: MemoryUtilized
23 - metric_name: ecs.task.memory.reserved
24 action: update
25 new_name: MemoryReserved
26 - metric_name: ecs.task.cpu.utilized
27 action: update
28 new_name: CpuUtilized
29 - metric_name: ecs.task.cpu.reserved
30 action: update
31 new_name: CpuReserved
32 - metric_name: ecs.task.network.rate.rx
33 action: update
34 new_name: NetworkRxBytes
35 - metric_name: ecs.task.network.rate.tx
36 action: update
37 new_name: NetworkTxBytes
38 - metric_name: ecs.task.storage.read_bytes
39 action: update
40 new_name: StorageReadBytes
41 - metric_name: ecs.task.storage.write_bytes
42 action: update
43 new_name: StorageWriteBytes
44 resource:
45 attributes:
46 - key: ClusterName
47 from_attribute: aws.ecs.cluster.name
48 action: insert
49 - key: aws.ecs.cluster.name
50 action: delete
51 - key: ServiceName
52 from_attribute: aws.ecs.service.name
53 action: insert
54 - key: aws.ecs.service.name
55 action: delete
56 - key: TaskId
57 from_attribute: aws.ecs.task.id
58 action: insert
59 - key: aws.ecs.task.id
60 action: delete
61 - key: TaskDefinitionFamily
62 from_attribute: aws.ecs.task.family
63 action: insert
64 - key: aws.ecs.task.family
65 action: delete
66 - key: ContainerName
67 from_attribute: container.name
68 action: insert
69 - key: container.name
70 action: delete
71exporters:
72 awsemf:
73 namespace: ECS/ContainerInsights
74 log_group_name: '/aws/ecs/containerinsights/{ClusterName}/performance'
75 log_stream_name: '{TaskId}'
76 resource_to_telemetry_conversion:
77 enabled: true
78 dimension_rollup_option: NoDimensionRollup
79 metric_declarations:
80 - dimensions: [[ClusterName], [ClusterName, TaskDefinitionFamily]]
81 metric_name_selectors:
82 - MemoryUtilized
83 - MemoryReserved
84 - CpuUtilized
85 - CpuReserved
86 - NetworkRxBytes
87 - NetworkTxBytes
88 - StorageReadBytes
89 - StorageWriteBytes
90 - dimensions: [[ClusterName], [ClusterName, TaskDefinitionFamily, ContainerName]]
91 metric_name_selectors: [container.*]
92
93service:
94 pipelines:
95 metrics:
96 receivers: [awsecscontainermetrics]
97 processors: [filter, metricstransform, resource]
98 exporters: [awsemf]