In today's modern, complex, distributed, and cloud-native environment, observability turns quite important with respect to ensuring the reliability of a system, performing optimization, and effective troubleshooting. OpenTelemetry has been central in creating a unified standard for collection, processing, and exporting telemetry data around traces, metrics, and logs. However, it can be hard to get started with OpenTelemetry because of its flexibility and depth.
This document aims to guide experienced developers and those new to the industry in configuring and using OpenTelemetry. We'll cover the fundamentals, advanced configurations, and different tools which can be used to streamline the OpenTelemetry experience.
OpenTelemetry, being the Cloud Native Computing Foundation (CNCF)-related project, is an open-source project aimed at providing a single set of APIs, libraries, agents, and instrumentation to collect telemetry data out of your services. Such data helps developers and operators gain deep insights into application performance and health.
Core OpenTelemetry Components
While OpenTelemetry is feature-rich and highly configurable, high configurability goes hand in hand with complexity—especially for those who are inexperienced in observability or handling complex infrastructure settings. For example, the OpenTelemetry Collector can become too much to configure with the vast number of processors, receivers, and exporters to select from.
Example of setup
To deploy OpenTelemetry in Kubernetes you will need to deploy the OpenTelemetry Collector using Kubernetes resources like ConfigMaps, Deployments, and Services.
Create a Configuration File for the OpenTelemetry Collector
Create a YAML file (otel-collector-config.yaml
) that contains the configuration for the OpenTelemetry Collector. This configuration defines the receivers, processors, and exporters that the collector will use.
Here's an example configuration file:
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-collector-config
namespace: default
data:
otel-collector-config.yaml: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
timeout: 10s
send_batch_size: 512
memory_limiter:
check_interval: 1s
limit_percentage: 75
spike_limit_percentage: 10
exporters:
logging:
loglevel: debug
jaeger:
endpoint: "jaeger-collector.default.svc:14250" # Replace with your Jaeger endpoint
tls:
insecure: true # Set to false for production with proper TLS setup
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch, memory_limiter]
exporters: [logging, jaeger]
Apply the ConfigMap to Your Kubernetes Cluster
kubectl apply -f otel-collector-config.yaml
Create a Deployment for the OpenTelemetry Collector
Create a YAML file (otel-collector-deployment.yaml
) for the OpenTelemetry Collector deployment. This deployment will use the ConfigMap created in Step 1 to configure the Collector.
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: otel-collector
template:
metadata:
labels:
app: otel-collector
spec:
containers:
- name: otel-collector
image: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector:0.84.0
command:
- "/otelcol"
- "--config=/etc/otel/config/otel-collector-config.yaml"
volumeMounts:
- name: otel-collector-config-vol
mountPath: /etc/otel/config
volumes:
- name: otel-collector-config-vol
configMap:
name: otel-collector-config
This Deployment specifies:
Apply the Deployment to Your Kubernetes Cluster
kubectl apply -f otel-collector-deployment.yaml
Create a Service for the OpenTelemetry Collector
Create a YAML file (otel-collector-service.yaml
) to expose the OpenTelemetry Collector via a Kubernetes Service:
apiVersion: v1
kind: Service
metadata:
name: otel-collector
namespace: default
spec:
selector:
app: otel-collector
ports:
- protocol: TCP
port: 4317 # OTLP gRPC port
targetPort: 4317
- protocol: TCP
port: 4318 # OTLP HTTP port
targetPort: 4318
This Service exposes the OTLP gRPC and HTTP ports (4317 and 4318) to enable applications to send telemetry data to the OpenTelemetry Collector.
Apply the Service to Your Kubernetes Cluster
kubectl apply -f otel-collector-service.yaml
Install OpenTelemetry SDKs in Your Applications:
Configure the Application to Send Telemetry Data:
otel-collector.default.svc:4317
for gRPC or otel-collector.default.svc:4318
for HTTP).Example environment variables configuration:
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector.default.svc:4317
Instrument Code Using OpenTelemetry SDK:
Check the Status of the OpenTelemetry Collector Pods
Ensure that the OpenTelemetry Collector pods are running without errors:
kubectl get pods -n default
Check Logs for Errors and Data Processing
View the logs of the OpenTelemetry Collector pod to check for any errors and confirm that it is receiving and exporting telemetry data:
kubectl logs <otel-collector-pod-name> -n default
Monitor Resource Usage
Monitor CPU, memory usage, and other metrics for the OpenTelemetry Collector to ensure it is running efficiently and adjust resource requests and limits as needed.
There are numerous tools and features that can be found to use OpenTelemetry, while some of the tools facilitating accessibility include ways of setting up, configuring, and managing it in a given environment. Now, let's go over these tools and best practices in deploying OpenTelemetry.
A tool to ease OpenTelemetry configuration is Odigos.
Getting Started with Odigos:
OpenTelemetry provides auto-instrumentation libraries and agents for popular languages like Java, Python, Node.js, and .NET. These libraries simplify instrumentation, enabling automatic collection of telemetry data from common frameworks and libraries.
Sample Java Auto-Instrumentation
Download the OpenTelemetry Java agent:
curl -L <https://github.com/open-telemetry>
Run your Java application with the agent:
java -javaagent:path/to/opentelemetry-javaagent.jar -jar myapp.jar
This configuration automatically collects traces and sends them to the OpenTelemetry Collector after setup.
This is easy to do because exporters and the OpenTelemetry Collector are very flexible in their configuration and can be added to popular observability platforms such as Prometheus, Grafana, Datadog, and Jaeger.
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
timeout: 5s
send_batch_size: 1024
exporters:
jaeger:
endpoint: "jaeger-collector.default.svc:14250" # Replace with your Jaeger Collector endpoint
tls:
insecure: true # Set to false for production with proper TLS setup
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [jaeger]
receivers
section defines how the OpenTelemetry Collector receives telemetry data. In this example:
otlp
receiver is enabled with both grpc
and http
protocols, which means the collector will accept OTLP (OpenTelemetry Protocol) data over gRPC and HTTP.processors
section includes a batch
processor:
batch
processor is used to batch trace data before sending it to the exporter. This improves performance and reduces network load by sending data in batches rather than individually.exporters
section defines how the telemetry data is sent to Jaeger:
jaeger
exporter sends trace data to a Jaeger backend. The endpoint
is set to the Jaeger Collector's gRPC endpoint (jaeger-collector.default.svc:14250
), which you need to replace with your actual Jaeger Collector endpoint.tls
section's insecure
field is set to true
for development purposes, allowing communication without TLS. For production environments, you should configure proper TLS settings and set insecure
to false
.service
section defines the pipelines that process telemetry data:
traces
pipeline is defined with otlp
as the receiver, batch
as the processor, and jaeger
as the exporter. This means the pipeline will receive OTLP trace data, batch process it, and export it to Jaeger.While the manual configuration of OpenTelemetry Collector offers flexibility, it can be cumbersome and error-prone, especially in dynamic environments. Odigos is a powerful tool that simplifies the entire process of setting up observability. With Odigos, you can easily configure observability backends without manually writing or managing complex configuration files.
Here's how to use Odigos to send traces to a variety of backends, including Jaeger:
http://jaeger-collector.default.svc:14268/api/traces
for HTTP or jaeger-collector.default.svc:14250
for gRPC.Odigos automatically handles the creation and management of OpenTelemetry Collectors, applying the necessary configurations to your environment. This makes it significantly easier to set up and maintain observability for your applications, reducing the manual overhead and potential for configuration errors. By leveraging Odigos, you can quickly achieve robust observability with minimal effort, keeping your focus on application development and performance tuning.
Given that OpenTelemetry setup is a learning curve, these are some of the common pitfalls and how to overcome them:
OpenTelemetry is a very strong and flexible observability framework, but it may be hard to get started with. If you are an experienced developer, these tools are just a walk in the park. If you're starting out, though, with the existence of tools like the OpenTelemetry Operator, Odigos, and auto-instrumentation libraries, this becomes easy for developers at both beginner and experienced levels. With the help of those tools and following best practices, one can come up with a well-configured observability pipeline, providing deep insights into health and performance for your applications.