Our vision is seamless automatic instrumentation, provided as a service by the underlying compute platform. This will only happen if we can automatically instrument any container image without any prior knowledge of the application.

In the past few weeks, we have been working hard on improving the language detection in Odigos. We are excited to share the latest improvements with you today.

How the old language detection worked

Let's assume that we have a pod named my-app that we want to detect the language for. Odigos (specifically the Instrumentor) would do the following:

  1. Create a new pod named my-app-lang-detection, and schedule it to run on the same node as my-app.
  2. This new pod should have the hostPID flag set to true, which means that it shares the same PID namespace as my-app. This allows the new pod to examine the processes running inside my-app.
  3. The language detection pod determines the language of my-app by examining the following:
    • Process name
    • Linked libraries
    • Environment variables
    • Language specific tools (e.g. go version for Go applications)
  4. Once the language detection pod determines the language of my-app, it reports it back to the instrumentor.
  5. The instrumentor then deletes the language detection pod, and instruments my-app using the correct instrumentation.

This worked well at first, but as Odigos became more popular, we started to see some issues with this approach.

OutOfPods error

The first issue we encountered was the OutOfPods error. Some managed Kubernetes providers (e.g. EKS) have a limit on the number of pods that can be scheduled on a node. This limit is usually set to 110 pods per node. When this limit is reached, Kubernetes will not allow any new pods to be scheduled on the node, and it will return the OutOfPods error. Failure to schedule pods on a specific node can happen for multiple reasons, and caused Odigos to fail to detect languages for applications running on that node.

Slow language detection

The second issue we encountered was the slow language detection. Creating new pod and deleting it after the language detection is complete takes time. When dealing with large clusters, this can cause a significant delay in the language detection process.

Elevated privileges

The last issue we encountered was the need for elevated privileges. This method required the instrumentor component to have a ClusterRole that allows it to delete pods in any namespace. This is not ideal, and it can be a potential security risk.

A better way to detect languages

To solve the issues mentioned above, we rewrote the language detection mechanisem to be integrated into the Odiglet. Odiglet is a small agent we developed that runs on every node in the cluster and is responsible for providing nodes with an instrumentation virtual device.

Moving the language detection to Odiglet solved all the issues mentioned above. Detecting languages is now much faster, and it does not require permissions to delete pods.

Try out the latest version of Odigos, and let us know what you think!

Related articles