Why OpenTelemetry is driving a new wave of innovation in addition to observability data

0


The past decade has brought a gradual transition from monolithic applications that run on static infrastructure to microservices that run on highly dynamic cloud native infrastructure. This change has led to the rapid emergence of many new technologies, frameworks and architectures and a new set of monitoring and observability tools that give engineers complete visibility into the health and performance of these new systems.

Visibility is essential to ensure that a system and its dependencies behave as expected and to identify and expedite the resolution of any issues that may arise. To do this, teams need to collect comprehensive telemetry data on health and performance (metrics, logs, and traces) from all of these components. This is accomplished by instrumentation.

Why do we need OpenTelemetry?

For many years, there has been a wide variety of open source and proprietary instrumentation tools such as StatsD, Nagios plugins, Prometheus exporters, Datadog integrations or New Relic agents. Unfortunately, while there are many open source tools out there, there has been no alignment to specific instrumentation standards, such as StatsD, in the developer community and between vendors. This makes interoperability a challenge.

The lack of instrumentation and interoperability standards has forced each monitoring and observability tool to create its own collection of integrations to instrument the technologies used and on which developers need visibility. For example, many monitoring tools have built integrations to instrument widely used databases like MySQL, including Prometheus MySQL exporter, Datadog MySQL integration, and New Relic MySQL integration.

This is also true for application code instrumentation, where New Relic, Dynatrace, Datadog, and other vendors have created complex agents that automatically instrument popular application frameworks and libraries. Developers spend years creating instrumentation, and it requires a significant investment to build a large enough catalog of integrations and maintain it as new versions of the monitored technologies are released. Not only is this a very inefficient use of overall developer resources, it also creates vendor lock-in since you have to re-instrument your systems if you want to change your observability tool.

Finally, the value of innovation (and where customers benefit the most!) Is not innovation on the instrumentation itself. These are improvements and advancements on what you can to do with the data collected. The requirement to make a large investment in instrumentation – that is, the area that offers little benefit to end users – for new tools to enter the market has created a large barrier to entry. and severely limited innovation in space.

All of this is about to change drastically, thanks to OpenTelemetry: an emerging open source standard that democratizes instrumentation.

OpenTelemetry has already gained a lot of momentum, with the support of all major observability providers, cloud providers and many end users contribute to the project. It became the second most active CNCF project in terms of number of contributions only behind Kubernetes. (It is also was recently accepted as a CNCF incubation project, which reiterates its importance for engineering communities.).

Why is OpenTelemetry so popular?

OpenTelemetry approaches the instrumentation “problem” in a different way. Like other attempts (usually proprietary), it provides many out-of-the-box instruments for application frameworks and infrastructure components, as well as SDKs for developers to add their own instrumentation.

Unlike other instrumentation frameworks, OpenTelemetry covers metrics, traces and logs, defines API, semantic conventions and standard communication protocol (OpenTelemetry or OTLP protocol). In addition, it is completely independent of the supplier, with a plugin architecture to export data to any backend.

Even more, the goal of OpenTelemetry is that developers who create technologies that others can use (for example, application frameworks, databases, web servers and service meshes) to integrate the instrumentation directly into the code they produce. This will make the instrumentation easily accessible to anyone who uses the code in the future and will prevent another developer from needing to learn the technology and figure out how to write the instrumentation for it (which in some cases requires the ‘use of complex techniques such as bytecode injection.)

OpenTelemetry unlocks many new values ​​for all developers:

  1. Interoperability. Analyze the entire flow of requests to your application as they flow through your microservices, cloud services, and third-party SaaS in the observability tool of your choice. Effortlessly send your observability data to a data warehouse for analysis along with your business data. OpenTelemetry’s common API, data semantics and protocol make all of the above – and more – possible, out of the box.
  2. Ubiquitous instrumentation. With a much larger community working together against siled duplication efforts, everyone benefits from the widest, deepest, and highest quality instrumentation available.
  3. The test of time. You can instrument your code once and use it anywhere, because the vendor-independent approach allows you to send data and run analysis in the backend of your choice. Before OpenTelemetry, modifying observability backends usually required tedious re-instrumentation of your system.
  4. Reduced resource footprint. More and more instruments are directly integrated into frameworks and technologies instead of being injected, reducing CPU and memory usage.
  5. Improved uptime. With OpenTelemetry’s shared metadata, observability tools provide better correlation between metrics, traces, and logs, allowing you to troubleshoot and resolve production issues faster.

More importantly, companies no longer have to spend time, personnel, and money developing their own product-specific instrumentation and can focus on improving the developer experience. With access to a large, in-depth, and high-quality observability dataset of metrics, traces, and logs without a multi-million dollar investment in instrumentation, a new wave of new solutions that leverage data from observability is about to come.

Let’s look at a few examples to demonstrate what OpenTelemetry will – and already is – enable developers to do:

  • AWS is integration of OpenTelemetry instrumentation through their services. For example, they released automatic trace instrumentation for Java Lambda functions without code changes. This gives developers immediate visibility into the performance of their Java code and allows them to send all collected data to the backend of their choice. As a result, they are not tied to a specific vendor and can send the data to multiple backends to resolve different use cases.
  • Kubernetes and the popular Apollo GraphQL Server have added the initial OpenTelemetry tracing instrumentation to their code. This provides efficient out-of-the-box instrumentation that is integrated directly into the code via the Go and OpenTelemetry JavaScript libraries, and the instrumentation is written by the experts who built these technologies.
  • Jenkins, the open source CI / CD server, offers a OpenTelemetry plugin to monitor and troubleshoot jobs using distributed tracing. This gives developers visibility into the time spent on tasks and the errors that occur to help troubleshoot and improve those tasks.
  • Rookout, a debugger for native cloud applications, integrated OpenTelemetry traces to provide additional context in the debugger itself. This helps developers understand the entire demand flow going through the code they are troubleshooting, with additional context from the tags in the OpenTelemetry data.
  • Ballroom scale allows developers to store your OpenTelemetry trace data in Postgres via OTLP. Then developers can use powerful SQL queries to analyze their traces and correlate them with other business data stored in Postgres. For example, if you are developing a SaaS service that uses a database, you can analyze the response time to database requests by client ARR band to ensure that your most valuable clients – who are most likely to suffer from poor query performance because they store more data in your application – see the best possible performance with your product.

OpenTelemetry is still (very!) Actively developed, so this is just the beginning. While many of the above products and projects improve the lives of engineers operating production environments, there is a field of new possibilities. With interoperability and ubiquitous instrumentation, there is huge potential for existing businesses to improve their existing products or develop new tools – and for newcomers and entrepreneurs to leverage OpenTelemetry instrumentation to solve new ones. existing problems or problems with new innovative approaches.


Leave A Reply

Your email address will not be published.