Telemetry: Unlocking the Hidden Power of Observability in Axon Server Applications

AxonIQ

Empower your business with AxonIQ—event-driven microservices, real-time insights, and seamless scalability.

Published Nov 15, 2024

As applications grow in complexity, understanding their performance and behavior becomes a critical challenge. At AxonIQ Conference 2024, Richard Bouška, CTO of ASSIST, delivered a compelling talk on how telemetry—often an overlooked aspect of system architecture—can transform the way we monitor and optimize Axon Server-centric applications.

What is Telemetry, and Why Does It Matter?

Telemetry encompasses the collection and analysis of operational data such as logs, metrics, and traces to answer the fundamental question: “What’s happening in my system?” For teams working with Axon Framework and Axon Server, telemetry becomes the key to achieving transparency, ensuring resilience, and fine-tuning performance across distributed applications.

“Without telemetry, you’re left reacting to user complaints instead of proactively addressing issues,” Richard explained. “It’s the mechanical sympathy of modern software systems.”

From Metrics to Mastery: Richard’s Journey with Telemetry

Richard walked the audience through ASSIST’s multi-year evolution with Axon technologies:

2021: Focused on CQRS and event sourcing for building domain-driven architectures.
2022: Adopted microservices to distribute applications more effectively.
2023: Scaled applications globally, making location transparency a priority.
2024: Observability emerged as a cornerstone for ensuring system reliability and optimizing performance.

Telemetry became essential as ASSIST deployed increasingly complex systems worldwide. Richard’s team used tools like Prometheus and Grafana to collect, visualize, and analyze metrics. These tools allowed them to spot anomalies, track resource usage, and even predict issues before they became critical.

Lessons Learned: The Challenges of Telemetry

Richard didn’t shy away from the hurdles:

Information Overload: With so many metrics to track—Java Virtual Machine (JVM) stats, Axon Server data, custom business metrics—it’s easy to drown in data. Teams must carefully choose what to monitor.
Complexity of Tools: Teams had to learn multiple query languages, statistical concepts, and dashboarding techniques. Richard humorously noted that adopting dark mode in Prometheus was a turning point for their younger developers.
Interpreting Data: Even simple metrics like averages can be misleading. Richard explained how poor aggregation could create resonances or skew results, leading to incorrect conclusions.

Recommended by LinkedIn

A Journey Through Time in the Transformation of…

Houda MEZOUAR, Ph.D. Eng. 1 month ago

In-Memory Computing Market : A Comprehensive Overview…

Rushikesh Makashir 8 months ago

15 System Design Core Concepts a complete crash course…

Eleke Great 9 months ago

The Benefits: Why Invest in Telemetry?

Despite the challenges, telemetry offers undeniable advantages:

Detecting Issues Early: By monitoring metrics like memory usage after garbage collection, teams can spot potential problems like memory leaks before they impact production.
Optimizing System Performance: Richard highlighted how snapshotting and proper configuration reduced command handling times from 1.5 seconds to 30 milliseconds.
Supporting Collaboration: Collocating development, DevOps, and observability teams enabled faster issue resolution and better system design alignment.

One of the most striking insights was Richard’s emphasis on “mechanical sympathy”—understanding how a system is designed to be used and aligning its operation with that design. Telemetry provides the visibility needed to achieve this harmony.

Practical Applications for Axon Server Users

Richard demonstrated how telemetry transformed their Axon Server deployments:

Node Connection Monitoring: By visualizing how applications connected to Axon Server nodes, they could identify and fix inconsistencies.
Event Processing Analysis: Metrics like the last token per context helped ensure event streams were processed correctly.
Command and Query Optimization: Real-time monitoring of command durations and query response times allowed for precise tuning and reduced latency.

He also encouraged teams to replay their event stores periodically. “It’s amazing what you can learn by observing patterns over millions of events,” Richard remarked. Replay data not only revealed performance bottlenecks but also provided insights into user behavior and system evolution.

Key Takeaways for Teams Using Axon Technologies

Telemetry Is Essential, Not Optional: Modern applications require visibility to ensure reliability and performance.
Start Simple, Then Iterate: Focus on key metrics like memory usage, event processing rates, and command durations before expanding to more complex analyses.
Collaboration Boosts Success: Observability isn’t just about tools; it’s about aligning teams and sharing knowledge.
Invest in the Right Tools: Tools like Prometheus, Grafana, and Axon Server’s telemetry capabilities provide powerful frameworks for monitoring distributed systems.

Closing Thoughts

As Richard concluded, “Telemetry is the most important feature of Axon.” While dashboards and graphs might seem overwhelming at first, they are invaluable tools for ensuring your systems remain efficient, resilient, and scalable. Whether you’re debugging a memory issue, optimizing event processing, or predicting user behavior, telemetry equips your team with the insights needed to stay ahead.

Ready to optimize your Axon Server deployments? Explore how AxonIQ’s solutions can help you leverage telemetry and gain unparalleled visibility into your systems. Discover more here.

Telemetry: Unlocking the Hidden Power of Observability in Axon Server Applications

AxonIQ

Empower your business with AxonIQ—event-driven microservices, real-time insights, and seamless scalability.

What is Telemetry, and Why Does It Matter?

From Metrics to Mastery: Richard’s Journey with Telemetry

Lessons Learned: The Challenges of Telemetry

Recommended by LinkedIn

The Benefits: Why Invest in Telemetry?

Practical Applications for Axon Server Users

Key Takeaways for Teams Using Axon Technologies

Closing Thoughts

More articles by this author

Insights from the community

Others also viewed

Ceramfix of Brazil uses Resilient IBM Storage That Can Survive Severe Outages

Robust DolphinDB – How does DolphinDB Achieve Scalability, Reliability, Resilience, Consistency, and Monitorability

Unlocking the Power of Observability with OpenTelemetry

Local first the principles of post-cloud future

Advanced API Gateway Patterns and Best Practices: Real-World Applications for Optimal Communication and Security

State Watch (Design Pattern of Distributed Systems)

Introduction to Project Beacon

API Design for Scalability and Performance: Best Practices and Pitfalls.

The power of integration: combining Kafka and API platforms

Explore topics

What is Telemetry, and Why Does It Matter?

From Metrics to Mastery: Richard’s Journey with Telemetry

Lessons Learned: The Challenges of Telemetry

Recommended by LinkedIn

The Benefits: Why Invest in Telemetry?

Practical Applications for Axon Server Users

Key Takeaways for Teams Using Axon Technologies

Closing Thoughts

Killing the Aggregate (Sort of): How AxonIQ is Breaking Boundaries and Making Event Sourcing Easier

Dec 10, 2024

How AxonIQ Makes Event Sourcing Easy

Dec 6, 2024

Smarter Debugging, Simpler Stacks: AxonIQ’s Latest Innovations

Nov 26, 2024

Collaborative Modeling and AI: How Lemon is Transforming Event-Driven Microservices

Oct 16, 2024

Insights from the community

Others also viewed

Ceramfix of Brazil uses Resilient IBM Storage That Can Survive Severe Outages

Robust DolphinDB – How does DolphinDB Achieve Scalability, Reliability, Resilience, Consistency, and Monitorability

Unlocking the Power of Observability with OpenTelemetry

Local first the principles of post-cloud future

Advanced API Gateway Patterns and Best Practices: Real-World Applications for Optimal Communication and Security

State Watch (Design Pattern of Distributed Systems)

Introduction to Project Beacon

API Design for Scalability and Performance: Best Practices and Pitfalls.

The power of integration: combining Kafka and API platforms

Explore topics