Metrics

E.D.D.I exposes all kinds of internal JVM metrics as prometheus export, including comprehensive tool management metrics for declarative agents.

These metrics are viewable here:

http://<eddi-instance>/q/metrics

In order to visualize these metrics, you can use this predefined dashboard for E.D.D.I:

E.D.D.I dashboard for Grafanaarrow-up-right


Tool Management Metrics

Version: ≥5.6.0

EDDI provides comprehensive metrics for monitoring tool usage, performance, caching, costs, and rate limits.

Cache Metrics

Monitor tool caching performance with Infinispan-based smart caching.

Note: The metric names below use Micrometer's dotted format (e.g., eddi.tool.cache.hits) as seen at /q/metrics. When scraped by Prometheus, these are automatically converted to underscored format with a _total suffix for counters (e.g., eddi_tool_cache_hits_total). The Prometheus queries in this document use the Prometheus naming convention.

eddi.tool.cache.hits                    # Total cache hits
eddi.tool.cache.misses                  # Total cache misses
eddi.tool.cache.hits{tool="weather"}    # Hits per tool
eddi.tool.cache.misses{tool="weather"}  # Misses per tool
eddi.tool.cache.puts{tool="weather"}    # Cache puts per tool
eddi.tool.cache.get.duration            # Cache get duration (timer)
eddi.tool.cache.put.duration            # Cache put duration (timer)
eddi.tool.cache.size                    # Current cache size (gauge)

Example Prometheus Query - Cache Hit Rate:

Rate Limiting Metrics

Monitor rate limiting and abuse prevention:

Example Prometheus Query - Rate Limit Violations:

Cost Tracking Metrics

Monitor API costs and budget usage:

Example Prometheus Query - Cost Per Hour:

Tool Execution Metrics

Monitor tool performance and reliability:

Example Prometheus Query - Success Rate:

Example Prometheus Query - P95 Latency:


Sample Grafana Dashboard

Tool System Overview


Prometheus Alerts

Sample Alert Rules


Accessing Metrics

Via Prometheus

Via REST API

EDDI also provides REST endpoints for tool metrics:


Monitoring Best Practices

Key Metrics to Monitor

  1. Cache Hit Rate - Target: >70%

  1. Tool Success Rate - Target: >95%

  1. P95 Latency - Target: <2 seconds

  1. Cost Per Request - Target: <$0.001


Additional Resources

Last updated

Was this helpful?