OpenTelemetry GenAI Semantic Conventions: Vendor-Neutral Standard for Agent and LLM Traces
By Vatsal Shah · June 22, 2026 · Open Source · Source: OpenTelemetry
AI SUMMARY
- The OpenTelemetry (OTel) GenAI Special Interest Group (SIG) has stabilized the GenAI Semantic Conventions in June 2026, establishing the first formal, vendor-neutral telemetry standard for LLM and agentic workflows.
- The standard defines explicit specifications for System Spans (
gen_ai.system), Model Spans (capturing prompt/completion tokens and hyperparameters), and Agent Spans (gen_ai.agent.nameandgen_ai.tool.name). - Enables production-grade distributed trace propagation across multi-agent systems and microservices using W3C Trace Context headers (
traceparent), tracking request pathways through orchestrators, tool calls, and multiple models. - Native adoption is rolling out across core frameworks (LangGraph, CrewAI, AutoGen) and observability backends (Datadog, Dynatrace, Grafana, Honeycomb), allowing enterprises to debug multi-step agent failures without vendor lock-in.
- Features support for the Model Context Protocol (MCP), instrumenting MCP-based tool discovery and execution with structured spans, and aligning with the latest OpenTelemetry semantic conventions specifications.
What Happened
The OpenTelemetry GenAI Special Interest Group (SIG) has released the stable specification of the GenAI Semantic Conventions for LLM and agentic traces. Operating within the semantic-conventions-genai repository, this release establishes a uniform telemetry schema that allows developers and site reliability engineers (SREs) to monitor, debug, and optimize complex AI agent applications without relying on proprietary instrumentation.
Historically, monitoring AI agents required custom integrations for each LLM provider, framework, and APM vendor. The new OTel conventions standardize how LLM requests, model inputs/outputs, agent planning loops, and tool execution phases are recorded as spans.
This release provides concrete attributes and trace structures for logging token usage (usage.input_tokens, usage.output_tokens), model parameters (gen_ai.request.model, gen_ai.response.model, gen_ai.request.temperature), and agentic runtime context (gen_ai.agent.name, gen_ai.tool.name, and gen_ai.tool.status).

Key capabilities introduced in the OTel GenAI Semantic Conventions standard:
- Structured LLM Model Call Instrumentation: Standardizes span attributes for model name, request temperature, top-p, finish reason, and token usage, rendering them readable by any OTel-compliant backend.
- Hierarchical Agent Spans: Defines conventions for tracing agent planning and loop execution, grouping sub-steps like tool calls under a parent agent span.
- W3C Trace Context Propagation: Standardizes trace context passing across multi-agent environments using
traceparentheaders, facilitating tracing from front-end user actions down to background tools and model calls. - Model Context Protocol (MCP) Support: Provides a schema to instrument tool-calling systems governed by the Model Context Protocol (MCP), recording tool discovery, parameters, and outputs.
- Compatibility Flagging: Implements
OTEL_SEMCONV_STABILITY_OPT_INenvironment variable integration, allowing existing OTel SDK instances to opt into the GenAI schemas.
Why It Matters
Observability as the Execution Gap Fix
In production, AI agents often fail silently or unpredictable loops occur. Without tracing, a developer only sees that a request timed out or returned a bad response. They cannot pinpoint whether the failure occurred due to an incorrect tool response, a model hallucination, or a latency bottleneck at the API gateway.
The OTel GenAI Semantic Conventions address this visibility gap. SREs can track the exact progression of an agentic workflow through a unified span hierarchy.

With this hierarchy, APM platforms can automatically compute metrics such as:
- TCO (Total Cost of Ownership) per Trace: Grouping
usage.input_tokensandusage.output_tokensacross multiple model calls in a single trace to calculate costs. - Tool Latency and Error Rate: Isolating
gen_ai.tool.namespans to identify which tools cause bottlenecks or throw exceptions. - Hyperparameter Impact: Analyzing agent performance by comparing trace latency against hyperparameter values like temperature and top_p.
Distributed Trace Propagation in Multi-Agent Systems
Many enterprise AI architectures deploy multiple cooperating agents as microservices. For instance, a user-facing Router Agent may delegate tasks to a Research Agent, which in turn calls an API service.
OTel GenAI conventions define how the W3C Trace Context propagates across service boundaries. By passing the traceparent header, the entire session is grouped under a single trace ID, tracking execution flows regardless of network boundaries.

This eliminates tracing gaps in complex agent loops, allowing developers to trace the prompt from the user interface, through custom microservices, and into the model.
Code Example: Instrumenting a GenAI Span
Implementing these conventions is straightforward. The following code snippet demonstrates how to instrument an LLM call using OpenTelemetry's TypeScript API, aligning with the new GenAI semantic attributes:
import { trace, SpanStatusCode } from '@opentelemetry/api039;;
const tracer = trace.getTracer('my-agent-application039;);
async function callLanguageModel(prompt: string) {
return tracer.startActiveSpan('gen_ai.model.call039;, {
attributes: {
'gen_ai.system039;: 039;anthropic039;,
'gen_ai.request.model039;: 039;claude-3-5-sonnet039;,
'gen_ai.request.temperature039;: 0.7,
'gen_ai.prompt039;: prompt,
}
}, async (span) => {
try {
// Simulate API call to model provider
const response = await modelProvider.generate({
model: 'claude-3-5-sonnet039;,
prompt,
temperature: 0.7
});
// Record standard usage metrics on response completion
span.setAttributes({
'gen_ai.response.model039;: response.model,
'usage.input_tokens039;: response.usage.prompt_tokens,
'usage.output_tokens039;: response.usage.completion_tokens,
'gen_ai.response.finish_reason039;: response.finish_reason
});
span.setStatus({ code: SpanStatusCode.OK });
return response.text;
} catch (error: any) {
span.recordException(error);
span.setStatus({
code: SpanStatusCode.ERROR,
message: error.message
});
throw error;
} finally {
span.end();
}
});
}
What to Watch Next
- Framework Integrations: Watch for native OpenTelemetry GenAI exporters inside orchestrators like LangGraph, CrewAI, AutoGen, and Semantic Kernel. Native integration will allow developers to enable tracing with one environment variable instead of writing custom wrappers.
- Observability Vendor Features: Trace-centric vendors like Datadog, Honeycomb, and Grafana are launching dedicated APM tabs for AI agents, leveraging these conventions to build dashboards for LLM usage, cost estimation, and tool latency.
- Model Context Protocol (MCP) Auto-Telemetry: The MCP community is working to include OTel context propagation in the MCP protocol specification, which would automate tracing across any MCP server-client interaction.
Source
OpenTelemetry Semantic Conventions for GenAI Operations (2026)
Additional information: OpenTelemetry GenAI Spans Documentation
Related on shahvatsal.com: