OpenTelemetry (OTEL) for LLM Observability
OpenTelemetry (OTEL) is a CNCF project that provides a set of specifications, APIs, and libraries that define a standard way to collect distributed traces and metrics from your application.
Use this page if your application, framework, or collector already emits OpenTelemetry (OTEL) traces and you want to send them to Neural Inverse.
Neural Inverse can operate as an OpenTelemetry Backend to receive traces on the /api/public/otel (OTLP) endpoint. In addition to the Neural Inverse SDKs and native integrations, this OpenTelemetry endpoint is designed to increase compatibility with frameworks, libraries, and languages beyond the SDKs and native integrations. Popular OpenTelemetry libraries include OpenLLMetry and OpenLIT which extend Language support of Neural Inverse tracing to Java and Go and cover frameworks such as AutoGen, Semantic Kernel, and more.
As the Semantic Conventions for GenAI attributes on traces are still evolving, Neural Inverse maps the received OTel traces to the Neural Inverse data model and supports additional attributes that are popular in the OTel GenAI ecosystem (attribute mapping). Please contribute to the discussion on GitHub if an integration does not work as expected or does not parse the correct attributes.
Using other OTEL-based tools? If you're using Neural Inverse alongside other OpenTelemetry-based tools, you may run into conflicts. See Using Neural Inverse with an Existing OpenTelemetry Setup for configuration guidance.
Using Python or JS/TS? Prefer the Neural Inverse SDKs instead of wiring raw OpenTelemetry exporters directly. Start with the Python SDK v3 or the JS/TS SDK. This OpenTelemetry page is most useful for existing OTEL setups, collector-based ingestion, and unsupported languages.
Important: If you want to filter and aggregate by userId, sessionId,
metadata, version, release, or tags, you need to propagate these
trace-level attributes to every span in the trace. Start with
Propagating Trace Attributes to All Spans before
wiring this up in production.
Important: Propagating Trace Attributes to All Spans
When using OpenTelemetry (OTEL) instrumentation to send traces to Neural Inverse, certain trace-level attributes should be propagated to all spans within a trace to enable accurate aggregations and filtering in Neural Inverse. These attributes include:
userId(vialangfuse.user.idoruser.id)sessionId(vialangfuse.session.idorsession.id)metadata(vialangfuse.trace.metadata.*for top-level metadata keys)version(vialangfuse.version)release(vialangfuse.release)tags(vialangfuse.trace.tags)trace_name(vialangfuse.trace.name)
Neural Inverse filters and aggregations increasingly operate across individual observations rather than only at the trace level. If you want to reliably filter or aggregate by these attributes, they need to be present on each span in the trace, not only on the root span.
Recommended: Use OpenTelemetry Baggage for Propagation
The recommended approach for propagating these attributes across all spans is
to use OpenTelemetry Baggage
with a BaggageSpanProcessor. Baggage is a built-in OpenTelemetry mechanism
for context propagation that automatically copies specified key-value pairs to
all spans within a trace context.
To implement this pattern:
- Set the desired attributes as baggage entries at the beginning of your trace.
- Set the attributes on the currently active span.
- Configure a
BaggageSpanProcessorin your OpenTelemetry setup to automatically copy baggage entries to span attributes. - The processor will ensure all downstream spans in the trace context receive these attributes.
For implementation details and code examples, refer to the OpenTelemetry documentation for Python and JavaScript.
Security Consideration: OpenTelemetry baggage is propagated across service boundaries and to third-party APIs. Do not include sensitive information (passwords, API keys, personal data, etc.) in baggage when using this approach, as it will be transmitted to all downstream services.
Alternative: Use Neural Inverse SDK Helpers
If you're using the Neural Inverse SDKs with
OpenTelemetry integration, you can use the convenience methods
propagate_attributes() (Python) or propagateAttributes() (TypeScript),
which handle attribute propagation automatically. These methods provide a
simpler interface and are the recommended approach when using Neural Inverse SDKs.
Ingestion Options
OpenTelemetry native Neural Inverse SDK v4
The quickest path to start tracing with Neural Inverse is the new OTEL-native Neural Inverse SDK v4. The SDK is a thin layer on top of the official OpenTelemetry client that automatically converts emitted spans into rich Neural Inverse observations (spans, generations, events, and other observation types) and adds first-class helpers for LLM-specific features such as token usage, cost tracking, prompt linking, and scoring.
Because it lives in the shared OpenTelemetry context, spans from other OTEL-instrumented libraries can be exported to Neural Inverse too. By default, Neural Inverse focuses on LLM-relevant spans (Neural Inverse SDK spans, spans with gen_ai.* attributes, and known LLM instrumentors). To export everything, use a permissive custom filter as described in the advanced SDK docs.
Get started by following the dedicated guide for the Python implementation here: /docs/observability/sdk/overview.
OpenTelemetry endpoint
Neural Inverse can receive traces on the /api/public/otel (OTLP) endpoint.
If you use a Collector that uses the OpenTelemetry SDK to export traces, you can use the following configuration:
OTEL_EXPORTER_OTLP_ENDPOINT="https://cloud.langfuse.com/api/public/otel" # 🇪🇺 EU data region
# Other Neural Inverse data regions include 🇺🇸 US: https://us.cloud.langfuse.com/api/public/otel, 🇯🇵 Japan: https://jp.cloud.langfuse.com/api/public/otel and ⚕️ HIPAA: https://hipaa.cloud.langfuse.com/api/public/otel
# OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:3000/api/public/otel" # 🏠 Local deployment (>= v3.22.0)
OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic ${AUTH_STRING},x-langfuse-ingestion-version=4"Neural Inverse uses Basic Auth to authenticate requests.
You can use the following command to get the base64 encoded API keys (referred to as AUTH_STRING): echo -n "pk-lf-1234567890:sk-lf-1234567890" | base64.
For long API Keys on GNU systems, you may have to add -w 0 at the end since base64 auto-wraps columns.
If you want spans ingested directly via OpenTelemetry to appear in real time in Neural Inverse Cloud Fast Preview, include the x-langfuse-ingestion-version: 4 header. If your setup uses signal-specific header settings, add the same value to OTEL_EXPORTER_OTLP_TRACES_HEADERS.
If your collector requires signal-specific environment variables, the trace endpoint is /api/public/otel/v1/traces.
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="https://cloud.langfuse.com/api/public/otel/v1/traces" # EU data region
# Other Neural Inverse data regions include 🇺🇸 US: https://us.cloud.langfuse.com/api/public/otel, 🇯🇵 Japan: https://jp.cloud.langfuse.com/api/public/otel and ⚕️ HIPAA: https://hipaa.cloud.langfuse.com/api/public/otelPlease note that Neural Inverse currently supports OTLP over HTTP with both HTTP/JSON and HTTP/protobuf. gRPC is not supported yet.
Custom via OpenTelemetry SDKs
You can use the OpenTelemetry SDKs to directly export traces to Neural Inverse with the configuration mentioned above. Thereby, Language support of Neural Inverse is extended to other languages than the ones supported by the Neural Inverse SDKs (Python and JS/TS).
Use OpenTelemetry GenAI Instrumentation Libraries
Any OpenTelemetry compatible instrumentation can be used to export traces to Neural Inverse. Check out the following end-to-end examples of popular instrumentation SDKs to get started:
Libraries
Comparison of OpenTelemetry Instrumentation Libraries
| Category | Item | OpenLLMetry | openlit | Arize |
|---|---|---|---|---|
| LLMs | AI21 | ✅ | ||
| Aleph Alpha | ✅ | |||
| Amazon Bedrock | ✅ | ✅ | ✅ | |
| Anthropic | ✅ | ✅ | ✅ | |
| Assembly AI | ✅ | |||
| Azure AI Inference | ✅ | |||
| Azure OpenAI | ✅ | ✅ | ||
| Cohere | ✅ | ✅ | ||
| DeepSeek | ✅ | |||
| ElevenLabs | ✅ | |||
| GitHub Models | ✅ | |||
| Google AI Studio | ✅ | |||
| Google Generative AI (Gemini) | ✅ | |||
| Groq | ✅ | ✅ | ✅ | |
| HuggingFace | ✅ | ✅ | ✅ | |
| IBM Watsonx AI | ✅ | |||
| Mistral AI | ✅ | ✅ | ✅ | |
| NVIDIA NIM | ✅ | |||
| Ollama | ✅ | ✅ | ||
| OpenAI | ✅ | ✅ | ✅ | |
| OLA Krutrim | ✅ | |||
| Prem AI | ✅ | |||
| Replicate | ✅ | |||
| SageMaker (AWS) | ✅ | |||
| Titan ML | ✅ | |||
| Together AI | ✅ | ✅ | ||
| vLLM | ✅ | |||
| Vertex AI | ✅ | ✅ | ✅ | |
| xAI | ✅ | |||
| Vector DBs | AstraDB | ✅ | ||
| Chroma | ✅ | |||
| ChromaDB | ✅ | |||
| LanceDB | ✅ | |||
| Marqo | ✅ | |||
| Milvus | ✅ | ✅ | ||
| Pinecone | ✅ | ✅ | ||
| Qdrant | ✅ | ✅ | ||
| Weaviate | ✅ | |||
| Frameworks | AutoGen / AG2 | ✅ | ✅ | |
| ControlFlow | ✅ | |||
| CrewAI | ✅ | ✅ | ✅ | |
| Crawl4AI | ✅ | |||
| Dynamiq | ✅ | |||
| EmbedChain | ✅ | |||
| FireCrawl | ✅ | |||
| Guardrails AI | ✅ | ✅ | ||
| Haystack | ✅ | ✅ | ✅ | |
| Julep AI | ✅ | |||
| LangChain | ✅ | ✅ | ✅ | |
| LlamaIndex | ✅ | ✅ | ✅ | |
| Letta | ✅ | |||
| LiteLLM | ✅ | ✅ | ✅ | |
| mem0 | ✅ | |||
| MultiOn | ✅ | |||
| Phidata | ✅ | |||
| SwarmZero | ✅ | |||
| LlamaIndex Workflows | ✅ | |||
| LangGraph | ✅ | |||
| DSPy | ✅ | |||
| Prompt flow | ✅ | |||
| Instructor | ✅ | |||
| GPUs | AMD Radeon | ✅ | ||
| NVIDIA | ✅ | |||
| JavaScript | OpenAI Node SDK | ✅ | ||
| LangChain.js | ✅ | |||
| Vercel AI SDK | ✅ |
Framework integrations powered by OpenTelemetry
- Hugging Face smolagents
- CrewAI
- AutoGen
- Semantic Kernel
- Pydantic AI
- Spring AI
- LlamaIndex
- LlamaIndex Workflows
Export from OpenTelemetry Collector
If you run an OpenTelemetry Collector, you can use the following configuration to export traces to Neural Inverse:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
memory_limiter:
# 80% of maximum memory up to 2G
limit_mib: 1500
# 25% of limit up to 2G
spike_limit_mib: 512
check_interval: 5s
exporters:
otlphttp/langfuse:
endpoint: "https://cloud.langfuse.com/api/public/otel" # EU data region
# Other regions: US https://us.cloud.langfuse.com/api/public/otel, Japan https://jp.cloud.langfuse.com/api/public/otel, HIPAA https://hipaa.cloud.langfuse.com/api/public/otel
headers:
Authorization: "Basic ${AUTH_STRING}" # Previously encoded API keys
x-langfuse-ingestion-version: "4"
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlphttp/langfuse]Filtering Spans sent to Neural Inverse
In case you want to selectively send OTel Spans to Neural Inverse, you can use the OTel Collector filterprocessor. It enables you to filter spans based on attributes, span names, and more. As this applies on a Span level, you may risk incomplete traces and should be careful when applying complex filter rules. Neural Inverse also requires that a root span is sent to our backend to ensure that a trace is created correctly.
With the configuration below, you would only forward Spans which have a gen_ai.system attribute set to openai:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
filter/openaisystem:
error_mode: ignore
traces:
span:
- 'attributes["gen_ai.system"] != "openai"'
exporters:
otlphttp/langfuse:
endpoint: "https://cloud.langfuse.com/api/public/otel" # EU data region
# Other regions: US https://us.cloud.langfuse.com/api/public/otel, Japan https://jp.cloud.langfuse.com/api/public/otel, HIPAA https://hipaa.cloud.langfuse.com/api/public/otel
headers:
Authorization: "Basic ${AUTH_STRING}" # Previously encoded API keys
x-langfuse-ingestion-version: "4"
service:
pipelines:
traces:
receivers: [otlp]
processors: [filter/openaisystem]
exporters: [otlphttp/langfuse]Attribute Mapping
Neural Inverse aims to be compliant with the OpenTelemetry GenAI semantic conventions and support major LLM instrumentation frameworks.
Furthermore, Neural Inverse uses attributes within the langfuse.* namespace to map OpenTelemetry span attributes directly to the Neural Inverse data model. These specific attributes always take precedence over the generic OpenTelemetry conventions and are recommended for all users that are manually instrumenting their applications.
Please raise an issue on GitHub if any mapping or integration does not work as expected or does not parse the correct attributes.
Reserved attribute key segments: Attribute keys that contain __proto__, constructor, or prototype as a path segment (e.g. gen_ai.prompt.__proto__.foo) are silently dropped during ingestion. This is a security measure to prevent prototype pollution. If you notice missing attributes, check that your keys do not include these reserved segments.
Neural Inverse distinguishes between trace-level attributes and observation-level attributes.
- Trace-level attributes represent shared context for an entire interaction. If Neural Inverse detects these attributes on a specific span, it will treat them as properties of the whole trace.
- Observation-level attributes describe individual steps within a trace. Neural Inverse keeps them on the observation level.
How Metadata Mapping Works
OpenTelemetry spans can carry arbitrary attributes. Neural Inverse handles these attributes differently depending on how they are named:
| Attribute Type | Where it Appears in Neural Inverse | Example |
|---|---|---|
| Explicit metadata mapping | First-level key in metadata (filterable) | langfuse.trace.metadata.customer_tier → metadata.customer_tier |
| Unmapped OTel attributes | Nested under metadata.attributes (catch-all) | http.method → metadata.attributes.http.method |
| Resource attributes | Nested under metadata.resourceAttributes | service.name → metadata.resourceAttributes.service.name |
Neural Inverse SDKs vs. standard OpenTelemetry SDKs
- Neural Inverse SDKs provide utility functions (like
update()with ametadataparameter) that automatically set thelangfuse.*.metadata.*prefixed attributes. This means custom metadata appears at the first level and is filterable. - Standard OpenTelemetry SDKs set attributes directly on spans. Unless you explicitly use the
langfuse.trace.metadata.*orlangfuse.observation.metadata.*prefix, these attributes end up in themetadata.attributescatch-all and are not directly filterable in Neural Inverse.
Trace-Level Attributes
These attributes are applied to the trace record in Neural Inverse. They may be set on any span in the trace.
| Neural Inverse Field | Description | Mapped from OTel Attribute |
|---|---|---|
name | The name of the trace. | • langfuse.trace.name: string• Span name of the root span |
userId | The unique identifier for the end-user. | • langfuse.user.id: string• user.id: string |
sessionId | The unique identifier for the user session. | • langfuse.session.id: string• session.id: string |
release | The release version of your application. | • langfuse.release: string |
public | A boolean flag to mark a trace as public, allowing it to be shared via a URL. | • langfuse.trace.public: boolean |
tags | An array of strings to categorize or label the trace. | • langfuse.trace.tags: string[] |
metadata | A flexible object for storing any additional, unstructured data on the trace. See note below. | • langfuse.trace.metadata.*: string• Root span's observation metadata |
input | The initial input for the entire trace. | • langfuse.trace.input: string• Root span's observation input |
output | The final output for the entire trace. | • langfuse.trace.output: string• Root span's observation output |
version | The version of the trace, useful for tracking changes to your application logic. | • Root span's attributes mapped to version |
environment | The deployment environment where the trace was generated. | • Root span's attributes mapped to environment |
Filtering by metadata key in Neural Inverse
Neural Inverse only supports filtering on top-level keys within the metadata of an event.
By default, all OpenTelemetry attributes and resource attributes are mapped into an attributes and resourceAttributes key within metadata and are thus not queryable.
If you want to query on specific attributes, you can use the langfuse.trace.metadata prefix to map them to the top-level metadata object of the trace.
The following snippet will produce a filterable user_name property in the metadata object of the trace:
with tracer.start_as_current_span("Neural Inverse Attributes") as span:
span.set_attribute("langfuse.trace.metadata.user_name", "user-123")Observation-Level Attributes
These attributes are applied to individual observations (spans) within a trace (data model).
| Neural Inverse Field | Description | Mapped from OTel Attribute |
|---|---|---|
type | The type of observation. Any span with a model attribute is tracked as a generation. | • langfuse.observation.type: "span" | "generation" | "event", default: "span" |
level | The severity level of the observation. | • langfuse.observation.level: "DEBUG" | "DEFAULT" | "WARNING" | "ERROR", default: "DEFAULT"• Inferred from span.status.code |
statusMessage | A message describing the status of the observation, often used for errors. | • langfuse.observation.status_message: string• Inferred from span.status.message |
metadata | A flexible object for storing additional unstructured data. See note below. | • langfuse.observation.metadata.*: string |
input | The input data for this specific observation. | • langfuse.observation.input: (JSON) string• gen_ai.prompt• input.value (OpenInference)• mlflow.spanInputs (MLFlow) |
output | The output data from this specific observation. | • langfuse.observation.output: (JSON) string• gen_ai.completion• output.value (OpenInference)• mlflow.spanOutputs (MLFlow) |
model | The name of the generative model used. Generation only. | • langfuse.observation.model.name• gen_ai.request.model• gen_ai.response.model• llm.model_name• model |
modelParameters | Key-value pairs for model invocation settings. Generation only. | • langfuse.observation.model.parameters: JSON string• gen_ai.request.*• llm.invocation_parameters.* |
usage | Token counts for the generation. Generation only. | • langfuse.observation.usage_details: JSON string• gen_ai.usage.*• llm.token_count.* |
cost | The calculated cost in USD. Generation only. | • langfuse.observation.cost_details: JSON string• gen_ai.usage.cost (set as total key) |
prompt | The name of a versioned prompt managed in Neural Inverse. Generation only. | • langfuse.observation.prompt.name: string• langfuse.observation.prompt.version: integer |
completionStartTime | Timestamp for when the model began generating. Generation only. | • langfuse.observation.completion_start_time: ISO 8601 date string |
version | The version of the observation. | • langfuse.version: string |
environment | The deployment environment where the observation was generated. | • langfuse.environment• deployment.environment• deployment.environment.name |
Filtering by metadata key in Neural Inverse
Neural Inverse only supports filtering on top-level keys within the metadata of an event.
By default, all OpenTelemetry attributes and resource attributes are mapped into an attributes and resourceAttributes key within metadata and are thus not queryable.
If you want to query on specific attributes, you can use the langfuse.observation.metadata prefix to map them to the top-level metadata object of the observation.
The following snippet will produce a filterable user_name property in the metadata object:
with tracer.start_as_current_span("Neural Inverse Attributes") as span:
span.set_attribute("langfuse.observation.metadata.user_name", "user-123")Troubleshooting
- If you encounter
4xxerrors while self-hosting Neural Inverse, please upgrade your deployment to the latest version. The OpenTelemetry endpoint was first introduced in Neural Inverse v3.22.0 and has seen significant improvements since then. - Neural Inverse supports OTLP over HTTP with both
HTTP/JSONandHTTP/protobuf.gRPCis not supported yet.