fix(tracing): apply environments to processor spans#1727
Conversation
|
@claude review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 89f96b34ef
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ): | ||
| return | ||
|
|
||
| attributes[LangfuseOtelSpanAttributes.ENVIRONMENT] = environment |
There was a problem hiding this comment.
Avoid applying one client's environment to other projects
When two Langfuse clients with different public keys/environments share the default TracerProvider, every registered LangfuseSpanProcessor receives on_start for every span. The first processor now writes its own environment before the Langfuse wrapper/project filter runs, and the owning processor then preserves that already-present langfuse.environment, so spans created by the second project can be exported with the first project's environment. This needs to be gated to spans owned by this processor/project or deferred until export filtering can determine ownership.
Useful? React with 👍 / 👎.
| def _apply_default_environment( | ||
| self, *, span: Span, attributes: Dict[str, Any] | ||
| ) -> None: | ||
| """Apply the processor environment to spans without an explicit environment. | ||
|
|
||
| Langfuse-created wrapper spans set ``langfuse.environment`` themselves, and | ||
| ``propagate_attributes(environment=...)`` adds it to the active context for | ||
| request-scoped overrides. Third-party OpenTelemetry spans only pass through | ||
| this processor, so they need the client-level environment applied here when | ||
| neither of those more specific sources is present. | ||
| """ | ||
|
|
||
| if LangfuseOtelSpanAttributes.ENVIRONMENT in attributes: | ||
| return | ||
|
|
||
| environment = getattr(self, "_environment", None) | ||
| if environment is None: | ||
| return | ||
|
|
||
| if ( | ||
| get_string_span_attribute(span, LangfuseOtelSpanAttributes.ENVIRONMENT) | ||
| is not None | ||
| ): | ||
| return | ||
|
|
There was a problem hiding this comment.
🔴 In multi-Langfuse-client setups (two Langfuse(...) instances with different public_keys and different environments on the same TracerProvider), the new _apply_default_environment in the first-registered processor's on_start writes its own langfuse.environment onto every span before the second processor runs, then the second processor's get_string_span_attribute short-circuit bails out. The wrapper at langfuse/_client/span.py:134-143 then reads that leaked value via existing_environment or environment or client._environment and re-stamps it, so client_b's exporter ships spans labeled with client_a's environment. Fix: gate _apply_default_environment to skip when is_langfuse_span(span) but not _is_langfuse_project_span(span) — mirroring the isolation already done in on_end and _is_expected_exported_at_start.
Extended reasoning...
What the bug is
This PR introduces _apply_default_environment in LangfuseSpanProcessor.on_start to stamp the client-level langfuse.environment onto third-party OTel spans that do not already carry one. The implementation has no scope check: it runs on every span the TracerProvider sees, including spans created by other Langfuse clients sharing the same global TracerProvider. In multi-client setups this leaks one tenant's environment onto another tenant's spans before any project-scoping filter has a chance to run.
Why multi-client setups share a TracerProvider
LangfuseResourceManager._instances is keyed by public_key, so two Langfuse(public_key=...) calls produce two distinct resource managers. But _init_tracer_provider only constructs a new TracerProvider when otel_trace_api.get_tracer_provider() is a ProxyTracerProvider — the second client falls into the else branch and reuses the first client's provider, then calls tracer_provider.add_span_processor to attach its own LangfuseSpanProcessor. Both processors are now registered on the same provider. This is explicitly supported (see the 'multi-project setups' comment at span_processor.py:198 and TestMultiProjectSetup in tests/unit/test_otel.py).
Step-by-step proof
Setup:
client_a = Langfuse(public_key='pk-A', secret_key='sk-A', environment='env-a')
client_b = Langfuse(public_key='pk-B', secret_key='sk-B', environment='env-b')
with client_b.start_as_current_observation(name='work') as span:
...client_b._otel_tracer.start_as_current_spancreates the span. The OTel SDK'sSynchronousMultiSpanProcessor.on_startiterates processors in registration order and calls each one synchronously, beforestart_as_current_spanreturns.- proc_A (env-a) runs first.
_get_propagated_attributes_from_contextreturns an empty dict (nopropagate_attributesactive)._apply_default_environmentchecks: (a)ENVIRONMENTnot inattributes✓, (b)self._environment = 'env-a'is not None ✓, (c)get_string_span_attribute(span, ENVIRONMENT)returnsNonebecause nothing has written to the span yet ✓ — so it setsattributes[ENVIRONMENT] = 'env-a'. Thenon_startrunsif propagated_attributes: span.set_attributes(propagated_attributes), writinglangfuse.environment='env-a'onto the span synchronously. - proc_B (env-b) runs second on the same span. Its own
attributesdict is still empty, butget_string_span_attribute(span, ENVIRONMENT)now returns'env-a'(just written by proc_A), so_apply_default_environmentearly-returns.env-bis never applied. start_as_current_spanreturns. The wrapperLangfuseObservationWrapper.__init__atlangfuse/_client/span.py:134-143readsexisting_environment = get_string_span_attribute(self._otel_span, ENVIRONMENT)→'env-a'. It then short-circuitsself._environment = existing_environment or environment or self._langfuse_client._environmentto'env-a'and re-stamps that onto the span.on_end:_is_langfuse_project_spancorrectly rejects on proc_A (instrumentation-scopepublic_key='pk-B'≠'pk-A'), but proc_B accepts and exports the span to client B's backend — withlangfuse.environment='env-a', the other tenant's value.
Why existing safeguards don't catch this
_is_langfuse_project_span is only consulted in on_end and _is_expected_exported_at_start. The contamination happens at on_start before any project filtering runs. The Resource-level langfuse.environment set by _init_tracer_provider is also locked to the first-initialized client (only the first call creates a new TracerProvider with a Resource), so the per-span write was the only mechanism left to carry client B's environment — and the PR breaks that mechanism.
Pre-PR behavior was correct
Before this PR, nothing wrote langfuse.environment during on_start. existing_environment in the wrapper was None, so the short-circuit resolved to client._environment = 'env-b', correctly. This is a regression introduced by this PR specifically.
Worse for third-party spans
For non-Langfuse third-party spans (e.g. opentelemetry.instrumentation.openai), is_langfuse_span(span) returns False, so the on_end project filter is skipped entirely and both processors export the span. Both exported copies carry langfuse.environment='env-a' (whichever processor was registered first wins). This contradicts the PR's stated intent of applying each client's environment to its own third-party spans.
How to fix
Mirror the isolation already done in on_end and _is_expected_exported_at_start: skip _apply_default_environment when is_langfuse_span(span) but not _is_langfuse_project_span(span). This leaves Langfuse-created wrapper spans for their owning processor (the wrapper already applies the env correctly when nothing else has) and leaves third-party spans to whichever processor genuinely owns the surrounding trace context. Equivalently, narrow the apply to non-Langfuse spans only (which matches the docstring's stated intent — "Third-party OpenTelemetry spans only pass through this processor").
Summary
langfuse.environmentVerification
uv run --frozen pytest tests/unit/test_span_processor.py tests/unit/test_propagate_attributes.py::TestPropagateAttributesEnvironmentuv run --frozen ruff format --check langfuse/_client/span_processor.py langfuse/_client/resource_manager.py tests/unit/test_span_processor.pyuv run --frozen ruff check .uv run --frozen mypy langfuse --no-error-summaryGreptile Summary
This PR passes the client-level
environmentstring intoLangfuseSpanProcessorand applies it as a default to third-party OpenTelemetry spans that do not already carry an explicit or propagatedlangfuse.environmentattribute.span_processor.py— addsenvironment: Optional[str]to the constructor and a new_apply_default_environmenthelper that mutates the freshly-createdpropagated_attributesdict before it is applied to the span, respecting the priority order: explicit span attribute > baggage-propagated > processor default.resource_manager.py— threads the existingenvironmentvalue through to the processor constructor with a single-line addition.test_span_processor.py— three new focused tests cover the default, propagated-override, and explicit-attribute cases using a minimal in-memory exporter harness.Confidence Score: 4/5
Safe to merge — the change is narrowly scoped to defaulting the environment on third-party spans, with no mutations to shared state or existing attribute-setting paths.
The implementation correctly handles all three environment-priority cases and the tests exercise each one. The only finding is a cosmetic getattr that could silently swallow a future typo in the attribute name instead of raising an error.
No files require special attention.
Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "fix(tracing): apply environments to proc..." | Re-trigger Greptile