Skip to content

fix(tracing): apply environments to processor spans#1727

Closed
hassiebp wants to merge 1 commit into
mainfrom
hassiebbot/processor-environment-fallback
Closed

fix(tracing): apply environments to processor spans#1727
hassiebp wants to merge 1 commit into
mainfrom
hassiebbot/processor-environment-fallback

Conversation

@hassiebp

@hassiebp hassiebp commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • pass the client environment into the Langfuse span processor
  • apply that environment to third-party OpenTelemetry spans that do not already have a propagated or explicit langfuse.environment
  • add processor tests for default, propagated, and explicit third-party span environments

Verification

  • uv run --frozen pytest tests/unit/test_span_processor.py tests/unit/test_propagate_attributes.py::TestPropagateAttributesEnvironment
  • uv run --frozen ruff format --check langfuse/_client/span_processor.py langfuse/_client/resource_manager.py tests/unit/test_span_processor.py
  • uv run --frozen ruff check .
  • uv run --frozen mypy langfuse --no-error-summary

Greptile Summary

This PR passes the client-level environment string into LangfuseSpanProcessor and applies it as a default to third-party OpenTelemetry spans that do not already carry an explicit or propagated langfuse.environment attribute.

  • span_processor.py — adds environment: Optional[str] to the constructor and a new _apply_default_environment helper that mutates the freshly-created propagated_attributes dict before it is applied to the span, respecting the priority order: explicit span attribute > baggage-propagated > processor default.
  • resource_manager.py — threads the existing environment value through to the processor constructor with a single-line addition.
  • test_span_processor.py — three new focused tests cover the default, propagated-override, and explicit-attribute cases using a minimal in-memory exporter harness.

Confidence Score: 4/5

Safe to merge — the change is narrowly scoped to defaulting the environment on third-party spans, with no mutations to shared state or existing attribute-setting paths.

The implementation correctly handles all three environment-priority cases and the tests exercise each one. The only finding is a cosmetic getattr that could silently swallow a future typo in the attribute name instead of raising an error.

No files require special attention.

Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
langfuse/_client/span_processor.py:183-185
`getattr(self, "_environment", None)` is unnecessary here — `_environment` is unconditionally assigned in `__init__`, so a missing attribute can only happen if something has gone very wrong. Using `self._environment` directly is clearer and would surface a real `AttributeError` rather than silently returning `None` if the attribute name were ever misspelled.

```suggestion
        environment = self._environment
        if environment is None:
            return
```

Reviews (1): Last reviewed commit: "fix(tracing): apply environments to proc..." | Re-trigger Greptile

@github-actions

Copy link
Copy Markdown

@claude review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89f96b34ef

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

):
return

attributes[LangfuseOtelSpanAttributes.ENVIRONMENT] = environment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid applying one client's environment to other projects

When two Langfuse clients with different public keys/environments share the default TracerProvider, every registered LangfuseSpanProcessor receives on_start for every span. The first processor now writes its own environment before the Langfuse wrapper/project filter runs, and the owning processor then preserves that already-present langfuse.environment, so spans created by the second project can be exported with the first project's environment. This needs to be gated to spans owned by this processor/project or deferred until export filtering can determine ownership.

Useful? React with 👍 / 👎.

@hassiebp hassiebp closed this Jun 25, 2026
Comment on lines +168 to +192
def _apply_default_environment(
self, *, span: Span, attributes: Dict[str, Any]
) -> None:
"""Apply the processor environment to spans without an explicit environment.

Langfuse-created wrapper spans set ``langfuse.environment`` themselves, and
``propagate_attributes(environment=...)`` adds it to the active context for
request-scoped overrides. Third-party OpenTelemetry spans only pass through
this processor, so they need the client-level environment applied here when
neither of those more specific sources is present.
"""

if LangfuseOtelSpanAttributes.ENVIRONMENT in attributes:
return

environment = getattr(self, "_environment", None)
if environment is None:
return

if (
get_string_span_attribute(span, LangfuseOtelSpanAttributes.ENVIRONMENT)
is not None
):
return

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 In multi-Langfuse-client setups (two Langfuse(...) instances with different public_keys and different environments on the same TracerProvider), the new _apply_default_environment in the first-registered processor's on_start writes its own langfuse.environment onto every span before the second processor runs, then the second processor's get_string_span_attribute short-circuit bails out. The wrapper at langfuse/_client/span.py:134-143 then reads that leaked value via existing_environment or environment or client._environment and re-stamps it, so client_b's exporter ships spans labeled with client_a's environment. Fix: gate _apply_default_environment to skip when is_langfuse_span(span) but not _is_langfuse_project_span(span) — mirroring the isolation already done in on_end and _is_expected_exported_at_start.

Extended reasoning...

What the bug is

This PR introduces _apply_default_environment in LangfuseSpanProcessor.on_start to stamp the client-level langfuse.environment onto third-party OTel spans that do not already carry one. The implementation has no scope check: it runs on every span the TracerProvider sees, including spans created by other Langfuse clients sharing the same global TracerProvider. In multi-client setups this leaks one tenant's environment onto another tenant's spans before any project-scoping filter has a chance to run.

Why multi-client setups share a TracerProvider

LangfuseResourceManager._instances is keyed by public_key, so two Langfuse(public_key=...) calls produce two distinct resource managers. But _init_tracer_provider only constructs a new TracerProvider when otel_trace_api.get_tracer_provider() is a ProxyTracerProvider — the second client falls into the else branch and reuses the first client's provider, then calls tracer_provider.add_span_processor to attach its own LangfuseSpanProcessor. Both processors are now registered on the same provider. This is explicitly supported (see the 'multi-project setups' comment at span_processor.py:198 and TestMultiProjectSetup in tests/unit/test_otel.py).

Step-by-step proof

Setup:

client_a = Langfuse(public_key='pk-A', secret_key='sk-A', environment='env-a')
client_b = Langfuse(public_key='pk-B', secret_key='sk-B', environment='env-b')

with client_b.start_as_current_observation(name='work') as span:
    ...
  1. client_b._otel_tracer.start_as_current_span creates the span. The OTel SDK's SynchronousMultiSpanProcessor.on_start iterates processors in registration order and calls each one synchronously, before start_as_current_span returns.
  2. proc_A (env-a) runs first. _get_propagated_attributes_from_context returns an empty dict (no propagate_attributes active). _apply_default_environment checks: (a) ENVIRONMENT not in attributes ✓, (b) self._environment = 'env-a' is not None ✓, (c) get_string_span_attribute(span, ENVIRONMENT) returns None because nothing has written to the span yet ✓ — so it sets attributes[ENVIRONMENT] = 'env-a'. Then on_start runs if propagated_attributes: span.set_attributes(propagated_attributes), writing langfuse.environment='env-a' onto the span synchronously.
  3. proc_B (env-b) runs second on the same span. Its own attributes dict is still empty, but get_string_span_attribute(span, ENVIRONMENT) now returns 'env-a' (just written by proc_A), so _apply_default_environment early-returns. env-b is never applied.
  4. start_as_current_span returns. The wrapper LangfuseObservationWrapper.__init__ at langfuse/_client/span.py:134-143 reads existing_environment = get_string_span_attribute(self._otel_span, ENVIRONMENT)'env-a'. It then short-circuits self._environment = existing_environment or environment or self._langfuse_client._environment to 'env-a' and re-stamps that onto the span.
  5. on_end: _is_langfuse_project_span correctly rejects on proc_A (instrumentation-scope public_key='pk-B''pk-A'), but proc_B accepts and exports the span to client B's backend — with langfuse.environment='env-a', the other tenant's value.

Why existing safeguards don't catch this

_is_langfuse_project_span is only consulted in on_end and _is_expected_exported_at_start. The contamination happens at on_start before any project filtering runs. The Resource-level langfuse.environment set by _init_tracer_provider is also locked to the first-initialized client (only the first call creates a new TracerProvider with a Resource), so the per-span write was the only mechanism left to carry client B's environment — and the PR breaks that mechanism.

Pre-PR behavior was correct

Before this PR, nothing wrote langfuse.environment during on_start. existing_environment in the wrapper was None, so the short-circuit resolved to client._environment = 'env-b', correctly. This is a regression introduced by this PR specifically.

Worse for third-party spans

For non-Langfuse third-party spans (e.g. opentelemetry.instrumentation.openai), is_langfuse_span(span) returns False, so the on_end project filter is skipped entirely and both processors export the span. Both exported copies carry langfuse.environment='env-a' (whichever processor was registered first wins). This contradicts the PR's stated intent of applying each client's environment to its own third-party spans.

How to fix

Mirror the isolation already done in on_end and _is_expected_exported_at_start: skip _apply_default_environment when is_langfuse_span(span) but not _is_langfuse_project_span(span). This leaves Langfuse-created wrapper spans for their owning processor (the wrapper already applies the env correctly when nothing else has) and leaves third-party spans to whichever processor genuinely owns the surrounding trace context. Equivalently, narrow the apply to non-Langfuse spans only (which matches the docstring's stated intent — "Third-party OpenTelemetry spans only pass through this processor").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant