Skip to content

feat(tracing): allow custom OpenTelemetry ID generators#1724

Merged
hassiebp merged 3 commits into
mainfrom
hassiebbot/otel-id-generator-option
Jun 24, 2026
Merged

feat(tracing): allow custom OpenTelemetry ID generators#1724
hassiebp merged 3 commits into
mainfrom
hassiebbot/otel-id-generator-option

Conversation

@hassiebp

@hassiebp hassiebp commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add an optional id_generator parameter to Langfuse for the OpenTelemetry TracerProvider that Langfuse creates.
  • Preserve default behavior by passing no custom generator unless the user provides one; OpenTelemetry still uses its default RandomIdGenerator when omitted.
  • Document the provider ownership boundary and add focused unit coverage for default and custom generator behavior.

Verification

  • uv run --frozen ruff check .
  • uv run --frozen mypy langfuse --no-error-summary
  • uv run --frozen pytest tests/unit/test_otel.py::TestOtelIdGeneration -q
  • uv run --frozen pytest tests/unit/test_resource_manager.py tests/unit/test_initialization.py -q

Greptile Summary

This PR adds an optional id_generator parameter to Langfuse so callers can plug in a custom OpenTelemetry IdGenerator when Langfuse creates its own TracerProvider; default behavior (random IDs) is preserved when the parameter is omitted.

  • id_generator is threaded through LangfuseLangfuseResourceManager_init_tracer_providerTracerProvider, with a clear ownership boundary: the parameter is only honoured when Langfuse creates a new provider; a warning is logged when a global provider is already registered.
  • get_client.py is updated to forward id_generator when reconstructing a client from an existing resource-manager singleton (safe because the singleton short-circuits re-initialization).
  • Two new unit tests cover the default and custom-generator paths, backed by a PredictableIdGenerator helper and the existing autouse OTel-reset fixture.

Confidence Score: 4/5

Safe to merge; the change is additive and the default path is unchanged.

The implementation is correct and well-tested. The one gap is that passing both an explicit tracer_provider and an id_generator silently drops the generator with no diagnostic, while the analogous globally-registered-provider case does warn. This is a usability inconsistency rather than a data or functional defect, but a user who misconfigures this combination will see no feedback.

The branch in _initialize_instance where tracer_provider is provided by the caller deserves a second look for the missing warning.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant User
    participant Langfuse
    participant LangfuseResourceManager
    participant _init_tracer_provider
    participant TracerProvider

    User->>Langfuse: "Langfuse(id_generator=custom_gen)"
    Langfuse->>LangfuseResourceManager: "__new__(public_key, id_generator=custom_gen)"
    LangfuseResourceManager->>LangfuseResourceManager: "_initialize_instance(id_generator=custom_gen)"
    
    alt tracer_provider arg is None
        LangfuseResourceManager->>_init_tracer_provider: "(id_generator=custom_gen)"
        alt Global provider is ProxyTracerProvider
            _init_tracer_provider->>TracerProvider: "TracerProvider(id_generator=custom_gen)"
            TracerProvider-->>_init_tracer_provider: new provider
            _init_tracer_provider-->>LangfuseResourceManager: new provider (custom_gen applied)
        else Global provider already registered
            _init_tracer_provider->>_init_tracer_provider: log warning (id_generator ignored)
            _init_tracer_provider-->>LangfuseResourceManager: existing global provider
        end
    else tracer_provider arg is provided
        Note over LangfuseResourceManager: id_generator silently ignored (no warning emitted)
        LangfuseResourceManager->>LangfuseResourceManager: "self.tracer_provider = arg"
    end

    LangfuseResourceManager-->>Langfuse: resource manager
    Langfuse-->>User: client ready
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant User
    participant Langfuse
    participant LangfuseResourceManager
    participant _init_tracer_provider
    participant TracerProvider

    User->>Langfuse: "Langfuse(id_generator=custom_gen)"
    Langfuse->>LangfuseResourceManager: "__new__(public_key, id_generator=custom_gen)"
    LangfuseResourceManager->>LangfuseResourceManager: "_initialize_instance(id_generator=custom_gen)"
    
    alt tracer_provider arg is None
        LangfuseResourceManager->>_init_tracer_provider: "(id_generator=custom_gen)"
        alt Global provider is ProxyTracerProvider
            _init_tracer_provider->>TracerProvider: "TracerProvider(id_generator=custom_gen)"
            TracerProvider-->>_init_tracer_provider: new provider
            _init_tracer_provider-->>LangfuseResourceManager: new provider (custom_gen applied)
        else Global provider already registered
            _init_tracer_provider->>_init_tracer_provider: log warning (id_generator ignored)
            _init_tracer_provider-->>LangfuseResourceManager: existing global provider
        end
    else tracer_provider arg is provided
        Note over LangfuseResourceManager: id_generator silently ignored (no warning emitted)
        LangfuseResourceManager->>LangfuseResourceManager: "self.tracer_provider = arg"
    end

    LangfuseResourceManager-->>Langfuse: resource manager
    Langfuse-->>User: client ready
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
langfuse/_client/resource_manager.py:251-258
**Silent `id_generator` drop when explicit `tracer_provider` is passed**

When a caller supplies both `tracer_provider=my_provider` and `id_generator=my_gen`, the `or`-short-circuit means `_init_tracer_provider` is never entered, so `id_generator` is stored on `self` but never wired up — no warning is emitted. The globally-registered-provider case (the `else` branch inside `_init_tracer_provider`) does emit a warning. Adding a symmetric guard directly in `_initialize_instance`, before the `or` expression, would give the same discoverability. For example, check `if tracer_provider is not None and id_generator is not None:` and log the same style of warning already used in the `else` branch.

Reviews (1): Last reviewed commit: "Merge branch 'main' into hassiebbot/otel..." | Re-trigger Greptile

@hassiebp hassiebp marked this pull request as ready for review June 24, 2026 07:52
@github-actions

Copy link
Copy Markdown

@claude review

@hassiebp hassiebp merged commit 688eb83 into main Jun 24, 2026
18 checks passed
@hassiebp hassiebp deleted the hassiebbot/otel-id-generator-option branch June 24, 2026 07:57

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward additive parameter for the OpenTelemetry ID generator, properly threaded through and well-tested.

Extended reasoning...

Overview

This PR adds an optional id_generator parameter to the Langfuse client constructor, threading it through LangfuseResourceManager and into _init_tracer_provider so callers can supply a custom OpenTelemetry IdGenerator when Langfuse creates its own TracerProvider. The parameter is also stored on the resource manager instance so _create_client_from_instance in get_client.py preserves it when reconstructing clients. Unit tests cover both the default (RandomIdGenerator) and a custom predictable generator case.

Security risks

None. The new parameter only affects ID generation semantics inside the local OpenTelemetry SDK. There is no auth, crypto, or data exposure surface touched.

Level of scrutiny

Low-to-moderate. The change is purely additive (new optional kwarg defaulting to None), preserves existing behavior in all code paths, and emits a clear warning when the generator is ignored because an OTel provider is already registered globally. The diff is small, follows the established pattern used by sibling parameters like tracer_provider and span_exporter, and the new tests directly exercise both branches.

Other factors

No outstanding reviewer comments, no prior review from me on this PR, the bug hunting system found nothing, and the author lists ruff/mypy/pytest verification in the PR description. The change is small enough and follows the existing conventions closely enough that human review is not strictly required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant