Skip to content

fix(with_columns): require exactly one of pass_dataframe_as / columns_to_pass / on_input#1654

Open
Hore01 wants to merge 1 commit into
apache:mainfrom
Hore01:fix/with-columns-arg-validation
Open

fix(with_columns): require exactly one of pass_dataframe_as / columns_to_pass / on_input#1654
Hore01 wants to merge 1 commit into
apache:mainfrom
Hore01:fix/with-columns-arg-validation

Conversation

@Hore01

@Hore01 Hore01 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

with_columns_base.__init__ is meant to enforce mutual exclusivity between pass_dataframe_as, columns_to_pass, and on_input — the docstring on each of those parameters says so. The current check raises only when exactly two are set:

if (
    int(pass_dataframe_as is None) + int(columns_to_pass is None) + int(on_input is None)
    == 1
):
    raise ValueError("You must specify only one of ...")

So a caller who specifies all three silently passes the validation and crashes later with a confusing downstream error. A caller who specifies zero of them also silently passes.

Args set sum(is None) currently raises? should raise?
0 3 no yes
1 2 no no
2 1 yes yes
3 0 no yes

This PR replaces the boolean arithmetic with the intuitive form:

n_set = sum(arg is not None for arg in (pass_dataframe_as, columns_to_pass, on_input))
if n_set != 1:
    raise ValueError("You must specify exactly one of ...")

The error message opener is also tightened: "only one of""exactly one of".

Tests

Four new tests in tests/function_modifiers/test_recursive.py, exercising the base class directly via a minimal inline subclass:

  • 0 args set → ValueError
  • 2 args set → ValueError (regression of the case that was already caught)
  • 3 args set → ValueError (new — was the primary bug)
  • 1 arg set → no error (positive case across all three options)

Validation lives in the shared base class so no parallel h_polars / h_spark tests are necessary. Testing against with_columns_base directly (rather than e.g. h_pandas.with_columns) is necessary because some subclasses pre-reject certain arg combinations before delegating to the base.

Behaviour change

Callers passing 0 or 3 of the mutually-exclusive args used to silently pass the decoration step and crash later with confusing errors deep in the DAG-build path. After this PR they get a clean ValueError at decoration time. This is the only intentional behavioural change. The documented contract has always been mutual exclusivity, so no caller should be relying on the silent-pass behaviour — but flagging the change explicitly so reviewers aren't surprised.

Signed-off-by: Olajumoke Akinremi 106763970+Hore01@users.noreply.github.com

…_to_pass / on_input

Signed-off-by: Olajumoke Akinremi <106763970+Hore01@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant