Skip to content

Conversation

@ogrisel
Copy link
Member

@ogrisel ogrisel commented Dec 5, 2025

This is a follow-up to #32840. This should detect most possible silent breaking in future refactoring of helper functions that are used by both in array API supporting and not supporting code in scikit-learn.

AI usage disclosure

I used AI assistance for:

  • Mostly bad code suggestions that I had to re-edit manually to get the code working correctly
  • Test/benchmark generation
  • Documentation (including examples)
  • Research and understanding

@ogrisel ogrisel added the CUDA CI label Dec 5, 2025
@github-actions github-actions bot removed the CUDA CI label Dec 5, 2025
@ogrisel ogrisel changed the title Fix make sure enabling array_api_dispatch=True does not break any estimator on NumPy inputs Fix make sure enabling array_api_dispatch=True does not break any estimator on NumPy inputs Dec 5, 2025
@ogrisel ogrisel moved this to In Progress in Array API Dec 5, 2025
@ogrisel ogrisel added the CUDA CI label Dec 5, 2025
@github-actions github-actions bot removed the CUDA CI label Dec 5, 2025
…entation in the memoryviewslices unwrap helper
@ogrisel
Copy link
Member Author

ogrisel commented Dec 8, 2025

cc @lucyleeow @OmarManzoor @betatim @virchan

@betatim
Copy link
Member

betatim commented Dec 9, 2025

Only just getting to this. I asked @lesteve about the release schedule and he'd like to get 1.8.0 done more or less "now". I think it would be nice to have this PR in, but it will take more than five minutes to review. For me it is ok to send 1.8.0 off without this merged.

I can understand that at some point you need to decide "the 1.8 boat has left, you are too late". Even though it makes me sad to ship it with a bug that we know about. Maybe having a small bug is ok (there are probably more that we haven't yet found) - we will find out if someone tried array API support because they'll report an issue ;)

Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @ogrisel

@ogrisel
Copy link
Member Author

ogrisel commented Dec 9, 2025

@betatim I think the release-critical bugfixes have already all been merged. This one can indeed wait and this is why I did not milestone it. We can take our time to review carefully, no hurry.

ogrisel and others added 2 commits December 9, 2025 15:37
Co-authored-by: Omar Salman <omar.salman2007@gmail.com>
@ogrisel ogrisel added the CUDA CI label Dec 9, 2025
@github-actions github-actions bot removed the CUDA CI label Dec 9, 2025
@OmarManzoor
Copy link
Contributor

@ogrisel There is a CI failure which seems unrelated to this PR but still seems like something erroneous

Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you @ogrisel

@ogrisel
Copy link
Member Author

ogrisel commented Dec 10, 2025

There is a CI failure which seems unrelated to this PR but still seems like something erroneous

Indeed, there is a missing rng seed. Let me open a dedicated PR.

@OmarManzoor OmarManzoor added Waiting for Second Reviewer First reviewer is done, need a second one! and removed Waiting for Reviewer labels Dec 10, 2025
Copy link
Member

@lucyleeow lucyleeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the other fixes here. Nits only

Whether to ignore None objects passed in arrays.
remove_types : tuple or list, default=(str,)
remove_types : tuple or list, (str, list, tuple)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
remove_types : tuple or list, (str, list, tuple)
remove_types : tuple or list, default=(str, list, tuple)

Whether to ignore None objects passed in arrays.
remove_types : tuple or list, default=(str,)
remove_types : tuple or list, (str, list, tuple)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
remove_types : tuple or list, (str, list, tuple)
remove_types : tuple or list, default=(str, list, tuple)

remove_none : bool, default=True
Whether to ignore None objects passed in arrays.
remove_types : tuple or list, default=(str,)
remove_types : tuple or list, (str, list, tuple)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
remove_types : tuple or list, (str, list, tuple)
remove_types : tuple or list, default=(str, list, tuple)

@skip_if_array_api_compat_not_configured
def test_get_namespace_ndarray_with_dispatch():
@pytest.mark.parametrize("X", [numpy.asarray([1, 2, 3]), [1, 2, 3], (1, 2, 3)])
def test_get_namespace_ndarray_with_dispatch(X):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, to match test above

Suggested change
def test_get_namespace_ndarray_with_dispatch(X):
def test_get_namespace_ndarray_or_similar_default_with_dispatch(X):

Comment on lines +1078 to +1079
When expect_only_array_outputs is False, the check accepts non-array
outputs from estimator methods (e.g., sparse data structures). This is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand on this? I was confused just reading this, but https://github.com/scikit-learn/scikit-learn/pull/32846/changes#r2602981393 cleared it up for me.

else:
assert inverse_result.shape == invese_result_xp_np.shape
assert inverse_result.dtype == invese_result_xp_np.dtype
assert array_device(inverse_result_xp) == array_device(X_xp)
Copy link
Member

@lesteve lesteve Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need a config_context here otherwise this assert statement will always pass, checking that None == None. This was noted some time ago by Stefanie in #31814.

Maybe we should do something about it because it's easy to fall into this trap. One potential idea was to add an assert_same_device that uses array_api_compat.device and does not depend on config_context(array_api_dispatch=True) (and a similar assert function like assert_same_namespace). Also maybe it would be nice to have a better name than device for our config_context-dependent device function since I find the clash with array_api_compat.device a tiny bit confusing.

Suggested change
assert array_device(inverse_result_xp) == array_device(X_xp)
with config_context(array_api_dispatch=True):
assert array_device(result_xp) == array_device(X_xp)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

6 participants