Skip to content

Conversation

@abhiyushS
Copy link

Fixes #32848

Summary

Ensures nan_euclidean_distances returns perfectly symmetric matrices when computing self-distances with NaN values, eliminating floating-point rounding asymmetries.

Problem

When computing distances with NaN values, nan_euclidean_distances produced slightly asymmetric matrices (~4e-16 differences) due to floating-point rounding after the sqrt operation. This breaks downstream tools like scipy.spatial.distance.squareform that expect exact symmetry.

Solution

  • Enforce perfect bitwise symmetry by averaging (distances + distances.T) / 2 after sqrt
  • Carefully preserve NaN diagonal values for rows with all missing features
  • Only zero out non-NaN diagonal entries

Testing

  • Added comprehensive regression test with ~10% NaN coverage
  • Uses assert_array_equal for strict bitwise symmetry check
  • Verifies diagonal handling for both zero and NaN cases
  • All 29 existing nan_euclidean tests pass

Note

Developed independently alongside PR #32851. This approach guarantees perfect symmetry rather than relying on allclose tolerance.

Fixes scikit-learn#32848

When computing distances with NaN values, nan_euclidean_distances produced
slightly asymmetric matrices due to floating-point rounding errors after
the sqrt operation. This caused issues with downstream tools expecting
perfect symmetry.

The fix ensures perfect symmetry by averaging the matrix with its transpose
when X is Y, and carefully preserves NaN diagonal values for rows with all
missing values.

Added regression test to verify symmetry and zero diagonal.
@github-actions github-actions bot added module:metrics CI:Linter failure The linter CI is failing on this PR labels Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nan_euclidean_distances producing distance matrix not symmetrical due to floating point precision

1 participant