Skip to content

Conversation

@jsignell
Copy link
Contributor

Demo

# /// script
# requires-python = ">=3.11"
# dependencies = [
#   "xarray[complete]@git+https://github.com/jsignell/xarray.git@uncertainties",
#   "uncertainties",
#   "pint",
# ]
# ///
#

from uncertainties import ufloat
import numpy as np
import pint
import xarray as xr

# create some uncertainties values
l1 = [ufloat(1, 0.1), ufloat(2, 0.2), ufloat(3, 0.3), ufloat(4, 0.4), ufloat(5, 0.5), ufloat(6, 0.6), ufloat(7, 0.7)]
l2 = [ufloat(10, 1.), ufloat(20, 2.), ufloat(30, 3.), ufloat(40, 4.), ufloat(50, 5.), ufloat(60, 6.), ufloat(70, 7.)]

# assemble a numpy array, we're at numpy(uncertainties)
numpy_array = np.array([l1, l2])

# attach a unit to it, we're at pint(numpy(uncertainties))
ureg = pint.UnitRegistry()
with_unit = numpy_array * ureg.second

# now assemble a DataArray, we're at xarray(pint(numpy(uncertainties)))
data_array = xr.DataArray(with_unit, dims=("a", "b"))

print(data_array.mean(dim="b"))

@keewis
Copy link
Collaborator

keewis commented Nov 12, 2025

I'll have to investigate this a bit more to really understand why we do the coercion to float in the first place (so far the only thing I know is that this part of the code has been added in #1883 and #2236).

However, for now I don't think we should try to support the uncertainty package in its current state, since there are a number of assumptions that are broken. In particular, while this is an object type numpy array, calling standard numpy ufuncs like np.cos fails. Instead, we're supposed to call the functions in unumpy, but this is not declared through any of the array standards (and may be slow).

So here's a bunch of options:

  • declare a new class that wraps the object dtype numpy array and defines the proper array standards (best would be the array API)
  • define a custom dtype that contains value and std per element
  • use a different package

There's a bunch of additional discussion in xarray-contrib/pint-xarray#3, and for the third option there's AutoUncertainties.

@jsignell
Copy link
Contributor Author

Thanks for taking a look at this @keewis. There was a reference in the code I deleted that mentioned this stemming from a dask issue. And indeed the failing tests on CI are failing because the dask arrays don't believe that they can do division on certain dtypes. I think this is only cropping up with numpy >= 2.3 which is why I wasn't seeing the failures locally when I was running tests (I had numpy 2.2. in my env).

I was also feeling unsure about how much xarray wants to support dtypes that advertise themselves as objects.

@dcherian
Copy link
Contributor

FWIW the object dtype array we really care about are ones with cftime objects in them. Everything else is best-effort IMO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Taking a mean along some dimension with units fails.

3 participants