Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass .chunk/rechunk calls through for chunked arrays without ChunkManagers #9286

Open
wants to merge 34 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
f860ce5
test that rechunk can be passed through non registered chunked arrays
TomNicholas Jul 26, 2024
77947c3
remove chunks and rechunk from chunkmanager
TomNicholas Jul 26, 2024
da6799a
remove now-redundant chunks method from dask chunkmanager
TomNicholas Jul 26, 2024
34dadae
use is_chunked_array instead of hasattr(chunks)
TomNicholas Jul 26, 2024
6e26a2d
add has_chunkmanager function
TomNicholas Jul 26, 2024
78e8ff4
use has_chunkmanager
TomNicholas Jul 26, 2024
6feede3
fix errors in other tests
TomNicholas Jul 26, 2024
621ea0c
improve docs
TomNicholas Jul 26, 2024
c305b61
test that computation proceeds without dask on unregistered chunked a…
TomNicholas Jul 26, 2024
4f82d9d
type hinting fixes
TomNicholas Jul 27, 2024
a9bd35d
fix other issues revealed by typing
TomNicholas Jul 27, 2024
49e2b5f
Merge branch 'main' into non_registered_chunkedarrays
TomNicholas Jul 27, 2024
a24489b
ensure tests check that chunks are properly normalized
TomNicholas Jul 27, 2024
2cce6a0
remove now-redundant chunks and rechunk methods from DummyChunkManager
TomNicholas Jul 27, 2024
3cadd53
commented-out code indicating problem with chunk normalization
TomNicholas Jul 27, 2024
556161d
fixed bug with chunks passed as dict
TomNicholas Jul 29, 2024
0296f92
fix dodgy chunking patterns in tests
TomNicholas Jul 29, 2024
9c2ab5e
Revert "fixed bug with chunks passed as dict"
TomNicholas Jul 29, 2024
665727b
fixed bug with chunks passed as dict
TomNicholas Jul 29, 2024
54adae7
remove outdated comments
TomNicholas Jul 29, 2024
5863859
Merge branch 'main' into non_registered_chunkedarrays
TomNicholas Jul 29, 2024
b1024c9
refactor to always use the same codepath for chunk normalization
TomNicholas Jul 31, 2024
e926748
also use new codepath when determining preferred_chunks for backends
TomNicholas Jul 31, 2024
def3131
update TODO about what normalization is handled by dask
TomNicholas Jul 31, 2024
8db961e
remove normalize_chunks method from ChunkManagerEntrypoint
TomNicholas Jul 31, 2024
955c56e
vendor dask.array.normalize_chunks and its dependencies
TomNicholas Aug 12, 2024
58d1cc1
Merge branch 'main' into non_registered_chunkedarrays
TomNicholas Aug 12, 2024
aa3afff
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 12, 2024
bd29f79
use Union for python <3.10
TomNicholas Aug 12, 2024
7ad5f07
some pre-commit fixes
TomNicholas Aug 12, 2024
d14b705
add __init__.py's to avoid import problems
TomNicholas Aug 12, 2024
aac1566
add vendor/__init__.py
TomNicholas Aug 12, 2024
a0d1e84
try to shut mypy up
TomNicholas Aug 12, 2024
ce1df3a
Merge branch 'main' into non_registered_chunkedarrays
dcherian Aug 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 14 additions & 7 deletions doc/internals/chunked-arrays.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ Alternative chunked array types

.. warning::

This is a *highly* experimental feature. Please report any bugs or other difficulties on `xarray's issue tracker <https://github.com/pydata/xarray/issues>`_.
This is an experimental feature. Please report any bugs or other difficulties on `xarray's issue tracker <https://github.com/pydata/xarray/issues>`_.
In particular see discussion on `xarray issue #6807 <https://github.com/pydata/xarray/issues/6807>`_

Xarray can wrap chunked dask arrays (see :ref:`dask`), but can also wrap any other chunked array type that exposes the correct interface.
Xarray can wrap chunked dask arrays (see :ref:`dask`), but can also wrap any other chunked array type which exposes the correct interface.
This allows us to support using other frameworks for distributed and out-of-core processing, with user code still written as xarray commands.
In particular xarray also supports wrapping :py:class:`cubed.Array` objects
(see `Cubed's documentation <https://tom-e-white.com/cubed/>`_ and the `cubed-xarray package <https://github.com/xarray-contrib/cubed-xarray>`_).
(see `Cubed's documentation <https://tom-e-white.com/cubed/>`_ via the `cubed-xarray package <https://github.com/xarray-contrib/cubed-xarray>`_).

The basic idea is that by wrapping an array that has an explicit notion of ``.chunks``, xarray can expose control over
the choice of chunking scheme to users via methods like :py:meth:`DataArray.chunk` whilst the wrapped array actually
Expand All @@ -25,11 +25,12 @@ Chunked array methods and "core operations"
A chunked array needs to meet all the :ref:`requirements for normal duck arrays <internals.duckarrays.requirements>`, but must also
implement additional features.

Chunked arrays have additional attributes and methods, such as ``.chunks`` and ``.rechunk``.
Furthermore, Xarray dispatches chunk-aware computations across one or more chunked arrays using special functions known
as "core operations". Examples include ``map_blocks``, ``blockwise``, and ``apply_gufunc``.
Chunked arrays will have additional attributes and methods, such as ``.chunks`` and ``.rechunk``.
If the wrapped class only implements these additional methods then xarray will handle them in the same way it handles other duck arrays - i.e. with no further action on the user's part.

However to support applying computations across chunks, Xarray dispatches all chunk-aware computations across one or more chunked arrays using special functions known
as "core operations". The core operations are generalizations of functions first implemented in :py:mod:`dask.array`, and examples include ``map_blocks``, ``blockwise``, and ``apply_gufunc``.

The core operations are generalizations of functions first implemented in :py:mod:`dask.array`.
The implementation of these functions is specific to the type of arrays passed to them. For example, when applying the
``map_blocks`` core operation, :py:class:`dask.array.Array` objects must be processed by :py:func:`dask.array.map_blocks`,
whereas :py:class:`cubed.Array` objects must be processed by :py:func:`cubed.map_blocks`.
Expand Down Expand Up @@ -100,3 +101,9 @@ To use a parallel array type that does not expose a concept of chunks explicitly
is theoretically required. Such an array type (e.g. `Ramba <https://github.com/Python-for-HPC/ramba>`_ or
`Arkouda <https://github.com/Bears-R-Us/arkouda>`_) could be wrapped using xarray's existing support for
:ref:`numpy-like "duck" arrays <userguide.duckarrays>`.

Chunks without parallel processing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Some chunked array types exist which don't support parallel processing.
These will define `.chunks` and possibly also `.rechunk`, but do not require a `ChunkManagerEntrypoint` in order for these method to be called by `DataArray.chunk`.
6 changes: 3 additions & 3 deletions xarray/coding/strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
from xarray.core.utils import module_available
from xarray.core.variable import Variable
from xarray.namedarray.parallelcompat import get_chunked_array_type
from xarray.namedarray.pycompat import is_chunked_array
from xarray.namedarray.pycompat import has_chunkmanager, is_chunked_array

HAS_NUMPY_2_0 = module_available("numpy", minversion="2.0.0.dev0")

Expand Down Expand Up @@ -144,7 +144,7 @@ def bytes_to_char(arr):
if arr.dtype.kind != "S":
raise ValueError("argument must have a fixed-width bytes dtype")

if is_chunked_array(arr):
if has_chunkmanager(arr):
chunkmanager = get_chunked_array_type(arr)

return chunkmanager.map_blocks(
Expand Down Expand Up @@ -183,7 +183,7 @@ def char_to_bytes(arr):
# can't make an S0 dtype
return np.zeros(arr.shape[:-1], dtype=np.bytes_)

if is_chunked_array(arr):
if is_chunked_array(arr) and has_chunkmanager(arr):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is_chunked_array(arr) and has_chunkmanager(arr) pattern becomes necessary because we are now considering the possibility that is_chunked_array(arr) == True but has_chunkmanager(arr) == False, whereas previously these were assumed to always be consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@headtr1ck I got a notification saying you commented saying

But doesn't has_chunkmanager(arr) == True imply is_chunked_array(arr) == True?

(But I can't find your comment.)

It's a good question though. I think there are some array types that don't define a .chunks where you might still want to use other ChunkManager methods.

In particular JAX is interesting - it has a top-level pmap function which applies a function over multiple axes of an array similar to apply_gufunc. It distributes computation, but not over .chunks (which JAX doesn't define), instead over a global variable jax.local_device_count.

This is why I think we should rename ChunkManager to ComputeManager.

cc @alxmrs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@headtr1ck I got a notification saying you commented saying

But doesn't has_chunkmanager(arr) == True imply is_chunked_array(arr) == True?

(But I can't find your comment.)

It's a good question though. I think there are some array types that don't define a .chunks where you might still want to use other ChunkManager methods.

I came to the same conclusion, that's why I deleted the comment, sry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! I prefer to leave all my half-baked thoughts in the open and double or triple-post 😅 If you were wondering it then other people will definitely have the same question!

This is why I think we should rename ChunkManager to ComputeManager.

I could leave this to a second PR, to isolate the breaking changes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomNicholas FYI JAX does now support something a bit like chunking via sharding of jax.Array, there's a good summary here: https://jax.readthedocs.io/en/latest/notebooks/Distributed_arrays_and_automatic_parallelization.html
IIUC this is now preferred over pmap.

chunkmanager = get_chunked_array_type(arr)

if len(arr.chunks[-1]) > 1:
Expand Down
11 changes: 7 additions & 4 deletions xarray/coding/times.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,11 @@
from xarray.core.pdcompat import nanosecond_precision_timestamp
from xarray.core.utils import emit_user_level_warning
from xarray.core.variable import Variable
from xarray.namedarray.parallelcompat import T_ChunkedArray, get_chunked_array_type
from xarray.namedarray.pycompat import is_chunked_array
from xarray.namedarray.parallelcompat import (
T_ChunkedArray,
get_chunked_array_type,
)
from xarray.namedarray.pycompat import has_chunkmanager, is_chunked_array
from xarray.namedarray.utils import is_duck_dask_array

try:
Expand Down Expand Up @@ -719,7 +722,7 @@ def encode_cf_datetime(
cftime.date2num
"""
dates = asarray(dates)
if is_chunked_array(dates):
if is_chunked_array(dates) and has_chunkmanager(dates):
return _lazily_encode_cf_datetime(dates, units, calendar, dtype)
else:
return _eagerly_encode_cf_datetime(dates, units, calendar, dtype)
Expand Down Expand Up @@ -864,7 +867,7 @@ def encode_cf_timedelta(
dtype: np.dtype | None = None,
) -> tuple[T_DuckArray, str]:
timedeltas = asarray(timedeltas)
if is_chunked_array(timedeltas):
if is_chunked_array(timedeltas) and has_chunkmanager(timedeltas):
return _lazily_encode_cf_timedelta(timedeltas, units, dtype)
else:
return _eagerly_encode_cf_timedelta(timedeltas, units, dtype)
Expand Down
4 changes: 2 additions & 2 deletions xarray/coding/variables.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from xarray.core import dtypes, duck_array_ops, indexing
from xarray.core.variable import Variable
from xarray.namedarray.parallelcompat import get_chunked_array_type
from xarray.namedarray.pycompat import is_chunked_array
from xarray.namedarray.pycompat import has_chunkmanager, is_chunked_array

if TYPE_CHECKING:
T_VarTuple = tuple[tuple[Hashable, ...], Any, dict, dict]
Expand Down Expand Up @@ -176,7 +176,7 @@ def lazy_elemwise_func(array, func: Callable, dtype: np.typing.DTypeLike):
-------
Either a dask.array.Array or _ElementwiseFunctionArray.
"""
if is_chunked_array(array):
if is_chunked_array(array) and has_chunkmanager(array):
chunkmanager = get_chunked_array_type(array)

return chunkmanager.map_blocks(func, array, dtype=dtype) # type: ignore[arg-type]
Expand Down
8 changes: 6 additions & 2 deletions xarray/core/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,11 @@
is_scalar,
)
from xarray.namedarray.core import _raise_if_any_duplicate_dimensions
from xarray.namedarray.parallelcompat import get_chunked_array_type, guess_chunkmanager
from xarray.namedarray.pycompat import is_chunked_array
from xarray.namedarray.parallelcompat import (
get_chunked_array_type,
guess_chunkmanager,
)
from xarray.namedarray.pycompat import has_chunkmanager, is_chunked_array

try:
import cftime
Expand Down Expand Up @@ -1717,6 +1720,7 @@ def _full_like_variable(

if (
is_chunked_array(other.data)
and has_chunkmanager(other.data)
or chunked_array_type is not None
or chunks is not None
):
Expand Down
4 changes: 2 additions & 2 deletions xarray/core/computation.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
from xarray.core.utils import is_dict_like, is_duck_dask_array, is_scalar, parse_dims
from xarray.core.variable import Variable
from xarray.namedarray.parallelcompat import get_chunked_array_type
from xarray.namedarray.pycompat import is_chunked_array
from xarray.namedarray.pycompat import has_chunkmanager, is_chunked_array
from xarray.util.deprecation_helpers import deprecate_dims

if TYPE_CHECKING:
Expand Down Expand Up @@ -2169,7 +2169,7 @@ def _calc_idxminmax(
indx = func(array, dim=dim, axis=None, keep_attrs=keep_attrs, skipna=skipna)

# Handle chunked arrays (e.g. dask).
if is_chunked_array(array.data):
if is_chunked_array(array.data) and has_chunkmanager(array.data):
chunkmanager = get_chunked_array_type(array.data)
chunks = dict(zip(array.dims, array.chunks))
dask_coord = chunkmanager.from_array(array[dim].data, chunks=chunks[dim])
Expand Down
6 changes: 4 additions & 2 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@
calculate_dimensions,
)
from xarray.namedarray.parallelcompat import get_chunked_array_type, guess_chunkmanager
from xarray.namedarray.pycompat import array_type, is_chunked_array
from xarray.namedarray.pycompat import array_type, has_chunkmanager, is_chunked_array
from xarray.plot.accessor import DatasetPlotAccessor
from xarray.util.deprecation_helpers import _deprecate_positional_args, deprecate_dims

Expand Down Expand Up @@ -856,7 +856,9 @@ def load(self, **kwargs) -> Self:
"""
# access .data to coerce everything to numpy or dask arrays
lazy_data = {
k: v._data for k, v in self.variables.items() if is_chunked_array(v._data)
k: v._data
for k, v in self.variables.items()
if is_chunked_array(v._data) and has_chunkmanager(v._data)
}
if lazy_data:
chunkmanager = get_chunked_array_type(*lazy_data.values())
Expand Down
6 changes: 3 additions & 3 deletions xarray/core/duck_array_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
from xarray.core.utils import is_duck_array, is_duck_dask_array, module_available
from xarray.namedarray import pycompat
from xarray.namedarray.parallelcompat import get_chunked_array_type
from xarray.namedarray.pycompat import array_type, is_chunked_array
from xarray.namedarray.pycompat import array_type, has_chunkmanager, is_chunked_array

# remove once numpy 2.0 is the oldest supported version
if module_available("numpy", minversion="2.0.0.dev0"):
Expand Down Expand Up @@ -736,7 +736,7 @@ def first(values, axis, skipna=None):
dtypes.isdtype(values.dtype, "signed integer") or dtypes.is_string(values.dtype)
):
# only bother for dtypes that can hold NaN
if is_chunked_array(values):
if is_chunked_array(values) and has_chunkmanager(values):
return chunked_nanfirst(values, axis)
else:
return nputils.nanfirst(values, axis)
Expand All @@ -749,7 +749,7 @@ def last(values, axis, skipna=None):
dtypes.isdtype(values.dtype, "signed integer") or dtypes.is_string(values.dtype)
):
# only bother for dtypes that can hold NaN
if is_chunked_array(values):
if is_chunked_array(values) and has_chunkmanager(values):
return chunked_nanlast(values, axis)
else:
return nputils.nanlast(values, axis)
Expand Down
9 changes: 7 additions & 2 deletions xarray/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,12 @@
to_0d_array,
)
from xarray.namedarray.parallelcompat import get_chunked_array_type
from xarray.namedarray.pycompat import array_type, integer_types, is_chunked_array
from xarray.namedarray.pycompat import (
array_type,
has_chunkmanager,
integer_types,
is_chunked_array,
)

if TYPE_CHECKING:
from numpy.typing import DTypeLike
Expand Down Expand Up @@ -1349,7 +1354,7 @@ def _masked_result_drop_slice(key, data: duckarray[Any, Any] | None = None):
new_keys = []
for k in key:
if isinstance(k, np.ndarray):
if is_chunked_array(data): # type: ignore[arg-type]
if is_chunked_array(data) and has_chunkmanager(data): # type: ignore[arg-type]
chunkmanager = get_chunked_array_type(data)
new_keys.append(
_chunked_array_with_chunks_hint(k, chunks_hint, chunkmanager)
Expand Down
4 changes: 2 additions & 2 deletions xarray/core/missing.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
from xarray.core.utils import OrderedSet, is_scalar
from xarray.core.variable import Variable, broadcast_variables
from xarray.namedarray.parallelcompat import get_chunked_array_type
from xarray.namedarray.pycompat import is_chunked_array
from xarray.namedarray.pycompat import has_chunkmanager, is_chunked_array

if TYPE_CHECKING:
from xarray.core.dataarray import DataArray
Expand Down Expand Up @@ -690,7 +690,7 @@ def interp_func(var, x, new_x, method: InterpOptions, kwargs):
else:
func, kwargs = _get_interpolator_nd(method, **kwargs)

if is_chunked_array(var):
if is_chunked_array(var) and has_chunkmanager(var):
chunkmanager = get_chunked_array_type(var)

ndim = var.ndim
Expand Down
6 changes: 4 additions & 2 deletions xarray/core/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1036,14 +1036,16 @@ def contains_only_chunked_or_numpy(obj) -> bool:

Expects obj to be Dataset or DataArray"""
from xarray.core.dataarray import DataArray
from xarray.namedarray.pycompat import is_chunked_array
from xarray.namedarray.pycompat import has_chunkmanager, is_chunked_array

if isinstance(obj, DataArray):
obj = obj._to_temp_dataset()

return all(
[
isinstance(var.data, np.ndarray) or is_chunked_array(var.data)
isinstance(var.data, np.ndarray)
or is_chunked_array(var.data)
and has_chunkmanager(var.data)
for var in obj.variables.values()
]
)
Expand Down
19 changes: 13 additions & 6 deletions xarray/namedarray/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
_SupportsReal,
)
from xarray.namedarray.parallelcompat import guess_chunkmanager
from xarray.namedarray.pycompat import to_numpy
from xarray.namedarray.pycompat import is_chunked_array, to_numpy
from xarray.namedarray.utils import (
either_dict_or_kwargs,
infix_dims,
Expand Down Expand Up @@ -819,11 +819,18 @@ def chunk(
if dim in chunks
}

chunkmanager = guess_chunkmanager(chunked_array_type)

data_old = self._data
if chunkmanager.is_chunked_array(data_old):
data_chunked = chunkmanager.rechunk(data_old, chunks) # type: ignore[arg-type]
if is_chunked_array(data_old):
print(f"problematic chunks = {chunks}")
# if is_dict_like(chunks) and chunks != {}:
# chunks = tuple(chunks.get(n, s) for n, s in enumerate(data_old.shape)) # type: ignore[assignment]

print(f"hopefully normalized chunks = {chunks}")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really irritating - if I keep these lines commented out then my test_rechunk on the DummyChunkedArray fails. But if I uncomment these lines (therefore doing exactly what happens in the other branch of the if is_chunked_array(data_old): statement) then dask rechunk tests fail!

There are so many possible valid argument types for chunks here, some of which are dicts but completely different, e.g. {0: (2, 3)} vs {'x': (2, 3)}.

It would be much nicer for all possible chunks to go through a single normalize_chunks function, but I'm getting confused even trying to work out what the current behaviour is.

The ChunkManager has a .normalize_chunks method, to call out to dask.array.normalize_chunks. Cubed vendors this function too, so perhaps instead xarray should vendor dask.array.normalize_chunks and remove it from the ChunkManager class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_rechunk on the DummyChunkedArray fails

This was actually mostly my fault for having a bug in that test, fixed by 0296f92.

It would be much nicer for all possible chunks to go through a single normalize_chunks function

But there is still some unnecessary complexity that would be nice to remove. The main reason why the weird is_dict_like(chunks): sections that turn dicts of chunks into tuples are currently needed is because of this bug in dask.array.core.normalize_chunks dask/dask#11261. Otherwise we could just use that.

(If we do just use that we should perhaps vendor it though - as cubed does already).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to sort this all out, so now everything goes through dask.array.core.normalize_chunks, which is much neater.

Question is now do I:

  1. Vendor dask.array.core.normalize_chunks (like cubed does), and use the vendored version no matter which ChunkManager is called
  2. Make all chunkmanagers define a normalize_chunks method and refer to that (what the main code currently does).

I think we actually have to do (1), because we now have a codepath which will try to call normalize_chunks even on chunked arrays that do not define a chunkmanager. But we want to vendor it without introducing any more dependencies (e.g. toolz).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcherian I would appreciate your input on this vendoring question before I move ahead with it ^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vendor it! Sorry for the delay. We can generalize if it's ever needed


# Assume any chunked array supports .rechunk - if it doesn't then at least a clear AttributeError will be raised.
# Deliberately don't go through the chunkmanager so as to support chunked array types that don't need all the special computation methods.
# See GH issue #8733
data_chunked = data_old.rechunk(chunks) # type: ignore[union-attr]
else:
if not isinstance(data_old, ExplicitlyIndexed):
ndata = data_old
Expand All @@ -841,13 +848,13 @@ def chunk(
if is_dict_like(chunks):
chunks = tuple(chunks.get(n, s) for n, s in enumerate(ndata.shape)) # type: ignore[assignment]

chunkmanager = guess_chunkmanager(chunked_array_type)
data_chunked = chunkmanager.from_array(ndata, chunks, **from_array_kwargs) # type: ignore[arg-type]

return self._replace(data=data_chunked)

def to_numpy(self) -> np.ndarray[Any, Any]:
"""Coerces wrapped data to numpy and returns a numpy.ndarray"""
# TODO an entrypoint so array libraries can choose coercion method?
return to_numpy(self._data)

def as_numpy(self) -> Self:
Expand Down
3 changes: 0 additions & 3 deletions xarray/namedarray/daskmanager.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,6 @@ def __init__(self) -> None:
def is_chunked_array(self, data: duckarray[Any, Any]) -> bool:
return is_duck_dask_array(data)

def chunks(self, data: Any) -> _NormalizedChunks:
return data.chunks # type: ignore[no-any-return]

def normalize_chunks(
self,
chunks: T_Chunks | _NormalizedChunks,
Expand Down
55 changes: 0 additions & 55 deletions xarray/namedarray/parallelcompat.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,30 +218,6 @@ def is_chunked_array(self, data: duckarray[Any, Any]) -> bool:
"""
return isinstance(data, self.array_cls)

@abstractmethod
def chunks(self, data: T_ChunkedArray) -> _NormalizedChunks:
"""
Return the current chunks of the given array.

Returns chunks explicitly as a tuple of tuple of ints.

Used internally by xarray objects' .chunks and .chunksizes properties.

Parameters
----------
data : chunked array

Returns
-------
chunks : tuple[tuple[int, ...], ...]

See Also
--------
dask.array.Array.chunks
cubed.Array.chunks
"""
raise NotImplementedError()

@abstractmethod
def normalize_chunks(
self,
Expand Down Expand Up @@ -305,37 +281,6 @@ def from_array(
"""
raise NotImplementedError()

def rechunk(
self,
data: T_ChunkedArray,
chunks: _NormalizedChunks | tuple[int, ...] | _Chunks,
**kwargs: Any,
) -> Any:
"""
Changes the chunking pattern of the given array.

Called when the .chunk method is called on an xarray object that is already chunked.

Parameters
----------
data : dask array
Array to be rechunked.
chunks : int, tuple, dict or str, optional
The new block dimensions to create. -1 indicates the full size of the
corresponding dimension. Default is "auto" which automatically
determines chunk sizes.

Returns
-------
chunked array

See Also
--------
dask.array.Array.rechunk
cubed.Array.rechunk
"""
return data.rechunk(chunks, **kwargs)

@abstractmethod
def compute(
self, *data: T_ChunkedArray | Any, **kwargs: Any
Expand Down
Loading
Loading