-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Description
What happened?
Loading a netcdf4 file from a THREDDS server via dap4 failed with UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6202: ordinal not in range(128)
What did you expect to happen?
Dataset loaded.
Minimal Complete Verifiable Example
# /// script
# requires-python = ">=3.12"
# dependencies = [
# "xarray[complete]@git+https://github.com/pydata/xarray.git@main",
# ]
# ///
import xarray as xr
url = "dap4://thredds.atmohub.kit.edu/thredds/dap4/iagos-caribic/IAGOS-CARIBIC_MS_files_collection_20250711/CARIBIC-2/MS_20200304_591_CPT_MUC_10s_V16.nc"
ds = xr.load_dataset(url, engine="pydap", decode_cf=False, decode_times=False, decode_timedelta=False, decode_coords=False)
print(ds)Relevant log output
Traceback (most recent call last):
File "/home/user/Code/Python/pyTesting/netcdf/./dap4_xarray.py", line 18, in <module>
ds = xr.load_dataset(url, engine="pydap", decode_cf=False, decode_times=False, decode_timedelta=False, decode_coords=False)
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/api.py", line 165, in load_dataset
with open_dataset(filename_or_obj, **kwargs) as ds:
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/api.py", line 612, in open_dataset
ds = _dataset_from_backend_dataset(
backend_ds,
...<11 lines>...
**kwargs,
)
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/api.py", line 302, in _dataset_from_backend_dataset
ds = _maybe_create_default_indexes(backend_ds)
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/api.py", line 278, in _maybe_create_default_indexes
return ds.assign_coords(Coordinates(to_index))
~~~~~~~~~~~^^^^^^^^^^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/coordinates.py", line 315, in __init__
index, index_vars = create_default_index_implicit(var, list(coords))
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexes.py", line 1638, in create_default_index_implicit
index = PandasIndex.from_variables(dim_var, options={})
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexes.py", line 720, in from_variables
data = var._data if isinstance(var._data, PandasIndexingAdapter) else var.data # type: ignore[redundant-expr]
^^^^^^^^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/variable.py", line 456, in data
duck_array = self._data.get_duck_array()
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexing.py", line 943, in get_duck_array
duck_array = self.array.get_duck_array()
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexing.py", line 897, in get_duck_array
return self.array.get_duck_array()
~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/coding/variables.py", line 71, in get_duck_array
return duck_array_ops.astype(self.array.get_duck_array(), dtype=self.dtype)
~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexing.py", line 737, in get_duck_array
array = self.array[self.key]
~~~~~~~~~~^^^^^^^^^^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/pydap_.py", line 51, in __getitem__
return indexing.explicit_indexing_adapter(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
key, self.shape, indexing.IndexingSupport.BASIC, self._getitem
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/core/indexing.py", line 1129, in explicit_indexing_adapter
result = raw_indexing_method(raw_key.tuple)
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/pydap_.py", line 56, in _getitem
result = robust_getitem(self.array, key, catch=ValueError)
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/xarray/backends/common.py", line 296, in robust_getitem
return array[key]
~~~~~^^^^^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/pydap/model.py", line 526, in __getitem__
data = self._get_data_index(index)
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/pydap/model.py", line 575, in _get_data_index
return self._get_data()[index]
~~~~~~~~~~~~~~~~^^^^^^^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/pydap/handlers/dap.py", line 548, in __getitem__
dataset = UNPACKDAP4DATA(r, self.checksums, self.user_charset).dataset
~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/pydap/handlers/dap.py", line 1002, in __init__
self.dmr, self.endianness = self.safe_dmr_and_data()
~~~~~~~~~~~~~~~~~~~~~~^^
File "/home/user/.cache/uv/environments-v2/dap4-xarray-92e2e08d4d47f0ca/lib/python3.13/site-packages/pydap/handlers/dap.py", line 1062, in safe_dmr_and_data
dmr = self.raw.read(dmr_length).decode(self.user_charset)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 5394: ordinal not in range(128)Anything else we need to know?
Related?
My guess
Looking at the traceback, the issue seems to be that pydap expects ASCII encoding while the file might not satisfy this expectation. As a test for this hypothesis, I replaced
dataset = UNPACKDAP4DATA(r, self.checksums, self.user_charset).datasetwith
dataset = UNPACKDAP4DATA(r, self.checksums, "UTF-8").datasetin the pydap code and everything worked just fine!
So I attempted to set the encoding as a keyword arg to load_dataset;
ds = xr.load_dataset(url, engine="pydap", decode_cf=False, decode_times=False, decode_timedelta=False, decode_coords=False, user_charset="UTF-8")but unfortunately, this gets lost somewhere - might be the pydap code, so not necessarily xarray's fault.
Environment
>>> xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.13.7 (main, Aug 18 2025, 19:20:03) [Clang 20.1.4 ]
python-bits: 64
OS: Linux
OS-release: 6.12.48+deb13-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.6
libnetcdf: 4.9.3
xarray: 2025.10.2.dev18+ge49cfc4f2
pandas: 2.3.3
numpy: 2.3.4
scipy: 1.16.2
netCDF4: 1.7.3
pydap: 3.5.8
h5netcdf: 1.7.3
h5py: 3.15.1
zarr: 3.1.3
cftime: 1.6.5
nc_time_axis: 1.4.1
iris: None
bottleneck: 1.6.0
dask: 2025.10.0
distributed: 2025.10.0
matplotlib: 3.10.7
cartopy: 0.25.0
seaborn: 0.13.2
numbagg: 0.9.3
fsspec: 2025.9.0
cupy: None
pint: None
sparse: 0.17.0
flox: 0.10.7
numpy_groupies: 0.11.3
setuptools: None
pip: None
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None