Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify signature of xr.open_dataset using new decoding_kwargs dict #9379

Closed
TomNicholas opened this issue Aug 18, 2024 · 3 comments
Closed

Comments

@TomNicholas
Copy link
Member

TomNicholas commented Aug 18, 2024

What is your issue?

The signature of xr.open_dataset is quite complicated, but many of the kwargs are really just immediately passed on to the public xr.decode_cf function internally. Specifically mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, and drop_variables are all passed on. Whether or not xr.decode_cf is used at all is controlled by the decode_cf kwarg, which is currently a boolean.

We could instead group all of these kwargs into a single decoding_kwargs dictionary keyword argument. We could also replace the decode_cf kwarg with a general decode or decode_func kwarg, with a type something like Callable[Dataset | AbstractDataStore, Dataset], which by default would point to the xr.decode_cf function.

This would:

  • Greatly simplify the signature of xr.open_dataset,
  • More clearly separate concerns (opening data and decoding are separate steps),
  • Allow users to define their own decoding functions, even with existing backends (e.g. a netCDF file that follows non-CF conventions, such as can be found in plasma physics),
  • Follow the same pattern we already use in open_dataset for from_array_kwargs and backend_kwargs,
  • Avoid this old issue Can we clarify decode_cf option of open_dataset? #3020 (because once the deprecation cycle was complete you wouldn't be able to pass a specific decoding kwarg whilst also specifying that no decoding is to happen).

The downside of this is that there would be a fairly significant blast radius of warnings raised during the deprecation cycle.

@dcherian
Copy link
Contributor

Cam we close in favor of #4490?

@TomNicholas
Copy link
Member Author

Cam we close in favor of #4490?

Oops 🤦‍♂️ I'll add anything not already mentioned there then close this.

@TomNicholas
Copy link
Member Author

Closing as a duplicate, see #4490 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants