-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Proposed refactor of read API for backends #4477
Conversation
Add necessary imports for this function.
# Conflicts: # xarray/backends/api.py
…ad-refactor # Conflicts: # xarray/backends/apiv2.py
- to be used in apiv2 without instantiate the object
- modify signature - move default setting inside backends
Note that this PR doesn't change xarray behaviour unless the user sets the environment variable To test the new code paths you can use: |
backend_ds = open_backend_dataset( | ||
filename_or_obj, | ||
**backend_kwargs, | ||
**{k: v for k, v in kwargs.items() if v is not None}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing everything directly as supplied by the user feels a little non-ideal to me here. I wonder if we could group together decoding options into a single argument first? See #4490 for a proposal of what that could look like.
xarray/backends/cfgrib_.py
Outdated
if kwargs: | ||
warnings.warn( | ||
"The following keywords are not supported by the engine " | ||
"and will be ignored: %r" % kwargs | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issuing warnings for unrecognized keyword arguments seems dangerously permissive. Could we raise errors here?
(This might be easier once we switch to putting all the CF decoding conventions into a single argument, which could then be extended without breaking every backend.)
|
||
ds = Dataset(vars, attrs=attrs) | ||
ds = ds.set_coords(coord_names.intersection(vars)) | ||
ds._file_obj = file_obj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to make this a public API. I guess users still won't want to see this in most cases, so perhaps just keep this in mind for something to document in the new "how to write a backend" documentation?
xarray/backends/zarr.py
Outdated
@classmethod | ||
def get_chunk(cls, name, var, chunks): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
classmethod
is fine, though I guess staticmethod
would also work.
xarray/backends/apiv2.py
Outdated
backend_kwargs, | ||
**kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these be grouped into a single argument, something like extra_tokens
?
I really like the direction this is going! See #4490 for a related design proposal, that could build on top of this (and/or potentially make this easier). |
xarray/backends/cfgrib_.py
Outdated
if kwargs: | ||
raise ValueError( | ||
"The following keywords are not supported by the engine " | ||
"and will be ignored: %r" % kwargs | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simpler option is to just omit these lines and **kwargs
entirely. Python will supply an error message like TypeError: open_backend_dataset_cfgrib() got an unexpected keyword argument 'something_unexpected'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right! It doesn't make sense to keep these lines!
The warning was needed in the first version of the code because I didn't filter the input keys. But the idea was to remove it later.
…2.dataset_from_backend_dataset`
…lated error message
backend_kwargs=None, | ||
**kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that isn't quite clear to me: why have both **kwargs
and backend_kwargs
, when they are merged together?
Is the intention to only keep one of them in the long term?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the intention is to keep only one in long term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which one? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep kwargs, but reading #4490 I suspect that you prefer to keep backend_kwargs.
If we keep backend_kwargs, maybe it is more clear for the user the separation between xarray inputs and backend inputs. I also like the way you suggest to use deprecate_kwargs.
The main pro for keeping kwargs is that the user can save 21 chars - backend_kwargs=dict() (and that is important!).
I also prefer flat structures, when possible. But that's a matter of taste.
…ctor # Conflicts: # xarray/backends/zarr.py
Merging per offline discussion. This is all hidden behind an environment variable, so this will allow @aurghs and @alexamici to continue working on these new features. |
The first draft of the new backend API:
Dataset
withBackendArray
.cc @jhamman @shoyer
isort . && black . && mypy . && flake8
whats-new.rst
api.rst