-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider how to deal with the proliferation of decoder options on open_dataset #939
Comments
We already have the dictionary. Users can make a decode_options dictionary, and then call what they want to with **decode_options. |
One advantage to creating classes to encapsulate these options is that it's easier to do error checking on field names and values than a dictionary. Though a dictionary is nice and lightweight. The simplest thing to do would be to move these options out of |
I would disagree with the form What do you mean when you say it's easier to do error checking on field names and values? The xarray implementation can still use fields instead of a dictionary, with the user saying |
My concern is that
|
It is considered poor software design to have 13 arguments in Java and other languages which do not have optional arguments. The same isn't necessarily true of Python, but I haven't seen much discussion or writing on this. I'd much rather have pandas.read_csv the way it is right now than to have a ReadOptions object that would need to contain exactly the same documentation and be just as hard to understand as read_csv. That object would serve only to separate the documentation of the settings for read_csv from the docstring for read_csv. If you really want to cut down on arguments, open_dataset should be separated into multiple functions. I wouldn't necessarily encourage these, but some possibilities are:
All of that aside, the |
Certainly I agree here. The alternative would be separating out related functionality into related groups, e.g., |
I agree that having too many keyword arguments is poor design; it's representative of either failing to abstract anything away or having the object/function just do too much. For a specific example, this jumps out to me as a problem: ds = conventions.decode_cf(
store, mask_and_scale=mask_and_scale, decode_times=decode_times,
concat_characters=concat_characters, decode_coords=decode_coords,
drop_variables=drop_variables) Already token = tokenize(file_arg, group, decode_cf, mask_and_scale,
decode_times, concat_characters, decode_coords,
engine, chunks, drop_variables) but again you're using all of these parameters together. If all of these variable values are needed to define the state, you already have an implicit object in your code; you're just not using the language syntax to help you by encapsulating it. I'd be in favor of having lightweight classes (essentially mutable named tuples) vs. dictionaries. The former allows more discoverability to the interface (i.e. tab completion in IPython) as well as better up-front error checking (you could use |
See #4490 for a concrete API proposal. I think we can move the discussion there. |
There are already lots of keyword arguments, and users want even more! (#843)
Maybe we should use some sort of object to encapsulate desired options?
The text was updated successfully, but these errors were encountered: