-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
%collect line magic for using inside loops #432
base: master
Are you sure you want to change the base?
%collect line magic for using inside loops #432
Conversation
Hey, I recently changed jobs and need to get cleared to work on OS projects. Will review soon I hope 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall 👍 thanks so much for doing this!
@argument("-e", "--coerce", type=str, default=None, help="Whether to automatically coerce the types (default, pass True if being explicit) " | ||
"of the dataframe or not (pass False)") | ||
@handle_expected_exceptions | ||
def collect(self, line, local_ns=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call it spark_collect
?
args = parse_argstring_or_throw(self.spark, line) | ||
coerce = get_coerce_value(args.coerce) | ||
if (len(args.command) > 0): | ||
command = " ".join(args.command) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please follow the 4-space indentation convention?
def collect(self, line, local_ns=None): | ||
args = parse_argstring_or_throw(self.spark, line) | ||
coerce = get_coerce_value(args.coerce) | ||
if (len(args.command) > 0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would args.command be > 0? Sorry if I'm missing something super obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, args.command is the part of the line that does not contain arguments. This part can be empty (for instance if I do %spark_collect -o my_dataframe
), or not (%spark_collect -c sql -o my_df SELECT * from my_table
).
If it is empty, the join command would throw an error, so this is a way to get around it, but maybe there's a better way...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused... where is args.command coming from? I thought that for you to be able to do that, you needed to define an @argument
called command
, and I don't see it there. Am I missing something?
else: | ||
self.ipython_display.send_error("Context '{}' not found".format(args.context)) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add unit tests in the style of:
https://github.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/sparkmagic/tests/test_remotesparkmagics.py#L374
if (len(args.command) > 0): | ||
command = " ".join(args.command) | ||
else: | ||
command = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a new line after every code block, like after this if/else code block.
else: | ||
command = "" | ||
if args.context == CONTEXT_NAME_SPARK: | ||
return self.execute_spark(command, args.output, args.samplemethod, args.maxrows, args.samplefraction, None, coerce) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a new line after every return call in this method
examples/CollectMagic_example.ipynb
Outdated
@@ -0,0 +1,329 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is amazing! Way to go!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! :)
Hey, I implemented all the suggested changes, and wrote some unit tests too.I know realize I have to change the sample notebook too, I'll do it ASAP. Let me know how it looks. Thanks! |
@@ -548,3 +548,290 @@ def test_logs_exception(): | |||
assert result is None | |||
ipython_display.send_error.assert_called_once_with(EXPECTED_ERROR_MSG | |||
.format(get_logs_method.side_effect)) | |||
|
|||
|
|||
# New Tests for the spark_collect line magic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to have this comment :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So nice to see tests! Thanks!
|
||
# New Tests for the spark_collect line magic | ||
|
||
#@with_setup(_setup, _teardown) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first it was a mistake.. but actually I don't think this test is necessary in this case.. The analogous test for the %spark magic is meant to check for the "subcommand" parameter, which does not exist in this case. If anyone writes a bad command in %spark_collect line, it would always throw an error. What do you think?
spark_controller.run_command = run_cell_method | ||
|
||
name = None | ||
line = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable is never used in tests. Please remove in all tests where it's not needed.
method_name = "sample" | ||
commandline = "command" | ||
line = " ".join([context, context_name, meth, method_name, commandline]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra white space
commandline = "command" | ||
line = " ".join([context, context_name, meth, method_name, | ||
output, output_var, coer, coerce_value, commandline]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra white space
run_cell_method = MagicMock() | ||
run_cell_method.return_value = (True, "") | ||
spark_controller.run_sqlquery = run_cell_method | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra new line
def collect(self, line, local_ns=None): | ||
args = parse_argstring_or_throw(self.spark, line) | ||
coerce = get_coerce_value(args.coerce) | ||
if (len(args.command) > 0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused... where is args.command coming from? I thought that for you to be able to do that, you needed to define an @argument
called command
, and I don't see it there. Am I missing something?
Don't understand why it's failing... |
it seems to be tornadoweb/tornado#2331 although it was closed as environment specific issue.. Not sure if retesting this may help to pass the test. |
@itamarst perhaps that test is not broken against master? |
This should solve #425.
It is probably just a working starting point. I haven't modified any test yet.
I have also added a notebook with some sample code.
Looking forward to your feedback. Cheers!