[SDK]Support Docker image as objective in the tune API #2338

akhilsaivenkata · 2024-05-30T00:58:49Z

What this PR does / why we need it: Supporting Docker image as an objective in the tune API.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2326

Checklist:

Docs included if any changes are user facing

Signed-off-by: akhilsaivenkata <[email protected]>

google-oss-prow · 2024-05-30T00:59:04Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign tenzen-y for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

akhilsaivenkata · 2024-05-30T01:06:47Z

Hi @andreyvelich , I made the changes to katib_client.py. Do we have any test cases that need to be updated or implemented for this change? I also wanted to test these new changes on my local machine so I ran "make check" & "make test" commands on my local machine and there were no failures. I wonder if you could see and suggest any action items that needs to taken up here.

andreyvelich

Thank you for doing this @akhilsaivenkata!
I left a few comments.

andreyvelich · 2024-06-12T16:44:10Z

sdk/python/v1beta1/kubeflow/katib/api/katib_client.py

        parameters: Dict[str, Any],
-        base_image: str = constants.BASE_IMAGE_TENSORFLOW,
+        #base_image: str = constants.BASE_IMAGE_TENSORFLOW,


I think, we should keep the base_image, since we use it when user set objective as train function.

andreyvelich · 2024-06-12T16:47:13Z

sdk/python/v1beta1/kubeflow/katib/api/katib_client.py

@@ -400,12 +407,12 @@ def tune(
        trial_template = models.V1beta1TrialTemplate(
            primary_container_name=constants.DEFAULT_PRIMARY_CONTAINER_NAME,
            retain=retain_trials,
-            trial_parameters=trial_params,
+            trial_parameters=trial_params if callable(objective) else [],


trial_parameters still be required even when user sets Docker image.
You can check example here: https://github.com/kubeflow/katib/blob/master/examples/v1beta1/hp-tuning/random.yaml#L31-L36

Sure @andreyvelich , I will revert this change and keep the trial_parameters.

Also, I would like to know if we need to write new unit test cases or change existing ones ?

@akhilsaivenkata Yes, since we merge this PR: #2325, please add unit test for tune function.

cc @tariq-hasan

akhilsaivenkata · 2024-06-19T15:10:44Z

sdk/python/v1beta1/kubeflow/katib/api/katib_client.py

+            input_params = {}
+            experiment_params = []
+            trial_params = []
+            base_image = constants.BASE_IMAGE_TENSORFLOW,


Yes @andreyvelich , i did keep the base_image here at this line, I have added it in the if block so this code change got mixed up with all the other lines

andreyvelich · 2024-06-24T14:00:13Z

@akhilsaivenkata Please rebase your PR.

Electronic-Waste · 2024-07-16T03:25:50Z

sdk/python/v1beta1/kubeflow/katib/api/katib_client.py

            trial_spec=trial_spec,
        )

        # Add parameters to the Katib Experiment.
-        experiment.spec.parameters = experiment_params
+        experiment.spec.parameters = experiment_params if callable(objective) else []


I think parameters field is also needed since trial_paramaters is still required.

WDTY👀 @akhilsaivenkata @andreyvelich

You are right @Electronic-Waste , I have just reverted these two code changes and pushed it now.

Signed-off-by: akhilsaivenkata <[email protected]>

Electronic-Waste

It looks great. Thank you @akhilsaivenkata ! I left a question for you and @andreyvelich .

Electronic-Waste · 2024-07-16T06:05:53Z

sdk/python/v1beta1/kubeflow/katib/api/katib_client.py

+                                command=["bash", "-c"] if callable(objective) else None,
+                                args=[exec_script] if callable(objective) else None,


Also I'm not sure if we can assign None to command and args here when we use Docker image as objective.

As @andreyvelich shows an example for us, we sometimes need to pass command and args to the training container to execute python scripts with some parameters.

Could you explain your idea in details so that I can understand more? WDYT👀 @akhilsaivenkata @andreyvelich

I think, initially we can just allow user to set image as objective without command and args.
Similar to how we allow create training job using base_image parameter: https://github.com/kubeflow/training-operator/blob/master/sdk/python/kubeflow/training/api/training_client.py#L327C35-L327C45.

andreyvelich · 2024-08-26T15:06:48Z

Hi @akhilsaivenkata, did you get a chance to finish this PR ?

github-actions · 2024-11-24T20:05:55Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

andreyvelich · 2024-11-25T16:23:22Z

/remove-lifecycle stale

Support Docker image as objective in the tune API

e0ea84e

Signed-off-by: akhilsaivenkata <[email protected]>

google-oss-prow bot requested review from andreyvelich, anencore94 and gaocegege May 30, 2024 00:59

google-oss-prow bot added the size/L label May 30, 2024

andreyvelich reviewed Jun 12, 2024

View reviewed changes

akhilsaivenkata commented Jun 19, 2024

View reviewed changes

Electronic-Waste reviewed Jul 16, 2024

View reviewed changes

google-oss-prow bot added size/M and removed size/L labels Jul 16, 2024

resolving review comments

2ef15e1

Signed-off-by: akhilsaivenkata <[email protected]>

akhilsaivenkata force-pushed the master branch from 3e133aa to 2ef15e1 Compare July 16, 2024 05:26

Merge branch 'master' of github.com:kubeflow/katib

b2290f9

Electronic-Waste reviewed Jul 16, 2024

View reviewed changes

github-actions bot added the lifecycle/stale label Nov 24, 2024

google-oss-prow bot removed the lifecycle/stale label Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SDK]Support Docker image as objective in the tune API #2338

[SDK]Support Docker image as objective in the tune API #2338

akhilsaivenkata commented May 30, 2024

google-oss-prow bot commented May 30, 2024

akhilsaivenkata commented May 30, 2024 •

edited

Loading

andreyvelich left a comment

andreyvelich Jun 12, 2024

andreyvelich Jun 12, 2024

akhilsaivenkata Jun 19, 2024

akhilsaivenkata Jun 19, 2024

andreyvelich Jun 24, 2024

andreyvelich Jun 24, 2024

akhilsaivenkata Jun 19, 2024

andreyvelich commented Jun 24, 2024

Electronic-Waste Jul 16, 2024

akhilsaivenkata Jul 16, 2024

Electronic-Waste left a comment

Electronic-Waste Jul 16, 2024

andreyvelich Jul 16, 2024

andreyvelich commented Aug 26, 2024

github-actions bot commented Nov 24, 2024

andreyvelich commented Nov 25, 2024

		command=["bash", "-c"] if callable(objective) else None,
		args=[exec_script] if callable(objective) else None,

[SDK]Support Docker image as objective in the tune API #2338

Are you sure you want to change the base?

[SDK]Support Docker image as objective in the tune API #2338

Conversation

akhilsaivenkata commented May 30, 2024

google-oss-prow bot commented May 30, 2024

akhilsaivenkata commented May 30, 2024 • edited Loading

andreyvelich left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreyvelich commented Jun 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Electronic-Waste left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreyvelich commented Aug 26, 2024

github-actions bot commented Nov 24, 2024

andreyvelich commented Nov 25, 2024

akhilsaivenkata commented May 30, 2024 •

edited

Loading