[python-package] simplify eval result printing #6749

jameslamb · 2024-12-12T05:21:48Z

Proposes a small refactor of _format_eval_result(), used to populate log messages like this:

LightGBM/python-package/lightgbm/callback.py

Lines 99 to 100 in 53e0ddf

    
           result = "\t".join([_format_eval_result(x, self.show_stdv) for x in env.evaluation_result_list]) 
        
           _log_info(f"[{env.iteration + 1}]\t{result}")

This is a first step in the broader refactoring proposed in #6748 ... see that issue for more details.

jmoralez · 2024-12-12T15:50:22Z

python-package/lightgbm/callback.py

-            return f"{value[0]}'s {value[1]}: {value[2]:g}"
-    else:
-        raise ValueError("Wrong metric value")
+    eval_name, metric_name, eval_result, *_ = value


What do you think of using similar names to the ones here

LightGBM/python-package/lightgbm/basic.py

Line 5215 in 53e0ddf

ret.append((data_name, eval_name, val, is_higher_better))

I find the first one a bit confusing, since it's the name of the validation set. Maybe something like dataset_name, metric_name, metric_value, *_ = value?

You're totally right... I think eval_name and eval_result are confusing. I love your suggested names, agree we should standardize on those (here and in future PRs like this).

updated in 6b63377

I'll do more of this renaming in future PRs.

python-package/lightgbm/callback.py

jmoralez · 2024-12-12T15:57:30Z

python-package/lightgbm/callback.py

-        else:
-            return f"{value[0]}'s {value[1]}: {value[2]:g}"
-    else:
-        raise ValueError("Wrong metric value")


We're loosing this error, not sure if it can happen but may be worth adding a check at the start like:

if len(value) not in [4, 5]: raise ValueError("Wrong metric value")

Honestly, I don't think this is a very helpful error message. "Wrong metric value" doesn't really describe the problem here, and you'll end up walking the stacktrace to find this point in the code anyway.

Originally in this PR, I was thinking we could just keep this simple and let Python's "too many values to unpack" or "not enough values to unpack" tuple-unpacking errors convey that information. But thinking about it more... removing this exception means that you could now implement a custom metric function that returns tuples with more than 5 elements and LightGB would silently accept it. I think it's valuable to continue preventing that, to reserve the 6th, 7th, etc. elements for future LightGBM-internal purposes.

I'll put back an exception here using logic like you suggested... but changing the message to something a bit more helpful.

Alright so I tried to add an error message here, and then write a test to check that it's raised... and I couldn't find a public code path that'd allow a tuple with too few or too many tuples to get to this point.

Here's what I tried:

import lightgbm as lgb from sklearn.datasets import make_regression X, y = make_regression(n_samples=10_000, n_features=3) def constant_metric(preds, train_data): # returns 4 elements (should be 3) return ("error", 0.0, False, "too-much") dtrain = lgb.Dataset(X, label=y).construct() dvalid = dtrain.create_valid(X, label=y) bst = lgb.train( params={ "objective": "regression" }, train_set=dtrain, feval=[constant_metric], valid_sets=[dvalid], valid_names=["valid0"] )

That gets stopped earlier:

[LightGBM] [Info] Total Bins 765 [LightGBM] [Info] Number of data points in the train set: 10000, number of used features: 3 [LightGBM] [Info] Start training from score -0.471765 Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/jlamb/miniforge3/envs/lgb-dev/lib/python3.11/site-packages/lightgbm/engine.py", line 329, in train evaluation_result_list.extend(booster.eval_valid(feval)) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jlamb/miniforge3/envs/lgb-dev/lib/python3.11/site-packages/lightgbm/basic.py", line 4445, in eval_valid return [ ^ File "/Users/jlamb/miniforge3/envs/lgb-dev/lib/python3.11/site-packages/lightgbm/basic.py", line 4448, in <listcomp> for item in self.__inner_eval(self.name_valid_sets[i - 1], i, feval) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jlamb/miniforge3/envs/lgb-dev/lib/python3.11/site-packages/lightgbm/basic.py", line 5214, in __inner_eval eval_name, val, is_higher_better = feval_ret ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: too many values to unpack (expected 3)

Here:

LightGBM/python-package/lightgbm/basic.py

Lines 5209 to 5215 in b33a12e

feval_ret = eval_function(self.__inner_predict(data_idx), cur_data)

if isinstance(feval_ret, list):

for eval_name, val, is_higher_better in feval_ret:

ret.append((data_name, eval_name, val, is_higher_better))

else:

eval_name, val, is_higher_better = feval_ret

ret.append((data_name, eval_name, val, is_higher_better))

So I think that maybe this error message we're talking about was effectively unreachable.

And I don't think we should add a custom and slightly more informative error message there in Booster.__inner_eval():

that code will raise a ValueError if anything other than exactly a 3-item tuple (or list of such tuples) is returned... so no need to add more protection against tuples of other sizes

that part of the code is already kind of complicated

that part of the code runs on every iteration, so adding an if len() ..: raise would add a bit of extra overhead to every iteration

Co-authored-by: José Morales <[email protected]>

…tGBM into python/format-eval-result

StrikerRUS

LGTM! Nice refactoring!

jameslamb · 2024-12-16T16:44:58Z

Thanks! I'll merge this and then continue with #6748 soon.

[python-package] simplify eval result printing

e1db89d

jameslamb added in progress maintenance labels Dec 12, 2024

jameslamb changed the title ~~WIP: [python-package] simplify eval result printing~~ [python-package] simplify eval result printing Dec 12, 2024

Merge branch 'master' into python/format-eval-result

5fa72f4

jameslamb marked this pull request as ready for review December 12, 2024 06:44

jameslamb requested review from guolinke, shiyu1994, jmoralez, borchero and StrikerRUS as code owners December 12, 2024 06:44

jameslamb added awaiting review and removed in progress labels Dec 12, 2024

jmoralez requested changes Dec 12, 2024

View reviewed changes

jameslamb and others added 4 commits December 12, 2024 10:14

Update python-package/lightgbm/callback.py

4d446de

Co-authored-by: José Morales <[email protected]>

better names

6b63377

Merge branch 'master' into python/format-eval-result

f2012b7

Merge branch 'python/format-eval-result' of github.com:microsoft/Ligh…

cc9bbbd

…tGBM into python/format-eval-result

jameslamb requested a review from jmoralez December 14, 2024 05:17

Merge branch 'master' into python/format-eval-result

d97590d

StrikerRUS approved these changes Dec 14, 2024

View reviewed changes

Merge branch 'master' into python/format-eval-result

132075d

jmoralez approved these changes Dec 16, 2024

View reviewed changes

jameslamb removed the awaiting review label Dec 16, 2024

jameslamb merged commit 480600b into master Dec 16, 2024
48 checks passed

jameslamb deleted the python/format-eval-result branch December 16, 2024 16:45

jameslamb mentioned this pull request Dec 17, 2024

[python-package] stop relying on string concatenation / splitting for cv() eval results #6761

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package] simplify eval result printing #6749

[python-package] simplify eval result printing #6749

jameslamb commented Dec 12, 2024

jmoralez Dec 12, 2024

jameslamb Dec 12, 2024

jameslamb Dec 14, 2024

jmoralez Dec 12, 2024

jameslamb Dec 12, 2024

jameslamb Dec 14, 2024

StrikerRUS left a comment

jameslamb commented Dec 16, 2024

	result = "\t".join([_format_eval_result(x, self.show_stdv) for x in env.evaluation_result_list])
	_log_info(f"[{env.iteration + 1}]\t{result}")

	feval_ret = eval_function(self.__inner_predict(data_idx), cur_data)
	if isinstance(feval_ret, list):
	for eval_name, val, is_higher_better in feval_ret:
	ret.append((data_name, eval_name, val, is_higher_better))
	else:
	eval_name, val, is_higher_better = feval_ret
	ret.append((data_name, eval_name, val, is_higher_better))

[python-package] simplify eval result printing #6749

[python-package] simplify eval result printing #6749

Conversation

jameslamb commented Dec 12, 2024

jmoralez Dec 12, 2024

Choose a reason for hiding this comment

jameslamb Dec 12, 2024

Choose a reason for hiding this comment

jameslamb Dec 14, 2024

Choose a reason for hiding this comment

jmoralez Dec 12, 2024

Choose a reason for hiding this comment

jameslamb Dec 12, 2024

Choose a reason for hiding this comment

jameslamb Dec 14, 2024

Choose a reason for hiding this comment

StrikerRUS left a comment

Choose a reason for hiding this comment

jameslamb commented Dec 16, 2024