-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantulum incorrect for prefixes that are not capitilized if the unit is incorrectly capitilized #161
Comments
I see the issue, the main problem being that for no match given the actual capitalization of the input, the closest match for all-lower input is given (completely ignoring capitalization). The most consistent way around this would be to return the first match obtained with as little capitalization changes as necessary (potentially preferring changing from lower to upper over upper to lower). |
Why not leave capitalization intact for matching a no-match object? Especially for prefixes this is a huge difference. |
What do you mean by "leave capitalization intact"? If units were compared without changing capitalization, "mw" would neither match "MW" nor "mW" which are the only valid official versions of writing Mega- or milliwatt respectively. Switching to all-lower, all of "mw" is matched to "MW" as well as "mW" which both are valid units. I see that another option would be to give preference to the match that is closest to the original string (which would return mW here, as expected) |
Then I don't know exactly how quantulum works. I thought there was machine learning doing the matches, not exact matching? I also learned you will get random output. There is a 50 / 50 one you get either one. |
It depends. Most of the tool is a very complex regex-based mechanism, it will turn up with "1pw" or "1mw". The unit in the back is then compared to all known units. If multiple units match, either a simple heuristic or the machine learning part is used to decide which unit has the highest likelihood of being correct. Hmm random output sounds bad. On the exact same input? Once trained, the output of the trained network should be deterministic. |
Yes, random output is bad. At least a warning should be thrown if picking a random option. Anyway, I made a PR. |
Describe the bug
When a unit is incorrectly capitilized, the SI-prefix will be read wrong.
To Reproduce
Expected behavior
Note
pW
andPW
,mW
andMW
are correctly read.Additional information:
pw
will not be read if not installed)The text was updated successfully, but these errors were encountered: