Section 9.1 AvailableLocales shouldn't require base language if script is present #947

sffc · 2024-12-10T01:28:24Z

It seems like it should be allowed for AvailableLocales to support zh-Hant but not zh, since zh implies zh-Hans. However, the spec currently states:

Additionally, for each element with more than one subtag, it must also include a less narrow language tag with the same language subtag and a strict subset of the same following subtags (i.e., omitting one or more) to serve as a potential fallback from ResolveLocale.

The text was updated successfully, but these errors were encountered:

gibson042 · 2024-12-19T20:43:02Z

since zh implies zh-Hans

Is there any spec text supporting this claim? I would consider it perfectly reasonable for an implementation that has data for "zh-Hant" but not "zh-Hans" to use the former in service of requested locale "zh", and any application that specifically warrants "zh-Hans" should be specific.

On the other hand, it would seem bizarre and in violation of the spirit (if not also the letter) of BCP 47 to support narrow data in absence of covering broad data. Some excerpts:

«A language tag is composed from a sequence of one or more "subtags", each of which refines or narrows the range of language identified by the overall tag»
«In the lookup scheme, the language range is progressively truncated from the end until a matching language tag is located. Single letter or digit subtags (including both the letter 'x', which introduces private-use sequences, and the subtags that introduce extensions) are removed at the same time as their closest trailing subtag.»
«For example, a user who reads both Simplified and Traditional Chinese, but who prefers Simplified, might use the range "zh" for filtering (matching all items that user can read) but "zh-Hans" for lookup (making sure that user gets the preferred form if it's available, but the fallback to "zh" will still work)»
«Whether a subtag adds distinguishing value can depend on the context of the request… If the user cannot be sure which scheme is being used (or if more than one might be applied to a given request), the user SHOULD specify the most specific (largest number of subtags) range first and then supply shorter prefixes later in the list to ensure that filtering returns a complete set of tags.»

I don't think that's invalided by Unicode likelySubtags logic, which improves "best case" results but should not preempt such "worst case" scenarios.

sffc · 2024-12-21T05:03:09Z

TG2 discussion: https://github.com/tc39/ecma402/blob/main/meetings/notes-2024-12-19.md#section-91-availablelocales-shouldnt-require-base-language-if-script-is-present-947

sffc added c: meta Component: intl-wide issues s: discuss Status: TG2 must discuss to move forward labels Dec 10, 2024

sffc added this to ECMA-402 Meeting Topics Dec 10, 2024

sffc moved this to Priority Issues in ECMA-402 Meeting Topics Dec 10, 2024

sffc mentioned this issue Dec 10, 2024

Specify language tag fallback support webmachinelearning/writing-assistance-apis#17

Merged

sffc moved this from Priority Issues to Previously Discussed in ECMA-402 Meeting Topics Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Section 9.1 AvailableLocales shouldn't require base language if script is present #947

Section 9.1 AvailableLocales shouldn't require base language if script is present #947

sffc commented Dec 10, 2024

gibson042 commented Dec 19, 2024

sffc commented Dec 21, 2024

Section 9.1 AvailableLocales shouldn't require base language if script is present #947

Section 9.1 AvailableLocales shouldn't require base language if script is present #947

Comments

sffc commented Dec 10, 2024

gibson042 commented Dec 19, 2024

sffc commented Dec 21, 2024