Aligning views on stability guarantees, their implications, and community perception #295
Replies: 15 comments 65 replies
-
As a (the?) proponent of 2, I'm going to slightly rephrase the argument, though I don't terribly disagree with it as-is. But if I were to strengthen it without elaborating too much beyond a simple initial paragraph:
|
Beta Was this translation helpful? Give feedback.
-
Using semver semantics, the way I think about it is that we've been operating as releasing major-versions for each draft and patch-versions for clarification/bugfix updates. So, draft-07 is essentially 7.0.0 and the draft-07 update was 7.0.1. What we're talking about here is shifting from releasing major-versions to releasing minor-versions. So 2023 would be something like 10.0.0, 2024 would be 10.1.0, and clarification/bugfix updates to 2024 would be 10.1.1. So, for option 1, it's like making one last major version release and option 2 is like starting minor-version updates from the last release. The question is, do we want to release 9.1.0 or 10.0.0. Personally, I think it makes perfect sense to go with a new major-version as we start a new way of operating, but mostly I just don't want to start off with the baggage of things we never would have included yet if we knew it would end up getting set in stone. There are known problematic issues. The biggest problem being collecting unknown keywords being incompatible with the forward compatibility guarantees we want to provide. That has to change, which means we need a new major-version anyway (option 1). So, the question isn't "do we change things", but rather, "how much should we change things". My position is that we change anything that needs to be changed. Anything with known issues should be changed, or marked as experimental, or removed. That should be a relatively small set of changes. |
Beta Was this translation helpful? Give feedback.
-
Do you mean "let's not discuss this here" or "this is not in question and therefore true"? |
Beta Was this translation helpful? Give feedback.
This comment was marked as off-topic.
This comment was marked as off-topic.
-
Are the stability guarantees expected to disallow the kind of changes that were made in previous draft releases, or merely to formalize the kind of guarantees that were already implicitly being followed? I.e. does 1) mean "the next publication will contain a comparable amount of breakage to 2020-12 and previous drafts, and we commit to doing less breakage in subsequent publications", or does it mean "the next publication will contain unusually large breaking changes compared to 2020-12 and previous drafts; subsequent publications will revert to the normal level of backward and forward compatibility, and augment this with an explicit formal definition of what that means"? |
Beta Was this translation helpful? Give feedback.
-
Speaking as a user, please just take whatever you currently have right now and stick with it. The value of JSON Schema is the ability to use tools and libraries that have been designed around it. Every time a new version with breaking changes is released, this causes significant inconvenience for people who have built production systems around the spec and third-party libraries. At this point there are no further changes that could be made to the spec whose benefits would outweigh the costs to those who are already using the spec and the ecosystem of software built around it. Declaring JSON Schema to be "complete" and promising not to make any future backwards-incompatible changes is the single most valuable thing that could be done right now. |
Beta Was this translation helpful? Give feedback.
-
Thank you for asking this question. As I see it, any breaking changes at this point, will offset adoption of any new standards years and years into the future. If you make breaking changes now, I would reconsider how I write parsers and validators - should it be one code base or multiple code bases? One major game-changer, would be to have official schema parsers and validators, in enough languages to cover the majority of popular programming languages. Performance should of course be considered, but they wouldn't need to be the absolutely most performant libraries out there. Personal note: being forgiven if needed would probably be necessary. I think that looking at how the Protocol Buffers and Docker Compose specifications works is a good place to start. Besides that, having clear semver adherence makes you trust that things "just works". I am certain that you could get a lot of contributors to such a project. Sorry for the rant, I guess the conclusion is, that I hope you go with one of 2 options:
I personally hope you go with the last option, as I think this would help the adoption of JSON Schema in geneal as well. |
Beta Was this translation helpful? Give feedback.
-
I'm not going to be able to write a reply to each of the posts for the next week at least, but I will quickly try to inspire an appreciation for how nuanced this problem is. The purpose of proposing a media type through the "Internet-Draft" system is so that implementations can be written, experimented with, and used to produce iterations of the specifications. Although it is common to formalize protocols that are de-facto standards in production, it's not necessarily the case that every protocol (or format/media type) is safe to use in production. While a feature being used in production "in the wild" generally shouldn't be changed, the only guarantee it will remain stable is the publication of a media type that defines it to work that way. In the absence of a published media type, a specification can include all the explicit stability guarantees it wants... but if a subsequent publication can change these, then these guarantees aren't meaningful. You have to have a published media type definition. A published media type can still evolve, though. Evolution is made possible by incorporating "interoperability requirements" — in the form of BCP 14 language that says what implementations "SHOULD" or "MUST" do. When specifications are published that describe de-facto standards in the wild, it's usually made possible by having sufficient interoperability requirements from the start, so the specification can still evolve in the draft process while still being adopted in production. There's many mechanisms to enable this—CSS has vendor prefixes, HTTP/2 had ALPS identifiers, and HTML and ECMAScript are each powerful enough you can support all sorts of "graceful degradation" techniques. JSON Schema does not yet define requirements like this. So, the current goal for the specifications is to add interoperability requirements that permit evolution without disrupting older implementations. The essence of this is to prohibit unknown keywords, since unknown keywords may represent an intent by the schema author to reject some set of JSON documents from validation. It is correct that the addition of this requirement could break some implementations, but (1) it doesn't have to be this way—not all applications actually rely on this guarantee, so it doesn't have to be a requirement a such. What can happen instead is applications that require "strict" handling can specify a different default for unknown properties—instead of unknown keywords accepting by default, they can reject by default, or produce an error by default ("indeterminate result"). I wrote a whitepaper describing this technique, and I encourage everyone here to read it and incorporate the ideas there into your comments. And (2) when a schema author adds arbitrary keywords, this implies an assumption that the validator they are using will not change. But if you are expecting to upgrade your validator and writing custom keyword names, then unfortunately you've already committed yourself to breakage at some time in the future, and there's nothing that can be done about that—what if we define a new keyword that matches what you've written? We can minimize breakage, but schemas that use nonstandard keywords are flawed from the start and there's no changes to the spec that can be made that can fix that. Now, what do those interoperability requirements look like? Here's the top priorities, and then I have several more edits to circulate after these... but first things come first:
I would appreciate comments on these issues. |
Beta Was this translation helpful? Give feedback.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
-
This comment on HN (and some others) suggest that people are less concerned with breaking changes if there is tooling to easily upgrade their schemas. @jviotti created and @gregsdennis collaborated on AlterSchema, which does just that. If we break things one last time, providing an upgrade path for schema authors will ease that pain. |
Beta Was this translation helpful? Give feedback.
-
On Tue, Jan 31, 2023, 3:32 PM Greg Dennis ***@***.***> wrote:
Updating non-trivial JSON Schema tooling to support another version of the
specification is often not an easy ride.
This strikes at the heart of the issue. By promising that there will be no
more breaking changes, we make the process to support newer versions
exponentially simpler.
The question for this discussion, then, is *can* we (is it possible to)
make that promise without first including one last set of breaking changes?
No, that's a pipe dream. Specs tend to change over time. When you have
breaking changes, there's a deprecation schedule for the
unfortunately-namespace-versioned API, and you increment the MAJOR field of
a.n. e.g. SemVer MAJOR.MINOR.PATCH so that people can tell *from the
version number* that there could be breaking changes which requires fixing
their implementation.
For example, SPARQL 1.2 etc. are revised to support RDFstar and SPARQLstar
years later; after having been thought frozen.
w3c/rdf-star-wg#4 (comment)
"Call for breaking changes before this funded major revision" would be more
realistic IMHO
Because e.g. https://github.com/lexiq-legal/pydantic_schemaorg isn't
sufficient to describe all of the Preconditions and Postconditions, it's
probably safe to say that developers will need unspecified custom keywords
or that they need yet another schema document with the same data shapes.
|
Beta Was this translation helpful? Give feedback.
-
Something relevant but not covered here: How frequently are new drafts going to be published? If I were a spec author, I would be concerned about any future-compatible promise which I am not, as an author, ever allowed to break. Especially because it can be hard to define what a compatible change is. What about making the future-compatibility guarantee only ever one or two steps into the future?
|
Beta Was this translation helpful? Give feedback.
-
Unless I missed it, the definition of "breaking change" hasn't been clearly defined. Is it a breaking change if Is it a breaking change if draft-next's base metaschema includes a new vocabulary that defines a keyword that might overlap with a keyword that appears in someone's schema where they were assuming that it's purely annotative and otherwise ignored? After all, if they're requesting evaluation under We're already making breaking changes in draft-next: for example, the introduction of Extending that thought some more -- I don't think we can possibly consider the addition of new keywords as "breaking", even if/when we declare that unknown keywords are prohibited by default. The way to handle unknown keywords in a future draft would be to declare a vocabulary that defines them (with a schema of Therefore - we need to be clear about what constitutes a "breaking change". And after that, I think I am in favour of declaring that we'll (try to) avoid making such changes, but I want to know exactly what I'm agreeing to, first. |
Beta Was this translation helpful? Give feedback.
-
DECISION: After getting feedback from the community (thanks to everyone!), we have decided to move forward with option 1, which is including minimal breaking changes that allow us to promise that there will be none in subsequent versions, with the caveat that we provide as easy an upgrade path as possible, including instruction and possibly tooling.
Further discussion about dropping support for unknown keywords will continue in this discussion, which already contains several proposed solutions.
Recently I asked the core members to provide their thoughts on the stability status of the features we have in the spec. The collated results focused on keywords and revealed two primary opinions:
Until this conversation, I was firmly of opinion 1. However, now I'm on the fence. This discussion aims to resolve this difference.
Not in question:
Beta Was this translation helpful? Give feedback.
All reactions