-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Status of value semantics? #387
Comments
Hi @AprilArcus, I'm hoping something more complete can be written up on the status of this soon. In lieu of that, a short summary is that we have received feedback from multiple major JS engine implementation teams that their investigations around This was not the only feedback but is maybe the most significant in terms of the proposal now looking at other potential designs for how the equality could work. I'll be sure to post an update here when there is a link to more information. |
@acutmore thanks for the update! does this preclude different R&T instances with equivalent values being |
What about using these types as keys in |
@acusti mostly like it does preclude that. That said there could be a new API @demurgos, similar to above. It's likely that by default R&T would behave the same as any other object in |
Thank you for your reply. It feels like R&T will end up being syntax sugar for |
@demurgos the R&T champions and other delegates are still very much interested in supporting the use cases that R&T would solve. For example, composite keys as you mention is definitely something I would like to be better supported natively as it's extremely painful to build in userland. However for various compatibility and performance reasons it likely will require some kind of explicit opt-in. |
If |
Hi @AprilArcus, are there particular situations where you would find having an operator useful over having to use a function call? Adding a new overload to Adding a new operator, like any new syntax, tends to need a stronger justification compared to adding new API. This is likely because the remaining design space is somewhat limited (only so many non-letter ascii characters to choose from), and they can be harder to look up their documentation (IDEs currently tend to expose docs for APIs but not syntax). A new operator could also be researched as a separate follow on proposal. |
If the problem (interning) already exists with strings, (does it?) why is it not acceptable for R&T? Shouldn't the problem be addressed in the same way? I.e. instead of introducing an e.g. I'm sure there's something(s) I'm missing though because I didn't follow the bit about why that would "add an extra load instruction to existing usages of |
I'd rather not have R&T at all than have them without value comparison semantics. |
Hi @gregmartyn, I'll try and respond as best I can!
You are correct that comparing string with
This does not apply to all JS engines, it depends on the information that comes with their 'tagged-pointers'. In some engines the pointer itself encodes that it points to a JS-object or a JS-string. This means that the instructions for |
Myself, I'd still prefer this to go forward, even without them. fwiw, even though I agree it'd be ideal to have strict equality, a None of these options are ideal, but moving forward with R&Ts without strict equality doesn't close these doors, and decoupling it from this requirement seems to be an acceptable path forward. A future, separate proposal could address Also, I'm not sure changing |
A future proposal would never ever be capable of changing how any operator behaves with a specific kind of object. Once R&T ships as a non-primitive, that is forever an unavailable path for them. |
If we take a step back, what is even the goal that this proposal is trying to achieve? Is it a new feature or new syntax? In a reply above I was disappointed in the scope reduction but felt the sugar may still be useful. Since then I had some time to think more about it, and I don't want to give up the value semantics. Please keep them, they are the core of this proposal. The first paragraphs of the README are (emphasis mine):
The expectation from the start was that the goal was to get compound primitives, aka value semantics. These are an incredibly powerful feature because they fully integrate with regular operators and methods. I don't want to be too harsh for all the people working on this who want to drop value semantics, but it's also one of the most important ongoing proposals for JS. Syntax sugar can be great, for example the I understand that compound primitives are harder to support, but it may lead to a way better language in the long run. The JS module system is one of the best of any language I ever used, it required huge amounts of effort but the result was well worth it. Native async support was also great and deeply transformed JS for the better. R&T with value semantics is one of those landmark proposals that should be fully realized IMO. JS does not have stuff like operator overloading, so R&T is the best shot to get first class language support for composite values without stringifying the world. Compound primitives are also a solid foundation to expand on in the future; so it's better to take a bit more time and have the best version of them that we can. |
The most common case I run into where I feel like records and tuples would be a godsend is when I am trying to use structured data as a map key, without having to encode it as string or resorting to any other hacks. However, this doesn't work at all if records / tuples are just frozen objects instead of primitives. Although I appreciate the technical difficulty of adding them, I feel like it isn't worth it without this key feature. |
That use case - being able to control the "key" of Maps and Sets - is valuable to solve, and doesn't need R&T to do so. |
Whilst you are right that R&T is not needed for this use case, it feels natural to use an immutable, compared by value data structure for a Map key, doesn't it? It is simply my opinion that the real value of R&T isn't so much the immutability as the possibilities that are opened by a complex data primitive. |
Immutability is a requirement for any stable keying (it's also the reason the value must not be key-comparable before being stable). What is not automatically needed is for the composite key value to expose the values composing it. The original R&T proposal solved this by making values born deeply immutable, which made it itself usable as a composite key. What the champions are now exploring is the problem space if the immutable R/T and the composite key are not the same thing. |
Having key semantics different from equality leads to many inconsistencies. It's possible to build many examples, but they all boil down to expecting that a Map/Set having a key implies that there exists an item equal to the key. In particular, I expect all three functions below to be equivalent*. function setHas1(s: Set<unknown>, x: unknown): boolean {
return s.has(x)
}
function setHas2(s: Set<unknown>, x: unknown): boolean {
for (const v of s) {
if (v === x) {
return true;
}
return false;
}
}
function setHas3(s: Set<unknown>, x: unknown): boolean {
return [...s].indexOf(x) >= 0;
} The strength of value semantics is that it's not a special case for *: Yes, I'm aware of |
@demurgos thats not an assumption that holds for subclasses, so it’s a false assumption. Only the first is valid. |
I don't think anyone has suggested changing the default key semantics of Map/Set. However one line being explored is that Map/Set could be configured at construction with a keying function to derive the keying value from the provided "key" (This is roughly equivalent to an alternative Map/Set implementation). The keying value would have normal equality / SameValue semantics.
I don't believe a program can assume the behavior of dynamically invoked methods on parameters. Anyone today can already provide a custom map/set/array instance with overriden methods. The program can only trust the behavior of instances it itself creates (modulo prototype pollution which is an orthogonal concern). Since noone is proposing to change the default behavior of existing Map/Set, I don't believe there is a problem. |
I understand that JS is dynamic by nature, everything can be patched, etc. But I would still argue that any subclass or method override where iteration disagrees with membership no longer provides a consistent JS already has multiple ways to check for equality |
The keying function idea seems a bit awkward to me, because it's possible that the keying function will return different values for the same original key over time (eg, This means that either the mapped key needs to be stored indefinitely in the const map = new Map(k => JSON.stringify(k));
const k = { foo: 42 };
map.set(k, "hello");
k.foo++;
map.set(k, "world"); Presumably the I can imagine there might be some use for it, but I think if it becomes the main way to implement "compound keys", it could be a big footgun. Having a separate constructor for compound keys just seems more sound, but it does indeed just sound like a less useful version of tuples (since they're basically just tuples that can't be read). |
It's the responsibility of the creator of the Map/Set to use a keying function that's stable. Relying on the data in the key itself with the key being mutable is obviously not compatible with stability, hence why I mentioned that immutability is a requirement for any stable keying
Yes, the Map implementation would not rely on the keying function for existing entries, not just because of stability, but because otherwise it would expose the internal implementation of the Map to the keying function.
Right, a composite key is a useful building block for custom collection keying: the keyBy options can just return a composite key. I personally believe it actually should be the only accepted return values of keyBy, to avoid potential footguns like you describe. |
How does that avoid the footgun? Presumably |
My expectation is that a keying function would only ever be called once for any given value, and the result memoized. |
While a dedicated key creating API does not prevent mutated objects being unstable sources of key generation, it does still have benefits compared to While the primitive, value semantics, R&T design guarantees stability - the same indirection issues can arise because R&T could only store other primitives. So indirection (via symbols in weakMaps) was required to store non-primitive values (e.g. So to me the question is less of, is it possible to create an odd scenario with the API, and more of if the APIs encourage good patterns by default and does less safe usages of the API clearly standout when reading the code. |
If that's taken literally, it means it would need to remember multiple original key values, eg: const map = new Map(k => {
console.log("called");
return JSON.stringify(k);
});
const k0 = { foo: 42 };
const k1 = { foo: 42 };
map.set(k0, "hello"); // "called"
map.set(k1, "world"); // "called"
map.get(k0); // ??
map.get(k1); // ?? Either both distinct key objects ( The former sounds very weird to me, since repeated sets of the "same" key will grow the map. Perhaps the implementation can mitigate this by having an implicit weak map associated (so temporary objects at least are not maintained), but this seems like a lot of undesirable overhead. |
yes, i'd expect |
I'm just a JS developer, and not part of the standards body, so this is just my two cents, but my preference would be to either retain the value semantics or abandon the entire proposal. Records and tuples with value semantics would be transformative. As I'm sure everyone posting here already knows, many popular libraries and frameworks in the JavaScript ecosystem (React, Redux, RxJS, etc) are "reactive" in the sense that they involve computing the difference between a new state and a prior state and taking some action based on the changes. Any programmer who works professionally with these libraries always needs to be aware of the reference semantics of objects and arrays; carelessly replacing one object or array with another, identical object or array can trigger a large amount of unnecessary downstream work or even an infinite loop. Records and tuples with value semantics would make it much easier to write correct, performant code in these situations. Record and tuples without value semantics, by contrast, don't really help with this problem at all. It doesn't seem to me that they would offer much value in terms of solving my day-to-day problems programming in the JS ecosystem. Since there's a limited amount of nice syntax available in the language, if value semantics are not on offer, my preference would be to save the syntax for something more impactful. |
That wouldn't be a breaking change, at least not the way those words are usually used. Do you have a link to the discussion where browsers refused a "canBeHeldWeakly" predicate? I'd hope that if we went this route a such a predicate would be palatable. |
The reason they rejected it, as I understand, is because they didn't want the answer such a predicate gives, for a given value, to change over time. The counterargument I and others offered was that that was the entire benefit of such a predicate - that you didn't have to look at the object, you just would branch on the predicate, and so it would be safe to migrate all current checks to use this predicate since it would always do what's expected even if the set of things changed. I can't locate the specific point in the notes, but I believe @syg was the more vocal folks on the other side of that discussion? |
Ah, sure. But now that we have symbols as WeakMap keys, it wouldn't change over time - there's no more values for which the answer to that question could plausibly go from "false" to "true", and of course you can't change the answer from "true" to "false" for anything without that actually being a breaking change (because people could have been using those values in a WeakMap). So if that was the only objection I don't see a reason not to add the predicate now, especially if we take the "tuples are objects but can't be held weakly" route. |
Because current code will already exist that says:
and rely on it not throwing for any |
That code will not break unless someone passes one of the new values in. That's not a breaking change any more than adding a new kind of primitive would be - in a very similar vein, lots of code assumes that they have an exhaustive switch over possible "Old code will break if you give it a new kind of value" isn't something we regard as a breaking change. |
Right, the compat requirement is that the predicate does not change its return value for any particular value over time. Adding new kinds of values means you have to decide on what the new return value ought to be for the predicate, which isn't the same thing as changing the result for existing values. |
Fair enough. I still think that making the description of "what can be weakly held" even more complex, absent a predicate, is not a positive change in the language. |
WeakSets and WeakMaps do guarantee that values are held as long as the key is reachable, so no, it can't just do nothing. But throwing is fine. |
If we wanted to avoid impacting performance of existing applications, couldn't the performance penalty of equality be paid when |
Yes, this approach is referred to as "interning" and is how the polyfill works. There are downsides of this approach. |
Isn't that just a limitation of having to use
Are we certain on this? If the lookup is not performed, then new objects have to be created that may not be needed, both unnecessary memory allocations and garbage collections. |
these seem like reasonable and easily documented trade-offs to me. |
I agree about interning being generally undesirable performance-wise (otherwise engines would be interning strings and bigints, and they demonstrably don't: #292 (comment)), but it would be nice if someone can explain the actual performance reasons as to why another type of content-compared value can't be added. Was this not a concern when adding bigint (another type of arbitrary-sized value that in general isn't interned)? Does this mean that there will never be another similar type (eg, decimal)? Is the performance issue contingent on |
@Maxdamantus yes, my understanding is that the lack of adoption of bigint has made browsers feel that the effort isn't worth it, which means that (unless that changes) there will never again be another new primitive type, which precludes both R&T and Decimal as primitives. |
@ljharb That doesn't really answer the question though. Was there a significant performance impact from adding bigint to the language? All major browsers currently support it. And if there was an impact, will it be increased by R/T? As for usefulness, I suspect the addition of R/T values is more important than bigint values was, since emulation of compound comparisons is fairly common (confer I suspect another issue with bigint usability that it was harder to polyfill (maybe this doesn't apply nowadays, but I feel like that probably limited adoption when people should have been excited to use it), whereas R/T can already be polyfilled (with the minor caveat of normalising |
I'm not clear on how much of the large impact was performance and how much was implementation difficulty, but the impact was large, and for R&T it would again be, and they aren't willing to repeat it. I 100% agree that R&T (and Decimal) are each much more useful than BigInt, and would have gotten way more adoption, and that it's turned out to be a shame that BigInt went first - but this is where we are. |
I don't really have a use-case that needs BigInts, but R&T helps with the ubiquitous problem that is state management. I also don't feel that implementation difficulty should be a driving factor in steering the language design. My primary concern is always what's best for the language itself. |
@Maxdamantus I'm just chiming in to share my sense of alignment (that might change if I learn more, I'll admit..) that non-object-based implementations could provide significant benefits. To respond nitpickily to one paragraph of yours:
My understanding from the current spec (refs: |
Right, but it was pointed out that that could cause security issues in user implementations of membranes that assume that, eg, Hopefully I'm not creating too many levels of indirection, but here's a post with links to earlier discussions about this: (#390 (comment)), where the outer post has some follow up discussion about this topic. #292 probably has more of it, but there's more in general to read through. I'll also copy this thought I've expressed somewhere else:
I think of it the other way around. I don't think the value of |
Ok, yep - those are valid concerns. As I understand it currently, So, I think that the safety level provided by FWIW: for the use case that gathered my interest in this proposal, the ability to reliably (deterministically) round-trip a datastructure to a language-agnostic representation, that has neither a prototype nor mutability when the contents are loaded, would be a benefit. |
This is one of the challenges. It would be beneficial for engines to re-use the existing machinery they have for objects, e.g. hidden classes and inline caches that make property access fast. However the current specification for R&T makes them different enough from objects that it is a non-trivial task to implement them in terms of objects. One of the other concerns is that a common stated use case for R&T from the community is to increase performance but, similar to strings, as R&T grow in size (imagine a very long tuple) comparing two of them is a linear operation. I.e. it might be the case that switching an entire app over to R&T would actually make it slower. Instead they would need to be used only when appropriate and it's not entirely clear how to clearly state the guidance on for appropriate use as it's very much a "it depends" situation. One of the alternative designs that reduces the largest amount of concern is dropping the |
That makes sense. I would probably suggest not following the possibilities to optimize those O(n) situations (such as requiring hash-equality or interning) initially -- but to ensure that they're possible in future and/or on a per-implementation basis. Reasonining: sufficiently-motivated folks will benchmark/profile their code, and that'll help to inform whether R&T provide a benefit despite the naive-comparions -- and also in cases where people question why an anticipated benefit hasn't materialized, they can trace that to their engine and request/contribute improvements there.
From reading a number of requests for something that could replace |
Would you be able to describe your use case? It would be much appreciated.
The counter to that though is: if developers benchmark and find they are too slow so don't adopt R&T then the engines would have done a lot of work for little gain. And if developers benchmark and find they are slow and respond by raising a bug this doesn't change the fact that implementations have already warned that these will be hard to optimise and that may not be possible. In other words if a majority of developers expect them to be fast, then we need to ensure the design is one that we are confident can be fast without significantly re-writing of the existing engines. This is the general catch-22 of language design. It's hard to predict how a new feature will be used at scale across large projects because large projects are unlikely to adopt an unstable language feature. So we have to make educated guesses based on smaller samples, or looking at other languages that already have similar features. Other languages tend to do equality by an |
Certainly; I've made a few contributions to the client-side HTML/JS search implementation in the Sphinx documentation generator, and that's where this originates. When Sphinx builds a documentation set to HTML, the entire output is a static, self-contained website, and when using client-side search, the search index is represented as a JavaScript file loaded when the search page is opened. Most of the content of that index is a JSON blob, but there is a small JS function call around it. Because the use case involves user-controlled input (the search query), and also because the search index contents itself should not change between or during page viewing, I'd like to sanitise the data that the code operates on. Sanitise is a term that I've made up there (vaguely derived from home-brewing experience) - it involves freezing the objects in the index and removing their prototype references. To some extent I'd be glad for any runtime/interpreter optimizations based on an immutable search index - but that's not my main objective; primarily I'd like to improve the assurance that the search index content retains its integrity and does not provide access to other runtime code. R&T would, I think, allow much of this in an elegant and effective way, and |
A self-correction:
From reading the |
I wanted to note that there can be a performance improvement even if R&T's |
To add to this point, I think it's worth thinking about how much worse (in terms of both performance and DX) the language would be if we didn't have string values with content-based equality. String equality is also
Tying this to my point above, this is already more-or-less how JS works for strings, where comparison involves an I think the main difference between JS and other languages is that the non- In the past in Java, there hasn't been a very clear distinction between mutable and immutable things, and people would tend to make their IDE generate the "obvious" implementation for Just for clarity, when you say "interning", do you mean eager deduplication of allocations containing the same content, or do you mean where the language doesn't provide a way to distinguish between allocations of the same content? Usually "interning" means the former, but JS only does the latter. I've pointed out in other posts (eg, #292 (comment)) that string implementations in JS are not based on interning, and I wouldn't expect R/T values to be either. |
While @Maxdamantus - lots of great points here
I completely agree that strings being immutable and It also comes back to the 'ability to re-use existing object machinery'. String optimisations such as ropes could also apply to tuples, making it more efficient to keep concatting new values to the end. However this means that tuples would have their own new representation - instead of being able to leverage the existing optimisations that exist for arrays.
Yes, eager deduplication so equality becomes pointer equality.
Yep, the way this proposal was designed and was being implemented in SpiderMonkey was by not using interning. However this revealed the level of implementation complexity this approach had. While equality is defined in only a few places in the spec, equality in implementations is spread out and inlined into different places as they have evolved and added multi-tier JITs. It's not technically impossible but is a large undertaking, and the feedback we received is that it is being considered too large an undertaking. Interning would simplify things, but does bring downsides such as not being lazy. |
According to @littledan on JS Party Episode 305, value semantics for records and tuples are being reconsidered by the committee. Could you expand a little more on the concerns here?
The text was updated successfully, but these errors were encountered: