-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve documentation of experimental status #6905
Comments
For the schema docs, I propose something like:
For the data guidelines, I propose something like:
This feels rather definitive (in fact, we could probably automate this, as written). But it leaves out non-standard features. I can imagine scenarios were features are non-standard and experimental (e.g., some vendor decides to go it alone on something), or where features are non-standard non-experimental (e.g., a non-standard feature has been explicitly deprecated or in the process of removal). I'm not sure how exactly to phrase that and have run out of time to think about it this afternoon. |
Also, #5392 is important reading for this issue. |
The most important effect of the Unfortunately, I don't think the number of implementations is a great proxy for that. What should matter for browser vendors' willingness to make breaking changes is how widely used something is. Once something is widely used that really locks the behavior in, if you change it you break the web. I went looking and could pick out two single-engine where calling them experimental feels weird:
I guess the crux is that "experimental" sounds like "new" but if things remain single-engine for a long time they're really not experimental in that sense. To make a concrete suggestion, I'd say that anything that's been shipping for 2+ years in any browser by definition can't be considered experimental. In some cases it might change in the future (being removed) but in others it's just there for the long term, unchanging. |
Agreed that the icon is the most important effect. But IMHO the “Expect behavior to change in the future” text expansion of what that icon signifies is just one possible way to interpret what it’s marking. Another way to interpret is that it simply flags that feature as something that’s not actually a mature, agreed-upon part of the web platform. I think that sense is what’s more important to developers, and I think we’d be way better off if instead of “experimental” we had some other term to indicate “not a mature part of the web platform” or “not an agreed-upon part” or some such. Remembering what problem this flag should actually be trying to help solveA well-known problem we have is that when developers see a feature documented in MDN, they tend to assume it’s mature and agreed-upon and “standard” and that if any browser doesn’t implement that documented-in-MDN feature, the vendor of that browser is bad for not following the standards and not implementing that feature. So I think we have an obligation to clearly flag such “not-agreed-upon part of the web platform” features in MDN so that developers don’t make the mistake of assuming that there’s a vendor agreement about the feature. The wider community of people involved in feature development for the web have had discussions about the broader issue around this in a number of different places; WICG/admin#102 for example. In practice I guess it comes down to is what have commonly been labeled “single-vendor specs” or “single-vendor features’’. But I think the more accurate term would be “single-engine”. I guess one obvious alternative to using the label/key/flag “experimental” is instead just using the word “single-engine”. (When considering that, there’s a natural question that comes up — I’ll post a separate comment about that.) But if we don’t like “experimental” or “single-engine” as terms, then let’s please find some other word with the sense “not an agreed-upon part of the web platform” that we can use as a flag in BCD in place of “experimental”. I don’t feel strongly about what particular word we end up using for that flag. But I do feel strongly that we need such a flag. |
In considering the idea of a “single-engine” flag (aka “single-vendor”), a thought naturally comes to mind: Well, we wouldn’t need any flag at all for that, because we can just compute “this feature is single-engine” based on the existing support data. But I believe there’d be problems in practice with following that line of thinking; I believe we’d still want a static flag in BCD. Here’s why: What we can just compute “this feature is single-engine” would actually mean in practice is that “we” wouldn’t be computing it all, but that instead that work would need to be done bu others; in particular, it would need to be computed on the MDN side, in the code that consumes the BCD data and generates the Browser Compatibility tables — because within BCD itself, we have no processes that generate the data we’re storing is generated; it’s all static data. And since we have other downstream consumers of BCD data, we can just compute “this feature is single-engine” would also mean that N different downstream consumers would each need to write their own custom code for computing it. That might not sound like such a big deal — because after all, the data is there and it would seem relatively simple to compute “this feature is single-engine” from the data. But speaking as a writer of downstream consuming code, having written code to do that computation, in several places and multiple languages, it’s not as simple and clear-cut as you might imagine. For the curious, here are links to some of the code I’m using to do it —
I am not enthusiastic about imposing the burden on all other downstream consumers of (re)writing that same kind of code. So to keep things as clear as possible and to avoid the need for all downstream consumers to write such code, it’s better for everyone if we have a static flag in BCD for this. And as I said in my other comment, I think if we can could find a good word that conveys “not an agreed-upon part of the web platform”, that would be best. But if we can’t find such a term, then I personally would be fine with just calling using “single-engine” as the name for the flag. |
I agree. But the number of implementations is better as a proxy for measuring the level of browser vendor/engine agreement about ever making the feature an actual standard part of the web platform, so that developers can use it cross-browser. Browser vendors vote on potential web-platform features by implementing them. So the lack an implementation for a certain feature in a particular browser engine is an indication that the vendor of that engine has not yet agreed in practice to make that feature part of the web platform.
I agree that labeling those as "experimental" is a mismatch. But it seems like labeling them as “single-engine” would not be.
Agreed. Those features have instead simply just pretty clearly become “single-engine”.
Agreed. But if something’s been shipping in one browser engine for a long time (2+ years or whatever) and not in any other browser engines, then I think it’s even more important that we make it clear to developers that’s a “single-vendor” feature — or “not an agreed-upon part of the web platform” feature. |
https://github.com/mdn/browser-compat-data/wiki/Features‐in‐less‐than‐two‐engines is a dump of output from a script I ran to identify all BCD features that are either only implemented in a single engine, or not even in a single engine — and that aren’t yet flagged |
That seems too broad to me, as it would also cover things that are in the process of being removed from the platform, even including some things that are still in two engines, like
If we decide that any status should be strictly derived from other parts of the data, that doesn't mean we have to push that burden on data consumers. A script to update the status plus a lint to ensure it doesn't produce any changes would do the trick. A "single-engine" status would avoid a bunch of interpretation, but in light of #6738 we're not going to remove the experimental status, so we still have to decide when to set it to true and false in the data we have. I think calling something that just shipped in only one engine experimental is always appropriate, at least I can't think of any exceptions. But when enough time has passed, it no longer seems compatible with the typical use of the word to call it "experimental", and "Expect behavior to change in the future" will be increasingly inaccurate. Here's a straw suggestion:
|
This has been a really helpful discussion already. Thank you, @sideshowbarker and @foolip. It really clarified a lot of things for me. I apologize, but I wrote a bunch more words in reaction to this conversation: There are multiple plausible interpretations of
|
Yes, it does to me.
Thanks much for coming up with this. I have no suggestions — I think that articulation you’ve come up could solve a lot of the problems we’ve been discussing. I’ll be interested to see what @foolip thinks of it. |
On further reflection, I do actually have a suggestion to make: I think rather than framing “experimental” in the context of “future browser releases”, I think it would be better to just frame it within the context of the overall web platform — like this:
|
I don’t agree. I think something can be quite old and still be experimental — it’s just that it’s an experiment that never ended up producing what it was hoped or expected it would. I think in some contexts such experiments would be called “failed experiments” or “dead-end experiments” or some such. But our context, there’s not the connotation of something being a failure or dead-end; instead it’s just if we consider those experiments as having the goal of producing a feature the becomes a cross-browser-implemented part of the web platform, they’re experiments that haven’t ended up achieving that goal. Or for younger features, they just haven’t yet.
I agree that the wording "Expect behavior to change in the future" would be inaccurate for the cases we’ve discussed. But I think the solution to fixing that inaccuracy is to drop that wording — rather than continuing to try to constrain “experimental” to what that over-limiting wording constrains it to. And I think @ddbeck’s proposed new wording achieves the goal of providing us with a useful project-specific framing of “experimental” that avoids over-constraining it in any undesirable way — especially my refinement applied:
Another nice property of that refined form of the statement is that it doesn’t imply that the feature is necessarily at risk of being removed from any browser which has already implemented it, nor that there’s necessarily any risk of the behavior of the feature in the browsers which have implemented will change. Instead, we’re simply stating it’s experimental with regard to its status as a standard, cross-browser-supported feature of the web platform. |
Apologies in advance for pointing out something that risks further muddying the waters, but in regard to the above, I realize we also need to consider features which BCD lists as having zero implementations, and never having shipped anywhere. https://github.com/mdn/browser-compat-data/wiki/Features‐in‐zero‐engines is a dump of output from a script I ran to identify all BCD features that are not implemented even in a single engine — and that aren’t yet flagged It’s 159 features. (And incidentally, I notice that 64 out of that 159 — so 40% — are SVG features. That would seem to indicate either that the SVG spec defines a whole lot of vaporware features, or else that the BCD data for those SVG features may not be accurate; those features may in fact be implemented in some engine.) |
It occurs to me that if we frame it in that way, a question which is going to naturally arise is: What becomes the difference between the I think the answer to that question is, we shouldn’t have a flag named So a solution to any perceived redundancy in the purposes of the I assert that we ideally shouldn’t be trying to use But something that is measurable and that’s not subjective and arguable is: Is this feature currently specified, somewhere? I think “Is this feature currently specified, somewhere?” is what we mostly use the Certainly I can say that when I become aware of a feature having been dropped from a specification, I raise a PR to change the BCD That leads me to believe that we could probably just globally replace the Those changes would resolve any ambiguity around the difference between the |
It further occurs to me that after we were to go ahead with #6765 (adding It seems like it’s become clear the problem that’s common to both the And so that’s why it seems like we’re going to be better off in the end if, in place of the subjective notions of “experimental” and “nonstandard” we’ve been surfacing to developers through MDN, we instead surface to them the related characteristics of the feature that are measurable — specifically: the measurable property “has less than two implementations” in place of “experimental”, and the measurable property “not currently part of any specification” in place of “nonstandard”. |
Thanks @ddbeck for coming up with a concrete proposal here!
Will this go together with an update of the tooltip wording on MDN? That is the wording that I think ultimately matters, and it won't be possible to put all the nuance in there.
The signals you listed look good to me. I have thoughts on two of them though:
Just going by the issue trackers, virtually all specs have unresolved issues, and opinions about how serious they are will diverge. It will be hard for us in BCD to judge how serious the disagreement is, or what the risk of future incompatible changes are.
I think this is good, and it's something could apply to To exaggerate the situation, would BCD maintainers be happy removing the experimental flag for every feature where the Chrome team made a public comment like "we don't have plans to remove this and are committed to backwards compatibility", at least for features above a certain age? |
OK, I think these are the questions that still feel a little under-addressed. The first two seem to have answers, the last one does not. (Also, I've started a task list in the issue description). Should BCD even have experimental or standards flags? #6905 (comment)Raised by @sideshowbarker in #6905 (comment). I agree in principle it'd be much better if we didn't have to make a judgement call about each feature individually. But we've got these flags now and it's going to take time to come up with alternatives and get consumers to switch to them. I think it's worth making "experimental" less bad now, even if we're going to make substantially less bad later. The same goes for the standard track flag, though I actually think that one is in better shape and farther along to being dropped, with the discussions already happening on the spec_url work. Can we update the "experimental" tooltip on MDN? #6905 (comment)I believe so, yes. It may be easier to do after Yari ships. We'll probably want to propose a terser version of our definition (maybe: "Experimental. Use this feature to provide feedback to browser vendors and specification authors. Risk of behavior change or removal.") but we should be able to get the gist of things into the tooltip. Can single-implementation, non-standard features age out of "experimental" status?I hope this is a fair summation of @foolip's question. This feels like the thorniest question to me, once I started to outline a data guideline. What if a feature only ever ships with one implementation, but the browser vendor commits to keeping the feature, unchanged as-shipped? Should circumstances allow for the feature to be marked as not experimental or is it experimental for all time? So @sideshowbarker suggests this is a "dead-end" experiment but an experiment nonetheless; @foolip's question suggests that experiments should one day be allowed to conclude. I see the appeal of both of these choices. That said, I do wonder if it would be difficult to write the guideline for expired experiments. Right now, we have a really good idea of when to mark things as experimental: when they ship with only one implementation. If we have an escape hatch, it becomes a bit of a complex sequence. How do we know the old experiment won't change in behavior? How official does it need to be? Do we require a wontfix on an issue suggesting removing the experimental feature? Are comment from the browser vendor on the BCD repo? How much time must have elapsed since the feature shipped? I'm not sure I want to choose something that requires a flow chart to decide. Is this something anyone feels very strongly about? If not, I'm inclined to follow the path of least resistance in terms of maintaining the data in the hopes that derived data (e.g., a generated single-implementation flag) will someday supplant this. |
I would like to submit https://developer.mozilla.org/en-US/docs/Web/API/Selection as a case that definitely shouldn't be considered experimental, this is an ancient API. |
Yeah, we should get a move on for this. Were there any strong opinions for the question of, can single-implementation, non-standard features age out of "experimental" status? If not, I'll proceed with writing a PR without the answer being no. |
I do feel somewhat strongly about it, that having a single implementation is not by itself enough to imply experimental status. Saying nothing about the relationship between single-implementation and experimental status would avoid the issue, but clarity on this is in my mind the most important part of this issue. As a conservative rule which applies to fewer things than I would perhaps think is ideal, I'd suggest "anything shipped in its current form more than two years ago". |
@chrisdavidmills, @Elchi3, @foolip, and I talked talked about this yesterday. To recap, we agreed that:
Also, Chris alerted us to the existence of a definition of "experimental" on MDN. Speaking for myself here: by my reading, the MDN definition is compatible with the things we've talked about on this issue. It's worded in such a way that we can't adopt it directly (it's specification oriented, which doesn't help us for stuff that doesn't or won't have a specification and it doesn't have anything to tell us about dead-end experiments). I suspect we'll learn a lot writing our own definition and implementing it. We may want to offer our definition upstream, after we've gained some experience with it. |
Prompted by #6873.
#1528 proposes removing
status.experimental
. I'm not ready to head down that path yet, but, in the mean time, I am prepared to improve the way we set the experimental status.The schema docs describe
status.experimental
as:This text doesn't quite capture the meaning of the data as we've come to update it. For consumers, we should revise the wording to more clearly articulate the meaning of the
true
andfalse
states; for contributors, we should write contribution guidelines for choosing between the two values.Once these docs and the guideline are written, then we should start follow-up issues to clean up any data accordingly.
Tasks:
The text was updated successfully, but these errors were encountered: