-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potentially standardize window.find() #3539
Comments
Good news. Is the scope of this issue simply to standardize The answer to all of these probably will be: “What do current browsers do? Let’s stick with that,” but it’d be good to be explicit about the goal. And there may be some platform inconsistencies, especially in fuzzy matching. See also Charmod Norm, w3c/selection-api#37, whatwg/dom#431, tc39/proposal-intl-segmenter#17, #2424, the inactive String Search API the inactive FindText API, and the inactive RangeFinder API. |
For the record some browsers also implement a document.execCommand("FindString", ...) command. |
I always find it distressing that counting the usage of public-facing websites is used to make decisions. In my experience, there are far more complex web applications with big companies and big governments that would not (should not) be included within these statistics. window.find() is very much needed in editing style applications and there is a need for this feature (or a better alternative). Support for case folding, regular expressions, and other things that would help with a fuzzy search are really needed. It seems that one possible effort to standardize this capability, the FindText API, (http://www.w3.org/TR/findtext/) has been discontinued :-( |
One way to accommodate some of the goals of FindText without requiring standardization to take a stance on algorithm would be to specify how |
I'm not sure if this should be a separate issue, but my proposal to start the process of standardizing window.find by standardizing some of the aspects of find-in-page commonly used first. For instance,
By starting with definitions, I think we can start thinking about how to define the algorithm. However for some features, it might already be useful to reference definitions of find-in-page (e.g. https://drafts.csswg.org/css-scroll-anchoring/#anchor-priority-candidates 2nd candidate is "an element containing the current active selected match of the find-in-page user-agent algorithm" which could reference this) As an aside, I put together a brief overview of behaviors of find-in-page in different browsers (Chrome, Firefox, Safari) to see the commonalities and differences in behaviors. The doc uses find-in-page dialog, not window.find though. Does this seem like a good approach? |
Interesting. This falls into a gray area of web specs, of specifying UI. Generally we try to shy away from that, and only specify things which are observable from JavaScript. I believe nothing about find-in-page is observable, so we normally wouldn't specify it. However, sometimes we bend this rule, when it's especially beneficial, and all the browsers are interested. I guess I would ask what is the goal here, and for who. Are you trying to make things more predictable for web page authors? In what way, since find-in-page is not observable? Are you trying to make things easier for implementers? If the goal is purely to work on a better spec for |
Good question. The immediate benefit from having the definitions is for spec writers and implementers so that they can agree what is meant by terms like 'active match' (e.g. the scroll anchor spec I linked, and beforematch proposal; the latter would benefit from the algorithm specified as well since the timing of the event and timing of find-in-page scroll are dependent on each other). I think the ultimate benefit of at least partially speccing the algorithm is for users to have a consistent experience across browsers (although I'm not sure how valuable it is, since I imagine users don't typically switch browsers very often). That is, you can see in the compat doc I linked that browsers tend to do different things in a number of situations. In some cases, none of the browser seem to do "the right thing". For instance, content clipped by overflow hidden can be found on the three browsers I tested. It is conceivable that the spec here would dictate what should and should not be found, if that makes sense. As an aside, I assume that window.find essentially hooks into the find-in-page algorithm (maybe this is a wrong assumption), so any kind of specification for it is likely to be very similar. To put it differently, I think if window.find is specified and browsers update their implementations to match the spec, I suspect that they will also have to change the find-in-page behavior to simplify the code. |
I definitely see the benefit there. That could probably be accomplished with a fairly minimal spec, that just hand-waves at how the feature works but builds around a skeleton of some
I think you're right that this would be valuable for users, in that it would guide browsers toward doing "the right thing", where "the right thing" is what domain experts (HTML spec editors, CSS WG, i18n folks, and browser engineers) can collectively get together and agree upon. Maybe we wouldn't get total agreement, e.g. maybe one browser representative has a very different philosophical stance on what a "word" means, but that's fine. Any discussion at all would likely be an opportunity to improve things in this way. In other words, since this isn't JS-developer-observable, the goal isn't to get total interop, but instead to get the other values that the standards process brings. And I suspect that even if not all browser engines want to spend to spend time on this, you'd be able to get good discussion from the rest of the web standards community, and from any interested web developers and users. So, I'm sold that this is worth trying to specify.
Well, but as long as the result of |
Text search is a complex topic for reasons such as those called out in @js-choi's comment. Past attempts to write a spec at W3C failed to consider I18N basics early on and have foundered on that. The I18N WG (perhaps wisely?) shelved any attempt to work on it directly as part of Charmod-Norm by creating a separate document. Any group starting to work on this might want to have a look at string-search and to the issues we filed against FindText. I think this is worth taking a stab at--it is possible to overcomplicate the problem and at long as judicious choices are made (and well-documented) I think it is possible to have a successful result in a finite amount of time. |
@domenic it's pretty observable, no? console.log(window.getSelection())
window.find("test");
console.log(window.getSelection()) |
Hmm, that appears to be a Firefox quirk where |
For the record, I was testing something wrong; |
For folks watching this thread, @vmpstr has put together an initial pull request describing find-in-page in #5770 (direct preview link). It's pretty basic and, I think, should be uncontroversial. But it might provide a good place to collect some of the notes or open issues here, e.g. we could expand it to link to https://w3c.github.io/string-search/#searching, and eventually try to define |
This serves as helpful structure for work such as #3539, or potentially integrating with https://github.com/WICG/scroll-to-text-fragment or https://github.com/WICG/display-locking.
FWIW, there's a CSS issue about controlling whether an element is findable/searchable: w3c/csswg-drafts#3460 |
This serves as helpful structure for work such as whatwg#3539, or potentially integrating with https://github.com/WICG/scroll-to-text-fragment or https://github.com/WICG/display-locking.
My gut instinct on this is that "find" is too generic and meaningless. Adding eg openFindWindow() or findTextOnPage() or highlight/selectTextOnPage() would be intuitively more distinct from querySelector() and friends, which "find" just isn't. |
We don't get to choose the name; it's already in all browsers. This issue is just about writing a spec for it. |
|
It's rather unfortunate that what Specifically, Firefox operates on the Unicode Database level (in a language-independent way) and WebKit&Blink use collator-based search (with primary-level matching only) such that the collation data that is used is the CLDR search collation for the browser UI language. As a collator implementor, I'm very skeptical of the technical merit of collator-based search compared to search implemented directly over the Unicode Database layer (possibly with hard-coded exceptions to try to reproduce the main effects of collator-based search). (When operating on the Unicode Database, you transform characters to other characters and match on the transformed stream of characters. When operating on collations, you perform a complex mapping from characters to collation units and then ignore everything but the primary weight in the collation unit and match on the primary weights. Even with fast computers of today, you can experience a performance difference by using cmd/ctrl-f on the HTML spec in Firefox and Chrome.) I also don't want to bring collator-based search into scope Gecko or ICU4X. See a URL text fragment issue. |
Given that — along with the core “highlight the active match and scroll into view” behavior — browser UIs also expose a count of the total matches for the current query, it’s imaginable that it might be useful to developers (and for testing scenarios too) to have an API which programmatically exposes that total match count to JavaScript code. |
See:
Related #2858.
The text was updated successfully, but these errors were encountered: