Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Range#expand() #431

Open
foolip opened this issue Mar 23, 2017 · 30 comments
Open

Range#expand() #431

foolip opened this issue Mar 23, 2017 · 30 comments

Comments

@foolip
Copy link
Member

foolip commented Mar 23, 2017

https://jsbin.com/valihix/edit?html,output

This test shows existence of the API in Blink, EdgeHTML and WebKit, but not Gecko.

It might make sense to standardize alongside w3c/selection-api#37, possibly using some shared logic.

It has been deprecated in Blink for a very long time, tracked by https://crbug.com/445218, but given the level of support I now think we should first investigate standardizing.

@foolip
Copy link
Member Author

foolip commented Mar 23, 2017

The API surface would be:

partial interface Range {
  void expand(DOMString unit);
};

The unit values are "word", "sentence", "block" and "document". Any other values are no-ops. If we really wanted to maybe we could use an enum which would result in an exception instead.

@annevk
Copy link
Member

annevk commented Mar 23, 2017

@littledan hey, it seems https://tc39.github.io/ecma402/ doesn't expose anything around word/sentence/etc. boundaries. Is that planned? I think ideally we make sure we use shared primitives to define these aspects, even if below they end up depending on platform conventions and libraries.

@littledan
Copy link

littledan commented Mar 23, 2017

Yes, it is proposed, in https://github.com/tc39/proposal-intl-segmenter , currently at Stage 2. I'm not really worried about a mismatch in semantics, as probably browsers will implement both of them using the same underlying internationalization libraries (usually ICU).

That draft spec isn't really well-factored in terms of exposing something HTML can call into--should it be?

The granularities that Intl.Segmenter exposes are grapheme word sentence and line. It's nice to see that two of those line up with the same name and the same semantics. What does block mean--is this like selecting until the next hard line break?

FWIW Intl.Segmenter's constructor throws an exception if you pass in a value that's not one of those four.

@annevk
Copy link
Member

annevk commented Mar 23, 2017

@rniwa ^^

I think it would be nice if all three API entry points could call into the same underlying primitive. Obviously Selection and Range have to do some processing around that for it to make sense for their contexts.

@rniwa
Copy link
Collaborator

rniwa commented Apr 5, 2017

No, what's exposed to ICU is completely different from what we use to modify selection at least in WebKit on Mac and iOS, and they're intentional. For example, ICU on iOS would treat each character in CJK as a separate word for line breaking purposes but for the purpose of selecting text, it would treat multiple characters as a single word. There are many other edge cases and special cases for selection, line breaking, word wrapping, etc...

@annevk
Copy link
Member

annevk commented Apr 5, 2017

Sure, but instead of ICU you could use your own implementation. Nothing here is requiring ICU. The idea is just that they share the same implementation.

@rniwa
Copy link
Collaborator

rniwa commented Apr 5, 2017

Well, on Mac, for example, we emulate clicking on a word (using AppKit) in order to determine the word boundary, and that's nothing to do with ICU. In fact, there is no way for us to expose that to Intl.Segmenter. Also, we do use ICU in other places. Modifying selection is just very special.

@domenic
Copy link
Member

domenic commented Apr 5, 2017

In fact, there is no way for us to expose that to Intl.Segmenter

Does that mean you're unable to implement the "word" part of the Intl.Segmenter proposal? If so it would be good to raise this at TC39, as the proposal is pretty far along in the process.

@rniwa
Copy link
Collaborator

rniwa commented Apr 5, 2017

In fact, there is no way for us to expose that to Intl.Segmenter

Does that mean you're unable to implement the "word" part of the Intl.Segmenter proposal?

No, we can totally implement that for Intl.Segmenter because we have ICU, and we use that for word breaking, etc... what I'm saying is that the word boundary behavior for selection is completely different from what we use for line breaking and Intl.Segmenter. It requires a hit testing in AppKit.

As a general rule, I don't think it's possible to spec almost anything about how selection behaves in WebKit on Mac & iOS based on a lower level primitive due to the way our stuff is integrated with platform APIs, and they're only remotely related to ICU's word boundary definitions on either OS.

@littledan
Copy link

This seems entirely reasonable to me. Seems like we can't expose a text-only primitive for Range#expand, even if we might be able to for line breaking (given the right tweaks).

@rniwa
Copy link
Collaborator

rniwa commented Apr 5, 2017

For your entertainment, TextBoundaries.mm#L183 is the code in WebKit that uses doubleClickAtIndex in AppKit. It's absolutely horrifying but nonetheless, that's what we use for word breaking in our editing / selection code.

@annevk
Copy link
Member

annevk commented Apr 5, 2017

I'm really curious why hit testing cannot be avoided. I'm also somewhat interested in what other implementations end up doing. If they all end up doing the same as Intl.Segmenter, having the abstraction could still be useful, even though we could allow difference for selection. (It's a little sad that Range ends up being impacted as well, as it's sorta supposed to be a non-UX API.)

@rniwa
Copy link
Collaborator

rniwa commented Apr 5, 2017

Sorry, hit testing was a red-herring to say. I don't think AppKit does actual hit-testing. It's likely doing some magic with ICU. I said hit testing because the method is called doubleClickAtIndex. It doesn't go through any sort of hit testing in CSS boxes or any DOM nodes for that matter.

My point was mainly that we get a string and pass it to AppKit, and AppKit does something with it, which is completely opaque to us. So we can't really describe it in terms of ICU operations.

@annevk
Copy link
Member

annevk commented Apr 5, 2017

Sure, but we shouldn't describe anything in terms of ICU operations, since ICU is just an implementation. The actual description needs to be flexible enough either way.

@domenic
Copy link
Member

domenic commented Apr 5, 2017

I guess the question in terms of implementations is whether WebKit/JSC would plan to implement Intl.Segmenter's word mode using ICU or AppKit.

@rniwa
Copy link
Collaborator

rniwa commented Apr 5, 2017

@annevk : well, the fundamental problem here is that we use ICU for all other word breaking purposes except editing & selection. I was using the term "ICU" as a way of referring to the word boundary definition for Intl.Segmenter. To put it another way, the word boundary behavior for selection can't be same as the one used for line breaking and Intl.Segmenter etc...

I don't think Web authors want our word boundary behavior for selection to be the word boundary behavior of Intl.Segmenter either. For starters, they have specific quirks that are only desirable and useful in the context of selecting a word. And it's going to be extremely slow.

@rniwa
Copy link
Collaborator

rniwa commented Apr 5, 2017

I guess the question in terms of implementations is whether WebKit/JSC would plan to implement Intl.Segmenter's word mode using ICU or AppKit.

We definitely don't want to be implementing it using AppKit because it doesn't expose enough information about other granularities ICU does: https://developer.apple.com/reference/foundation/nsattributedstring?language=objc

@domenic
Copy link
Member

domenic commented Apr 5, 2017

For starters, they have specific quirks that are only desirable and useful in the context of selecting a word.

That's what I was missing, thanks.

@yosinch
Copy link

yosinch commented Apr 6, 2017

I would like to oppose standardize Range#expand(). I prefer to get rid of Range#expand() from browsers.
Reason of my objection of standardize Range#expand() are:

  1. DOM base Range#expand() isn't useful since I'm not sure about use case and DOM base can't handle collapsed whitespaces,, or it can be done by Intl.Segmenter+Range.toString()
  2. Visible character base Range#expand(), Blink and WebKit do so, is hard to spec'd, e.g. Element#innerText. I think visible character in DOM spec is layering violation.
  3. Implementation of Range#expand() and Selection#modify() should be different == it is hard to share code, e.g. Range#expand() may not support line and page unit. Result of word unit on Selection#modify() are different among platforms, e.g. Windows includes trailing whitespace but others don't.

I think it is better to provide "visible character iterator" as API, in editing spec, and using Intl.Segmenter is better than providing high-level API Range#expand().

@annevk
Copy link
Member

annevk commented Apr 6, 2017

@yosinch are you in the position to get it removed from one of the three browsers mentioned in OP?

@rniwa
Copy link
Collaborator

rniwa commented Apr 6, 2017

FWIW, we will not attempt or consider to remove this API at this moment due to backwards compatibility concerns.

@yosinch
Copy link

yosinch commented Apr 6, 2017

@annevk, I'm working for Blink at Google, and owner of Blink editing. I'm also a member of editing-tf@
For Blink, usage of Range#expand() is low and we would like to reduce maintenance cost, I would like to
remove Range#expand().

@annevk
Copy link
Member

annevk commented Apr 6, 2017

@yosinch ah okay, if you're successful in removal we can file a bug against Edge to see if they can consider it too. Maybe then down the line WebKit can reconsider. I'm happy to delay standardizing contentious APIs.

@foolip
Copy link
Member Author

foolip commented Apr 7, 2017

@yosinch, do you know any editing people working on Edge? Getting their thoughts on this and Selection#modify() would be good.

Compat risk wise, Range#expand() has lower usage than many things we have successfully removed, but it would still require httparchive analysis to get an idea about what the breakage would look like. Unfortunately this API is not very grep friendly, my experiments with BigQuery are quite noisy. Happy to help dig in that if you want to pursue removal though.

@yosinch
Copy link

yosinch commented Apr 7, 2017

@yosinch, do you know any editing people working on Edge? Getting their thoughts on this and Selection#modify() would be good.

I don't know Edge people working on editing. Last time editing meeting in TPAC , Travis joined for discussion.

@foolip
Copy link
Member Author

foolip commented Apr 7, 2017

@travisleithead

@gked
Copy link

gked commented Apr 11, 2017

I am of the same opinion as rniwa. We should still be backwards compatible with the web. EdgeHtml will keep the implementation as is. We could, of course, deprecate it in the spec so authors would be discouraged from using it.

@annevk
Copy link
Member

annevk commented Apr 12, 2017

@gked it's already maximally deprecated by not being in the DOM Standard to begin with.

@foolip
Copy link
Member Author

foolip commented Apr 12, 2017

@gked, do you have any sense of how interoperable the EdgeHTML implementation is with Blink and WebKit? Does it operate on the DOM, the layout tree, or is it all tied up with editing as it is in Blink/WebKit?

@annevk
Copy link
Member

annevk commented Mar 13, 2018

@yosinch have you made any progress here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

7 participants