The _code editor_ use case #8

marijnh · 2018-11-14T12:46:33Z

(Opening this to join the discussion—I unfortunately found out about this work only now, via the Chrome Dev Summit.)

So, another type of software that has been implementing something like this is code editors. All of CodeMirror (which I maintain), Ace, and Monaco do some kind of virtual rendering, where in order to support huge documents without bringing the browser to its knees, they only render the lines of code around the viewport.

This is different from classical infinite scrolling in that the content isn't open-ended—there's a clear start and end to it, and the vertical scrollbar should provide a reasonable approximation of the height of the entire document right away. But other than that, I think there's a lot of overlap. We've also been having problems getting browser search to work (or, more accurately, have just given up on that for the time being, falling back to custom search interfaces, which is problematic for various reasons).

One thing that concerns me about the direction that uses hidden DOM nodes is that creating those DOM nodes might already be too expensive. In CodeMirror, we strive to support any document size that reasonable fits in memory, trying to keep our memory consumption for non-visible parts of the document under 2x the size of the strings needed to represent the document text. In this context, given the memory consumed by browsers' DOM implementations, creating the entire DOM structure for the document might just not be viable for memory pressure reasons.

As such, when I heard about this work I was initially hoping that it'd include some programmatic way for JavaScript code to react to search events. That'd be a much less convenient API to plug in to, but for a library like this, knowing in advance what the user is searching for would allow the library to selectively render those parts of the DOM that match the search string, at which point the browser's native code could 'see' them and scroll them into view as appropriate. (There'd be corner cases like searches that match too many lines that we could still not handle perfectly, but the user experience would still go from awful to pretty good, which'd be a big win.)

One thing that's not quite clear to me yet is how you're addressing 'flickering' during quick scrolling, where empty space becomes visible as you scroll quickly because scroll events are delivered only after the scrolling took place. Will these components be able to hook into the browser's native scroll behavior in a way that allows them to render or un-hide their content before it becomes visible, or will they have to fake their own scrolling? (The latter is problematic because you'll never exactly duplicate the native behavior or performance.)

The text was updated successfully, but these errors were encountered:

domenic · 2019-02-01T22:03:40Z

Hey @marijnh, thanks for weighing in, and sorry for us taking so long to get back to you! The team has gone through a bit of churn, but we'd love to pick up this discussion if you're willing to forgive our lengthy absence. /cc @rakina, @bicknellr, @stubbornella, @chrishtr, and @vmpstr, who are involved in various capacities these days.

We're hugely interested in supporting the code editor use case. If we're going to put something into the web platform, it should be general enough to support all types of "infinite list" cases, from Twitter timelines to code editors to news articles. Indeed, one of the main reasons we've renamed our work to "virtual scroller" is to try to be more inclusive about the use cases. CodeMirror is one of the potential users that we are all particularly excited about, mainly because of how it powers GitHub.com which we all use.

One thing that concerns me about the direction that uses hidden DOM nodes is that creating those DOM nodes might already be too expensive. In CodeMirror, we strive to support any document size that reasonable fits in memory, trying to keep our memory consumption for non-visible parts of the document under 2x the size of the strings needed to represent the document text. In this context, given the memory consumed by browsers' DOM implementations, creating the entire DOM structure for the document might just not be viable for memory pressure reasons.

It would be great to dive into this more. Actually having the text be present in the DOM has so many advantages, not just for find-in-page, but also for accessibility, indexability, etc. We'd very much prefer to make that viable. And your benchmark, of 2x the size of the strings, gives a lot of hope! I think we can definitely ensure that "searchable invisible DOM" of this sort, since it won't have associated layout objects and similar, can come in under that.

Probably you'd need to hold off on fully "decorating" the DOM nodes until they come into view, e.g. adding wrapper nodes for syntax highlighting, or IntelliSense, or so on. But just keeping the text in memory, perhaps with a <div> wrapper per line or per 10 lines or so, should stay under the 2x limit. At least, that's my intuition; does it match with yours from building CodeMirror?

As such, when I heard about this work I was initially hoping that it'd include some programmatic way for JavaScript code to react to search events.

We did start down this path, but we realized then we'd need a separate solution for accessibility, indexability, focus management, and find in page. Some people also expressed privacy concerns about "watching" what the user searched for; although I'm not sure those made a ton of sense, having to overcome those adds more friction.

In the end we realized we weren't thinking big enough. Virtualization is not actually something to be preserved; it's a consequence of the fact that the browser is not efficient enough at dealing with out-of-viewport elements. It's better if we can treat the problem at its source, and make virtualization unnecessary. So that's the latest approach. We'd be thrilled if you were able to take the time to test out that work and let us know if it's workable yet, or needs more elbow grease.

One thing that's not quite clear to me yet is how you're addressing 'flickering' during quick scrolling, where empty space becomes visible as you scroll quickly because scroll events are delivered only after the scrolling took place.

I'm a little unclear as to whether you're referring to normal scrolling, or find-in-page scrolling. Both cases seem solvable, though. From what I understand scroll events are currently dispatched at requestAnimationFrame timing, i.e. before the actual painting occurs, so the tools should already be present. (If not, that's probably a bug!) Similarly, the activateinvisible event for "something is found" gets dispatched before the browser scrolls to the found content, giving you the opportunity to un-hide or otherwise prepare your content.

So, where do we go from here? If you're willing to do some experiments with early-stage, behind a flag technology, we'd love to get your feedback. If you enable "experimental web platform" features in Chrome Canary, you can use the searchable invisible DOM proposal's invisible="" attribute and activateinvisible event. The API surface may be changing soon as part of a merger with another set of APIs (which will help make the cost of invisible DOM nodes even cheaper), but the basic concepts are there.

You're also welcome to experiment with our higher-level "virtual scroller" APIs, but they're in a transitional state now, from the master branch which still uses virtualization, to a new WIP PR that is actually based on searchable invisible DOM. It feels a bit early to test those, although as I said in the opening, we do very much hope that eventually they can be usable by code editors.

Anyway, let us know if there's anything we can do to work further with you and CodeMirror on this. It'd be amazing to have a working demo of native find-in-page working with a CodeMirror editor.

marijnh · 2019-02-04T10:35:32Z

Thanks for getting back to me!

And your benchmark, of 2x the size of the strings, gives a lot of hope!

Do you know whether DOM nodes can share string data with the JS heap nowadays? I did some benchmarks on both Firefox and Chrome that suggest they can—which would be a big win. Still, you'd need at least two DOM nodes per line (a wrapping block element and a text node), with many lines being tiny (zero to 5 characters), but yeah, if we can get guaranteed non-rendered, DOM nodes staying in the 2× range might be realistic.

I still do have some questions. If these aren't laid out, how does the browser decide where to scroll to when it finds a search match in the middle of a big chunk of this content? Due to highlighting and other styling considerations (CodeMirror allows user code to insert 'widgets' in the editable content, which can have any height), the plain text height of these nodes might be quite different from their actual rendered height, and even that plain text height might be expensive to compute when jumping to a match below a ton of other content (CodeMirror internally keeps a data structure with estimated or measured heights for each line, but since the browser does the scrolling into view here, that won't help much).

From what I understand scroll events are currently dispatched at requestAnimationFrame timing, i.e. before the actual painting occurs,

I don't think so. Remember those old pre-position:fixed solutions to keeping some box at the top of the page? They lagged. Because browsers do paint before they fire scroll events (this is very easy to test, see for example this). If they didn't, a lot of my problems would be gone, but even then, you have things like mobile browsers only firing a scroll event every once in a while during smooth scrolling. So anything that relies on fixing stuff up in response to the scroll event is going to work poorly.

So, where do we go from here?

I have a lot on my plate right now, but yes, I will try to make time to do some experiments in the near future.

Finally, I notice this is all pretty much entirely Google-driven. Are any Mozilla people involved, or is there at least a positive response from Mozilla to this initiative to be found somewhere?

domenic · 2019-02-04T20:08:44Z

Do you know whether DOM nodes can share string data with the JS heap nowadays?

I'm not sure, but I'll ask around!

Still, you'd need at least two DOM nodes per line (a wrapping block element and a text node)

Although I can imagine some plausible reasons, could you expand on why you need one wrapper per line, instead of e.g. a wrapper per 10 lines or so? It'd be good to tune my intuition in this space.

If these aren't laid out, how does the browser decide where to scroll to when it finds a search match in the middle of a big chunk of this content?

Right, the web developer still needs to do the traditional virtualization-style tricks to estimate this (like your data structure with estimated/measured heights per line), and leverage that.

Concretely, an example sequence would be:

Lines 1-10/50 are visible; lines 11-50 are invisible. The control estimates that the total content size will be 4x the currently-layed-out size, and creates a spacer div to represent that, thus getting the scrollbar height to look correct.
User searches for something that is on line 35.
Browser sends activateinvisible event to the line 35 wrapper node.
The control reacts as follows:
1. Makes lines 31-40 visible.
2. Rejiggers to add spacer divs representing lines 11-30 and 41-50.
The browser now scrolls down to line 35.
- At this point the visible blocks are lines 1-10, a spacer div, and lines 31-40.
- Thus the user experience of scrolling is to see the spacer div flash by between the two visible portions, before ending up with lines 31-40 visible (assuming the browser scrolls the search result at line 35 to the center of the viewport).
When the scroll finishes, the control reacts as follows:
1. Makes lines 1-10 invisible.
2. Makes the spacer div that currently represents lines 11-30 to instead represent lines 1-30. (Probably using the previously-known layout information about those lines to get a more precise estimate.)

This sequence is off the top of my head, and may not be 100% right, but hopefully you get the idea. @bicknellr may be able to provide a better description, since he's implemented roughly this in the WIP virtual scroller PR I mentioned above. There things are a bit more abstracted, so that you can handle normal scrolling and find-in-page scrolling the same way.

From what I understand scroll events are currently dispatched at requestAnimationFrame timing, i.e. before the actual painting occurs,

I don't think so

Hmm, that's unfortunate. Do you know if things are any different with IntersectionObserver? This feels like @chrishtr's area of expertise; perhaps he could comment.

Are any Mozilla people involved, or is there at least a positive response from Mozilla to this initiative to be found somewhere?

Both Mozilla and Apple were intrigued and fairly engaged with this idea at TPAC 2018 (minutes). I'd guess that we'll need to prove it out with some experiments though before it becomes something either would really be willing to commit to. I can ask around for more info.

marijnh · 2019-02-05T09:50:41Z

Although I can imagine some plausible reasons, could you expand on why you need one wrapper per line, instead of e.g. a wrapper per 10 lines or so? It'd be good to tune my intuition in this space.

That's a good point—if these aren't presentational at all, I guess we could render big chunks of text to a single text node.

Concretely, an example sequence would be:

That actually sounds like it would work really well with our approach.

Do you know if things are any different with IntersectionObserver?

Nope, same problem. I tried opening an issue on that, but was brushed off with 'out of scope'.

Great to hear there was at least some discussion, thought that does look relatively minimal. (Note that I would be very hesitant to integrate this if it ends up a Chrome-only thing—I'm happy to experiment with it and try to help convince people with useful use cases, but CodeMirror targets the Web, not Chrome.)

(Edit: initially linked the wrong interactionobserver issue)

chrishtr · 2019-02-06T05:27:14Z

@marijnh Thanks for the interest and detailed comments. Just to +1 what Domenic said, we would love to collaborate with you in prototyping for your use case.

Now to responses to earlier comments:

Scroll events do not come in before paint; they are not actually guaranteed to come in at any particular time. The reason for this is that otherwise threaded scrolling would not be possible, if the site was listening to scroll events. It's possible to force the browser out of threaded scrolling, but this is definitely not a good idea, since it makes scrolling slower and less reliable, and entails complete loss of the built-in scroll gestures and smooth scrolling animations of the browser.
However, you can listen to scroll events or add an IntersectionObserver, as a way of asynchronously listening to scroll changes. In response to such events, you can add or remove decorations to the DOM to add additional overlay elements and widgets.
Re: how to get the scrollbar approximately right: another approach is to break the code into a
sequence of (say) viewport-sized chunks and wrap each in an element that is invisible but has a minimum intrinsic size that is an estimation of how large its fully laid-out size would be. When a chunk gets close to the screen, it would be made visible.

fergald · 2019-02-07T02:48:56Z

@marijnh Would it be easy to get stats on memory usage for a large document for each of these cases?

Doc data in JS +
1 No DOM
2 Everything has been inflated to minimal searchable DOM
3 Everything has been inflated to fully styled DOM
4 Everything has been inflated to fully styled DOM but hidden

if .3 and .4 are significantly bigger than .2 then that implies that on-demand upgrade to fully-presentable DOM is going to continue to be a requirement for some authors.

domenic · 2019-04-25T21:19:07Z

Heyhey, thanks so much for engaging with us here! I'm really looking forward to the platform providing better support for code editors like this.

We're rolling this repository into WICG/virtual-scroller, so let's track this mainly in codemirror/dev#80, but also on any issues on the display-locking and virtual-scroller repositories for any specific questions that come up.

chrishtr mentioned this issue Feb 6, 2019

Support the use-case of maximal memory efficiency WICG/display-locking#42

Open

marijnh mentioned this issue Mar 8, 2019

Experiment with invisible DOM for off-screen content codemirror/dev#80

Open

domenic closed this as completed Apr 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The _code editor_ use case #8

The _code editor_ use case #8

marijnh commented Nov 14, 2018

domenic commented Feb 1, 2019

marijnh commented Feb 4, 2019 •

edited

Loading

domenic commented Feb 4, 2019

marijnh commented Feb 5, 2019 •

edited

Loading

chrishtr commented Feb 6, 2019

fergald commented Feb 7, 2019

domenic commented Apr 25, 2019

The _code editor_ use case #8

The _code editor_ use case #8

Comments

marijnh commented Nov 14, 2018

domenic commented Feb 1, 2019

marijnh commented Feb 4, 2019 • edited Loading

domenic commented Feb 4, 2019

marijnh commented Feb 5, 2019 • edited Loading

chrishtr commented Feb 6, 2019

fergald commented Feb 7, 2019

domenic commented Apr 25, 2019

marijnh commented Feb 4, 2019 •

edited

Loading

marijnh commented Feb 5, 2019 •

edited

Loading