Add support for custom parsing of APC, SOS and PM sequences. #115

boazy · 2024-08-04T14:36:51Z

Fixes #109.

This attempts the same goal at John-Toohey's PR #110, but since this PR was was not accepted, I wanted to try another approach and see whether it is acceptable.

Rationale for support

APC sequences are rarely supported by general-purpose terminals (if we put aside Kermit clients, tmux and screen), but there is one exception: The Kitty Terminal Graphics Protocol. While the Kitty Image Protocol is not as ubiquitous as Sixel, it is more robust, accurate and powerful and it's implemented by several prominent terminal emulators:

Kitty (obviously)
Konsole
Wezterm
Wayst
Ghostty

It's also supported by a large number of programs and libraries, some of them are mentioned here.

Design

Design Goals

Support SOS, PM and APC sequences (just like the original PR)
Support streaming parsing of SOS, PM and APC sequence payloads.
Avoid adding new conditional branches in existing code paths. This should minimize performance impact on existing code.
Avoid re-purposing OSC state variables (in order to reduce complexity)
Avoid adding any new stored state variable
No embedded parsers - use existing state change tables for parsing

Design Choices

Reorder actions into packable and non-packable actions

Currently, resulting state and actions are packed into an 8-bit value in the state change table (with a 4-bit nibble allocated for each). All values from 0 to 15 for both state and actions are used, so I could not add new states and action that would be packable. Unfortunately, I needed to support a state change with the OpaquePut action. The best way I've found is to separate the integer values of actions that do not need to be packed into the state change table (such as Hook, UnhookandClear`) into a value higher than 15 and reserve lower values for actions that need to be packed

I'm not sure if my current solution is acceptable or not but I have no other idea on how to resolve this situation without repurposing the OscString state and actions and re-introducing an embedded parser.

Action matching order

This should have minimal impact, but in order to avoid any performance impact for programs which do not use APC sequences, I made sure that when matching APC/SOS/PM-specific actions, they are always matched last.

Streaming

Unlike #110, this PR does not collect the sequence payload inside an array. Instead, the sequence payload is streamed directly to the Perform trait, one byte at a time. I believe this approach is more flexible and would lead to better performance overall. The key gains here are:

Avoiding extra vector allocation for the payload (I don't want to re-use osc_raw and in any case we would have to increase its default size to 4096 to be compatible with the max payload size).
Avoiding unnecessary copying when. The payload often just needs to be parsed directly. This is especially useful when sending images directly, since the image data would often just be forwarded to an image format parser anyway.

Names

I've changed SosPmApcString to OpaqueString, since I wanted to have a general name for all these types of sequences where the VTE parser has no understanding of the internal structure of the sequence payload. This is difference from OSC and CSI sequences where the VTE parser is aware of the high-level structure of the sequence (if not the semantics).

chrisduerr

I'll give a detailed review later today/tomorrow, but assuming that this was benchmarked it seems like a reasonable approach.

Generally I think the VTE crate needs a little rewrite, but I don't want to hold up other changes for that since it will likely take a long time before I get around to that.

boazy · 2024-08-05T04:16:38Z

I'll give a detailed review later today/tomorrow, but assuming that this was benchmarked it seems like a reasonable approach.

I ran cargo bench but I'm a bit unsure on how to run the full vtebench suite. Is there a guide for that?

I don't expect any performance impact since I've added no new state and I've only added tailed branches to an already-long match expression, but you never know until you test...

kchibisov · 2024-08-05T09:29:33Z

You can open a PR against alacritty with your changes, so we can use our bot for it. But generally you just build alacritty with your changes and run the vtebench.

boazy · 2024-08-05T10:54:14Z

You can open a PR against alacritty with your changes, so we can use our bot for it. But generally you just build alacritty with your changes and run the vtebench.

Thanks! Here are the results:

Master Branch:

  dense_cells (625 samples @ 1 MiB):
    15.47ms avg (90% < 16ms) +-1.93ms

  scrolling (199 samples @ 1 MiB):
    37.19ms avg (90% < 42ms) +-4.62ms

  scrolling_bottom_region (361 samples @ 1 MiB):
    27.25ms avg (90% < 28ms) +-2.28ms

  scrolling_bottom_small_region (369 samples @ 1 MiB):
    26.69ms avg (90% < 27ms) +-2.3ms

  scrolling_fullscreen (185 samples @ 1 MiB):
    41.15ms avg (90% < 45ms) +-2.28ms

  scrolling_top_region (247 samples @ 1 MiB):
    40.09ms avg (90% < 42ms) +-2.92ms

  scrolling_top_small_region (364 samples @ 1 MiB):
    27.06ms avg (90% < 28ms) +-2.27ms

  unicode (611 samples @ 1.06 MiB):
    15.86ms avg (90% < 17ms) +-1.55ms

This PR:

  dense_cells (606 samples @ 1 MiB):
    16ms avg (90% < 17ms) +-1.26ms

  scrolling (198 samples @ 1 MiB):
    37.39ms avg (90% < 42ms) +-5.06ms

  scrolling_bottom_region (361 samples @ 1 MiB):
    27.25ms avg (90% < 28ms) +-2.12ms

  scrolling_bottom_small_region (362 samples @ 1 MiB):
    27.19ms avg (90% < 28ms) +-2.25ms

  scrolling_fullscreen (182 samples @ 1 MiB):
    41.63ms avg (90% < 45ms) +-3.56ms

  scrolling_top_region (250 samples @ 1 MiB):
    39.6ms avg (90% < 41ms) +-2.85ms

  scrolling_top_small_region (362 samples @ 1 MiB):
    27.16ms avg (90% < 28ms) +-1.28ms

  unicode (654 samples @ 1.06 MiB):
    14.79ms avg (90% < 16ms) +-2.14ms

The scrolling_* benchmarks are unstable and give me different lead every time. Unicode seems to be consistently slightly faster on this PR while dense_cells is slightly faster on master branch, but the difference is quite small and the variance is higher.

kchibisov · 2024-08-05T11:07:29Z

@boazy just open a PR against alacritty's repo, it should better for us to review that way, since we can run the bot.

boazy · 2024-08-07T06:20:42Z

@kchibisov I've added a pull request. I'm not sure how to trigger the benchmarks bot.

chrisduerr · 2024-08-07T07:16:44Z

I'm not sure how to trigger the benchmarks bot.

You can't. I did.

src/definitions.rs

src/lib.rs

Fix typo (case) Co-authored-by: Christian Duerr <[email protected]>

Fix typo (period) Co-authored-by: Christian Duerr <[email protected]>

chrisduerr · 2024-09-21T22:48:52Z

@boazy This has stalled out for a while now, are you still interested in working on this? I'm slightly concerned because the longer we wait, the higher the chance that a vte rewrite will make all your work worthless.

boazy · 2024-10-01T02:27:42Z

I would expect three sets of functions. How that is done is an implementation detail irrelevant to the consumer (as long as it's fast).

@boazy This has stalled out for a while now, are you still interested in working on this? I'm slightly concerned because the longer we wait, the higher the chance that a vte rewrite will make all your work worthless.

I didn't have time for a while, but I've had some time again today and I've checked your suggestion. Unfortunately, I don't think it would be possible without completely re-architecting the way state works. As I've mentioned before, we only have 4 bits for actions that change the state, but we also only have 4 bits for the state itself, and here all possible values (0-15) are used.

If we want to support three sets of functions, we would need to track three separate states for each type of string (SOS, PM, APC). This is not possible without either:

Adding a new state variable for tracking the type of opaque string. (with possible performance impact)
Modifying the state variable to a u16 or better yet something like struct ResultingState(State, Action).
Would this impact performance? I don't know, but it's also a big change I'm a little bit uneasy with...

The only thing I can do without any major change is to have 3 different functions instead of opaque_start, but opaue_end and opaque_put would have to remain a single function in this case.

chrisduerr · 2024-10-11T23:28:09Z

Just to recap because this is quickly fading into obscurity and I don't want to forget about it entirely:

Splitting opaque_start up would be no problem, since we have access to the three states anyway, but splitting opaque_put and opaque_end is a problem because it means we'd have to keep track of the state we're currently in?

But couldn't you just store that on Parser (the size of which is irrelevant), then branch inside Action::OpaquePut => and Action::OpaqueEnd => since doing so should leave performance of the other branches unaffected?

I believe that's kinda your first suggestion? Why do you think this would impact performance?

boazy · 2024-10-16T11:00:33Z

But couldn't you just store that on Parser (the size of which is irrelevant), then branch inside Action::OpaquePut => and Action::OpaqueEnd => since doing so should leave performance of the other branches unaffected?

I believe that's kinda your first suggestion? Why do you think this would impact performance?

I honestly don't know. I assumed the size is important for performance since we want to be able to have all state transitions in one table (which only has 256 entries?). If we go back to the start, I created this PR trying to answer the comments to the original PR that tried to add APC support (#110), particularly this comment:

This basically feels like an embedded parser. Most of our parser is structured in a way to stream through bytes and process them as they're coming in, however rather than creating state transitions you're storing a state transition to later then match on it instead. VTE is already too slow and I'm afraid this will just amplify that problem.

#110 (comment)

I realize I can add additional state that doesn't affect OSC sequences, but seeing this comment I was a bit worried. Do you believe adding an additional field to Parser (let's say opaque_sequence_kind) should be ok?

chrisduerr · 2024-10-16T18:12:51Z

I honestly don't know. I assumed the size is important for performance since we want to be able to have all state transitions in one table (which only has 256 entries?). If we go back to the start, I created this PR trying to answer the comments to the original PR that tried to add APC support (#110), particularly this comment:

Yeah that's fair. This would make this a slow sloppy implementation and it might not be worth adding it for that reason. But that applies to your current solution too and the only alternative is a bigger rework that integrates this as a core part of our parser.

I realize I can add additional state that doesn't affect OSC sequences, but seeing this comment I was a bit worried. Do you believe adding an additional field to Parser (let's say opaque_sequence_kind) should be ok?

I think we should either do this, or go for something that properly integrates everything into our standard processing pipeline, which of course is difficult without affecting performance. I'm very hesitant to accept this sort of "second rate escape sequence", but I would prefer that over the current implementation.

boazy · 2024-10-17T05:47:09Z

I think we should either do this, or go for something that properly integrates everything into our standard processing pipeline, which of course is difficult without affecting performance. I'm very hesitant to accept this sort of "second rate escape sequence", but I would prefer that over the current implementation.

Ok, I think I've understood you now. I'll modify the pull request along these lines.

Split the `opaque_[start|put|end]()` set of functions into 3 different sets of functions for SOS, PM and APC sequences.

Nightly cargo fmt seems to have added new rules since the last commit, so the code (including code that I didn't touch in this PR) doesn't past `cargo fmt +nightly -- --check` anymore.

src/lib.rs

chrisduerr · 2024-10-21T13:49:57Z

Going it tenatively approve the current revision, I don't think there's any other changes necessary from your end.

While I'm not entirely convinced I'll give this a closer look and if there are no performance regressions it should be fine.

boazy force-pushed the parserless-apc-support branch from 46e444a to 04fbc6e Compare August 4, 2024 14:39

Add support for custom parsing of APC, SOS and PM sequences.

255c2fe

boazy force-pushed the parserless-apc-support branch from 04fbc6e to 255c2fe Compare August 4, 2024 14:41

chrisduerr reviewed Aug 4, 2024

View reviewed changes

boazy mentioned this pull request Aug 7, 2024

Try new VTE branch for benchmarks alacritty/alacritty#8118

Closed

chrisduerr requested changes Aug 11, 2024

View reviewed changes

src/definitions.rs Outdated Show resolved Hide resolved

src/lib.rs Outdated Show resolved Hide resolved

src/lib.rs Outdated Show resolved Hide resolved

boazy and others added 2 commits August 14, 2024 12:44

Update src/definitions.rs

8777a53

Fix typo (case) Co-authored-by: Christian Duerr <[email protected]>

Update src/lib.rs

790e34e

Fix typo (period) Co-authored-by: Christian Duerr <[email protected]>

This was referenced Sep 21, 2024

Added APC support #110

Closed

Use git dependency of vte as John-Toohey:vte alacritty/alacritty#7826

Closed

boazy force-pushed the parserless-apc-support branch from 9038add to cbbdde3 Compare October 17, 2024 06:18

Split opauqe_*() functions into sos_*(), pm_*(), apc_*().

45061fd

Split the `opaque_[start|put|end]()` set of functions into 3 different sets of functions for SOS, PM and APC sequences.

boazy force-pushed the parserless-apc-support branch from cbbdde3 to 45061fd Compare October 17, 2024 06:20

Added trailing commas required by nightly cargo fmt

ada43b8

Nightly cargo fmt seems to have added new rules since the last commit, so the code (including code that I didn't touch in this PR) doesn't past `cargo fmt +nightly -- --check` anymore.

boazy requested a review from chrisduerr October 17, 2024 13:11

chrisduerr requested changes Oct 20, 2024

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

boazy requested a review from chrisduerr October 21, 2024 08:33

Combine two match clauses

4839d4d

boazy force-pushed the parserless-apc-support branch from d511cdd to 4839d4d Compare October 21, 2024 08:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for custom parsing of APC, SOS and PM sequences. #115

Add support for custom parsing of APC, SOS and PM sequences. #115

boazy commented Aug 4, 2024

chrisduerr left a comment

boazy commented Aug 5, 2024

kchibisov commented Aug 5, 2024

boazy commented Aug 5, 2024

kchibisov commented Aug 5, 2024

boazy commented Aug 7, 2024

chrisduerr commented Aug 7, 2024

chrisduerr commented Sep 21, 2024

boazy commented Oct 1, 2024

chrisduerr commented Oct 11, 2024

boazy commented Oct 16, 2024

chrisduerr commented Oct 16, 2024

boazy commented Oct 17, 2024

chrisduerr commented Oct 21, 2024

Add support for custom parsing of APC, SOS and PM sequences. #115

Are you sure you want to change the base?

Add support for custom parsing of APC, SOS and PM sequences. #115

Conversation

boazy commented Aug 4, 2024

Rationale for support

Design

Design Goals

Design Choices

Reorder actions into packable and non-packable actions

Action matching order

Streaming

Names

chrisduerr left a comment

Choose a reason for hiding this comment

boazy commented Aug 5, 2024

kchibisov commented Aug 5, 2024

boazy commented Aug 5, 2024

kchibisov commented Aug 5, 2024

boazy commented Aug 7, 2024

chrisduerr commented Aug 7, 2024

chrisduerr commented Sep 21, 2024

boazy commented Oct 1, 2024

chrisduerr commented Oct 11, 2024

boazy commented Oct 16, 2024

chrisduerr commented Oct 16, 2024

boazy commented Oct 17, 2024

chrisduerr commented Oct 21, 2024