Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for the proposed Semantic Highlighting Protocol #954

Merged
merged 12 commits into from
Feb 14, 2020

Conversation

jackguo380
Copy link
Contributor

@jackguo380 jackguo380 commented Jan 16, 2020

Mostly done the initial implementation for supporting for the semantic highlighting protocol proposed by people from the vscode language server. As far as I know currently clangd version 9 and eclipse.jdt.ls both have working implementations of this protocol.

This PR adds a new setting to configure semantic highlight g:LanguageClient_semanticHighlightMaps. Which is a per language map like serverCommands of mappings from semantic/textmate scopes to vim highlight groups.

Here are a couple of sample configurations and screenshots to go along.

Using clangd 9 for C++

let g:LanguageClient_semanticHighlightMaps['cpp'] = [
            \ {'Function': ['entity.name.function.cpp']},
            \ {'Function': ['entity.name.function.method.cpp']},
            \ {'CppNamespace': ['entity.name.namespace.cpp']},
            \ {'CppEnumConstant': ['variable.other.enummember.cpp']},
            \ {'CppMemberVariable': ['variable.other.field.cpp']},
            \ {'Type': ['entity.name.type.class.cpp']},
            \ {'Type': ['entity.name.type.enum.cpp']},
            \ {'Type': ['entity.name.type.template.cpp']},
            \ ]

hi! CppEnumConstant ctermfg=Magenta guifg=#AD7FA8 cterm=none gui=none
hi! CppNamespace ctermfg=Yellow guifg=#BBBB00 cterm=none gui=none
hi! CppMemberVariable ctermfg=White guifg=White


Using eclipse.jdt.ls for Java

let g:LanguageClient_semanticHighlightMaps['java'] = [
            \ {"JavaStaticMemberFunction": ['storage.modifier.static.java', 'entity.name.function.java', '**']},
            \ {"JavaMemberVariable": ['meta.definition.variable.java', 'meta.class.body.java', 'meta.class.java', '**']},
            \ {"Function": ['entity.name.function.java', '**']},
            \ {"Function": ['*', 'entity.name.function.java', '**']},
            \ {"Type": ['entity.name.type.class.java', '**']},
            \ {"Type": ['*', 'entity.name.type.class.java', '**']},
            \ ]

hi! JavaStaticMemberFunction ctermfg=Green cterm=none guifg=Green gui=none
hi! JavaMemberVariable ctermfg=White cterm=italic guifg=White gui=italic


TODO:

  • Support for vim 8.1 with +textprop
  • Disable feature when the version of vim or neovim does not support the needed APIs
  • More tests?

The PR is broken down into 2 commits since I had to upgrade the lsp-types crate to get the types needed for semantic highlighting.

Let me know what you think and if there are any improvements/fixes needed!

Related PRs/Issues:
#383
microsoft/vscode-languageserver-node#367

Also see this document for exactly how the protocol works:
https://github.com/microsoft/vscode-languageserver-node/blob/5a9b33c23de84c3341e011e79221795a8059375b/protocol/src/protocol.semanticHighlighting.proposed.md

@YaLTeR
Copy link
Contributor

YaLTeR commented Jan 19, 2020

Oh, this is so cool! Can't wait for this to get merged.

A number of these scopes like entity.name.function.java seem to follow the pattern of entity.name.function.something, maybe add a couple of default rules mapping all of these to a reasonable Function? And the same for any other common scopes.

@jackguo380
Copy link
Contributor Author

Thanks @YaLTeR

I was considering adding default values but I don't see any good way to do it since the semantic scopes are kind of language server specific. I'm kind of worried that language servers are going to use different scope names or map the scopes differently, so the defaults may look good on one server but look awful on a another one.

I'll probably just add them into the wiki page for each server once this gets merged.

Copy link
Owner

@autozimu autozimu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work!

Just left some comments here.


The maps associate the highlight group with a semantic scope matching pattern.
Any symbols that have a scope that matches the pattern will be highlighted
wtih that highlight group.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wtih => with

@@ -1375,6 +1385,67 @@ function! LanguageClient_contextMenu() abort
return LanguageClient_handleContextMenuItem(l:options[l:selection - 1])
endfunction

function! LanguageClient_semanticScopes(...) abort
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename this a bit to ..._show/print to make the intension more clear.

return LanguageClient#Call('languageClient/showSemanticHighlightSymbols', l:params, l:Callback)
endfunction

function! LanguageClient_showCursorSemanticHighlightSymbols() abort
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest ..._showSemanticHighlightUnderCursor

In order for a pattern to match a semantic scope the lists must fully match
in length and list items.
>
['x', 'y', 'z'] == ['x', 'y', 'z']
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's replace the == with matches to make it more clear.

src/types.rs Outdated
assert_eq!(do_matches(&t3), (false, false, true, true));
assert_eq!(do_matches(&t4), (false, true, false, true));
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add unit tests covering case of **?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a ** gets used it gets converted into the corresponding SemanticHighlightMatcher so the string ** doesn't ever appear in the matcher itself. See buildSemanticHighlightMatchers in language_server_protocol.rs for the construction.

E.g. an array ['a', 'b', 'c', '**'] would construct the ArrayStart matcher.

This is kind of limiting right now since the ** can only be used at the beginning/end of the array (although it should be sufficient for most use cases). I don't see any good way to evaluate arbitrary patterns of ** without an overly complex implemenation or hacks like concatentating the items and regexing. Any suggestions?

src/vimext.rs Show resolved Hide resolved
>
let g:LanguageClient_semanticHighlightMaps = {
\ 'java': [
\ {"Function": ['entity.name.function.java', '**']},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think if it's more intuitive if the map is mapping semantic groups to highlight groups?

Another related question, what is the motivation of supporting two styles of this setting? Should one suffice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue/reason for the reversed map is that vimscript doesn't support arrays as a key into a dictionary.

Originally I just had a single map but I realized afterward that theres often a need to map multiple scopes to the same highlight group. I guess I could simplify the documentation and only mention the second style but still allow the first if it ever gets used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea is to represent semantic scopes as a string with some separator, e.g. storage.modifier.static.java/entity.name.function.java/**. That would allow using them as keys. Although idk if there are any banned characters from the semantic scope names that would be usable as separators.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's reduce the number of supported styles here. More styles supported only leads to complicated implementation to maintainer and more confusion to end user.

@YaLTeR cleaver idea! Agreed the tricky part is the choice of separator. As different language have different rules, I don't know there are possible separators working for all languages.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing the same approach presented by @YaLTeR. To solve the issue of ambiguity of separators, a buffer local variable with default can be introduced, say b:LanguageClient_semanticScopeSeparator with default of /.

e.g., user might have config like

let g:LanguageClient_semanticHighlightMaps['java'] = {
    \ 'storage.modifier.static.java/entity.name.function.java/.*': 'Function',
}

In a different filetype where / might be part of the scope themselves,

autocmd *.elisp setlocal LanguageClient_semanticScopeSeparator='|'
let g:LanguageClient_semanticHighlightMaps['elisp'] = {
    \ 'storage.modifier.static|name.function|.*': 'Function',
}

We would construct scope by concatenate the scopes with this separator. The actual highlight group are then being looked up through this map.

IMO, this way is more intuitive and solves the original issue that ** can only appear in beginning or end.

\ {"JavaStaticMemberFunction": ['storage.modifier.static.java', 'entity.name.function.java', '**']},
\ {"JavaMemberVariable": ['meta.definition.variable.java', 'meta.class.body.java', 'meta.class.java', '**']},
\ {"Function": ['entity.name.function.java', '**']},
\ {"Function": ['*', 'entity.name.function.java', '**']},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these two lines be replaced with one {"Function": ['**', 'entity.name.function.java', '**']}?

>
let g:LanguageClient_semanticHighlightMaps = {}
let g:LanguageClient_semanticHighlightMaps['java'] = [
\ {"JavaStaticMemberFunction": ['storage.modifier.static.java', 'entity.name.function.java', '**']},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this list is parsed in order top to bottom until a matching semantic scope is found? What about the first case, in which order is that parsed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a note about the order. It is from top to bottom.

>
let g:LanguageClient_semanticHighlightMaps = {
\ 'java': [
\ {"Function": ['entity.name.function.java', '**']},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea is to represent semantic scopes as a string with some separator, e.g. storage.modifier.static.java/entity.name.function.java/**. That would allow using them as keys. Although idk if there are any banned characters from the semantic scope names that would be usable as separators.

mapping_pairs
.extend(mapping.into_iter().map(|(hl_group, val)| (val, hl_group)));
}
Value::Array(values) => {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start with supporting one style first and begin from there.

With either two styles or one style, serde would be able to deserialize json string to the variable, without manually unwrapping and mapping. The added benefits is serde would be able to tell where and why deserialize failed if there is any error.

>
let g:LanguageClient_semanticHighlightMaps = {
\ 'java': [
\ {"Function": ['entity.name.function.java', '**']},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's reduce the number of supported styles here. More styles supported only leads to complicated implementation to maintainer and more confusion to end user.

@YaLTeR cleaver idea! Agreed the tricky part is the choice of separator. As different language have different rules, I don't know there are possible separators working for all languages.

src/vimext.rs Show resolved Hide resolved
Ok(())
}

fn buildSemanticHighlightMatchers(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably be extracted to a utility function.

InitializeParams {
client_info: Some(ClientInfo {
name: "LanguageClient_neovim".into(),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use "LanguageClient-neovim" instead.

@@ -1577,7 +1803,11 @@ impl LanguageClient {
tab_size,
insert_spaces,
properties: HashMap::new(),
trim_trailing_whitespace: None,
insert_final_newline: None,
trim_final_newlines: None,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use default to fill properties we don't care.

use std::cmp::Ordering;

match existing_hl.line.cmp(&new_hl.line) {
Ordering::Less => {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of comparing each highlight group, can we just delete all highlights on screen and add all newer visible highlights? Similarly to how diagnostics are highlighted. In my opinion, the implementation would be much simplier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue is that the protocol supports incremental highlighting. So if that case is not handled then it may only apply the changed highlighting.

Currently eclipse.jdt.ls sends a single line of highlighting if the line itself was changed, if a line was added/removed the entire document's worth of highlighting is sent.

From this comment (prabirshrestha/vim-lsp#633 (comment)) it looks like clangd 10 has similar behavior except it can also send half the document's worth of highlighting upon add/remove lines.

I'll continue looking at simplifying this logic a bit. Might try a region based approach rather than the current line by line one.

@jackguo380 jackguo380 force-pushed the semantic_highlight branch 2 times, most recently from 9dc369d to dc627f2 Compare January 21, 2020 21:53
@jackguo380
Copy link
Contributor Author

jackguo380 commented Feb 5, 2020

@autozimu @YaLTeR Thanks for all the review comments, I just pushed a fairly major change to use regexs rather than '*' and '**', please have a look.

In summary, the configuration of LanguageClient_semanticHighlightMaps now looks like this:

let g:LanguageClient_semanticHighlightMaps['java'] = {
        \ '^storage.modifier.static.java:entity.name.function.java': 'JavaStaticMemberFunction',
        \ '^meta.definition.variable.java:meta.class.body.java:meta.class.java': 'JavaMemberVariable',
        \ '^entity.name.function.java': 'Function',
        \ '^[^:]*entity.name.function.java': 'Function',
        \ '^[^:]*entity.name.type.class.java': 'Type',
        \ }

The keys of the map are vim regexs which are then matched against the semantic scopes concatentated together using the string LanguageClient_semanticScopeSeparator which is : by default.

E.g.

['invalid.deprecated.java', 'meta.class.java', 'source.java']
# becomes
'invalid.deprecated.java:meta.class.java:source.java'

I figured this is the simplest to implement and most vim users are competent enough in regex to be able to easily use this. A minor implementation defail is that we have to evaluate the regex's in vim due to differences in syntax between vim and rust's regexes.

@autozimu
Copy link
Owner

autozimu commented Feb 6, 2020

Awesome! Will take another look when I got a chance.

@autozimu autozimu merged commit 7c741d0 into autozimu:next Feb 14, 2020
@autozimu
Copy link
Owner

Thank you very much for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants