Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardise the ability for client/DA to use URIs in place of file paths (enabling debugging of non-file:/ sources) #444

Open
DanTup opened this issue Nov 16, 2023 · 18 comments

Comments

@DanTup
Copy link
Contributor

DanTup commented Nov 16, 2023

Edit 2024-02-15: It turns out VS Code is already handling (and sending) URIs but without any standard way to negotiate this with the client. Rather than adding uri fields, it would probably make more sense to standardise this with capabilities. My current suggestion is detailed below at #444 (comment)


In VS Code and LSP, Uris are used everywhere instead of paths. This allows workspaces that contain projects that are not backed by the file system (for example virtual workspaces or virtual documents).

While doing some testing of registerTextDocumentContentProvider in VS Code to provide virtual documents, I discovered that in the debugger we cannot return sources with URIs that would be handled via the TextDocumentContentProvider because the Source interface has only file paths and not uris.

While it's possible to provide a sourceReference and download the source from the debugger, this results in a poor user experience because each debug session gets its own versions of these files, and they do not match the one the user may have used Go-to-Definition on (when there was no debug session). This is a very confusing experience (especially as each "copy" of the document has its own breakpoints) and something we already had to solve for file paths (that is, we try to never download sources from the debugger and always map them to disk-based sources that match where the language server would navigate to).

My suggestion would be to add a uri field to Source, and have a capability for the client to tell the server that it will read a uri field (if provided) instead of path (and has the capability to read these documents) - either as a boolean, or a list of schemes that it can load content for.

@roblourens
Copy link
Member

@puremourning
Copy link
Contributor

What should a client do with a random URI? We have this problem in LSP where server spams out 'jdt://' and client has no clue what to do with that. The source request is an elegant solution.

@DanTup
Copy link
Contributor Author

DanTup commented Nov 16, 2023

@roblourens

In VS Code, we do try to parse the string field as a URI. That happens here

Ah, that's interesting - I'll do some testing of that. It should really be part of the spec before debug adapters rely on it though. I can add a flag to only send URIs for VS Code in the meantime, but I think I'd only want to do it if there's a path to it being standard.

@puremourning

What should a client do with a random URI?

VS Code has an API (registerTextDocumentContentProvider) that lets an extension supply content for a custom scheme. The idea is that these work together - an editor might not be restricted only to file:///-based content, but might be able to handle virtual documents. That's why part of my suggestion was for the client to tell the server which URI schemes it can read documents for (it would be entirely opt-in and not require clients to handle URIs).

Eg., if I call registerTextDocumentContentProvider in my VS Code extension and say I can provide content for danny:/ URIs, the VS Code debug client could pass ['file', 'danny'] in the capabilities as schemes it can handle documents for, and the debug adapter would then know that if it has sources that are danny: URIs it could use those URIs in Source responses.

The source request is an elegant solution.

It has the issues I mentioned above - the client can't relate these sources to sources that the user discovered when there was no debug session (via language server), or across debug sessions (because they have no stable identifiers, just random sourceReferences). It's very confusing for a user to Go-to-Definition on something, add a breakpoint, and then while debugging be shown another copy of the same file (which the editor tracks a completely separate set of breakpoints for, even if the debugger is treating them the same).

@DanTup
Copy link
Contributor Author

DanTup commented Nov 16, 2023

@roblourens I just tested it, and it works as you described - this is great (but I think still needs to become standard). Thank you for highlighting it! :-)

@mfussenegger
Copy link
Contributor

VS Code has an API (registerTextDocumentContentProvider) that lets an extension supply content for a custom scheme. The idea is that these work together - an editor might not be restricted only to file:///-based content, but might be able to handle virtual documents. That's why part of my suggestion was for the client to tell the server which URI schemes it can read documents for (it would be entirely opt-in and not require clients to handle URIs).

Eg., if I call registerTextDocumentContentProvider in my VS Code extension and say I can provide content for danny:/ URIs, the VS Code debug client could pass ['file', 'danny'] in the capabilities as schemes it can handle documents for, and the debug adapter would then know that if it has sources that are danny: URIs it could use those URIs in Source responses.

This is all very well, but would require a client VS code extension, and a custom extension for any other editor per debug adapter that uses that functionality. That goes against the main selling point of the protocol:

The idea behind the Debug Adapter Protocol (DAP) is to abstract the way how the debugging support of development tools communicates with debuggers or runtimes into a protocol. [...]
Debug Adapters can be re-used across multiple development tools which significantly reduces the effort to support a new debugger in different tools.

There would need to be a mechanism in the protocol to retrieve contents for custom scheme's, but the source already kinda is that.

@DanTup
Copy link
Contributor Author

DanTup commented Nov 16, 2023

This is all very well, but would require a client VS code extension, and a custom extension for any other editor per debug adapter that uses that functionality.

All it requires is that the client knows how to get resources from a scheme. VS Code already has this support, and it seems likely LSP will get it.

That goes against the main selling point of the protocol:

I'm not sure I agree. The protocol supports many opt-in capabilities that add functionality to clients that support it. The protocol is not intended to restrict all debuggers and clients to only functionality that all other debuggers and clients can support.

There would need to be a mechanism in the protocol to retrieve contents for custom scheme's, but the source already kinda is that.

Source does it in a way that is debug-session specific and cannot interact with sources provided by any other source in the editor. This has many drawbacks that I have mentioned above. Why should debugging only work properly if your source files are on the local physical disk of a machine? Why shouldn't debugging be able to work with custom remote filesystems or when running in sandboxes or browsers with only virtual resources?

All of VS Code and LSP has been built using Uris so that the file:/// scheme doesn't have to be special. VS Code is already handling this and it supports real use cases. I don't really understand the resistance to this - it adds zero work to clients that do not want to support it.

@mfussenegger
Copy link
Contributor

mfussenegger commented Nov 16, 2023

I'm not sure I agree. The protocol supports many opt-in capabilities that add functionality to clients that support it. The protocol is not intended to restrict all debuggers and clients to only functionality that all other debuggers and clients can support.

So far no client capabilities require any language-awareness and a client can implement them reasonably in a generic way that's usable with any debug adapter. That's not possible for custom scheme's without some standard retrieval mechanism.

All of VS Code and LSP has been built using Uris so that the file:/// scheme doesn't have to be special. VS Code is already handling this and it supports real use cases. I don't really understand the resistance to this - it adds zero work to clients that do not want to support it.

Because as @puremourning pointed out, that language servers use URIs with custom scheme's without providing a standardized way to look up the document is already causing friction in other clients. See also microsoft/language-server-protocol#336

@connor4312
Copy link
Member

connor4312 commented Nov 16, 2023

I'm not sure that support URIs with custom schemes introduces any editor-specific restrictions. Considering:

  • In the case that a client requests the debugger, through its launch configuration, to debug content that includes or lives on a foreign scheme, it will be aware of that scheme.
  • In the case that a client is unaware of the foreign scheme, this implies it cannot edit files that live on that scheme. Therefore, if a DA returns sources on a foreign scheme, showing read-only contents of that file via the source request should be sufficient.

If we support this, I would be happy to adopt it in the JavaScript debugger. For example, a euser can debug JavaScript on remote URIs (e.g. https://example.com) and sourcemaps can generate URIs of virtual files (webpack-internal://... URIs for example.) Today the debugger munges these into a path-like structure with a non-zero sourceReference, but preserving the original URI would be clearer to users and provide better navigation of loaded sources.

@DanTup
Copy link
Contributor Author

DanTup commented Nov 16, 2023

So far no client capabilities require any language-awareness and a client can implement them reasonably in a generic way that's usable with any debug adapter.

There's no language-awareness required here. Just that a client is able to tell a DA what URI schemes (besides file:/) it is able to fetch content for and that the spec allows for returning those URIs instead of paths everywhere. Yes, it's only useful when a client has some other part to this, but I don't see why that's a reason not to do it - it does not need any language-specific work on part of the client/editor.

That's not possible for custom scheme's without some standard retrieval mechanism.

Sure it is. An editor would not say that it knows how to get "danny:/" documents unless is has some way of getting content. How that happens is entirely up to the editor through its own plugins or configuration. If an editor doesn't have any resource provider registered for the danny:/ scheme, it would never tell the DA that it does, and the DA should never return danny:/ URIs.

Because as @puremourning pointed out, that language servers use URIs with custom scheme's without providing a standardized way to look up the document is already causing friction in other clients.

This doesn't make any sense to me. This is an implementation issue. Servers should not return URIs that it they don't know a client can handle. The fix is to have a mechanism for servers to know what a client supports (and in the meantime, for servers not to return non-file URIs), not to prevent the ability to have these custom schemes. LSP servers doing bad things is not a reason not to support new features that provide better debugging experiences.

The LSP request above is exactly what I want to implement (an LSP equiv of VS Code's API). But there's no reason that supporting arbitrary URIs here in DAP requires any specific implementation (in LSP or otherwise). That's an implementation detail the debugger should not need to know about. If a debug adapter and the corresponding LSP server (or VS Code extension, or some other functionality of an editor) both agree they know how to use a custom URI scheme (which has been made up by a language), why shouldn't they be able to use it?

This already all exists and works in VS Code today, I really don't understand what drawbacks there are to making it standard. "Debug adapters might return garbage URIs the client cannot handle" does not seem like an argument against this functionality - they can also return invalid file paths and invalid DAP responses. The spec can't protect you from misbehaving implementations.

@mfussenegger
Copy link
Contributor

mfussenegger commented Nov 16, 2023

The fix is to have a mechanism for servers to know what a client supports (and in the meantime, for servers not to return non-file URIs), not to prevent the ability to have these custom schemes

This is already the case for the most part. They're still problematic because users wonder why something doesn't work in <editor-x> when it works in vscode.

Debug adapters might return garbage URIs the client cannot handle" does not seem like an argument against this functionality

That's not the concern or the argument I made. My concern is that there is no way for a client to implement this generically. It would need scheme specific implementations for each debug adapter that starts using this functionality to reach feature parity with vscode. That's getting back to dedicated implementations per editor per debug adapter - the problem DAP and LSP claimed to solve.

This has already been a major pain point with LSP and its undefined client commands and it was nice that DAP didn't suffer from this problem. It would be really disappointing to see this change

@DanTup
Copy link
Contributor Author

DanTup commented Nov 16, 2023

@connor4312

In the case that a client requests the debugger, through its launch configuration, to debug content that includes or lives on a foreign scheme, it will be aware of that scheme.

In my case, the launch program is not on the custom scheme, it's for additional sources that don't exist on disk. I think it would be better as part of capabilities (since the editor already knows which schemes it can handle). I actually just found this in InitializeRequestArguments:

  /**
   * Determines in what format paths are specified. The default is `path`, which
   * is the native format.
   * Values: 'path', 'uri', etc.
   */
  pathFormat?: 'path' | 'uri' | string;

This is odd, because it's a single value (not an array) so can't indicate support for both paths and URIs, and it allows for an arbitrary string with no explanation of what it is.

My suggestion would be that specific URI schemes are provided instead (and perhaps pathFormat is deprecated?):

  /**
   * Determines the schemes the client supports in URIs used for paths.
   *
   * If defined, it means the client accepts URI strings in any `path` fields such as `Source.path`.
   */
  pathUriSchemes?: string[]; // ["file", "danny"]

@mfussenegger

This is already the case for the most part. They're still problematic because users wonder why something doesn't work in <editor-x> when it works in vscode.

As an author of LSP and DAP servers used in other editors, I completely sympathise with this - but nothing in my request here requires that other editors get broken. I want this in the spec specifically to make things standard and not VS Code-only. I care very much about the LSP and DAP servers I work on being usable (and as functional as possible) with other editors.

My concern is that there is no way for a client to implement this generically. It would need scheme specific implementations for each debug adapter that starts using this functionality to reach feature parity with vscode.

I don't agree with this. If/when LSP gets support for TextDocumentContentProvider (something I'd like to contribute, but it's no good without debugger support) any editor could have a completely standard LSP server and completely standard debug adapter and no language-specific functionality/plugin and get all of this behaviour. The LSP server would register schemes with the editor, the editor would tell the debug adapter it can read content for those schemes. There's nothing VS Code specific and nothing that requires language-specific plugins.

@DanTup
Copy link
Contributor Author

DanTup commented Feb 7, 2024

I'm working on some debugger functionality that needs to know whether the client supports URIs in path fields (as VS Code does).

There's an existing pathFormat field, but IMO it's not fit for purpose (because it's not properly specified and it doesn't allow a client to express what it supports, it can only provide a single value). VS Code sends "pathFormat":"path" even though it supports URIs.

As a simple way forward, could we add something like supportsUris to the client and server capabilities to indicate that URIs are supported in all places that normally take paths? That way a debug adapter can check this field before trying to send any URIs, and a client can check it before it sends any URIs too (they need to go both ways if we want a client to be able to add breakpoints in a non-file document, for example).

@dbaeumer @roblourens any thoughts?

@roblourens
Copy link
Member

If this functionality is going to exist, I guess it makes sense for a DA to know whether or not it's available- but the current state of things is a little suspicous and I wonder whether vscode is doing the wrong thing by saying pathFormat=path and then parsing URIs out of paths. Not exactly sure about the history of this.

@puremourning
Copy link
Contributor

The argument that 'if LSP gets a feature that doesn't exist yet' is hard to justify today in isolation. But besides that, the idea that 'DAP clients can just interact with whatever LSP client happens to exist on the client' seems to bake in the assumption that DAP and LSP have any relationship whatsoever which today they don't. Indeed DAP clients and LSP clients (of which there are many) for many editors are completely independent.

I would strongly discourage any DAP notion that requires a LSP implementation.

@DanTup
Copy link
Contributor Author

DanTup commented Feb 14, 2024

@roblourens

If this functionality is going to exist, I guess it makes sense for a DA to know whether or not it's available- but the current state of things is a little suspicous and I wonder whether vscode is doing the wrong thing by saying pathFormat=path and then parsing URIs out of paths. Not exactly sure about the history of this.

Honestly I don't know how pathFormat is usable. If a client says pathFormat=uri but the DA knows nothing about this, it will send paths and the client won't get what it expects. It seems like there should be some negotiation here, or at least the client should tell the DA it supports URIs (or the set of formats it supports).

FWIW as a workaround, I'm sending a new flag (supportsDartUris) in the client capabilities from my VS Code extension and having the DA accept URIs (in breakpoint etc. requests) and return URIs (in source.path) only if set. This is working fine, but I'd rather supportsDartUris was replaced by something standard (advertising support for URIs and the specific supported schemes) because VS Code isn't the only client of this DA.

@puremourning

But besides that, the idea that 'DAP clients can just interact with whatever LSP client happens to exist on the client' seems to bake in the assumption that DAP and LSP have any relationship whatsoever which today they don't.

Using URIs instead of file paths makes no assumptions about, nor has any dependencies on LSP. The way in which I'd like to use this would involve implementations in both of those, but that's an implementation detail. A client can happily only support file:/// URIs, or could support other schemes through any mechanism it wishes (LSP or not). DAs should not return URIs that a client can't handle just like LSP servers should not.

Not adding this to the spec isn't likely to stop this from happening, it just means that clients and servers will implement it in non-standard ways (as VS Code already has, and Dart is likely to take advantage of). This will make things worse for other clients, not better.

I care very much about supporting clients other than VS Code and that's exactly why I'm here asking for a standard way to do something that is currently very VS Code-specific. I would like my DA to work with clients that do not handle URIs, but I also would like clients that are using the Dart LSP server and Dart DA to be able to support these virtual files without needing Dart-specific coding, and that's only possible if this becomes standard.

@puremourning
Copy link
Contributor

@DanTup apologies perhaps I misunderstood what was being proposed. So long as everything is opt in and people who don't opt in aren't second class, I have no objection to that!

@DanTup
Copy link
Contributor Author

DanTup commented Feb 15, 2024

@puremourning np, sorry if I wasn't very clear. My intention is absolutely that everything here should be opt-in and both the client and DA should completely understand the capabilities of the other.

Today, VS Code supports URIs but does not tell the DA a) that it supports them or b) which schemes it supports. This means DAs are either making assumptions they probably should not, or VS Code extensions are having to pass non-standard flags in to tell the DA that they support this.

IMO the existing pathFormat field is not fit for purpose. Even if VS Code started using it, it doesn't handle a) DAs not supporting URIs or b) DAs that need to know if the client is able to load content for the specific schemes that they want to use.

My proposal would be something like:

Client -> Server capabilities

interface InitializeRequestArguments {
  /**
    * Client supports receiving URIs with the following schemes in place of file paths for (such as in `stackTrace` responses).
    * 'file' should be listed explicitly if the client supports file URIs instead of paths.
    * If not provided, the DA must not send URIs to the client.
    */
  supportsUriSchemes?: string[];
}

Server -> Client capabilities (InitializeResponse.capabilities)

interface Capabilities {
  /**
    * DA supports receiving URIs with the following schemes in place of file paths (such as in setBreakpoints() requests).
    * 'file' should be listed explicitly if the DA supports file URIs instead of paths.
    * If not provided, the client must not send URIs to the DA.
    */
  supportsUriSchemes?: string[];
}

This allows both parties to indicate whether they support URIs at all, and if so which schemes. Unless the client passes supportsUriSchemes then the DA should never return URIs. If it does, the DA should only return URIs of that scheme. A client can include just file if it supports URIs, but only file:///s.

Similarly, the client should send any URIs unless the DA includes supportsUriSchemes and then, only for those URIs. This means if someone has a virtual vscode-vfs:// file open in VS Code, the breakpoints would not be sent to the DA unless the DA said it understands vscode-vfs://.

Nothing would change at all for a client that didn't provide supportsUriSchemes. However, it enables debugging for non-file schemes if both the server and client agree that they understand that scheme (the implementation of how they support that scheme is an implementation detail not important to the spec).

@roblourens any thoughts on the above?

@DanTup DanTup changed the title Support uri on Source as an alternative to path Standardise the ability for client/DA to use URIs in place of file paths (enabling debugging of non-file:/ sources) Feb 15, 2024
@DanTup
Copy link
Contributor Author

DanTup commented Sep 26, 2024

I'm still keen to make the above standard.. VS Code is already supporting it and Dart is going to use it. If nobody has any objections, I'll open a PR with roughly what's in my comment above and we can discuss further there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants