-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce proc_macro::Span::source_text #55780
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nikomatsakis (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
So this code seems fine, but I'm not sure from a procedural and stability point of view what is the best way to handle this. |
cc @dtolnay @petrochenkov @alexcrichton -- thoughts? |
One doubt i had was if we should return None , instead of the macro call inside for span belonging to the call site. ( |
This seems like a reasonable API edition to me and one that we'll want in the long haul. If any procedural macro has whitespace-sensitive parsing associated with it then accessing the source text via means like this is intended to be the main way to actually do the parsing. I don't think we're on track to stabilize this in the near term, but in terms of a long-term addition I think we'll want this which to me means it's fine to land unstable for now in |
We might want to strip comments. What do others think? I can get on board with whitespace-sensitive macro DSLs such as languages that differentiate between |
I could go either way on comments personally, but one aspect about omitting comments that may be a bit odd is if the difference of byte positions of a span is very different from the length of the source text due to comment removal |
Good call. We could sub out the comment with spaces. |
Seems plausible to me! |
I ... I don't know. If we're going to give the source text, I'm inclined to just give the source text, and let macros do weird things with comments. Let the market decide. =) e.g., sometimes people add "pre and post conditions" in the form of specially formatted comments. That seems not terrible to me. |
I think we should keep preserve the comment. As an usecase, the main reason I'm doing this change is for the cpp crate which extract C++ code. And people use comments in C++ to annotate things for static analyzers. (For example, gcc's Another usecase would be to print snippets of the code while compiling for better diagnostics. We wants the comments in this case. |
@ogoffart interesting. Makes sense to me. |
What should I do now? |
@nikomatsakis ping? |
I'm worried about giving guarantees to users about whitespace and comments because that forces alternative Rust compiler implementations into preserving such things rather than just throwing such things away permanently during lexing. In other words, should we give a guarantee, this effectively forces all Rust compilers to use a certain compilation model and makes that part of the specification. If this was not a guarantee but rather "at the compilers option, you may get whitespace and comments..." then I'd be less worried. |
That's why it returns an Optional. If the compiler do not have access to the actual source code, it can return None. |
@ogoffart Ah; I thought
referred only to getting Can we clarify this in the documentation somehow that compilers are not required to give you the actual source code even in cases where it's not produced by macros? |
It would be good to somehow document this as unstable, "best effort" and restricted to "for diagnostics only". |
@petrochenkov Yeah; "best effort" / "for diagnostics only" sounds like appropriate wording; thank you <3. |
My specific use-case is a power_assert macro. I want an assertion macro that has the following output:
In order to do this, I get the span of the full expression ( For this to work, Span::start() and the string I print out need to match.
If we don't get whitespace and comments, then we run the risk of having Span::start() become out of sync with the raw text, breaking the above functionality if a comment was put inside the assert macro. |
@roblabla: do you take in to account the fact that the column is in utf-8 bytes.
In order to do that, you indeed need to know what exactly is in the comments (how many byte, corresponds to how many code points) (I guess this should be computed with |
Can you not have some fallback such that the |
I added a note that this should not be relied upon, and is only there for diagnostics. |
This comment has been minimized.
This comment has been minimized.
Or maybe because it's just an unstable addition, we can "just do it"? If so, is there some place that needs to be updated (a tracking issue, etc?) |
The code seems fine to me, I just want to ensure that we don't lose track of this random thing. |
I guess that would be #54725 |
Thinking about this a bit more, I feel like there is quite a number of considerations and unknowns that came up on this thread (e.g., "comments or not?" etc), and I'm a bit reluctant to just r+ this without at least recording those. So I guess I would say, could someone produce a brief summary of the conversation and in particular the unknowns? Then we can put that in the tracking issue and I would feel pretty good about an r+. (I may have time to do that on monday, gotta run right now) |
@nikomatsakis I believe most changes to the proc macro APIs are shared between T-Libs and T-Lang so both of those teams. :) |
@Centril seems sensible. Regardless, I think what we need most at this juncture is a kind of capsule summary of the conversation and in particular highlighting the alternative designs that were visited and the reasons for the current one. |
@nikomatsakis Fair; I've nominated to discuss this a bit on Thursday. :) |
One thing we might want to note in any summary: There are a variety of possible strings you might return here. For example, if you had |
Another small unknown is line-endings. Ideally, the meaning of Rust program should be independent of line endings used (because, for example, gitconfig might change the line endings). I think currently this is more or less the case: for example, line endings in string literals seem to be normalized to |
ping from triage @nikomatsakis what's the update on this? |
@nikomatsakis Could you post a summary of the current state of this pull request? (triage) |
ping from triage anyone from @rust-lang/lang @rust-lang/libs can review this? |
I'm going to r? @petrochenkov for now (please make a fresh issue number and tracking issue for it) so that this PR can be dealt with. We don't have to figure out everything just now. |
I'm ok with this as long as this is unstable and documented as best effort.
@bors r+ |
📌 Commit e88b0d9 has been approved by |
Introduce proc_macro::Span::source_text A function to extract the actual source behind a Span. Background: I would like to use `syn` in a `build.rs` script to parse the rust code, and extract part of the source code. However, `syn` only gives access to proc_macro2::Span, and i would like to get the source code behind that. I opened an issue on proc_macro2 bug tracker for this feature dtolnay/proc-macro2#110 and @alexcrichton said the feature should first go upstream in proc_macro. So there it is! Since most of the Span API is unstable anyway, this is guarded by the same `proc_macro_span` feature as everything else.
☀️ Test successful - checks-travis, status-appveyor |
A function to extract the actual source behind a Span.
Background: I would like to use
syn
in abuild.rs
script to parse the rust code, and extract part of the source code. However,syn
only gives access to proc_macro2::Span, and i would like to get the source code behind that.I opened an issue on proc_macro2 bug tracker for this feature dtolnay/proc-macro2#110 and @alexcrichton said the feature should first go upstream in proc_macro. So there it is!
Since most of the Span API is unstable anyway, this is guarded by the same
proc_macro_span
feature as everything else.