-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Render Markdown in search results #77686
Conversation
Some changes occurred in HTML/CSS/JS. |
r? @jyn514 (rust_highfive has picked a reviewer for you, use r? to override) |
Also the summary for |
cc @euclio , you know this code better than I do. |
@camelid have you considered the issues ollie mentioned in #68699 (comment) ? |
Ah, I was trying to figure out where the alt-text was used. I'll add back the plain summary function in addition, and use that for alt-text. |
cc @GuillaumeGomez , this replaces #68699. I'm not yet sure how it differs from that PR, although I think it does go through pulldown instead of re-parsing markdown in javascript. |
Yes, this approach is to render the HTML using pulldown ahead of time. |
e766c01
to
aa88d52
Compare
Hmm, it's still rendering alt-text as HTML... |
This comment has been minimized.
This comment has been minimized.
b2fe8de
to
e1e172f
Compare
Rebased. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was trying to figure out where the alt-text was used. I'll add back the plain summary function in addition, and use that for alt-text.
Did you fix this? If so, can you add a test?
We discussed in PMs that you're still working on this so I'm not doing a full review for now - let me know when you've got this working and I'll take another look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I'm the best reviewer for this.
src/librustdoc/html/static/main.js
Outdated
return html.replace('<code>', '`').replace('</code>', '`') | ||
.replace(/<[^>]+>/, ''); | ||
var dom = new DOMParser().parseFromString( | ||
html.replace('<code>', '`').replace('</code>', '`'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit is no longer necessary I think.
html.replace('<code>', '`').replace('</code>', '`'), | |
html, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I take it back - it is necessary if you want to replace code blocks with `. Not sure why rustdoc is doing that, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not strictly necessary, but I think it's better to do it this way, for one since it used to use ` here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it kinda makes sense. However, we also keep other HTML tags like <b>
and <i>
. Should we also "revert" them into markdown too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not strictly necessary, but I think it's better to do it this way, for one since it used to use ` here.
My comment still stands. :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have the same issue as in #68699 (comment) that we don't know what was originally html and what was markdown
I do not think that is correct. The example given in #68699 (comment) was:
/// foo `this is a code span` bar \`this isn't\` baz
pub struct Foo;
That is stored in the search index as:
foo <code>this is a code span</code> bar `this isn't` baz
And then convertHTMLToPlaintext
converts that into:
foo `this is a code span` bar `this isn't` baz
Which is exactly the same as the current hover text (that you get with rustdoc nightly).
That said, I am open to just stripping every HTML tag if we decide that would be better. I just don't think #68699 (comment) is relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@camelid the test case is the other way:
/// foo <code>this is a code span</code> in HTML
///
/// `this is markdown` baz
pub struct Foo;
There's no way to get back that info after the fact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copy-pasted the test case from the comment you linked to. The comment doesn't seem to be taking issue with the example you're giving. This is the full comment text:
No. The summaries stored in the search index don't contain that information. For:
/// foo `this is a code span` bar \`this isn't\` baz pub struct Foo;the search index contains:
foo `this is a code span` bar `this isn't` baz
There is no way in JavaScript to know which backticks meant codeblocks and which didn't.
As I've said, to do this right the summaries need to be rendered and stored in the search index as HTML.
It is saying that converting backticks into HTML is a problem because they both mean inline code and literal backticks. Converting in the opposite direction is loss-less.
And, either way, I don't see how converting both `code`
and <code>code</code>
into `code`
when this is just for alt-text is an issue. If for some reason this is a blocker, I am absolutely fine to just strip all HTML tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure about this but I don't think it should block this change. So if @GuillaumeGomez thinks this is good to go I do too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I said I didn't. 😛
@Mark-Simulacrum Wait, but this will make the squashed commit show as authored by bors. @bors squash- |
@camelid: 🔑 Insufficient privileges: not in try users |
@bors squash- |
I am confused -- my understanding is that the functionality in bors does not produce that result? It'll set the committer to bors most likely but that seems fine? |
I'm not sure, but I'd rather squash by hand if you want me to squash. |
I would personally like to at least get rid of the "Fix typo" commit, I don't care strongly about others though. |
Previously Markdown documentation was not rendered to HTML for search results, which led to the output not being very readable, particularly for inline code. This PR fixes that by rendering Markdown to HTML with the help of pulldown-cmark (the library rustdoc uses to parse Markdown for the main text of documentation). However, the text for the title attribute (the text shown when you hover over an element) still uses the plain-text rendering since it is displayed in browsers as plain-text. Only these styles will be rendered; everything else is stripped away: * *italics* * **bold** * `inline code`
@GuillaumeGomez was concerned about browser compatibility.
Finally!
Accidentally removed in rebase.
4cb30b2
to
376507f
Compare
I squashed in the "Fix typo" commit. @Mark-Simulacrum could you |
📌 Commit 376507f has been approved by |
✌️ @camelid can now approve this pull request |
If it's a big deal, I can squash more away, but I would really like to get this merged. |
Rollup of 11 pull requests Successful merges: - rust-lang#77686 (Render Markdown in search results) - rust-lang#79541 (Doc keyword lint pass) - rust-lang#79602 (Fix SGX CI) - rust-lang#79611 (Use more std:: instead of core:: in docs for consistency) - rust-lang#79623 (Pass around Symbols instead of Idents in doctree) - rust-lang#79627 (Update cargo) - rust-lang#79631 (disable a ptr equality test on Miri) - rust-lang#79638 (Use `item_name` instead of pretty printing for resolving `Self` on intra-doc links) - rust-lang#79646 (rustc_metadata: Remove some dead code) - rust-lang#79664 (move interpret::MemoryKind::Heap to const eval) - rust-lang#79678 (Fix some clippy lints) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
Don't worry it wasn't a big issue. Thanks a lot for working on this! |
Yay! Excited to look at the docs for the next nightly :) |
I added `summary_opts()` before I cut the branch for rust-lang#77686 (2 months ago!), so this "slipped through the cracks".
…Gomez Use `summary_opts()` in another spot I added `summary_opts()` before I cut the branch for rust-lang#77686 (2 months ago!), so this "slipped through the cracks".
Fixes #32040.
Previously Markdown documentation was not rendered to HTML for search results,
which led to the output not being very readable, particularly for inline code.
This PR fixes that by rendering Markdown to HTML with the help of pulldown-cmark
(the library rustdoc uses to parse Markdown for the main text of documentation).
However, the text for the title attribute (the text shown when you hover over an
element) still uses the plain-text rendering since it is displayed in browsers
as plain-text.
Only these styles will be rendered; everything else is stripped away:
inline code