-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix broken spans in diffs #14678
Fix broken spans in diffs #14678
Conversation
09ab1be
to
e54cf99
Compare
Gitea runs diff on highlighted code fragment for each line in order to provide code highlight diffs. Unfortunately this diff algorithm is not aware that span tags and entities are atomic and cannot be split. The current fixup code makes some attempt to fix these broken tags however, it cannot handle situations where a tag is split over multiple blocks. This PR provides a more algorithmic fixup mechanism whereby spans and entities are completely coalesced into their respective blocks. This may result in a incompletely reduced diff but - it will definitely prevent the broken entities and spans that are currently possible. As a result of this fixup several inconsistencies were discovered in our testcases and these were also fixed. Fix go-gitea#14231 Signed-off-by: Andrew Thornton <[email protected]>
e54cf99
to
ae9dc35
Compare
@@ -61,22 +62,22 @@ func TestDiffToHTML(t *testing.T) { | |||
{Type: dmp.DiffEqual, Text: "</span><span class=\"p\">)</span>"}, | |||
}, DiffLineDel)) | |||
|
|||
assertEqual(t, "<span class=\"nx\">r</span><span class=\"p\">.</span><span class=\"nf\">WrapperRenderer</span><span class=\"p\">(</span><span class=\"nx\">w</span><span class=\"p\">,</span> <span class=\"removed-code\"><span class=\"nx\">language</span></span><span class=\"removed-code\"><span class=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs</span></span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>", diffToHTML("", []dmp.Diff{ | |||
assertEqual(t, "<span class=\"nx\">r</span><span class=\"p\">.</span><span class=\"nf\">WrapperRenderer</span><span class=\"p\">(</span><span class=\"nx\">w</span><span class=\"p\">,</span> <span class=\"removed-code\"><span class=\"nx\">language</span><span class=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs</span></span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>", diffToHTML("", []dmp.Diff{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This removes an unnecessary close and reopen of a removed-code span
{Type: dmp.DiffEqual, Text: "<span class=\"nx\">r</span><span class=\"p\">.</span><span class=\"nf\">WrapperRenderer</span><span class=\"p\">(</span><span class=\"nx\">w</span><span class=\"p\">,</span> <span class=\"nx\">"}, | ||
{Type: dmp.DiffDelete, Text: "language</span><span "}, | ||
{Type: dmp.DiffEqual, Text: "c"}, | ||
{Type: dmp.DiffDelete, Text: "lass=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs"}, | ||
{Type: dmp.DiffEqual, Text: "</span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>"}, | ||
}, DiffLineDel)) | ||
|
||
assertEqual(t, "<span class=\"added-code\">language</span></span><span class=\"added-code\"><span class=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs</span></span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>", diffToHTML("", []dmp.Diff{ | ||
assertEqual(t, "<span class=\"added-code\">language</span><span class=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs</span></span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>", diffToHTML("", []dmp.Diff{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This removes an unnecessary close and reopen of a added-code span
{Type: dmp.DiffInsert, Text: "language</span><span "}, | ||
{Type: dmp.DiffEqual, Text: "c"}, | ||
{Type: dmp.DiffInsert, Text: "lass=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs"}, | ||
{Type: dmp.DiffEqual, Text: "</span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>"}, | ||
}, DiffLineAdd)) | ||
|
||
assertEqual(t, "<span class=\"k\">print</span><span class=\"added-code\"></span><span class=\"added-code\"><span class=\"p\">(</span></span><span class=\"sa\"></span><span class=\"s2\">"</span><span class=\"s2\">// </span><span class=\"s2\">"</span><span class=\"p\">,</span> <span class=\"n\">sys</span><span class=\"o\">.</span><span class=\"n\">argv</span><span class=\"added-code\"><span class=\"p\">)</span></span>", diffToHTML("", []dmp.Diff{ | ||
assertEqual(t, "<span class=\"k\">print</span><span class=\"added-code\"><span class=\"p\">(</span></span><span class=\"sa\"></span><span class=\"s2\">"</span><span class=\"s2\">// </span><span class=\"s2\">"</span><span class=\"p\">,</span> <span class=\"n\">sys</span><span class=\"o\">.</span><span class=\"n\">argv</span><span class=\"added-code\"><span class=\"p\">)</span></span>", diffToHTML("", []dmp.Diff{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This removes an unnecessary close and reopen of a removed-code span
@@ -85,14 +86,14 @@ func TestDiffToHTML(t *testing.T) { | |||
{Type: dmp.DiffInsert, Text: "<span class=\"p\">)</span>"}, | |||
}, DiffLineAdd)) | |||
|
|||
assertEqual(t, "sh <span class=\"added-code\">'useradd -u $(stat -c "%u" .gitignore) jenkins</span>'", diffToHTML("", []dmp.Diff{ | |||
assertEqual(t, "sh <span class=\"added-code\">'useradd -u $(stat -c "%u" .gitignore) jenkins'</span>", diffToHTML("", []dmp.Diff{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at the provided diff array and what was previously listed as expected you'll see that we actually expected something incorrect!
{Type: dmp.DiffEqual, Text: "sh "}, | ||
{Type: dmp.DiffDelete, Text: "4;useradd -u 111 jenkins""}, | ||
{Type: dmp.DiffInsert, Text: "9;useradd -u $(stat -c "%u" .gitignore) jenkins'"}, | ||
{Type: dmp.DiffEqual, Text: ";"}, | ||
}, DiffLineAdd)) | ||
|
||
assertEqual(t, "<span class=\"x\"> <h<span class=\"added-code\">4 class=</span><span class=\"added-code\">"release-list-title df ac"</span>></span>", diffToHTML("", []dmp.Diff{ | ||
assertEqual(t, "<span class=\"x\"> <h<span class=\"added-code\">4 class="release-list-title df ac"</span>></span>", diffToHTML("", []dmp.Diff{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This removes an unnecessary close and reopen of a addded-code span
func TestDiffToHTML_14231(t *testing.T) { | ||
setting.Cfg = ini.Empty() | ||
diffRecord := diffMatchPatch.DiffMain(highlight.Code("main.v", " run()\n"), highlight.Code("main.v", " run(db)\n"), true) | ||
diffRecord = diffMatchPatch.DiffCleanupEfficiency(diffRecord) | ||
|
||
expected := ` <span class="n">run</span><span class="added-code"><span class="o">(</span><span class="n">db</span></span><span class="o">)</span>` | ||
output := diffToHTML("main.v", diffRecord, DiffLineAdd) | ||
|
||
assertEqual(t, expected, output) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we add a test case specific for the issue that this PR fixes.
Codecov Report
@@ Coverage Diff @@
## master #14678 +/- ##
==========================================
+ Coverage 42.21% 42.29% +0.07%
==========================================
Files 767 767
Lines 81624 81739 +115
==========================================
+ Hits 34458 34569 +111
- Misses 41531 41534 +3
- Partials 5635 5636 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless we want to rewrite whole diff code ... this is the way to go 🚀
🚀 |
Backport go-gitea#14678 Gitea runs diff on highlighted code fragment for each line in order to provide code highlight diffs. Unfortunately this diff algorithm is not aware that span tags and entities are atomic and cannot be split. The current fixup code makes some attempt to fix these broken tags however, it cannot handle situations where a tag is split over multiple blocks. This PR provides a more algorithmic fixup mechanism whereby spans and entities are completely coalesced into their respective blocks. This may result in a incompletely reduced diff but - it will definitely prevent the broken entities and spans that are currently possible. As a result of this fixup several inconsistencies were discovered in our testcases and these were also fixed. Fix go-gitea#14231 Signed-off-by: Andrew Thornton <[email protected]>
Backport #14678 Gitea runs diff on highlighted code fragment for each line in order to provide code highlight diffs. Unfortunately this diff algorithm is not aware that span tags and entities are atomic and cannot be split. The current fixup code makes some attempt to fix these broken tags however, it cannot handle situations where a tag is split over multiple blocks. This PR provides a more algorithmic fixup mechanism whereby spans and entities are completely coalesced into their respective blocks. This may result in a incompletely reduced diff but - it will definitely prevent the broken entities and spans that are currently possible. As a result of this fixup several inconsistencies were discovered in our testcases and these were also fixed. Fix #14231 Signed-off-by: Andrew Thornton <[email protected]> Co-authored-by: 6543 <[email protected]>
Gitea runs diff on highlighted code fragment for each line in order to provide
code highlight diffs. Unfortunately this diff algorithm is not aware that span tags
and entities are atomic and cannot be split.
The current fixup code makes some attempt to fix these broken tags however, it cannot
handle situations where a tag is split over multiple blocks.
This PR provides a more algorithmic fixup mechanism whereby spans and entities are
completely coalesced into their respective blocks.
This may result in a incompletely reduced diff but - it will definitely prevent the
broken entities and spans that are currently possible.
As a result of this fixup several inconsistencies were discovered in our testcases
and these were also fixed.
Fix #14231
Signed-off-by: Andrew Thornton [email protected]