-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update spec to 0.30 #266
Update spec to 0.30 #266
Conversation
- Remove duplicated logic, the parser combinator version now uses the stateful version under the hood. This also fixes the problem with invalid characters still being added to the buffer. - Update length limits of escapes to match the spec.
Hi! Thanks for this work :) It will take me some time to carve out time for a review, but I have it on my queue! Hopefully over the weekend. |
I'm afraid I've caused a merge conflict as I'm cleaning up the backlog of PRs. I think I should be able to resolve this without causing any extra work on your side, @SquidDev. |
Resolved merge conflict due to formatting rules update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
tests/extract_tests.ml
Outdated
@@ -9,7 +9,7 @@ let protect ~finally f = | |||
r | |||
|
|||
let disabled = | |||
[ 175; 184; 185; 410; 411; 414; 415; 416; 428; 468; 469; 516; 536 ] | |||
[ 028; 171; 206; 215; 216; 410; 411; 414; 415; 416; 428; 468; 469; 519; 539 ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment in passing: Sad to see this list getting longer, tho #269 should knock id down some. And hopefully we can reduce the others in followups. Tho I think we should keep in mind whether the difficulty of squashing these bugs indicates a need to do a deeper refactor of the parser.
@@ -270,6 +270,16 @@ of representing the structural distinctions we need to make, and the | |||
choice of HTML for the tests makes it possible to run the tests against | |||
an implementation without writing an abstract syntax tree renderer. | |||
|
|||
Note that not every feature of the HTML samples is mandated by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment in passing: It would be good to split the spec exceptions into two different categories: the ones that are differences not mandated by the spec, and those which are genuinely out of line with the spec. FYI @sonologico
| Some _ | ||
| None -> | ||
Buffer.add_string buf (range st p (pos st - p)) | ||
junk st; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a wonderful cleanup! 😍
@@ -810,7 +810,7 @@ let known_tags = | |||
; "ul" | |||
] | |||
|
|||
let special_tags = [ "script"; "pre"; "style" ] | |||
let special_tags = [ "pre"; "script"; "style"; "textarea" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appreciate the alphabetizing.
Sorry for the slow review cycle, but thank you very much for bringing this up to date, and for the cleanup on the parser :) |
CHANGES: - Expose the HTML escape function `htmlentities` (ocaml-community/omd#295 @cuihtlauac) - Support generation of identifiers in headers (ocaml-community/omd#294, @tatchi) - Support GitHub-Flavoured Markdown tables (ocaml-community/omd#292, @bobatkey) - Update parser to support CommonMark Spec 0.30 (ocaml-community/omd#266, @SquidDev) - Preserve the order of input files in the HTML output to stdout (ocaml-community/omd#258, @patricoferris) - Fix all deviations from CommonMark Spec 0.30 (ocaml-community/omd#284, ocaml-community/omd#283, ocaml-community/omd#278, ocaml-community/omd#277, ocaml-community/omd#269, @tatchi)
This also refactors how entities are parsed, meaning there's only one function to do it (rather than two). I'm not 100% with the changes I've made to the parser combinator and
Sub
modules - happy to discuss those.