-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial cleanup and refactoring of the parser #286
Conversation
f5b759c
to
aa8fbb0
Compare
cfef1b1
to
2dd3255
Compare
The `last` and `tail` functions where each two functions combined into one, differentiated by a `~rev` flag. This imposes unnecessary overhead when trying to read the code (IMO). This change replaces the flag-based usage with two declaritively named functions. We also rename the `ws` function to the more accurate `trim_ws`, since it trims white space, and make the flag a boolean, rather than an optional unit.
Afaik, `drop` is the usual name for this function. Also adds validation logic to ensure the slice cannot exceed the bounds of the underlying string.
Helps reduce cognitive overload in trying to graple with `parser.ml` and gives a more accurate name to the module.
The aim is to enable the reader to see what we are testing for at a glance, to reduce the cognitive load of reading each line.
Replace logic with drop_while and separate into three distinct functions.
- Use uncons to simplify logic - Separate helper function def from case analysis - Simplify failure logic
2dd3255
to
75805d8
Compare
It's a long weekend in Canada, so I'll keep doing the cleanup here thru tomorrow, then I'll offer up whatever I have done by EOD for reviw :) |
67204db
to
03cb3fb
Compare
Allows us to use more recent stdlib methods while maintaining compat with older versions of the compiler.
db0e092
to
28b38e5
Compare
These will enable cleaner logic in the parsing
Using a fold we can further simplify the logic, and ditch the explicit loop.
That's what I'll be getting done this weekend! Merging will be blocked until I sort out thierry-martinez/stdcompat#26, but this should be ready for review (mainly just to keep it from getting too long). I'll resume this superficial pass of parsing later in the week, or on the weekend at the latest. I think I'm forming some ideas for more substantial refactoring, and I'll open issues to plan those out as they become more formed. |
src/block.ml
Outdated
@@ -1,5 +1,5 @@ | |||
open Ast | |||
module Sub = Parser.Sub | |||
module Sub = StrSlice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure what I should do about this naming. Is Sub
a good name for the module? I find it cryptic and inaccurate, under the view that it's not really about substrings, per se, but about slices over a base. But maybe that's more of an implementation detail? Really, it's more like an alternative representation of strings based on the view of a slice...
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find StrSlice
a better name than Parser.Sub
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I think I do too. I dropped the Sub
alias.
Interesting! It looks like some of our custom |
src/parser.ml
Outdated
@@ -139,24 +100,23 @@ end = struct | |||
|
|||
type 'a t = state -> 'a | |||
|
|||
let ensure_chars_remain st = if st.pos >= String.length st.str then raise Fail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick, but ensure_chars_remain
to me sounds like it prevents chars from being removed whereas it is more like "ensure there are still chars". Can't think of an actual good name. Admittedly not great suggestions:
ensure_remaining_chars
ensure_not_at_end
ensure_end_not_reached
fail_if_no_more_chars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see the potential for confusion. Since it's only being used in 2 places, maybe best just to remove this? It also lets us make avoid conditional statements and use conditional expression. WDYT? 6611e25
src/parser.ml
Outdated
| Lsetext_heading of int * int | ||
(** the level of the heading and how long the underline marker is *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider inline records here if they don't make later code more verbose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
One of our compat functions is necessary for backwards compatible functionality.
Another option could be to use the |
That's perfect. Thanks for the fix, @tatchi! |
d7dd919
to
b3cb00c
Compare
@@ -24,7 +24,8 @@ Additionally, OMD implements a few Github markdown features, an | |||
extension mechanism, and some other features. Note that the opam | |||
package installs both the OMD library and the command line tool `omd`.") | |||
(tags (org:ocamllabs org:mirage)) | |||
(depends (ocaml (>= 4.05)) | |||
(depends (ocaml (>= 4.08)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I'm suggesting raising the minimum ocaml version here. I think this library is high-level enough, and has few enough dependencies that we don't need to weigh ourselves down trying to maintain backwards compatibility with versions that much more central libraries have already lost support for.
Thanks to @tatchi's suggestion and my suggestion to increase our minimum OCaml version to 4.08 (for the time being at least) I think this should be passing all CI now and ready for another review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice refactor 👏 Thanks for doing that!
b3cb00c
to
612f35a
Compare
Context
To help clear the way for #223 we've agreed that some substantial refactoring of the parer passes should be undertaken to make it less unwieldy to extend and reason about (following the principle of "first make the change easy, then make the change").
This work is largely just preparatory, and the result of "active reading", which is adding comments, cleanup, and simplification as I study the existing parser. I'll resume this work later in the week.
Reviewing
There was some churn in the changes, but review still may benefit from skimming each commit, before a final review, to see the evolution of the changes.
I'd particularly like the review to ensure that the changes I'm proposing actually make the code more readable and easier to reason about. I don't want to be introducing changes for their own sake, so if you think any of my changes end up just as complicated or hard to reason about as what they are replacing (including the naming suggestions), please raise alarm!
Thanks in advance for the review!