Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jsoniq parser #3

Open
mosheduminer opened this issue Aug 3, 2020 · 11 comments
Open

jsoniq parser #3

mosheduminer opened this issue Aug 3, 2020 · 11 comments

Comments

@mosheduminer
Copy link

Hi, your package is really cool!
I'm trying to implement the JSONiq core grammar. This grammar uses the EBNF except symbol, to exclude otherwise valid matches (the specification itself uses it once, but it references the XML specification in places where it uses the except operator). However, it appears that this package does not support excluding otherwise valid matches.
Would you be willing to add support for this? Or accept a PR (I don't know if I would manage it though 🙂, I'm new to Julia as well as parsing)?

@gkappler
Copy link
Owner

gkappler commented Aug 4, 2020

Glad you enjoy the package!
It will be great to provide further parsers like JSONiq. You might be able to exclude otherwise valid matches with Never() or NegativeLookbehind() or NegativeLookahead?
A more general approach might support the ENBF grammar on the page similarly to the regex parser used in CombinedParsers.Regexp, what do you think?
A PR will be perfect. I will have a look and check if/when I can fit adding this into my schedule. :-)

@mosheduminer
Copy link
Author

NegativeLookbehind() looks like it will do the trick!
As for supporting EBNF grammar - sounds great, but I'm afraid I'll be in over my head 😄, I'd have to get familiar with how the library works, etc. But it is a very enticing idea, I'll see if I can figure it out.
On another note, I just managed to cause several LLVM errors while testing out NegativeLookbehind(). The issue I filed is here. Perhaps you'd be able to add some context for the bug?

@gkappler
Copy link
Owner

gkappler commented Aug 4, 2020

I checked your example, and the issue seems Julia 1.5.0 related. Your example works in Julia 1.4.1 on my side.

@mosheduminer
Copy link
Author

mosheduminer commented Aug 4, 2020

In that case, I guess I'll have to downgrade Julia.

In regards to EBNF, defining the Grammar in Julia would be might be problematic due to recursive (and interdependent) definitions. Currently, I would solve those with an Either. The other solution would be to write the an EBNF grammar as a long string, and to parse that string using EBNF rules, an option I think might be more complex. What are your thoughts?

EDIT: actually, Either wouldn't seem to handle recursive definitions... I was completely confused 😄.

@gkappler gkappler changed the title Exception Operator Julia 1.5 support Aug 4, 2020
@gkappler
Copy link
Owner

gkappler commented Aug 4, 2020

I worked around the compiler issue during sequence construction in order to support Julia 1.5 and be independent of the upstream fix - the performance dent will be negligible. Version 0.1.1 is pushed and registered :-).

Recursive parsing is supported in CombinedParsers, and interdependent definitions can be added too with "push to Either": r=Either{R}(Any[]) for a parser that you can push!(r,o) to, if result_type(o)<:R.
I will improve documentation on this, I hope for now the arithmetics example helps (@syntax for o in r is calling push! for you):

# Term expressions are sequences of subterms interleaved with operators.
# Sub terms are [`Either`](@ref) fast `TextParse.Numeric(Int)` integer numbers, converted to `Rational{Int}`,
@syntax subterm = Either{Rational{Int}}(Any[TextParse.Numeric(Int)]);
# A subterm can also be a nested term in parentheses
@syntax for parenthesis in subterm
    mult                = evaluate |> join(subterm, CharIn("*/"), infix=:prefix )
    @syntax term = evaluate |> join(mult,    CharIn("+-"), infix=:prefix )
    Sequence(2,'(',adds,')')
end;
# This `CombinedParser` definition in 5,5 lines is sufficient for doing arithmetics:
# [`Base.join`](@ref)(x,infix; infix=:prefix) is shorthand for `x `[`*`](@ref)` `[`Repeat`](@ref)`( infix * x  )`,
# and `f |> parser` is shorthand for [`map`](@ref)(f,parser)`.
@syntax term = adds;
# registers a `@term_string` macro for parsing and transforming.
term"(1+2)/5"

I think it would be nice to have a string macro ebnf"..."....

@gkappler gkappler changed the title Julia 1.5 support Exception Operator. Julia 1.5 support Aug 4, 2020
@mosheduminer
Copy link
Author

I've been poking around the library more, as well as learning more about julia macros. I hope that in a week maybe I'll have an EBNF parser and macro 🙂.

@gkappler
Copy link
Owner

gkappler commented Aug 7, 2020

I spent some time with it today and have added a few syntactic changes, so the EBNF parser syntax can be adapted very straightforwardly. I need to test a bit more, and will push the changes later tonight.
This will include the bare jsoniq parser with default results.

I wonder: You plobably do not just want to parse jsoniq but transform into a query object with some business logic, right?
You can then take the bare parser and add transformation functions.
🙂

@mosheduminer
Copy link
Author

mosheduminer commented Aug 7, 2020

I wonder: You plobably do not just want to parse jsoniq but transform into a query object with some business logic, right?
You can then take the bare parser and add transformation functions.

Yes, sure. I am planning to attempt writing a jsoniq engine, and of course parsing is the first step 🙂.

This will include the bare jsoniq parser with default results.

You mean you are already integrating a jsoniq parser?

@gkappler
Copy link
Owner

gkappler commented Aug 7, 2020

I wonder: You plobably do not just want to parse jsoniq but transform into a query object with some business logic, right?

Yes, sure. I am planning to attempt writing a jsoniq engine, and of course parsing is the first step 🙂.

cool!

This will include the bare jsoniq parser with default results.

You mean you are already integrating a jsoniq parser?

Looking at the EBNF syntax, I found it convenient to define a whitespace separated sequence as @seq "some" "number" re"[0-9]", and implemented that - It will make defining parser more brief, and be familiar to those who know ENBF. (For writing a ENBF parser I created #5 ).

Then I thought it would be most effective to provide a syntax scaffold to work from. The scaffold is a parser that results in nested tuples of the parsed parts. All you need to do then is mapping those to your engine's constructors.
Yet the syntax needs rearranging and recursion is done differently.
I thought of creating a doc example demonstrating @seq with this scaffold.

@gkappler gkappler changed the title Exception Operator. Julia 1.5 support jsoniq parser Aug 7, 2020
@gkappler
Copy link
Owner

Hey, I updated CombinedParsers and wrote a gist for you with the scaffold. Note that I did not include parsers for NCName,D, _Char, etc.
So it is not yet tested.

The gist is using @seq and the recursion with "push! to either".

Hope this helps!

@mosheduminer
Copy link
Author

Thank you! I will study that code, and likely use it for my upcoming project 🙂.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants