Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow chaining parsers similar to p + q + r but without getting nested tuple result #25

Open
JoshMcguigan opened this issue Dec 21, 2018 · 5 comments

Comments

@JoshMcguigan
Copy link
Contributor

Following up on the discussion in #24, I created one possible implementation which allows chaining parsers without getting a nested tuple result. You can see the result of this at the link below. Note that in this example, I could have removed the call to map entirely, but I wanted to leave it to demonstrate that the (hours, minutes, seconds) tuple is not longer nested.

master...JoshMcguigan:experimental-combinator

Unfortunately, at the moment I'm not sure how this could be extended to tuples of any size without creating all4, all5.. allN for some reasonable value of N. Another downside of this approach is when the user adds a parser to the chain they'd have to switch which version of the all function they are using, from allN to allN+1.

The upside to this is the result is a (not nested) tuple of the results of each of the parsers, which means this could nicely replace the use of the +, -, and * combinators, allowing users to write these types of combinations in what I'd consider to be more idiomatic Rust.

Thanks again for your work on this crate, and feel free to let me know if this isn't something you are interested in.

@J-F-Liu
Copy link
Owner

J-F-Liu commented Dec 21, 2018

I think it's OK to define and use allN in user's code, but not good to include in pom, for the sake of consistent operator style. I also modified duration example a bit.

@J-F-Liu
Copy link
Owner

J-F-Liu commented Dec 21, 2018

    (
        two_digits(),
        char(':'),
        two_digits(),
        char(':'),
        two_digits(),
        time_zone(),
    )
        .map(|(hour, _, minute, _, second, time_zone)| {
            // Its ok to just unwrap since we only parsed digits
            Time {
                hour: hour,
                minute: minute,
                second: second,
                time_zone: time_zone,
            }
        })

While this approach looks good.

@JoshMcguigan
Copy link
Contributor Author

I think it's OK to define and use allN in user's code, but not good to include in pom, for the sake of consistent operator style. I also modified duration example a bit.

From my perspective, the reason I like using pom over the alternatives is the simplicity. The operator style for the combinators (p + q rather than something like all2(p, q) or p.and(q)) is the primary reason I have to keep the pom documentation open while developing. I think it would be more idiomatic, and friendlier to new-comers to pom, to use methods/functions rather than operator overloading.

That said, I do agree the code in your second post is nice. But that example is from combine, and it's not clear to me how an API like that could be developed within pom.

@J-F-Liu
Copy link
Owner

J-F-Liu commented Dec 21, 2018

Yes, it would be fine to rename all3 to terms, and all4 to terms4. Or a macro terms! to handle any numbers of terms.

@glasspangolin
Copy link

Hi guys, not sure whether this is helpful but I solved this problem in my code using a new 'vector' combinator and an enum. I copied and adapted your code for the 'list' combinator to use an ordered Vec of parsers.

I have this in parsers.rs:

pub fn vector<'a, I, O>(
	parser: Vec<Parser<'a, I, O>>
) -> Parser<'a, I, Vec<O>>
	where
		O: 'a
{
	Parser::new(move |input: &'a [I] , start: usize| {
		let mut items = vec![];
		let mut pos = start;
		let mut done = false;
		let mut counter : usize = 0;
		while !done && counter < parser.len() {
			match (parser[counter].method)(input, pos) {
				Ok((more_item, more_pos)) => {
					items.push(more_item);
					pos = more_pos;
					counter+=1;
				}
				Err(_) => {
					done = true;
					return Err(Error::Incomplete)
				},
			}
		}
		Ok((items, pos))
	})
}

Then a minimal working example:

use pom::parser::*;

#[derive(Copy, Clone, PartialEq, Debug)]
pub struct Object {
    a : Field,
    b : Field,
    c : Field
}

#[derive(Copy, Clone, PartialEq, Debug)]
enum Field {
    A,
    B,
    C,
    NONE
}

fn take_space<'a>() -> Parser<'a, u8, u8> {
    one_of(b" \t")
}

fn get_a<'a>() -> Parser<'a, u8, Field> {
    sym(b'a').map(|maybe_a| {
        if maybe_a == b'a' {
            Field::A
        } else {
            Field::NONE
        }
    })
}

fn get_b<'a>() -> Parser<'a, u8, Field> {
    sym(b'b').map(|maybe_a| {
        if maybe_a == b'b' {
            Field::B
        } else {
            Field::NONE
        }
    })
}

fn get_c<'a>() -> Parser<'a, u8, Field> {
    sym(b'c').map(|maybe_a| {
        if maybe_a == b'c' {
            Field::C
        } else {
            Field::NONE
        }
    })
}

pub fn parse_line<'a>() -> Parser<'a, u8, Object> {
    vector(
    vec![ call(get_a) - call(take_space).repeat(0..)
          , call(get_b) - call(take_space).repeat(0..)
          , call(get_c) - call(take_space).repeat(0..)]
    ).map(|v_vector| {
        Object {
            a:v_vector[0],
            b:v_vector[1],
            c:v_vector[2]
        }
    })
}

#[cfg(test)]
mod tests {
    use super::*;


    #[test]
    fn it_works() {
        assert_eq!(parse_line().parse(b"a b c").expect("couldn't parse."), Object {
            a:Field::A,
            b:Field::B,
            c:Field::C
        });
    }
}

You can see the limitation is that all the members of the vector have to return the same type, but I think it's quite neat when you combine the vector combinator with an enum :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants