-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to parse until a range of tags #1712
Comments
Hello @frenetisch-applaudierend. |
Hi @coalooball Thanks for the Link, I haven't seen that one before! However I don't think it applies to my use case, since the mentioned parsers all only allow a predicate on single characters. I would need predicate on different parsers (i.e. |
Hello again! use nom::{
branch::alt,
bytes::complete::{tag, take_till, take_while1},
character::{is_alphanumeric, is_space},
sequence::{delimited, terminated},
IResult,
};
fn is_delimiter(s: u8) -> bool {
s == 0x2a || s == 0x23
}
fn embedded_sequence(s: &[u8]) -> IResult<&[u8], &[u8]> {
delimited(
alt((tag(b"<"), tag(b"("))),
delimited(
alt((tag(b"#"), tag(b"*"))),
take_till(is_delimiter),
alt((tag(b"#"), tag(b"*"))),
),
alt((tag(b">"), tag(b")"))),
)(s)
}
fn parse(s: &[u8]) -> IResult<&[u8], &[u8]> {
terminated(
take_while1(|x| is_alphanumeric(x) || is_space(x)),
embedded_sequence,
)(s)
}
fn main() {}
#[test]
fn test_embedded_sequence() {
assert_eq!(
embedded_sequence(b"<#embedded sequence 1#>111").unwrap(),
(b"111".as_ref(), b"embedded sequence 1".as_ref())
);
assert_eq!(
parse(b"Test <#embedded sequence 1#> and (*embedded sequence 2*)").unwrap(),
(b" and (*embedded sequence 2*)".as_ref(), b"Test ".as_ref())
)
} |
I would like to parse arbitrary text with embedded sequences which are delimited by different tags into their parts. E.g.
should be parsed to
Text("Test ")
Embedded1("embedded sequence 1")
Embedded2("embedded sequence 2")
. Ideally all strings in the token should be borrowed from the input string.The embedded sequences are straightforward, but I fail to specify the parser for the
Text
tokens. Is it possible totake_until
a range of tags is encountered?The text was updated successfully, but these errors were encountered: