-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
one_of, none_of, etc. only work with strings #1510
Comments
Please provide a minimal reproducer of your problem when reporting an issue. There's a reason this is a part of the issue template, which you seem to have ignored entirely. This works as expected for me on nom 7.1.0: use nom::{character::complete::one_of, IResult};
fn abc(i: &[u8]) -> IResult<&[u8], char> {
one_of(b"abc".as_slice())(i)
}
fn main() {
assert_eq!(abc(b"axe"), Ok((b"xe".as_slice(), 'a')));
assert_eq!(abc(b"baba"), Ok((b"aba".as_slice(), 'b')));
assert!(abc(b"foo").is_err());
} |
Sure buddy use nom::{Err, IResult};
use nom::branch::alt;
use nom::bytes::complete::{is_not, tag, take_while_m_n};
use nom::character::complete::{alpha1, alphanumeric1};
use nom::character::is_digit;
use nom::combinator::{map, map_opt, opt, recognize, success, value, verify};
use nom::error::{ErrorKind, ParseError};
use nom::multi::{count, many0, many0_count};
use nom::sequence::{delimited, pair, preceded, terminated};
pub fn short_string_escape<'a, E: ParseError<&'a [u8]>>(input: &'a [u8]) -> IResult<&'a [u8], Option<u8>, E> {
preceded(tag(b"\\"),
alt((
//map(short_string_ordinal, Some),
value(Some(b'\x07'), tag(b"a")),
value(Some(b'\x7F'), tag(b"b")),
value(Some(b'\x0C'), tag(b"f")),
value(Some(b'\n'), tag(b"n")),
value(Some(b'\r'), tag(b"r")),
value(Some(b'\t'), tag(b"t")),
value(Some(b'\x0b'), tag(b"v")),
map(one_of(b"\\'\"\n".as_slice()), Some),
success(None)
))
)(input)
} doesn't compile use nom::{Err, IResult};
use nom::branch::alt;
use nom::bytes::complete::{is_not, tag, take_while_m_n};
use nom::character::complete::{alpha1, alphanumeric1};
use nom::character::is_digit;
use nom::combinator::{map, map_opt, opt, recognize, success, value, verify};
use nom::error::{ErrorKind, ParseError};
use nom::multi::{count, many0, many0_count};
use nom::sequence::{delimited, pair, preceded, terminated};
pub fn one_of_bytes<'a, E: ParseError<&'a [u8]>>(bytes: &'a [u8]) -> impl Fn(&'a [u8]) -> IResult<&'a [u8], u8, E> {
move |input: &'a [u8]| {
if let Some(byte) = input.first() {
if bytes.contains(byte) {
return Ok((&input[1..], *byte))
}
}
Err(Err::Error(E::from_error_kind(input, ErrorKind::OneOf)))
}
}
pub fn short_string_escape<'a, E: ParseError<&'a [u8]>>(input: &'a [u8]) -> IResult<&'a [u8], Option<u8>, E> {
preceded(tag(b"\\"),
alt((
//map(short_string_ordinal, Some),
value(Some(b'\x07'), tag(b"a")),
value(Some(b'\x7F'), tag(b"b")),
value(Some(b'\x0C'), tag(b"f")),
value(Some(b'\n'), tag(b"n")),
value(Some(b'\r'), tag(b"r")),
value(Some(b'\t'), tag(b"t")),
value(Some(b'\x0b'), tag(b"v")),
map(one_of_bytes(b"\\'\"\n"), Some),
success(None)
))
)(input)
} compiles
|
I'm trying to parse Lua source code, which is not UTF-8 (strings are just bags of bytes).
nom
seems to be mostly compatible with&[u8]
, but I keep finding that random parts of nom randomly fail to typecheck on it.For example,
one_of(b"\\'\"')
compiles fine on its own, but you can't use the returned parser on a&[u8]
because it expects the element type to bechar
. This is confusing especially if you're using it as part of analt
, which doesn't point out theone_of(...)
as the source of the problem.nom
is quite an arcane library. I appreciate the clearly huge amount of work that has been put into the documentation and implementation, but along with what was noted in #1506 by @Firstyear, a lot of things (likerecognize
) are still hard to find, and it's not obvious that certain helpers, while they could be generalized, only work on strings, with no alternative for bytes. This is detrimental and makes it very challenging to implement the parser that I want.Almost everything that has been implemented for
str
s andchar
s could be implemented for bytes instead, but there is no separation between the modules and therefore no effort to bring them up to parity.The text was updated successfully, but these errors were encountered: