Skip to content

Commit

Permalink
syntax: fix handling of (?flags) in parser
Browse files Browse the repository at this point in the history
This commit fixes a bug with the handling of `(?flags)` sub-expressions
in the parser. Previously, the parser read `(?flags)`, added it to the
current concatenation, and then treat that as a valid sub-expression for
repetition operators, as in `(?i)*`. This in turn caused the translator
to panic on a failed assumption: that witnessing a repetition operator
necessarily implies a preceding sub-expression. But `(?i)` has no
explicit represents in the HIR, so there is no sub-expression.

There are two legitimate ways to fix this:

1. Ban such constructions in the parser.
2. Remove the assumption in the translator, and/or always translate a
   `(?i)` into an empty sub-expression, which should generally be a
   no-op.

This commit chooses (1) because it is more conservative. That is, it
turns a panic into an error, which gives us flexibility in the future to
choose (2) if necessary.

Fixes #465
  • Loading branch information
BurntSushi committed Apr 28, 2018
1 parent d0ab70f commit 17764ff
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 4 deletions.
6 changes: 3 additions & 3 deletions regex-syntax/src/ast/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -156,9 +156,9 @@ pub enum ErrorKind {
/// An opening `{` was found with no corresponding closing `}`.
RepetitionCountUnclosed,
/// A repetition operator was applied to a missing sub-expression. This
/// occurs, for example, in the regex consisting of just a `*`. It is,
/// however, possible to create a repetition operating on an empty
/// sub-expression. For example, `()*` is still considered valid.
/// occurs, for example, in the regex consisting of just a `*` or even
/// `(?i)*`. It is, however, possible to create a repetition operating on
/// an empty sub-expression. For example, `()*` is still considered valid.
RepetitionMissing,
/// When octal support is disabled, this error is produced when an octal
/// escape is used. The octal escape is assumed to be an invocation of
Expand Down
13 changes: 13 additions & 0 deletions regex-syntax/src/ast/parse.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1051,6 +1051,13 @@ impl<'s, P: Borrow<Parser>> ParserI<'s, P> {
ast::ErrorKind::RepetitionMissing,
)),
};
match ast {
Ast::Empty(_) | Ast::Flags(_) => return Err(self.error(
self.span(),
ast::ErrorKind::RepetitionMissing,
)),
_ => {}
}
let mut greedy = true;
if self.bump() && self.char() == '?' {
greedy = false;
Expand Down Expand Up @@ -2942,6 +2949,12 @@ bar
span: span(0..0),
kind: ast::ErrorKind::RepetitionMissing,
});
assert_eq!(
parser(r"(?i)*").parse().unwrap_err(),
TestError {
span: span(4..4),
kind: ast::ErrorKind::RepetitionMissing,
});
assert_eq!(
parser(r"(*)").parse().unwrap_err(),
TestError {
Expand Down
4 changes: 3 additions & 1 deletion regex-syntax/src/hir/translate.rs
Original file line number Diff line number Diff line change
Expand Up @@ -349,7 +349,9 @@ impl<'t, 'p> Visitor for TranslatorI<'t, 'p> {
Ast::Concat(_) => {
let mut exprs = vec![];
while let Some(HirFrame::Expr(expr)) = self.pop() {
exprs.push(expr);
if !expr.kind().is_empty() {
exprs.push(expr);
}
}
exprs.reverse();
self.push(HirFrame::Expr(Hir::concat(exprs)));
Expand Down

0 comments on commit 17764ff

Please sign in to comment.