Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repl: Handle unicode escape sequence #5189

Closed
wants to merge 1 commit into from

Conversation

princejwesley
Copy link
Contributor

Fix for #3971

@princejwesley
Copy link
Contributor Author

/cc @bnoordhuis

@mscdex mscdex added the repl Issues and PRs related to the REPL subsystem. label Feb 11, 2016
@jasnell
Copy link
Member

jasnell commented Feb 11, 2016

CI: https://ci.nodejs.org/job/node-test-pull-request/1636/
(the CI run won't be publicly visible for the time being, one of the collaborators will have to post the results after it runs)

@jasnell
Copy link
Member

jasnell commented Feb 11, 2016

LGTM if CI is green and @bnoordhuis is happy.

@princejwesley
Copy link
Contributor Author

@jasnell CI turned green ?

@silverwind
Copy link
Contributor

@princejwesley it's green on all machines, the overall run is in 'cancelled' status, but I guess that's something about the CI, not this PR.

@princejwesley
Copy link
Contributor Author

@silverwind thanks 😃
/cc @bnoordhuis

@jasnell
Copy link
Member

jasnell commented Feb 19, 2016

@bnoordhuis ...ping

@bnoordhuis
Copy link
Member

Sorry for the delay. Is there a reason you're using regular expressions? They're not super-readable, an literal.indexOf('\\') followed by a simple pattern scanner is arguably a little easier to follow.

@princejwesley
Copy link
Contributor Author

@bnoordhuis if we use pattern scanner instead of regular expression, we have to create two scanner pattern functions and have to maintain the lastIndex(for multiple match) state like RegExp object.

If it improves readability, I'll build indexOf based RegExp like simple scanner object and update the PR.

@princejwesley
Copy link
Contributor Author

@bnoordhuis , Here is the comparison code with and without regular expression. I'll update PR if literal version looks good to you.

// ------------------------------------------------------------------------
// With regex
// ------------------------------------------------------------------------
const unicodePointLikePattern = /\\u{([^}]+)}/mg;
const unicodeLikePattern = /\\u(.{0,4})/mg;
const hexLikePattern = /\\x(.{0,2})/mg;
// ------------------------------------------------------------------------

// ------------------------------------------------------------------------
// with simple scanner
// ------------------------------------------------------------------------
class LiteralPattern {

  constructor(prefix, suffixOrMaxLength) {
    this._prefix = prefix;
    this._prefixLength = prefix.length;
    if (typeof suffixOrMaxLength === 'string') {
      this._suffix = suffixOrMaxLength;
      this._suffixLength = suffixOrMaxLength.length;
    } else {
      this._maxLength = suffixOrMaxLength;
    }
    this.lastIndex = 0;
  }

  exec(input) {
    input = input.substring(this.lastIndex);
    let idx = input.indexOf(this._prefix);
    if (idx !== -1) {
      let start = this._prefixLength + idx;
      if (this._suffix) {
        input = input.substring(start);
        let end = input.indexOf(this._suffix);
        if (end !== -1) {
          this.lastIndex += start + end + this._suffixLength;
          return [null, input.substring(0, end)];
        }
      } else {
        this.lastIndex += start + this._maxLength;
        return [null, input.substring(this._prefixLength + idx, this._maxLength)];
      }
    }
    this.lastIndex = 0;
    return null;
  }
}

const unicodePointLikePattern = new LiteralPattern('\\u{', '}');
const unicodeLikePattern = new LiteralPattern('\\u', 4);
const hexLikePattern = new LiteralPattern('\\x', 2);
// ------------------------------------------------------------------------

@princejwesley
Copy link
Contributor Author

@bnoordhuis shall I update PR with pattern scanner?

@princejwesley
Copy link
Contributor Author

@bnoordhuis Updated PR with simple pattern scanner.

@jasnell
Copy link
Member

jasnell commented Mar 2, 2016

@bnoordhuis ... ping

@princejwesley
Copy link
Contributor Author

@bnoordhuis Fixed merge conflicts!

@princejwesley
Copy link
Contributor Author

Ping @bnoordhuis

@princejwesley
Copy link
Contributor Author

Working fine in v6.3.0

~ 🙈  node
> '\u{110000}'
SyntaxError: Unexpected token ILLEGAL
    at Object.exports.createScript (vm.js:47:10)
    at REPLServer.defaultEval (repl.js:255:25)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer.<anonymous> (repl.js:489:10)
    at emitOne (events.js:101:20)
    at REPLServer.emit (events.js:188:7)
    at REPLServer.Interface._onLine (readline.js:232:10)
    at REPLServer.Interface._line (readline.js:574:8)
    at REPLServer.Interface._ttyWrite (readline.js:851:14)
> process.version
'v6.3.0'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
repl Issues and PRs related to the REPL subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants