Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bind expressions #6201

Merged
merged 28 commits into from
Dec 2, 2016
Merged

Conversation

dreamofabear
Copy link

Partial for #6199.

Implements Bind's JS-like expression grammar with Jison (also used by amp-access) and unit tests. Contains several security features, including:

  • Disallow access of prototype properties
  • Only allow invocation of whitelisted String and Array functions
  • Disallow access to globals, function declaration and non-whitelisted keywords

Other things to note:

  • Undefined vars, properties, array-index-out-of-bounds return null instead of throwing errors (similar to Angular)
  • bind-expr-impl.js is just a compilation of bind-expr-impl.jison and can be ignored

/to @dvoytenko @aghassemi
/cc @cramforce

@aghassemi
Copy link
Contributor

Not forgotten, will be 👀 at this today.

Copy link
Contributor

@jridgewell jridgewell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man, I hope bind expressions aren't used too much, this code's gonna be slow.

var prop = Object.prototype.toString.call($2);

if (obj === '[object Array]') {
if (prop === '[object Number]' && Number.isInteger($2)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Number.isInteger is not available everywhere. An imperfect but practical equivalent is $2 === ($2 | 0), or you can go for the full polyfill.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this code block per Ali's comment below.

}
}

if (prop === '[object String]') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numeric props are valid for objects, too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@cramforce
Copy link
Member

Regarding @jridgewell's speed comment: How long does it take to execute 100 typical expressions on a Nexus 5X type device?


import {parser} from './bind-expr-impl';

export function evaluateBindExpr(expr, data) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of evaling each time, can we have a two phase approach that parses the expression and produces a function that evaluates the AST?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe jison has an option to build ast

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jison doesn't support it natively, but I could build a custom one in the grammar. Not sure how much of a speed-up we'd get but I'll try it out.

});

it('should NOT allow access to prototype properties', () => {
expect(evaluateBindExpr('{}.constructor')).to.be.null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also as bare expression: expect(evaluateBindExpr('constructor')).to.be.null;?

Copy link
Author

@dreamofabear dreamofabear Nov 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

}

if (prop === '[object String]') {
if (Object.prototype.hasOwnProperty.call($1, $2)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these three lines alone should be able handle all the cases, regardless of whether obj is an array or not. For array access, both arr[0] and arr['0'] are valid. Alsoarr.hasOwnProperty(0) and arr.hasOwnProperty('0') return true.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might, however, be much faster to keep the special cases since we are paying the price for the toString already. That way the VM can generated specialized code for each if.

expect(evaluateBindExpr('2 - 3.5')).to.equal(-1.5);
expect(evaluateBindExpr('3 * 4')).to.equal(12);
expect(evaluateBindExpr('4 / 5')).to.equal(0.8);
expect(evaluateBindExpr('5 % 4')).to.equal(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test for divide by 0.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also 0/0

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@dreamofabear
Copy link
Author

How long does it take to execute 100 typical expressions on a Nexus 5X type device?

~100ms to evaluate 100 complex-ish expressions (8 operations, 1 function call, 2 member access) on Nexus 5X. About half that on iPhone 6S.

@dreamofabear dreamofabear force-pushed the bind-expressions branch 2 times, most recently from 0af6628 to 4cd66a8 Compare November 29, 2016 17:26
@cramforce
Copy link
Member

cramforce commented Nov 29, 2016 via email

@jridgewell
Copy link
Contributor

How much of that 100ms is just the lexer? Ie, if we take away all the security code you wrote, how long does it take?

expr
{$$ = [$1];}
| array ',' expr
{$$ = $1; Array.prototype.push.call($1, $3);}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$1.push($3)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a safety measure to make sure we call the intended method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We create the array though? I don't think there's a way for them to set a property on it.


literal:
STRING
{$$ = yytext.substr(1, yyleng - 2);}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the substring?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignores the wrapping ' or ".


variable:
NAME
{$$ = Object.prototype.hasOwnProperty.call(yy, $1) ? yy[$1] : null;}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since yy can be null, we'll have to guard that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasOwnProperty appears to return null in that case, which is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should throw an error.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we want to return null instead of throwing undefined symbol error to be more forgiving (see PR summary).

function typeCheckArgs(args) {
for (var i = 0; i < args.length; i++) {
var arg = args[i];
if (Object.prototype.toString.call(arg) === '[object Object]') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typeof arg === "object" && !Array.isArray(arg);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't equivalent to the existing line. This will also catch null, Date and other objects.


var obj = Object.prototype.toString.call($1);

var whitelist = functionWhitelist[obj];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of toString, use an if:

var whitelist;
if (typeof obj === "string") {
} else if (Array.isArray(obj)) {
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll probably need to whitelist functions other than Array and String in the future though, so the flexibility is nice to keep. Also the performance difference seems negligible at N < 1000, so I'd rather do this fine-tuning later if necessary.

Fun fact: underscore.js also thoroughly debated the use of toString for type checking a few years ago.

var whitelist = functionWhitelist[obj];
if (whitelist) {
var fn = $1[$3];
if (whitelist.indexOf(fn) >= 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since they're method on the prototype, names can't conflict. That makes it perfect for an prototype-less object map instead of an array.

whitelist[$3] === $1[$3];

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this will avoid scanning the whitelist array. Done.

if (typeCheckArgs($4)) {
$$ = fn.apply($1, $4);
} else {
throw new Error(`Unexpected argument type in {$3}()`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This throw should be in typeCheckArgs()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? I think this is more readable since control flow remains within the parser's control. Note there's another error that can be thrown a few lines below this.

}

var prop = Object.prototype.toString.call($2);
if (prop === '[object String]' || prop === '[object Number]') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These type checks are likely unnecessary. We definitely need to enforce that it is not a Symbol, but we can do that much quicker at a higher level (forbid the key in the yy context object).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, done.

Copy link
Author

@dreamofabear dreamofabear Nov 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had to revert this change since hasOwnProperty throws a "cannot convert to primitive" error in some cases. Changed to use typeof instead of toString though, which should be faster.


var prop = Object.prototype.toString.call($2);
if (prop === '[object String]' || prop === '[object Number]') {
if (Object.prototype.hasOwnProperty.call($1, $2)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hoist Object#hasOwnProperty into a variable to avoid repeated lookups.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -0,0 +1,288 @@
/** shared vars and functions */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we publish human grammar in a md file right away?

Also, this will need to be reviewed by security very soon.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I'll add the documentation in a follow-up PR and ask @bpaduch for a review.

@molnarg reviewed the design and we're planning to have a formal security review once the feature is ready.

expect(evaluateBindExpr('2 - 3.5')).to.equal(-1.5);
expect(evaluateBindExpr('3 * 4')).to.equal(12);
expect(evaluateBindExpr('4 / 5')).to.equal(0.8);
expect(evaluateBindExpr('5 % 4')).to.equal(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also 0/0

it('should support strings', () => {
expect(evaluateBindExpr('"a"')).to.equal('a');
expect(evaluateBindExpr('"a".length')).to.equal(1);
expect(evaluateBindExpr('"a" + "b"')).to.equal('ab');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we allow interpolation. I still feel like interpolation is the most obvious way to build a string (e.g. URL) within the markup. We can follow JS here as well by using reverse quotes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that interpolation would be nice. Current plan is to add it after MVP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok deferring. But we need to see the most prevalent cases while we are implementing it toward MVP. If it's URL construction, I believe, we need to do it for MVP.


/** @see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators */
it('should NOT allow access to non-whitelisted operators', () => {
expect(evaluateBindExpr('this')).to.be.null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also add self and global

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, done.

{$$ = $1 != $3;}
| expr '==' expr
{$$ = $1 == $3;}
| expr '?' expr ':' expr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there elvis too?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No since || offers the same functionality, but I don't have a strong opinion. Think we should add it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. || is good enough.

expect(() => { evaluateBindExpr('delete foo', {foo: 0}); }).to.throw();
});

it('should NOT allow control flow or loops', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not seeing tests for ?:, but they seem to be allowed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The grammar currently doesn't support ?: since an expr can't be empty.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant tests for expr ? expr : expr.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Added.

@dvoytenko
Copy link
Contributor

@choumx Wow! This grammar is way more powerful than I expected. But, I guess, why hide from it :)

@dreamofabear
Copy link
Author

How much of that 100ms is just the lexer? Ie, if we take away all the security code you wrote, how long does it take?

The security code seems to be only a small fraction (< 5%) of the total runtime. I'm going to experiment with building an AST first -- hopefully that will boost performance substantially.

Array.prototype.lastIndexOf,
Array.prototype.slice,
],
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These must be prototype-less objects. As is, you've just allowed __defineGetter__, __defineSetter__, hasOwnProperty, isPrototypeOf, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for us we have Justin on our side :) Lets also test them.

types.js#map can help with creating such an object from a literal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good catch! Added a unit test for this.

I changed the whitelist to be created by a function. This is more readable than using types.j#map IMO and also prevents typos in the function name.

@dreamofabear
Copy link
Author

Re: performance, I looked around and there are other parser generator options. This comparison suggests that we may get several-X speed up with another generator or by handwriting a parser. @jridgewell also suggested precompiling the AST into a Function (underscore and Angular do this).

I'll continue to investigate expression performance as I work on an end-to-end MVP for Bind.

@cramforce
Copy link
Member

Lets get this in and then optimize. We can chat offline about details, but I'm pretty bullish about moving all of this into a web worker.

const functions = whitelist[type];
for (let i = 0; i < functions.length; i++) {
const f = functions[i];
out[type][f.name] = f;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name property has spotty support in iOS and Android Browser (non chrome). Instead of that, we could use strings and grab the method off the prototype:

const whitelist = [
  {
    class: '[object Array]',
    proto: Array.prototype,
    methods: ['concat', 'includes', ...]
  },
];

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this simple usage of Function.name, it appears to be supported.

@@ -2,44 +2,57 @@

%{
// Shortcuts for common functions.
var toString = Object.prototype.toString;
var hasOwnProperty = Object.prototype.hasOwnProperty;
const toString = Object.prototype.toString;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this compiled into es5?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

@dvoytenko
Copy link
Contributor

LGTM with one request for an additional test.

@dreamofabear
Copy link
Author

Thanks for the reviews everyone!

@dreamofabear dreamofabear merged commit 81d7f96 into ampproject:master Dec 2, 2016
@dreamofabear dreamofabear deleted the bind-expressions branch December 2, 2016 00:34
Lith pushed a commit to Lith/amphtml that referenced this pull request Dec 22, 2016
* initial bind expressions commit

* fix unary op bug, started unit tests

* bug fixes, unit tests for arrays

* more array tests

* add unary plus, fix !=, add a lot more tests

* remove useless comments

* initial bind expressions commit

* fix unary op bug, started unit tests

* bug fixes, unit tests for arrays

* more array tests

* add unary plus, fix !=, add a lot more tests

* remove useless comments

* throw error for non-whitelisted functions, return null for nested undefined member access

* more unit tests

* minor formatting fixes

* ignore compiled bind-expr-impl.js during lint

* fix lint errors in test-bind-expr.js

* fix presubmit: remove ES6 startsWith/endsWith and whitelist test file

* add type check for function args and unit test

* expect specific error for unsupported function tests

* address PR comments

* more PR comments

* add test for prototype fns, fix other tests

* fix broken string templating in parse errors

* use const/let instead of var

* use a function to create whitelist map

* fix lint errors

* add test for ternary op
Lith pushed a commit to Lith/amphtml that referenced this pull request Dec 22, 2016
* initial bind expressions commit

* fix unary op bug, started unit tests

* bug fixes, unit tests for arrays

* more array tests

* add unary plus, fix !=, add a lot more tests

* remove useless comments

* initial bind expressions commit

* fix unary op bug, started unit tests

* bug fixes, unit tests for arrays

* more array tests

* add unary plus, fix !=, add a lot more tests

* remove useless comments

* throw error for non-whitelisted functions, return null for nested undefined member access

* more unit tests

* minor formatting fixes

* ignore compiled bind-expr-impl.js during lint

* fix lint errors in test-bind-expr.js

* fix presubmit: remove ES6 startsWith/endsWith and whitelist test file

* add type check for function args and unit test

* expect specific error for unsupported function tests

* address PR comments

* more PR comments

* add test for prototype fns, fix other tests

* fix broken string templating in parse errors

* use const/let instead of var

* use a function to create whitelist map

* fix lint errors

* add test for ternary op
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants