Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MongoDB performance issues with pointer permissions on large collections. #7065

Closed
4 tasks done
pdiaz opened this issue Dec 12, 2020 · 6 comments
Closed
4 tasks done
Labels
type:feature New feature or improvement of existing feature

Comments

@pdiaz
Copy link
Contributor

pdiaz commented Dec 12, 2020

New Issue Checklist

Issue Description

After upgrading from parse-server 2.7.4 to parse-server 4.4.0 we found a huge increase on the CPU load on a MongoDB 4.0.11 cluster. The primary node was originally getting peaks of about 20% usage and it got maxed out after the upgrade.

Looking at MongoDB performance logs we noticed that some queries had some logic that could be simplified a lot on collections with about 10s of millions of documents and were using pointer permissions.

A asking MongoDB to explain these find operations were indicating that indexes were used and that it takes about 600ms to perform the request to the MongoDB cluster looking at these queries you can see that some boolean reductions can be done and after making them by hand an explain operation on the same database collection was being executed in just a few milliseconds.

Example of filter statement on a find command found on the MongoDB logs taking 600ms to process:

{ $or: [ { $and: [ { _p_owner: "_User$xxxxxxxxxx" }, { _p_owner: "_User$xxxxxxxxxx", status: "pending", buyerOffered: true } ] }, { _p_owner: "_User$xxxxxxxxxx", status: "pending", buyerOffered: true, _p_buyer: "_User$xxxxxxxxxx" } ], _rperm: { $in: [ null, "*", "*", "xxxxxxxxxx" ] } }

This can be simplified by hand in several steps:

  • { $or: [ { $and: [ { _p_owner: "_User$xxxxxxxxxx", status: "pending", buyerOffered: true } ] }, { _p_owner: "_User$xxxxxxxxxx", status: "pending", buyerOffered: true, _p_buyer: "_User$xxxxxxxxxx" } ], _rperm: { $in: [ null, "*", "*", "xxxxxxxxxx" ] } }
  • { $or: [ { _p_owner: "_User$xxxxxxxxxx", status: "pending", buyerOffered: true }, { _p_owner: "_User$xxxxxxxxxx", status: "pending", buyerOffered: true, _p_buyer: "_User$xxxxxxxxxx" } ], _rperm: { $in: [ null, "*", "*", "xxxxxxxxxx" ] } }
  • { $or: [ { _p_owner: "_User$xxxxxxxxxx", status: "pending", buyerOffered: true } ], _rperm: { $in: [ null, "*", "*", "xxxxxxxxxx" ] } }
  • { _p_owner: "_User$xxxxxxxxxx", status: "pending", buyerOffered: true, _rperm: { $in: [ null, "*", "*", "xxxxxxxxxx" ] } }

This last simplified query is executed in less than 5ms. So it seems that MongoDB 4.0 doesn't optimize these queries internally!

Steps to reproduce

In a large collection (16M documents) using at least two pointer permissions on MongoDB 4.0 queries perform from one of the users that are allowed to read a write the documents.

Schema used:

{ "className": "Offer", "fields": { "objectId": { "type": "String" }, "createdAt": { "type": "Date" }, "updatedAt": { "type": "Date" }, "ACL": { "type": "ACL" }, "status": { "type": "String" }, "owner": { "type": "Pointer", "targetClass": "_User" }, "buyer": { "type": "Pointer", "targetClass": "_User" }, "offerAmount": { "type": "Number" }, "originalPrice": { "type": "Number" }, "responded": { "type": "Boolean" } }, "classLevelPermissions": { "find": { "owner": true, "buyer": true }, "count": { "owner": true, "buyer": true }, "get": { "owner": true, "buyer": true }, "create": { "*": true, "requiresAuthentication": true, "owner": true, "buyer": true }, "update": { "owner": true, "buyer": true }, "delete": { "owner": true, "buyer": true }, "addField": {}, "protectedFields": {}, "readUserFields": [ "owner", "buyer" ], "writeUserFields": [ "owner", "buyer" ] }, "indexes": { "_id_": { "_id": 1 }, ... } }

Actual Outcome

Slow queries to MongoDB of the order of 600-800ms on large collections (16M+ documents)

Expected Outcome

Much faster queries

Environment

parse-server: 4.4.0
parse: 2.18.0

Server

  • Parse Server version: 4.4.0
  • Operating system: Debian hosting k8s
  • Local or remote host (AWS, Azure, Google Cloud, Heroku, Digital Ocean, etc): DOKS

Database

  • System (MongoDB or Postgres): MongoDB
  • Database version: 4.0.11 replica set
  • Local or remote host (MongoDB Atlas, mLab, AWS, Azure, Google Cloud, etc): in cluster

Client

  • SDK (iOS, Android, JavaScript, PHP, Unity, etc): iOS and Node.JS
  • SDK version: iOS SDK 1.19.0, JS SDK 2.18

Logs

@pdiaz
Copy link
Contributor Author

pdiaz commented Dec 12, 2020

Naive fix for the issue that give us very good performance.

#7061

@mtrezza
Copy link
Member

mtrezza commented Dec 12, 2020

Thanks for reporting and thanks for adding this well researched issue explanation to the PR.

@mtrezza
Copy link
Member

mtrezza commented Dec 13, 2020

Looking at the optimization algorithm, I would expect this to improve the performance of queries regardless of whether they use pointer permissions. Did you do a performance test without the pointer permissions in the query for comparison?

@pdiaz
Copy link
Contributor Author

pdiaz commented Dec 13, 2020

I haven't done that many performance tests beyond this specific case but I believe it could be a good idea.

The current code in master does complicate queries with pointer permissions and confuses MongoDB to choose an index that is not optimal.

The changes on the merge request are made to simplify mostly this specific case. There could be a more general logic optimizer but that would require much more unit testing.

A first step for a more complete general optimizer would be to move to the parent object que common key value pairs instead of just discarding a branch. In general there would be more

For example, this case is not contemplated on the current merge request as { $or: [{ a: 1, b: 2}, {b: 1, c: 3}] } could be simplified as { b: 2, $or: [{a: 1}, {c: 3}] } and this kind of reduction will also help any admin to define indexes by hand in a much clearer way.

@pdiaz
Copy link
Contributor Author

pdiaz commented Dec 14, 2020

Would you need something more to approve this merge request?

@mtrezza
Copy link
Member

mtrezza commented Dec 16, 2020

Closed with #7061.

@mtrezza mtrezza closed this as completed Dec 16, 2020
@mtrezza mtrezza added type:feature New feature or improvement of existing feature and removed type:improvement labels Dec 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature New feature or improvement of existing feature
Projects
None yet
Development

No branches or pull requests

2 participants