Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple Parse query translated to a mongodb "or" query #6708

Closed
mess-lelouch opened this issue May 26, 2020 · 25 comments
Closed

Simple Parse query translated to a mongodb "or" query #6708

mess-lelouch opened this issue May 26, 2020 · 25 comments
Labels
type:feature New feature or improvement of existing feature

Comments

@mess-lelouch
Copy link
Contributor

mess-lelouch commented May 26, 2020

Issue Description

I am currently using Parse Server 3.6.0. Last week I attempted to migrate to the latest version of Parse Server and I had to revert back due to performance issues related to specific queries. I think I know how to solve the performance issues, however, I found an interesting behavior during my investigation.

It seems that in the new version of Parse Server, whenever I query a table which has a CLP entry that gives access to the user in a specific column of the table, the resultant mongodb query ends up as an $or query. Also, one of the branches in the $or query has a clause to match the user pointer object (which seems useless because pointers are not stored that way in the mongodb table).

Steps to reproduce

I am close to a 100% sure (but not 100% sure) that the following steps reproduce the problem.

  1. Create a class FooClass with two columns: name | String and user | Pointer<User>.

  2. Add a CLP giving read/write access to the user column.

image

  1. Run the following Parse query in explain mode and look at the mongodb query. In my environment, I reproduced it using a user session token. Not sure if it can be reproduced with a master key.
    const query = await new Parse.Query('FooClass')
        .explain(true)
        .equalTo('name', 'some-name')
        .find({sessionToken: token});

Expected Results

A simpler mongodb query

Actual Outcome

    "parsedQuery": {
      "$and": [
        {
          "$or": [
            {
              "$and": [
                {
                  "_p_user": {
                    "$eq": "_User$5YImRRMWLs"
                  }
                },
                {
                  "name": {
                    "$eq": "some-name"
                  }
                }
              ]
            },
            {
              "$and": [
                {
                  "_p_user": {
                    "$eq": {
                      "__type": "Pointer",
                      "className": "_User",
                      "objectId": "5YImRRMWLs"
                    }
                  }
                },
                {
                  "name": {
                    "$eq": "some-name"
                  }
                }
              ]
            }
          ]
        },
        {
          "_rperm": {
            "$in": [
              null,
              "*",
              "5YImRRMWLs",
              "role:role1"
            ]
          }
        }
      ]
    },

Environment Setup

  • Server

    • parse-server version (Be specific! Don't say 'latest'.) : [4.2.0] (In 3.6.0 the problem does not happen)
    • Operating System: [Linux]
    • Hardware: [AWS EC2]
    • Localhost or remote server? (AWS, Heroku, Azure, Digital Ocean, etc): [AWS]
  • Database

    • MongoDB version: [3.6.9] (But I have reproduced it with 4.2 as well)
    • Storage engine: [WiredTiger]
    • Hardware: [EC2]
    • Localhost or remote server? (AWS, mLab, ObjectRocket, Digital Ocean, etc): [AWS]

Logs/Trace

@mtrezza
Copy link
Member

mtrezza commented May 27, 2020

I doubt that the query in your post matches the explain output. The query does not contain a condition on user and the ACL has nothing to do with a condition on the user field.

@mess-lelouch
Copy link
Contributor Author

Hello @mtrezza ! While I am 100% sure about the query and the output, it could be environment related. I reproduced this on my existing dev environment.

I will try to create an environment from scratch and reproduce it.

@mtrezza
Copy link
Member

mtrezza commented May 27, 2020

Can you post the output of query.toJSON() (before you execute the query) here?

Sent with GitHawk

@mess-lelouch
Copy link
Contributor Author

@mtrezza { where: { name: 'some-name' }, explain: true }

This is definitively environment related because I was not able to reproduce it from scratch. Give me some time to attempt to reproduce it and I will get back to you.

@mess-lelouch
Copy link
Contributor Author

mess-lelouch commented May 27, 2020

@mtrezza I think I got it... I still have some things I need to figure out but I am getting closer to it.

For some reason, my class schema has the following CLP attributes:

  classLevelPermissions: {
    find: { user: true },
    count: { user: true },
    get: { user: true },
    create: { '*': true },
    update: { '*': true },
    delete: { '*': true },
    addField: { '*': true },
    protectedFields: {},
    readUserFields: [ 'user' ],
    writeUserFields: [ 'user' ]
  },

Not sure how readUserFields and writeUserFields got there, but those seem to trigger the reshaping of the query as I described in this issue. And it looks like that the code in DatabaseController.addPointerPermissions is doing that.

When I attempted to reproduce the issue from scratch, the table that I created did not have those in the CLP, which is why I was not able to reproduce it.

@mess-lelouch
Copy link
Contributor Author

@mtrezza I realized that

    readUserFields: [ 'user' ],
    writeUserFields: [ 'user' ]

are set when we set the user column CLPs using an older version of the Parse Dashboard (2.0.4). In production we used that.

However, in my local environment I had 2.1.0, which behaves a bit differently and doesn't set those, which is why I was not able to reproduce it easily.

So as a workaround, I think we can upgrade the dashboard in our production env to 2.1.0 and then reset those CLPs so that they are set without readUserFields and writeUserFields. This way we avoid the issue described in this thread.

Still, if my understanding is correct, this is a legitimate bug that probably needs to be fixed or perhaps documented properly.

@mtrezza
Copy link
Member

mtrezza commented May 27, 2020

I also haven't been able to add readUserFields or writeUserFields to the schema. The test cases set the fields manually, but I couldn't find out how to do that with Parse Dashboard.

await updateCLP({
find: {
pointerFields: ['owner'],
},
});

Also the docs mention the topic but don't even mention these parameters.

I can see that @BufferUnderflower has been working on this, maybe they can shed light on this?

@BufferUnderflower
Copy link
Contributor

BufferUnderflower commented May 27, 2020

I also haven't been able to add readUserFields or writeUserFields to the schema. The test cases set the fields manually, but I couldn't find out how to do that with Parse Dashboard.

await updateCLP({
find: {
pointerFields: ['owner'],
},
});

Also the docs mention the topic but don't even mention these parameters.

I can see that @BufferUnderflower has been working on this, maybe they can shed light on this?

parse-community/parse-dashboard#1478

Allows to set Pointer Permissions (PP) per-operation in advanced mode. Previously it was only possible to set write/read groups ( readUserFields / writeUserFields in schema). Upon saving the field name will go under corresponding operation into pointerFields array. If all (read or write) operations are checked - saves into corresponding readUserFields/writeUserFields array instead.

@mess-lelouch
Copy link
Contributor Author

Thanks for the info @BufferUnderflower... @mtrezza you could also use version 2.0.4 of the dashboard and easily set them as well: https://recordit.co/mFsbZb8sjm

@mtrezza
Copy link
Member

mtrezza commented May 27, 2020

If all (read or write) operations are checked - saves into corresponding readUserFields/writeUserFields array instead.

If I understand this correctly, if I check all read operations then readUserFields:['user'] should be set instead of the entries under each operation. That I was not able to achieve with dashboard 2.1.0.

This permission:
image

Saves this schema:
image

However, when I set readUserFields manually in the DB then I get the same query result as described in #6708 (comment).


However, back to the original issue:

Expected Results
A simpler mongodb query

I understand that the performance issue you mention is not related to the query you point out here. In other words, the issue at question here is whether the query you found is unnecessarily verbose, thus realistically having a potential performance and data traffic implication. That's a valid point.

I assume what you identified as "complex" is the additional $or'ed

{
    "__type": "Pointer",
    "className": "_User",
    "objectId": "5YImRRMWLs"
}

It seems that this actually allows to reference User objects by "literal" Parse pointer, although I wouldn't know why anyone would store a pointer like this in the DB considering that it just bloats the MongoDB document, other than maybe by mistake. From the comments this has apparently been added to reference an array of users.

// constraint for single pointer setup
const q = {
[key]: userPointer,
};
// constraint for users-array setup
const qa = {
[key]: { $all: [userPointer] },
};

Since Parse Server does not have a field type for "Array of Pointers" (≠ Relations), maybe it was assumed that an array of pointers would be stored in a field of type Array with the literal pointer objects as elements. That's my guess.

I would conclude, that the query is that complex intentionally, even if it imposes costs (data traffic, DB node resources) on users who don't use this type of pointer reference, so there is room for improvement.

@mess-lelouch
Copy link
Contributor Author

mess-lelouch commented May 27, 2020

@mtrezza Thanks for the feedback!

I understand that the performance issue you mention is not related to the query you point out here.

Well... sometime ago I had an issue with a mongodb $or query that was sending my db instance's CPU utilization to 100% and I was pointed out to https://jira.mongodb.org/browse/SERVER-36393. The moment I deployed my PS application with 4.2.0, CPU util quickly went to 100%. Sadly, I cannot confirm at the moment that the issue in the link is what occurred to me when I deployed the PS application. That said, I think the issue in the link can be avoided with proper maintenance of the table's indexes (e.g. removing unused).

In other words, the issue at question here is whether the query you found is unnecessarily verbose, thus realistically having a potential performance and data traffic implication.

Right, Since I am not able to confirm what I stated above, this is the primary reason why I opened the issue. Under normal conditions the queries work well. However, I speculate that it will add unnecessary stress that will make the system less scalable under heavy loads.

I assume what you identified as "complex" is the additional $or'ed

{
    "__type": "Pointer",
    "className": "_User",
    "objectId": "5YImRRMWLs"
}

Well, regardless of the _User pointer clause, if I create a simple Parse.Query with a single equalTo clause, I would expect the resultant query to be simple as well. I know there are things that are added under the hood (e.g. the _rperm clause), but I don't expect it to be translated to a more complex query.

What is more interesting is that in the current PS version that I am running (3.6.0), this behavior does not happen and the query gets translated to the following mongodb query, which I consider simpler and more acceptable:

	"command" : {
		"find" : "FooClass",
		"filter" : {
			"name" : "some-name",
			"_p_user" : "_User$5YImRRMWLs",
			"_rperm" : {
				"$in" : [
					null,
					"*",
					"*",
					"role:role1",
					"role:role2",
					"5YImRRMWLs"
				]
			}
		},
		"sort" : {
			
		},
		"projection" : {
			
		},
		"limit" : 100,
		"returnKey" : false,
		"showRecordId" : false,
		"$db" : "test"
	},

@mtrezza
Copy link
Member

mtrezza commented May 27, 2020

I have updated my previous comment.

if I create a simple Parse.Query with a single equalTo clause, I would expect the resultant query to be simple as well

It seems this issue is only related to pointer permissions. The query is that complex to allow pointer permissions for an array of user pointers, not only a single user pointer. If my previous comment analyzed this correctly, then this is actually a legacy issue that only occurs when a schema contains readUserFields / writeUserFields. The issue should not occur with pointer permissions that are set under each schema operator (find, etc).

in the current PS version that I am running (3.6.0), this behavior does not happen

That is likely because you don't have readUserFields / writeUserFields in your schema, so the query stays simple. As I pointed out in my previous comment, I was not able to add these fields to the schema by using dashboard 2.1.0. I haven't been involved with the schema ACLs, so I couldn't tell if it's intended to deprecate these fields anyway.

@mess-lelouch
Copy link
Contributor Author

mess-lelouch commented May 27, 2020

That is likely because you don't have readUserFields / writeUserFields in your schema, so the query stays simple. As I pointed out in my previous comment, I was not able to add these fields to the schema by using dashboard 2.1.0. I haven't been involved with the schema ACLs, so I couldn't tell if it's intended to deprecate these fields anyway.

  • Yes, I made sure I had those. Since Parse Server version < 4 does not expose explain, I had to enable logs in my mongo instance and observe them. The output I got was the one I pasted my my previous response.

  • As I mentioned above, if you install dashboard version 2.0.4 (npm install -g parse-dashboard@2.0.4) you will get this behavior. I even included a video recording a couple of comments above: https://recordit.co/mFsbZb8sjm

@mtrezza
Copy link
Member

mtrezza commented May 27, 2020

Yes, so apart from the query complexity, is there another issue?

@BufferUnderflower
Copy link
Contributor

I haven't been involved with the schema ACLs, so I couldn't tell if it's intended to deprecate these fields anyway.

That was exactly my idea behind the PR, since these fields don't allow any granularity. I tried not to break backwards compatibility with '[read/write]UserFields'. However chances are I've messed smth. Will be able to play with the code next week only.

I believe this is the code in question

@mess-lelouch
Copy link
Contributor Author

mess-lelouch commented May 27, 2020

Yes, so apart from the query complexity, is there another issue?

Nope, that is the main reason of this issue. I think I had identified a workaround that we could use to do the upgrade (i.e. using the new dashboard and resetting some of the CLPs), but it would be great if this issue gets fixed to avoid any unexpected performance issues in the future.

@BufferUnderflower
Copy link
Contributor

BufferUnderflower commented May 27, 2020

more interesting is that in the current PS version that I am running (3.6.0), this behavior does not happen and the query gets translated to the following mongodb query, which I consider simpler and more acceptable:

	"command" : {
		"find" : "FooClass",
		"filter" : {
			"name" : "some-name",
			"_p_user" : "_User$5YImRRMWLs",
...```

It is definitely simpler, but it does not consider the columns of type 'Array', where you can hold an array of users who are granted some permission by pointer.

@mtrezza
Copy link
Member

mtrezza commented May 27, 2020

OK, so there are 2 separate issues:

  1. "complex" query

@mess-lelouch
The query exists for 4 years now, since the introduction of pointer permissions (PP). I would not consider it a bug and it's probably just a slight improvement when using PP. Do you have a suggestion for a PR?

  1. legacy fields not set

@BufferUnderflower
Maybe we can just deprecate the readUserFields / writeUserFields fields and add a migration script that runs on start of parse server to translate all readUserFields / writeUserFields into operator entries?

this behavior does not happen

Do you have the readUserFields / writeUserFields fields in the schema? It's required to replicate the query.

@BufferUnderflower
Copy link
Contributor

As for the addPointerPermission query builder - we have schema instance there. So it is possible to only add "Array $or" when there are Array columns present in class schema. And same for pointers. If there are both, I can' think of any way how to combine 2 different queries without $or, do you?

@mtrezza
Copy link
Member

mtrezza commented May 27, 2020

As for the addPointerPermission query builder - we have schema instance there. So it is possible to only add "Array $or" when there are Array columns present in class schema. And same for pointers.

@mess-lelouch Would you want to look into a PR for this?

@mess-lelouch
Copy link
Contributor Author

It is definitely simpler, but it does not consider the columns of type 'Array', where you can hold an array of users who are granted some permission by pointer.

Ahh I see... I didn't knew this was possible. I always assumed it only worked on a single _User pointer column. That said, not sure if I would encourage using Array of user pointers as CLP over roles, but if this behavior exists in previous versions then I agree it should be preserved.

I am interested to see how this works in 3.6.0 so I will give it a try.

Do you have the readUserFields / writeUserFields fields in the schema? It's required to replicate the query.

@BufferUnderflower Yeah, this is the interesting bit. And this is the main reason I started investigating everything. Had the behavior remained the same in 4.x.x I would not have noticed it. I have been testing for the past day switching between versions on my Parse Server application.

@mess-lelouch Would you want to look into a PR for this?

@mtrezza not sure, but I will try to make some time for it.

@mess-lelouch
Copy link
Contributor Author

@BufferUnderflower I think I found a very related issue (to this one) but I am not sure if it is an issue or if the behavior was replaced by something else. I haven't created a github issue yet since I haven't confirmed if it is a bug or not.

https://community.parseplatform.org/t/permission-denied-for-action-find-on-class/365

@mtrezza
Copy link
Member

mtrezza commented Jul 27, 2020

I am closing this as it seems to be resolved by #6747. Feel free to comment if you have any questions and we can re-open this issue.

However, there is still the issue of legacy fields not set that was revealed in the process.
@BufferUnderflower Does it makes sense to open a separate issue for this or is this even already in the works?

@mtrezza mtrezza closed this as completed Jul 27, 2020
@mess-lelouch
Copy link
Contributor Author

I am closing this as it seems to be resolved by #6747. Feel free to comment if you have any questions and we can re-open this issue.

However, there is still the issue of legacy fields not set that was revealed in the process.
@BufferUnderflower Does it makes sense to open a separate issue for this or is this even already in the works?

@mtrezza I already submitted a fix for the pointer CLPs issue here parse-community/parse-dashboard#1556. It is already merged, but please let me know if there are any questions or concerns and I will go ahead and address them.

@mtrezza
Copy link
Member

mtrezza commented Jul 27, 2020

Amazing! Thanks for the fix.

@mtrezza mtrezza added type:feature New feature or improvement of existing feature and removed type:improvement labels Dec 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature New feature or improvement of existing feature
Projects
None yet
Development

No branches or pull requests

3 participants