Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename users clean or prune -- use aggregated user statistics #901

Merged
merged 10 commits into from
Dec 15, 2022

Conversation

DaedalusG
Copy link
Contributor

This PR refactors use of the usageStatistics lastActiveUsage table which caused the command to timeout during execution on instances with a sufficiently large number of users. See #848

Instead the aggregated_user_statistics table is used via the site graphQL endpoint

query getInactiveUsers {
  site {
    users {
      nodes {
        id
        username
        email
        siteAdmin
        lastActiveAt
      }
    }
  }
}

This has two benefits

  1. The backend to query lastActiveAt from the site graphQL endpoint is much more efficient than usageStatistics, and eliminates timeout concerns on many users instances
  2. aggregated_user_statistics persists the date lastActiveAt eliminating the concern of null lastActiveUsage values which are misleading. lastActiveUsage is computed by reference to the event_logs table which only persists event entries for 93 days, when 93 days pass if a user had no entries in event_logs lastActiveUsage was set to null -- previously src users clean interpreted a null value as a user never having used the instance. A flag -remove-null-users is provided to remove null lastActiveAt` users

This PR also changes the name of the src users clean command renaming it to src users prune which follows the docker and kubectl cleanup command convention

Closes #848

Test plan

Tested on local machine

@DaedalusG DaedalusG marked this pull request as ready for review December 15, 2022 10:52
Copy link
Contributor

@erzhtor erzhtor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides, a comment regarding command (and flag) backward compatibility, PR looks good!

@@ -19,21 +19,21 @@ This command removes users from a Sourcegraph instance who have been inactive fo

Examples:

$ src users clean -days 182
$ src users prune -days 182
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are such command/API changes usually made in src cli? Should this be a major version increase or should we keep backward compatibility leaving both clean and prune commands and marking one deprecated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost all the functionality changes in terms of the command run locally, the old clean command had some big timeout issues see #848. The name change from clean to prune was mostly just a style change. I don't think this command is being widely used so I think the change is alright. To my knowledge no one has put this on a cronjob or anything like that yet.

@DaedalusG DaedalusG merged commit 0476543 into main Dec 15, 2022
@DaedalusG DaedalusG deleted the users_clean-use-aggregated_user_statistics branch December 15, 2022 19:11
Copy link
Contributor

@mrnugget mrnugget left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different than before in its performance characteristics? You're still fetching all users in the table, without doing any filtering. I don't see any improvement here?

Comment on lines +73 to +82
site {
users {
nodes {
username
email
siteAdmin
lastActiveAt
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intendation is off

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hitting the aggregated_user_statistics table, and the value of lastActiveAt doesn't need to access the event_logs table the same way usageStatistics used to. Try it out here, its much more performant.

aggregated_user_statistics table is updating in the background, so this isn't accessing the event logs table at the time of the request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrnugget
Copy link
Contributor

Okay, so we're removing the N+1 error, got it.

That's only 1 of 2 possible performance issues, the other is that we transfer all users over the wire. (I don't think we want that on an instance with 10k+ users).

But I just checked and it turns out that the code here doesn't work: site { users } is paginated and only returns the first 100 users by default. You need to paginate through the results.

Here, run the query (note that I added totalCount), copy the output into result.json and then check with jq:

$ cat result.json | jq '.data.site.users.nodes | length'
100
$ cat result.json | jq '.data.site.users.totalCount'
208

that means you're only cleaning up the first 100 users.

@DaedalusG
Copy link
Contributor Author

Ah, good catch @mrnugget will open a follow up on this to paginate and maybe implement a filter similar to what we were doing here: https://github.com/sourcegraph/sourcegraph/pull/43370

scjohns pushed a commit that referenced this pull request Apr 24, 2023
* added new type to process siteUser grahQL type -- still needing to change in utility functions

* code compiling but not registering users to remove correctly

* correctly structured json response

* remove debugging log

* rename command src users clean to src users prune

* rename null flag

* changelog entry

* correct json formating

* correct command name in users general help

* style correct users.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve performance of src users clean in instances with many users
3 participants