-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename users clean or prune -- use aggregated user statistics #901
Conversation
…ange in utility functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides, a comment regarding command (and flag) backward compatibility, PR looks good!
@@ -19,21 +19,21 @@ This command removes users from a Sourcegraph instance who have been inactive fo | |||
|
|||
Examples: | |||
|
|||
$ src users clean -days 182 | |||
$ src users prune -days 182 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are such command/API changes usually made in src
cli? Should this be a major version increase or should we keep backward compatibility leaving both clean
and prune
commands and marking one deprecated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost all the functionality changes in terms of the command run locally, the old clean
command had some big timeout issues see #848. The name change from clean
to prune
was mostly just a style change. I don't think this command is being widely used so I think the change is alright. To my knowledge no one has put this on a cronjob or anything like that yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different than before in its performance characteristics? You're still fetching all users in the table, without doing any filtering. I don't see any improvement here?
site { | ||
users { | ||
nodes { | ||
username | ||
siteAdmin | ||
lastActiveAt | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intendation is off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is hitting the aggregated_user_statistics
table, and the value of lastActiveAt
doesn't need to access the event_logs
table the same way usageStatistics
used to. Try it out here, its much more performant.
aggregated_user_statistics
table is updating in the background, so this isn't accessing the event logs table at the time of the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/sourcegraph/src-cli/pull/903/files Indentation fix
Okay, so we're removing the N+1 error, got it. That's only 1 of 2 possible performance issues, the other is that we transfer all users over the wire. (I don't think we want that on an instance with 10k+ users). But I just checked and it turns out that the code here doesn't work: Here, run the query (note that I added
that means you're only cleaning up the first 100 users. |
Ah, good catch @mrnugget will open a follow up on this to paginate and maybe implement a filter similar to what we were doing here: https://github.com/sourcegraph/sourcegraph/pull/43370 |
* added new type to process siteUser grahQL type -- still needing to change in utility functions * code compiling but not registering users to remove correctly * correctly structured json response * remove debugging log * rename command src users clean to src users prune * rename null flag * changelog entry * correct json formating * correct command name in users general help * style correct users.go
This PR refactors use of the
usageStatistics
lastActiveUsage
table which caused the command to timeout during execution on instances with a sufficiently large number of users. See #848Instead the
aggregated_user_statistics
table is used via thesite
graphQL endpointThis has two benefits
lastActiveAt
from thesite
graphQL endpoint is much more efficient thanusageStatistics
, and eliminates timeout concerns on many users instancesaggregated_user_statistics
persists the datelastActiveAt
eliminating the concern ofnull
lastActiveUsage
values which are misleading.lastActiveUsage
is computed by reference to theevent_logs
table which only persists event entries for 93 days, when 93 days pass if a user had no entries inevent_logs
lastActiveUsage
was set tonull
-- previouslysrc users clean
interpreted anull
value as a user never having used the instance. A flag-remove-null-users
is provided to removenull
lastActiveAt` usersThis PR also changes the name of the
src users clean
command renaming it tosrc users prune
which follows thedocker
andkubectl
cleanup command conventionCloses #848
Test plan
Tested on local machine