-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process query rows one at a time to reduce memory footprint #15268
Conversation
4157470
to
b548207
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left one comment, otherwise LGTM 👍
def fetchall(self): | ||
return self.__data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to keep this one now we use fetchone
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, I think I was keeping it around "just in case" but that's not a good justification, I'll drop that.
c070420
to
83bd212
Compare
raw_version = self.execute_query_raw("select current_version();") | ||
version = raw_version[0][0] | ||
raw_version = next(self.execute_query_raw("select current_version();")) | ||
version = raw_version[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the second [0]
not required here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of the next
right above. When using fetchall()
, raw_version
was a list of tuples, and thus we were accessing the first element of the first item in the list. With fetchone()
, execute_query_raw
is returning an iterator, for which we're taking the first element already with next
. Thus, in the new version of the code, raw_version
is a tuple from which we're only interested in the first element.
What does this PR do?
Changes
fetchall()
with iterating on the cursor, which is equivalent to fetching the results one by one.Motivation
A support case reported running into increased memory usages when the queries return a large number of row. The snowflake docs indicate that this is the way to go when
fetchall()
results in memory usage issues.Additional Notes
fetchall()
is usingfetchone()
in a loop.fetchone
.Review checklist (to be filled by reviewers)
changelog/
andintegration/
labels attachedqa/skip-qa
label.