-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chore: Benchmarking #1028
base: main
Are you sure you want to change the base?
Chore: Benchmarking #1028
Conversation
Committer: Alka Trivedi <alkatrivedi12dec@gmail.com> On branch benchmarking Changes to be committed: new file: samples/samples/benchmarking.py
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
def query_data(thread_id): | ||
print("running thread ", thread_id) | ||
start_time = time.time() | ||
time.sleep(10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this sleep time, instead you can increase the # of transaction per run if you want the benchmark to execute over a long period of time.
|
||
# Create a Spanner database instance | ||
instance = spanner_client.instance(instance_id) | ||
pool = pool.FixedSizePool(size = 10, logging_enabled=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- What's the default size size of
FixedSizePool
? Can we avoid passingsize = 10
and use the default size? That will take us closer to what a general customer will use. - @surbhigarg92 Are we using
logging_enabled=True
as a global option? I thought we had refactored this to be a pool option similar toclose_inactive_transactions
p90Index = math.floor(0.9*len(latencies)) | ||
p90Latency = latencies[p90Index] | ||
|
||
return [p50Latency, p90Latency] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also evaluate p99.
pool = pool.FixedSizePool(size = 10, logging_enabled=True) | ||
database = instance.database(pool=pool, database_id=database_id, close_inactive_transactions=True) | ||
|
||
transaction_time = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A global variable for transaction_time
is mixing up the results of different kinds of transactions. I see you have 3 types here - executeSql, batch transaction and DML (insert). It would be good to profile what is the latency for each type of transaction so that we understand if any of them is regressing or not.
"SELECT 1 FROM Singers" | ||
) | ||
|
||
# for row in results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: could you remove the commented out code?
# [END spanner_query_data] | ||
|
||
# [START spanner_batch_transaction] | ||
def batch_transaction(thread_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this function batch_transaction
invoked? Similarly where is insert_with_dml
invoked?
# [END insert_with_dml] | ||
|
||
# Define the number of threads | ||
num_threads = 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- At max only 10 threads will be used up since there are only 10 sessions. Post that 10 threads will always wait.
- Currently you are running 1 query per thread, it will be good to increase the number of transactions per thread. For example, 2000 transactions in total. 20 threads will make it 200 transactions per thread. This will ensure the run executes for more time and does not finish in < 1 second. This will also allow you to remove sleep() of 10s.
Benchmarking long running session removals