Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chore: Benchmarking #1028

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

alkatrivedi
Copy link
Contributor

Benchmarking long running session removals

Committer: Alka Trivedi <alkatrivedi12dec@gmail.com>
On branch benchmarking
Changes to be committed:
	new file:   samples/samples/benchmarking.py
@google-cla
Copy link

google-cla bot commented Oct 25, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: spanner Issues related to the googleapis/python-spanner API. labels Oct 25, 2023
def query_data(thread_id):
print("running thread ", thread_id)
start_time = time.time()
time.sleep(10)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this sleep time, instead you can increase the # of transaction per run if you want the benchmark to execute over a long period of time.


# Create a Spanner database instance
instance = spanner_client.instance(instance_id)
pool = pool.FixedSizePool(size = 10, logging_enabled=True)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • What's the default size size of FixedSizePool ? Can we avoid passing size = 10 and use the default size? That will take us closer to what a general customer will use.
  • @surbhigarg92 Are we using logging_enabled=True as a global option? I thought we had refactored this to be a pool option similar to close_inactive_transactions

p90Index = math.floor(0.9*len(latencies))
p90Latency = latencies[p90Index]

return [p50Latency, p90Latency]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also evaluate p99.

pool = pool.FixedSizePool(size = 10, logging_enabled=True)
database = instance.database(pool=pool, database_id=database_id, close_inactive_transactions=True)

transaction_time = []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A global variable for transaction_time is mixing up the results of different kinds of transactions. I see you have 3 types here - executeSql, batch transaction and DML (insert). It would be good to profile what is the latency for each type of transaction so that we understand if any of them is regressing or not.

"SELECT 1 FROM Singers"
)

# for row in results:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: could you remove the commented out code?

# [END spanner_query_data]

# [START spanner_batch_transaction]
def batch_transaction(thread_id):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this function batch_transaction invoked? Similarly where is insert_with_dml invoked?

# [END insert_with_dml]

# Define the number of threads
num_threads = 20
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • At max only 10 threads will be used up since there are only 10 sessions. Post that 10 threads will always wait.
  • Currently you are running 1 query per thread, it will be good to increase the number of transactions per thread. For example, 2000 transactions in total. 20 threads will make it 200 transactions per thread. This will ensure the run executes for more time and does not finish in < 1 second. This will also allow you to remove sleep() of 10s.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the googleapis/python-spanner API. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants