Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream simulation results in their own thread #1630

Merged
merged 6 commits into from
Feb 12, 2025

Conversation

joshhaug
Copy link
Contributor

@joshhaug joshhaug commented Jan 28, 2025

Description

This should not be a breaking change.

This PR introduces threaded uploads of simulation results to the database. This is a follow up to this PR in which streaming simulation results were introduced.

Implementation Details

  • Creates a new PostgresProfileQueryHandler class which abstracts out database query logic that was previously in PostgresProfileStreamer.
  • Due to some database design decisions, sim result segments must be inserted in-order.
    • Note that PostgresProfileStreamer.accept() calls will be in order.
      • But, if we were to spin up each upload in a separate thread, the upload completion times for each of those calls may not be in-order.
    • To get around this, this PR creates a new PostgresQueryQueue which enqueues concurrent uploads using an ExecutorService.
  • PostgresProfileStreamer creates a single instance of the query queue, which handles all uploads and is automatically closed.

Verification

Initially I had hoped to profile against the status prior to the original streaming PR but some version mismatches in clipper made this difficult.

I instead used the feature--aerie-3.2.0 branch of clipper-aerie. I then extended Clipper's gnc-all-activities plan by repeating its content every day for five days. I'm happy to make this plan available to anyone who would like to see it or use it for future reference, but I doubt I can upload it here in this public venue. I ran profiling against this extended gnc-all-activities twice, once with high-overhead memory instrumentation and once with async sampling. I encountered errors when extracting mission model parameters, but this apparently did not impact simulation results, and the error message in the console was not helpful.

Here are the results from using async sampling:

Branch Name Sim Dur Heap Mem Non-Heap Mem
develop 2:48 1.68 GB 62.42 MB
feature/streaming-thread 2:20 1.62 GB 62.78 MB

Statistics for these results:

  • 4% decrease in heap memory usage
  • 1% increase in non-heap memory usage
  • 16.6% speedup in sim duration

As expected, the primary gain here is in simulation duration, as we are no longer blocking simulation to upload results to the database. I'll attach some further sim results for anyone interested.

ProfilingResults.zip

Documentation

No documentation needs updating.

Future work

@joshhaug joshhaug requested a review from a team as a code owner January 28, 2025 19:47
@joshhaug joshhaug requested a review from Mythicaeda January 28, 2025 19:48
@Mythicaeda Mythicaeda removed the request for review from srschaffJPL January 30, 2025 19:57
@joshhaug joshhaug requested a review from mattdailis January 30, 2025 22:27
@joshhaug joshhaug changed the title Feature/streaming thread Stream simulation results in their own thread Feb 6, 2025
@joshhaug joshhaug merged commit a9dff7f into develop Feb 12, 2025
11 checks passed
@joshhaug joshhaug deleted the feature/streaming-thread branch February 12, 2025 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stream Simulation Results in their own thread
3 participants