Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

impl(pubsub): fix scopes #13027

Merged
merged 4 commits into from
Nov 3, 2023
Merged

impl(pubsub): fix scopes #13027

merged 4 commits into from
Nov 3, 2023

Conversation

alevenberg
Copy link
Member

@alevenberg alevenberg commented Nov 2, 2023

@alevenberg alevenberg temporarily deployed to internal November 2, 2023 20:07 — with GitHub Actions Inactive
@product-auto-label product-auto-label bot added the api: pubsub Issues related to the Pub/Sub API. label Nov 2, 2023
Copy link

codecov bot commented Nov 2, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (3c6ad6f) 93.61% compared to head (f5dd4cf) 93.61%.
Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #13027   +/-   ##
=======================================
  Coverage   93.61%   93.61%           
=======================================
  Files        2082     2082           
  Lines      181972   181972           
=======================================
+ Hits       170360   170362    +2     
+ Misses      11612    11610    -2     
Files Coverage Δ
...gle/cloud/pubsub/internal/tracing_message_batch.cc 98.66% <100.00%> (ø)

... and 6 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alevenberg alevenberg marked this pull request as ready for review November 2, 2023 21:42
@alevenberg alevenberg requested a review from a team as a code owner November 2, 2023 21:42
google/cloud/pubsub/internal/tracing_message_batch.cc Outdated Show resolved Hide resolved
google/cloud/pubsub/internal/tracing_message_batch.cc Outdated Show resolved Hide resolved
google/cloud/pubsub/internal/tracing_message_batch.cc Outdated Show resolved Hide resolved
Copy link
Member

@dbolduc dbolduc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it is worth exploring whether we can just wrap the BatchSink in a thing that traces.

We can add BatchSink::SaveMessage(pubsub::Message m) = 0, and we could do away with pubsub_internal::MessageBatch.

Then we can just compose the BatchSink::AsyncPublish() APIs. That is essentially what you are doing anyway with the Flush() callbacks.

This all seems more straightforward to me. WDYT?

Comment on lines 59 to 60
internal::DetachOTelContext(
opentelemetry::context::RuntimeContext::GetCurrent());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detaching here looks strange to me. It is either clever or a hack.

Can you say which context (i.e. which span name) we are detaching in the comment? I do not remember which one is supposed to be current.


So we have N messages, and on the Nth one the batch is full so we flush it. We hit this code with the "publisher send" span active. Is that right?

And the logic is that even though the code goes from message N to flushing the batch, we just don't think a parent-child relationship makes sense. It was all of the messages from 1-N that led to the batch flushing.


Alternatively, we might want to detach all of these "publisher send" spans earlier if no work is ever done on their behalf. (If work is done on behalf of "publisher send" spans after the flush.... then we probably can't detach the "publisher send" context here...).

Copy link
Member Author

@alevenberg alevenberg Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detaching here looks strange to me. It is either clever or a hack.

On second look, this is not needed. I removed it and it generated the same spans.

Can you say which context (i.e. which span name) we are detaching in the comment? I do not remember which one is supposed to be current.

It's detaching the "x publish" span created in the PublisherTracingConnection.

So we have N messages, and on the Nth one the batch is full so we flush it. We hit this code with the "publisher send" span active. Is that right?

Yes

And the logic is that even though the code goes from message N to flushing the batch, we just don't think a parent-child relationship makes sense. It was all of the messages from 1-N that led to the batch flushing.

Yes correct, since we are using links to associate them

Alternatively, we might want to detach all of these "publisher send" spans earlier if no work is ever done on their behalf. (If work is done on behalf of "publisher send" spans after the flush.... then we probably can't detach the "publisher send" context here...).

I think this is not relevant now, see my comment below

Copy link
Member Author

@alevenberg alevenberg Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, with the hack part removed, does this change sound good?

The alternative I think is possible based on what you described.

Add a SaveMessage API to Batch Sink, Move TracingMB to TracingBatchSink, and wrap it, but I think it will be 2-3 weeks of work for the same effect. Made an issue for tracking (#13035)

The only thing I'm not sure about is the BatchSink holds the CQ so we try to just make one. So I think it might be a little ugly but we want to do something like

    if (!key.empty()) {
          // Only wrap the sink if there is an ordering key.
    used_sink = pubsub_internal::TracingBatchSink::Create(std::move(pubsub_internal::SequentialBatchSink::Create(
              std::move(used_sink)))
        } else {
          used_sink = pubsub_internal::TracingBatchSink::Create(
              std::move(used_sink));
}

in publisher connection, since we only want add the seq batch sink.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this might be a change we want to make after there exists an end to end integration test that verifies the spans #13034

Copy link
Member

@dbolduc dbolduc Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be 2-3 weeks of work for the same effect.

  1. I am glad you have learned the ways of conservative estimates. I don't think it would be too bad, though.
  2. Sure the effect would be the same, but the code to achieve that effect would be less complicated, and thus easier to maintain.

I think it might be a little ugly

It can be less ugly.

        auto used_sink = sink;
        if (!key.empty()) {
          // Only wrap the sink if there is an ordering key.
          used_sink = pubsub_internal::SequentialBatchSink::Create(
              std::move(used_sink));
        }
        if (TracingEnabled(opts)) used_sink = MakeTracingBatchSink(std::move(used_sink));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be 2-3 weeks of work for the same effect.

  1. I am glad you have learned the ways of conservative estimates. I don't think it would be too bad, though.
  2. Sure the effect would be the same, but the code to achieve that effect would be less complicated, and thus easier to maintain.

I promise I will revisit this, but it will definitely take a while. All of these interfaces are internal.

I want to get this in so the December release has the up to date pubsub otel changes.

I think it might be a little ugly

It can be less ugly.

        auto used_sink = sink;
        if (!key.empty()) {
          // Only wrap the sink if there is an ordering key.
          used_sink = pubsub_internal::SequentialBatchSink::Create(
              std::move(used_sink));
        }
        if (TracingEnabled(opts)) used_sink = MakeTracingBatchSink(std::move(used_sink));

Duh, that is better. Thanks.

@alevenberg
Copy link
Member Author

I messed this up somehow--- gonna fix the branch by force pushing. sorry!

@alevenberg
Copy link
Member Author

Unrelated failure Step #0: error building image: error building stage: failed to save layer: read tcp 192.168.10.2:42200->173.194.216.207:443: read: connection reset by peer

@alevenberg alevenberg merged commit dbe26c9 into googleapis:main Nov 3, 2023
@alevenberg alevenberg deleted the stuff branch November 3, 2023 19:51
Copy link
Member

@dbolduc dbolduc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it would be nice if we could verify that BatchSink::AsyncPublish is in fact the parent of the RPC.

If the batch sinks were layered, we could just pass in a MockBatchSink and verify that there is an active span when we call AsyncPublish on the mock.

Oh and even the span name "BatchSink::AsyncPublish" is telling us that the thing making the span should be a BatchSink 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the Pub/Sub API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Group rpc with publish span
3 participants