Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding proper OpenTelemetry integration via. registration helpers and better context propagation #1528

Merged
merged 14 commits into from
Jun 4, 2024

Conversation

stebet
Copy link
Contributor

@stebet stebet commented Apr 2, 2024

Proposed Changes

  • Making it possible to override the default context propagation in RabbitMQActivitySource to make it possible to have proper OpenTelemetry Baggage propagation
  • Adding a separate OpenTelemetry integration package to make it easier to integrate with OTel and configure the RabbitMQActivitySource
  • Adding new tests for OTel integration package and context propagation

Types of Changes

  • Bug fix (non-breaking change which fixes issue #NNNN)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause an observable behavior change in existing systems)
  • Documentation improvements (corrections, new content, etc)
  • Cosmetic change (whitespace, formatting, etc)

Checklist

  • I have read the CONTRIBUTING.md document
  • I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
  • All tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in related repositories

Further Comments

Would love to get some input from @lmolkova and @martinjt as well :)

…abbitMQActivitySource to make it possible to add OpenTelemetry Baggage propagation

* Adding a separate OpenTelemetry integration package to make it easier to integrate with OTel
* Adding OTel tests to test trace context and baggage propagation
@stebet
Copy link
Contributor Author

stebet commented Apr 2, 2024

@lukebakken and @michaelklishin This PR is not quite ready yet, want to do a little cleanup first to make sure everything works as intended :)

@lukebakken lukebakken marked this pull request as draft April 2, 2024 15:14
@lukebakken
Copy link
Contributor

@stebet we can leave this PR as a Draft until it's ready.

@stebet
Copy link
Contributor Author

stebet commented Apr 2, 2024

@lukebakken I'm thinking I might just move the OTel integration tests to the SequentialIntegration test project. They are pretty much doing a very similar thing to the ActivitySource integration tests and just referencing an extra project (not drastically different like the OAuth2 tests). Does that make sense to not convolute the test projects and pipelines?

public static TracerProviderBuilder AddRabbitMQ(this TracerProviderBuilder builder,
RabbitMQOpenTelemetryConfiguration configuration)
{
if (configuration.PropagateBaggage)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason not to allow user to provide a preconfigured propagator instance?
E.g. users could want to use B3 propagation or XRay AWS propagator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason bu my lack of knowledge :). Are you then referring to these as alternatives to the tracestate propagator? Are there examples for me to look at?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean you're forcing a specific propagator - a CompositeTextMapPropagator or TraceContextPropagator which is not necessary.

I believe you should just use the DefaultTextMapPropagator.
Users can configure it however they want to, check out https://github.com/open-telemetry/opentelemetry-dotnet/blob/9b246750ab76e94986fc433d5c7af046a137e477/src/OpenTelemetry.Extensions.Propagators/README.md?plain=1#L29

If I understand the intent correctly you want to users to be able to disable baggage propagation - any specific reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean you're forcing a specific propagator - a CompositeTextMapPropagator or TraceContextPropagator which is not necessary.

I believe you should just use the DefaultTextMapPropagator.

Users can configure it however they want to, check out https://github.com/open-telemetry/opentelemetry-dotnet/blob/9b246750ab76e94986fc433d5c7af046a137e477/src/OpenTelemetry.Extensions.Propagators/README.md?plain=1#L29

Ahh gotcha! Saw that the DefaultTextMapPropagator was a Noop so I thought this was the way. I'll change it :)

If I understand the intent correctly you want to users to be able to disable baggage propagation - any specific reason?

Nope. I'll remove it. Does it make sense to then have a composite propagator with the default textmap propagator and the baggage propagator or do users normally configure the baggage propagator themselves?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The otel sdk should set the default one to the composite.

Not completely sure what sets OpenTelemetry.Context.Propagation.Propagators.DefaultTextMapPropagator, but I see it's set after OTel SDK is configured:)

s_propagator = new TraceContextPropagator();
}

RabbitMQActivitySource.UseRoutingKeyAsOperationName = configuration.UseRoutingKeyAsOperationName;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you feel it's something we should document better /provide more clear guidance on in the messaging semconv?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good idea if people are running into high-cardinality issues with their messaging span names.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Operation the right name here? Since this is focused on OpenTelemetry, I'm wondering if just "SpanName" is a better wording.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTel config: UseRoutingKeyInSpanName
RabbitMQActivitySource property: UseRoutingKeyInActivityName
Make more sense?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good idea if people are running into high-cardinality issues with their messaging span names.

the span name cardinality requirements are soft (undefined) - open-telemetry/opentelemetry-specification#3534. Things like HTTP route we put there are like middlish-cardinality and I think the same is true about queue names/routing key.

so if cardinality is the only problem, I'd not make span name configurable (worst case users can rename spans with a processor). So again, no strong opinions

RabbitMQActivitySource.ContextExtractor = OpenTelemetryContextExtractor;
RabbitMQActivitySource.ContextInjector = OpenTelemetryContextInjector;

if (configuration.IncludeSubscribers)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any strong opinion here, but want to challenge the need for this flag (and IncludePublishers)
if users want more control, they can always call AddSource themselves.

I guess my broader question is when would users want to enable rabbitmq via an instrumentation library if all they really need to do is call
builder.AddSource("RabbitMQ.Client.*")

Copy link
Contributor Author

@stebet stebet Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any strong opinion here, but want to challenge the need for this flag (and IncludePublishers) if users want more control, they can always call AddSource themselves.

I guess my broader question is when would users want to enable rabbitmq via an instrumentation library if all they really need to do is call builder.AddSource("RabbitMQ.Client.*")

The main reason for the library was to simplify the OTel configuration and setup for users (with the extension method) and to work around the current issue that since OTel and Activity Baggage propagation is different we do not have to make the OpenTelemetry.Api package a dependency on the RabbitMQ library itself.

Also it takes care of configuring the custom propagators to use the OTel propagators instead of the default Activity propagators.

I also just changed the defaults in the latest commits to have both publishers and subscriber event sources enabled, so users can just call .AddRabbitMQInstrumentation() and get both publishers, subscribers and OTel propagation without the need configure anything else by themselves.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm honestly not sure whether the separation of allowing people to turn on/off subscriber or publisher. Is that actually a usecase?

I can understand adding it here if that's a usecase seeing as this is for simplification. My question is, should you allow people to just do one or the other?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm honestly not sure whether the separation of allowing people to turn on/off subscriber or publisher. Is that actually a usecase?

I can understand adding it here if that's a usecase seeing as this is for simplification. My question is, should you allow people to just do one or the other?

No idea, but I think it was you and @lmolkova that suggested I separated the ActivitySources in the earlier OTel pr: https://github.com/rabbitmq/rabbitmq-dotnet-client/pull/1261/files#r1392469382

I have no preference either way really

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having different names is more future-proof, but also it costs nothing since you can enable sources with wildcard.

I didn't try to push you to remove the instrumentation library, was just trying to understand what'd be the better case for it.

The Activity.Baggage problem is an interesting one. @martinjt do you know if there are any plans to make baggages work better together? I can create issues/ping people/etc - it'd be great if we could solve it for everyone without instrumentations trying to patch it in each case.

Copy link
Contributor Author

@stebet stebet Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try to push you to remove the instrumentation library, was just trying to understand what'd be the better case for it.

No worries. I didn't take it as such :)

The Activity.Baggage problem is an interesting one. @martinjt do you know if there are any plans to make baggages work better together? I can create issues/ping people/etc - it'd be great if we could solve it for everyone without instrumentations trying to patch it in each case.

Here's an old issue that relates to it (CC @cijothomas): open-telemetry/opentelemetry-dotnet#1842

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Activity baggage vs OTel Baggage is still an unresolved problem! It'll eventually be fixed, though we don't have specifics or timeline. (its not likely coming in this year, there is nothing tracking it in .NET 9 release timeframe.)

@danielmarbach
Copy link
Collaborator

@stebet thanks for the ping. Currently on vacation and after that pretty occupied with some other things. I'm pinging @lailabougria for a review after her vacation. She has lots of knowledge in this area and it quite likely the better reviewer for this type of work anyway

stebet added 3 commits April 3, 2024 09:02
* Use the DefaultTextMapPropagator and make it possible for uses to configure their own.
* Updated tests to configure the propagator.
* Updating README.md with basic instructions for use
@stebet stebet requested a review from lmolkova April 3, 2024 09:13
@stebet
Copy link
Contributor Author

stebet commented Apr 3, 2024

@lukebakken I think this PR is ready for a proper review now.

Thanks for the feedback @lmolkova and @martinjt, feel free to provide additional comments and feedback if you have time to spare :)

CCing @davidfowl and @JamesNK in case they wants to take a look as well from an Aspire perspective.

@stebet stebet marked this pull request as ready for review April 3, 2024 09:17
@stebet
Copy link
Contributor Author

stebet commented Apr 3, 2024

This PR should also help with #1478

@stebet
Copy link
Contributor Author

stebet commented Apr 4, 2024

Think I've resolved most of the comments @lukebakken :)

@lailabougria
Copy link

Thanks, @danielmarbach, for the ping!

Regarding @martinjt's comment:

I'm honestly not sure whether the separation of allowing people to turn on/off subscriber or publisher. Is that actually a usecase?

I believe splitting the ActivitySources is ok even though it may not really be necessary eventually; having them split does provide more flexibility and possibly avoids breaking changes down the road.

However, what stood out to me is that the user can't easily benefit from those separate ActivitySources with the proposed instrumentation API, as there's no way to configure them separately and the API adds both sources by default. Assuming users will use the instrumentation API AddRabbitMQInstrumentation, instead of AddSource, they can't opt-in to individual activity sources that way, so they'd have to purposefully use AddSource instead of the instrumentation API, which also means the other code in that method won't be invoked (meaning the baggage propagation issue is still at play, and the propagators aren't set automatically, as @stebet explained in this comment).

That may be something to consider in the design of the instrumentation APIs if the decision is to keep the ActivitySources split.

Comment on lines +16 to +17
RabbitMQActivitySource.ContextExtractor = OpenTelemetryContextExtractor;
RabbitMQActivitySource.ContextInjector = OpenTelemetryContextInjector;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need this part? Can it be set by default in main package?

I would expect that this method will only call builder.AddSource("RabbitMQ.Client.*") as a convenient way to register it in TracerProviderBuilder. Other parts should be done in the instrumentation library.

Alternative options is to have here the second method:

public static TracerProviderBuilder AddRabbitMQInstrumentation(this TracerProviderBuilder builder, Action<RabbitMQSpecificOptions>? configure);

where you can configure other methods externally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenTelemetryContextInjector is OTel specific for Baggage handling, that's pretty much the only reason for the package, because we don't want to pull OpenTelemetry.API dependencies into the main library.

@stebet
Copy link
Contributor Author

stebet commented Apr 18, 2024

Thanks, @danielmarbach, for the ping!

Regarding @martinjt's comment:

I'm honestly not sure whether the separation of allowing people to turn on/off subscriber or publisher. Is that actually a usecase?

I believe splitting the ActivitySources is ok even though it may not really be necessary eventually; having them split does provide more flexibility and possibly avoids breaking changes down the road.

However, what stood out to me is that the user can't easily benefit from those separate ActivitySources with the proposed instrumentation API, as there's no way to configure them separately and the API adds both sources by default. Assuming users will use the instrumentation API AddRabbitMQInstrumentation, instead of AddSource, they can't opt-in to individual activity sources that way, so they'd have to purposefully use AddSource instead of the instrumentation API, which also means the other code in that method won't be invoked (meaning the baggage propagation issue is still at play, and the propagators aren't set automatically, as @stebet explained in this comment).

That may be something to consider in the design of the instrumentation APIs if the decision is to keep the ActivitySources split.

Thanks for the input @lailabougria! :)

We discussed this exact thing here: #1528 (comment)

This PR actually had that configuration at the start but was removed after the discussion.

The consensus seemed to be that it didn't really make sense from the end-user perspective. I'd love more knowledgeable people (from an end-users view) to chime in so we can finish this. Back to you @lmolkova, @cijothomas and @martinjt :)

@lukebakken
Copy link
Contributor

@stebet I merged main into your branch to resolve conflicts, but I had to make this change, which I'm guessing may not be correct:

https://github.com/rabbitmq/rabbitmq-dotnet-client/pull/1528/files#diff-f6d8753c982a8409794d13d5a9263aea70541146d5f4255c86e4e5150f6b9a1dR1072-R1075

Merging #1233 will probably address the above, however.

@lukebakken
Copy link
Contributor

Just FYI for everyone who has commented and contributed here, I would like to release version 7 this month (May 2024).

@lukebakken lukebakken requested a review from WeihanLi June 3, 2024 17:01
@lukebakken
Copy link
Contributor

Hi everyone -

In the interest of shipping v7 of this library this month, I am going to merge this PR as-is today or tomorrow. I am going to release RC1 of version 7 by the end-of-day tomorrow, June 4th, 2024.

I am assuming the work here is complete. As I've said in other OpenTelemetry-related PRs, I'm not very familiar with OTel and rely on others' expertise.

@lukebakken lukebakken self-assigned this Jun 3, 2024
@lukebakken lukebakken added this to the 7.0.0 milestone Jun 3, 2024
Copy link
Contributor

@lukebakken lukebakken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @stebet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants