-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid redundant shutdown in TracerProvider::drop when already shut down #2197
base: main
Are you sure you want to change the base?
Avoid redundant shutdown in TracerProvider::drop when already shut down #2197
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2197 +/- ##
======================================
Coverage 79.3% 79.4%
======================================
Files 121 121
Lines 20944 21047 +103
======================================
+ Hits 16612 16712 +100
- Misses 4332 4335 +3 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets make it similar to #2195
…TracerProviderInner::Shutdown
…tb/opentelemetry-rust into tracer-provider-drop-shutdown-check
Done. |
drop(provider3); | ||
|
||
// Verify shutdown was called exactly once | ||
assert!(assert_handle.0.is_shutdown.load(Ordering::SeqCst)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this verify that shutdown was called only once? It looks like it's only verifying that shutdown was called (could have been called once or multiple times)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I think I should be using CountingShutdownProcessor which was added in #2195.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have made this use CountingShutdownProcessor now.
opentelemetry/src/trace/mod.rs
Outdated
@@ -200,6 +200,10 @@ pub enum TraceError { | |||
#[error("Exporting timed out after {} seconds", .0.as_secs())] | |||
ExportTimedOut(time::Duration), | |||
|
|||
/// already shutdown error | |||
#[error("{0} already shutdown")] | |||
AlreadyShutdown(String), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect to use this variant for anything other than TracerProvider
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thought to use them for the processors and exporters too. But I believe we can customize it later if required. For now, made it static for TracerProvider.
Co-authored-by: Utkarsh Umesan Pillai <[email protected]>
Co-authored-by: Utkarsh Umesan Pillai <[email protected]>
/// | ||
/// ## Cloning and Shutdown | ||
/// | ||
/// The `TracerProvider` is designed to be lightweight and clonable. Cloning a `TracerProvider` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think TracerProvider is lightweight. It is pretty heavy, and we expect user to create it only once. It is correct to mention cloning is cheap as it is just creating a new ref.
/// | ||
/// The `TracerProvider` manages the lifecycle of span processors, which are responsible for | ||
/// collecting, processing, and exporting spans. To ensure all spans are processed before shutdown, | ||
/// users can call the [`force_flush`](TracerProvider::force_flush) method at any time to trigger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets remove force_flush mention here. I have seen many users doing force_flush in their code (and block their threads).. Not sure why, but lets make sure official docs don't recommend it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have reworded it so that it doesn't look as recommendation. I think it's better to at-least document since we provide it.
/// ## Span Processing and Force Flush
///
/// The `TracerProvider` manages the lifecycle of span processors, which are responsible for
/// collecting, processing, and exporting spans. The [`force_flush`](TracerProvider::force_flush) method
/// invoked at any time will trigger an immediate flush of all pending spans (if any) to the exporters.
/// This will block the user thread till all the spans are passed to exporters
/// `TracerProvider` is lightweight container holding pointers to `SpanProcessor` and other components. | ||
/// Cloning and dropping them will not stop the span processing. To stop span processing, users | ||
/// must either call `shutdown` method explicitly, or drop every clone of `TracerProvider`. | ||
/// `TracerProvider` is a lightweight container holding pointers to `SpanProcessor` and other components. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not introduced in this PR, but advertising TracerProvider as lightweight is incorrect, and can lead to users repeatedly creating them, instead of doing it once.
|
||
#[derive(Debug)] | ||
struct CountingShutdownProcessor { | ||
shutdown_count: Arc<Mutex<i32>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Atomics maybe easier here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
…tb/opentelemetry-rust into tracer-provider-drop-shutdown-check
@@ -200,6 +200,10 @@ pub enum TraceError { | |||
#[error("Exporting timed out after {} seconds", .0.as_secs())] | |||
ExportTimedOut(time::Duration), | |||
|
|||
/// already shutdown error | |||
#[error("TracerProvider already shutdown")] | |||
AlreadyShutdown, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename this to be more specific?
AlreadyShutdown, | |
TracerProviderAlreadyShutdown, |
// TracerProvider2 should observe the shutdown state but not trigger another shutdown | ||
let shutdown_result2 = tracer_provider2.shutdown(); | ||
assert!(shutdown_result2.is_err()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add assert_eq!(shutdown_count.load(Ordering::SeqCst), 1);
here as well to ensure that explicitly calling shutdown on an already shutdown TracerProvider doesn't call shutdown again.
#[derive(Debug)] | ||
struct CountingShutdownProcessor { | ||
shutdown_count: Arc<AtomicU32>, | ||
flush_called: Arc<AtomicBool>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be adding tests to check flush_called
later?
} | ||
|
||
// Verify that shutdown was only called once, even after drop | ||
assert_eq!(shutdown_count.load(Ordering::SeqCst), 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also verify that force_flush
was not called similar to the previous test.
@@ -36,36 +90,60 @@ static NOOP_TRACER_PROVIDER: Lazy<TracerProvider> = Lazy::new(|| TracerProvider | |||
span_limits: SpanLimits::default(), | |||
resource: Cow::Owned(Resource::empty()), | |||
}, | |||
is_shutdown: AtomicBool::new(true), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not added in this PR but why have we initialized the no-op tracer provider as an already shut down provider?
I know this wouldn't make much difference in functionality but semantically it would be weird if I call shutdown on the global provider and get an error saying it has already been shut down.
Changes
changes similar to #2195 for TracerProvider
Merge requirement checklist
CHANGELOG.md
files updated for non-trivial, user-facing changes