Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUMM-1609 Use NTP server offset #607

Merged
merged 3 commits into from
Sep 27, 2021
Merged

Conversation

maxep
Copy link
Member

@maxep maxep commented Sep 24, 2021

What and why?

The Kronos Clock.now property is not thread safe and should always be invoked from the main thread, this change would have a large impact on our codebase. As an alternative, we can rely on the server time offset we get from Clock.sync: The Clock.now is in fact used by DateCorrector to actually compute the server offset.

Below a quick benchmark of the computed offset previously used and the server offset applied by this PR:

Previous Computed Offset Server Offset
0.053711891174316406 0.05375981330871582
0.04763531684875488 0.04767334461212158
0.04572319984436035 0.045766234397888184
0.08127260208129883 0.08131551742553711
0.05130136013031006 0.051299452781677246
0.03885924816131592 0.038874030113220215

Note that Kronos.Clock.now has a sub-second precision which is reflected in the above tests.

Fix #588

How?

The ServerDateProvider will now provide an offset time interval and its concrete implementation NTPServerDateProvider will retrieve that value from Kronos.Clock.sync.

Clock.sync callback is invoked on the main thread, so we are now using a ValuePublisher to safely access the server time offset.

⚠️ Clock.sync only notifies the first and last samples, so we are losing the intermediate offsets!

Review checklist

  • Feature or bugfix MUST have appropriate tests (unit, integration)
  • Make sure each commit and the PR mention the Issue number or JIRA reference

@maxep maxep requested a review from a team as a code owner September 24, 2021 08:35
@maxep maxep self-assigned this Sep 24, 2021
@maxep maxep requested a review from ncreated September 24, 2021 08:35
Copy link
Member

@ncreated ncreated left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clock.sync only notifies the first and last samples, so we are losing the intermediate offsets!

Is it true that previously (with using Clock.now) we were receiving some initial approximation even before Clock.sync's first {} callback?

I'm wondering if we can cover this gap by using Clock.now right after Clock.sync:

func synchronize(with pool: ...) {
   Clock.sync(
      // ...
      first: { publisher.publishAsync(...) } // first approximation
      completion: { publisher.publishAsync(...) } // stable result
   )

   publisher.publishAsync(Clock.now - currentDeviceTime()) // ← this
}

Kronos seems to integrate with UserDefaults and stores the last-known NTP time in there. With this change we should read it back as soon as possible. This will be thread safe and will be as precise as before, no?

Sources/Datadog/Core/System/Time/ServerDateProvider.swift Outdated Show resolved Hide resolved
Sources/Datadog/Core/System/Time/ServerDateProvider.swift Outdated Show resolved Hide resolved
if let offset = offset {
self.publisher.publishAsync(offset)
} else if let now = now {
self.publisher.publishAsync(now.timeIntervalSinceNow)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to get now but not offset, or this is just a sanity fallback? In any case, it would be good to have inline comment on this, otherwise it seems unclear.

Copy link
Member Author

@maxep maxep Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my comment above is misplaced. This is indeed a fallback in case the last sample returns a nil offset.
Th now parameter is Clock.now, so it is possible to have now but not offset.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated my comment, LMKWYT!

@maxep
Copy link
Member Author

maxep commented Sep 27, 2021

Yes, Kronos first loads the previous offset from the UserDefaults. Their interface is not public, but we can indeed do this workaround 👍 It is thread-safe since they load from the user defaults on the current thread, but they could change that behavior any time.

@maxep maxep requested a review from ncreated September 27, 2021 12:55
@maxep maxep force-pushed the maxep/RUMM-1609/use-ntp-offset branch from d3d737a to 5eae9a8 Compare September 27, 2021 13:05
@ncreated
Copy link
Member

Yes, Kronos first loads the previous offset from the UserDefaults. Their interface is not public, but we can indeed do this workaround 👍 It is thread-safe since they load from the user defaults on the current thread, but they could change that behavior any time.

It's IMO fine 👍. Kronos now requires using Clock.sync and Clock.now on the same thread (which we do now 🚀), and it's hard to to imagine this changing to "must be used on different threads".

@maxep
Copy link
Member Author

maxep commented Sep 27, 2021

Kronos now requires using Clock.sync and Clock.now on the same thread

Actually, their requirement should be to 'invoke Clock.now on the main thread'. If, at some point, they decide to dispatch loadFromDefaults on the main thread, then we would still risk a data race.

@maxep maxep merged commit b3ff7ca into master Sep 27, 2021
@maxep maxep deleted the maxep/RUMM-1609/use-ntp-offset branch September 27, 2021 15:33
@ncreated ncreated mentioned this pull request Sep 28, 2021
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ThreadSanitizer data race
2 participants