Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AV sync observation #95

Open
yanj-github opened this issue Nov 25, 2024 · 10 comments
Open

AV sync observation #95

yanj-github opened this issue Nov 25, 2024 · 10 comments
Assignees
Labels
Released Released and ready for check

Comments

@yanj-github
Copy link
Collaborator

yanj-github commented Nov 25, 2024

Only 2 AV sync test passed from last all others FAIL.
AV sync tests requires investigation.
Here is the summary of what I have found:

AV sync result from Berlin Plugfest:

image

I’ve enhanced the current OF algorithm for comparing matching audio and video pairs and updated the result message to display the percentage of pairs that PASS, and the audio-to-video minimum and maximum differences.
Below are the updated results with this improvement, we now have 3 PASSES.

Improved OF algorithm:

image

Here is the summary including the percentage of matching audio and video pairs that PASS and the audio-to-video min/max differences:

image

I’m assuming that Fraunhofer used the same camera for all test recordings.
The observed difference between audio and video indicates a synchronization failure in both directions on some of the devices.
Upon further debugging, I discovered that for some of the devices the AV sync issue only occurs very briefly, after which the playback remains in-sync for the rest of the duration.
Currently, the AV sync is checked for each audio segment throughout the entire playback. The test fails if even one segment is out of sync, which makes it extremely challenging for devices to pass.
Therefore, I propose adding a tolerance threshold for pass/fail criteria. For instance, a brief out-of-sync duration (e.g., couple of seconds) may not be perceptible to the human eye and could be considered acceptable.

Q1: Can we add a pass rate to determine PASS/FAIL? If we allow a threshold of 80%(?) or above to pass, this adjustment will make the pass/fail criteria more practical and achievable.
This accounts for minor discrepancies that are not perceptible to users while maintaining an acceptable standard for performance.

If we add an 80% pass rate, the results now look like this:

image

Additionally, we can discuss the AV sync tolerance, which is currently set at [40ms, -120ms] compared to the ITU standard [45ms, -125ms].
The current setting aligns with a measurement resolution of 20ms.
Audio is being captured with millisecond-level accuracy. However, due to limitations in the capturing process, video frames might not achieve the same level of precision in millisecond accuracy.
With a 120fps camera, there will be an offset of approximately 8.33ms at either end of the capture. We need to consider this limitation.
Q2: Can we increase the tolerance e.g.: [60ms, -140ms] ?

Note, the measured offsets assume that the camera can record audio and video perfectly in-sync, with no camera offset factored into these measurements.
Q3: Should we allow users to configure a camera recording offset (up to a threshold) when running the OF process?
We must carefully consider the risks, such as:
• Potential misuse: Users might misuse this feature to get a pass.
• Measurement offset inaccuracies: It is hard to measure recording offset accurately; Incorrect offset values could lead to false positives or negatives.

@yanj-github yanj-github self-assigned this Nov 25, 2024
@jpiesing
Copy link

In principle I support including all 3 of these assuming we are confident they would make a difference.
How does point 3 relate to the calibration process for cameras defined in https://github.com/cta-wave/device-observation-framework/wiki/How-to-test-your-camera-records-in%E2%80%90sync?

@yanj-github
Copy link
Collaborator Author

In principle I support including all 3 of these assuming we are confident they would make a difference. How does point 3 relate to the calibration process for cameras defined in https://github.com/cta-wave/device-observation-framework/wiki/How-to-test-your-camera-records-in%E2%80%90sync?

Point 1:
We can apply this immediately by setting the default tolerance configuration to 80% in OF. If needed, we can later define the tolerance in TR.

Point 2:
This is a simple configuration change that doesn't require code modifications. However, it does require an update to the specifications. I suggest delaying this until after we implement the changes from Point 3 to see if it’s still necessary.

Point 3:
This is more complex, as we need to develop a reliable solution to accurately measure the recording offset. You were correct about the relevance of the calibration document; it will require more thought and consideration. I will need more thinking on this.

@yanj-github yanj-github added the In progress In progress label Nov 27, 2024
@yanj-github
Copy link
Collaborator Author

Updates on Point 3:
I am going to automate the existing manual calibration process and when the calibration recording is provided on OF run it will automatically apply the offset to the observations.

@wschidol
Copy link

wschidol commented Dec 9, 2024

I agree that we should factor the imperfections of the measurement device (i.e. the camera) out of the measurement as far as possible. Therefore,

  • I agree with point 3 if it can be made foolproof against accidental mistuning. It sounds like @yanj-github has found a good way to do that.

  • In the same vein, since the 120 Hz sampling of the camera adds jitter of +/-8.3 ms to the result, one could think about increasing the tolerance by that amount. That said, it may not actually do much to the pass/fail rate so is likely not worth the churn in specs. Plus, this tolerance would have to adapt to the actual camera and actual framerate used. This would start to get tricky, and so I'd leave that as a last resort.

  • I am somewhat concerned about a global threshold of 20% of frames that can be out of sync. This feels like a much too high number.
    As discussed, it is quite likely that at the beginning and the end of the test clip, imperfections in start-up and shutdown of streams could lead to failing measurements. These imperfections would hopefully already be flagged up by other tests and don't need to lead to failing A/V sync. My suggestion would be to disregard A/V sync failures at start and end of stream (like, maybe the first and last 5 frames or so) and to then ask for a much higher pass rate in the main body of content (say, 95%).

yanj-github added a commit to yanj-github/device-observation-framework that referenced this issue Dec 9, 2024
rcottingham added a commit that referenced this issue Dec 10, 2024
@yanj-github
Copy link
Collaborator Author

As agreed at Meeting 2024-12-10 on @wschidol's comment:
I will add tolerances at the beginning and the end, measured in milliseconds, default 100ms. The pass rate percentage, default 95%, will be calculated for the middle range, excluding the tolerances at both ends.

@yanj-github
Copy link
Collaborator Author

yanj-github commented Dec 19, 2024

@wschidol I have made default pass rate 95% and tolerances at the beginning and the end, measured in milliseconds, default 1000ms (which is 1 seconds). I think from the captured data 100ms was too small.
Based on these values OF will ignore the AV sync within 1 second on both end. Please see the results comparison below.
image

@wschidol
Copy link

@yanj-github Glad to see that we can tighten the pass rate for 95% for the bulk of the signal.

Just to clarify... do you mean that

  1. the first and last 1000 ms of the signal are exempt from checking, or
  2. that A/V sync is allowed to be off by one second at the start and end of the signal?

And, if it is 2., for how long at either end can A/V sync be off by up to 1 sec?

@yanj-github
Copy link
Collaborator Author

I meant the first and last 1000 ms of the signal are exempt from checking. So we ignore 1st and last 1 seconds audio track. Please note that 1000ms off is based on expected duration not presented duration.
For example for 30 seconds audio track we only check AV sync from 1 seconds to 29 seconds.

@jpiesing
Copy link

I meant the first and last 1000 ms of the signal are exempt from checking. So we ignore 1st and last 1 seconds audio track. Please note that 1000ms off is based on expected duration not presented duration. For example for 30 seconds audio track we only check AV sync from 1 seconds to 29 seconds.

Just to be clear, if the first 40ms and last 80ms of the 'real' audio data are missing, what would be exempt from a/v sync testing? Would it be 960ms of 'real' data exempt at the start and 920ms exempt at the end?

Whatever it is, it should be documented somewhere, if only in the config file where 1000ms is defined.

@yanj-github
Copy link
Collaborator Author

@jpiesing Sure it will be documented on OF end. The 1000ms is completely independent to the missing data. AV sync only checked where audio media time is bigger than tolerance 1000ms.
So yes, if the first 40ms audio data are missing, 960ms of 'real' data exempt at the start.

yanj-github added a commit to yanj-github/device-observation-framework that referenced this issue Jan 15, 2025
yanj-github added a commit that referenced this issue Jan 15, 2025
yanj-github added a commit that referenced this issue Jan 15, 2025
@yanj-github yanj-github added Released Released and ready for check and removed In progress In progress labels Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Released Released and ready for check
Projects
None yet
Development

No branches or pull requests

3 participants