-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ASS Subtitle Fixes #400
ASS Subtitle Fixes #400
Conversation
Well done, thanks @softworkz! Only we need to be very careful when changing the font sizing. We had different algorithms already, and some turned out to have weird effects on some files. We'd need to do lots of testing with different files, to make sure that sizing is consistent and in line with how most other players render the subs. |
The problem was that after fixing the named style parsing (shadow), in many cases it started to really pick up the those styles (which it didn't before) and the result was that fonts were too big. Otherwise I wouldn't have even bothered touching it. A little more background: You can rely on the PlayResX and PlayResY values to be present in all ASS headers (all ass versions) as those are mandatory and crucial. These are frequently differing from the resolution of the video, because (naturally) subtitles are usually created just once and then used with different versions of a video. Since that is too much for a series of small patches, I only used the heights to determine a scale factor for the font sizes. It's not perfect, but on the side of the most frequent cases at least (where the height factor is determining the scale). So far so good, but there's one more caveat: There may be differences regarding the meaning of a font size of n pixels between Windows and ASS. And this can even differ by font. Getting that part right is really tough. Everything else can be calculated.. |
Another note: The style parsing is not quite correct, because neither order nor count of fields is guaranteed, i.e. there can be more or less than 23 columns. You need to read the line with the field names first and then parse the values in each line accordingly. |
There is an official standard document for the SSA/ASS styles, for V4 and V4+. The version number defines which fields there are and their order. It is nowhere said that the order may be changed. Doing dynamic parsing would totally blow up complexity here, just to catch some very few broken files, produced by broken non-standard compliant tools. I don't believe it's worth the hassle. And it's not like they won't work, it's just that their styles are not applied. |
There are many bits which are missing - many of them cannot even be replicated (as discussed recently), but what's important and doable is collission detection, which means that when two or more cues are visible at the same time (and not absolutely positioned), they need to be stacked in order not to overlap like in this example: Correct it looks like this: Another issue: There's a discrepancy in color. It's slight but noticeable and in case you wonder how I can say that the second is right and the first one is wrong, it's because I've watched that video a thousand times with different ASS implementations and in several subtitle authoring tools and it's always like in the second image. |
Huh, usually MediaPlayer does the stacking, I have seen it happening quite often. Actually, I have never seen subtitles overlap like that. Are you sure they are not positioned absolutely? They seem to be on slightly different Y positions. Or maybe this is one more WinUI thing? I also noticed the slight color deviation when I started on SSA/ASS styles support. Not sure where it comes from. But the difference was low enough for me to not bother 😄 If you want to dig into this, you could enter the parsed RGB value into a graphics tool, to see how the color is supposed to look like. And yeah, the outline rendering is a mess. Better not look too closely ^^ |
Regarding ASS specidfication, I had some chat with the core developers of libass, because it is not complete in several means, and they said that the full spec they consider to be valid is the combination of this document together with the source code of AegiSub... 😉 |
I read it more in the sense of "you can do it dynamically to support new standards without having to rework". But there has never been a new standard after 4+. And 4+ defines exactly which new fields are added at which exact position (the ones in red). And yes, Aegisub is the de-facto standard. It shows how the files are supposed to be. I'd see any other files as broken. |
I also read somewhere that Aegisub is considered the more precise definition of the standard. The document just tries to make it more readable. |
The different Y is probably due to being Asian letters. There's no absolute positioning. Just different styles: |
The text lines are probably better to look at:
|
They are working on 5.0, but who knows whether and when it will happen 😆
Which means that current implementation should not rely on a fixed order and count so that they won't get broken by later versions of the spec. But as said, there's one case where existing ASS files differ and that made me change the whole parsing in the ffmpeg implementation I did to use the field names for parsing. |
I don't know how the WindowsMedia layouting is working - could the overlap happen because both styles have margins (the same MarginV) and the auto-stacking does perhaps not work when you use the margins to set the region? |
That could be the case.
If they ever release a new standard, I'd support it like I did for V4 and V4+. Sure, dynamic parsing would be the better implementation. But it is way more complicated, and for me it's not worth all the effort, just to support a few broken files. If it is very important for you, and you want to implement this once again, feel free to add it. As I said, I'd consider it the better implementation, it's just that I won't do it 😄. |
Agreed and I understand that very well 😆 BTW, I just looked up what they consider to be the size of a font when it's set to 100% and in the source it is stated that a font size of 100% corresponds to 5% of the video's height... |
Damn - and I just wanted to suggest that I could send you a bunch of videos with ASS subs for testing, LOL... |
OMG - do you know how they're drawing the outlines? They are drawing the text 8 times!!! (N, NE, E, SE, S, SW, W, NW) |
That's how age of empires 2 text used to work. |
Thanks. I will check back on how much time we want to invest here. The collission issue is obviously more important than the style parsing. |
And they even to it wrong: // Upper left
IFC(CreateOutlineHelper(pLine, pStyle, pRegion, pCueElement, -offset, -offset));
// Upper center
IFC(CreateOutlineHelper(pLine, pStyle, pRegion, pCueElement, 0, -offset));
// Upper right
IFC(CreateOutlineHelper(pLine, pStyle, pRegion, pCueElement, offset, -offset));
// Left
IFC(CreateOutlineHelper(pLine, pStyle, pRegion, pCueElement, -offset, 0));
// Right
IFC(CreateOutlineHelper(pLine, pStyle, pRegion, pCueElement, offset, 0));
// Lower left
IFC(CreateOutlineHelper(pLine, pStyle, pRegion, pCueElement, -offset, offset));
// Lower center
IFC(CreateOutlineHelper(pLine, pStyle, pRegion, pCueElement, 0, offset));
// Lower right
IFC(CreateOutlineHelper(pLine, pStyle, pRegion, pCueElement, offset, offset)); Do you see what I mean? |
I have submitted a PR to improve the outline rendering: at https://github.com/microsoft/microsoft-ui-xaml/ PR nr 9224 |
It would be great when you could test a bit if you find the time. I have created that draft PR which includes all the fixes I have submitted: #405 Now I can of course send you some of my files for testing if you wish, but as I know that these are working (except animations), it would be much more interesting to have it tested with more and different files.. |
Wow you are really serious about this @softworkz. Good job. I never had the ambition to fully support ASS. For me, it was enough to have some of the basic styling working. But you are really taking things to a different level. I will try to do reviews on the partial PRs. But we should do testing on the combined PR, and merge that when we think it is all good. The bad thing is that we are limited by the poor rendering done by the framework controls. Nice catch with the broken outline rendering. I's no surprise that it looks so horrible. Not sure how libass does it, but I'd guess it's more a combination of a single render plus a clever combination of effects to get the nice outline. But of course, increasing the render count (combined with knowing how a circle works 😄) is a major improvement to the MS solution. Are you actually able to compile WinUI 3 from source? Or how have you done the comparison screens in the PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to test if the font scaling is good for various files. But the implementation looks good.
From the files I've had in my usual testing battery this looks fine. Good work! I will try to create some brand new subtitles and mix them around to see how it goes. |
@lukasf I've read somewhere that libass uses direct2D/directtext for this outline thing. It's how age of empires 2 Definitive Edition does it 😂 |
I've been shaken back and forth because I know our users and how they would have reacted, so we had discussed internally for how we want to proceed. My initial suggestion was to disable ASS subttitles and use your implementation only for all other text subtitles for which it is doing very well. The big plus which made me stick to your implementation was that you had already implemented support for custom (embedded) fonts - which is always the first thing about which Anime fans are complaining. 👍 |
No I can't compile - I didn't even retry with the 1.5 branch. Instead I have created a UWP project which replicates their current method and made sure that it gives identical results to their current implementation. Then I implemented my proposed solution, did the comparisons and finally translated it back to C++ code. |
They use DirectWrite on Windows, but I don't know how they do the outlines. |
Based on the output, I am pretty sure they use some kind of gaussian blur on the text for the outline, plus maybe some additional effects for fine tuning the look. See how it fades out smoothly to the edges. You cannot get such a smooth fade-out by just rendering the text multiple times. Same is probably done for the shadow, just with some offset and different color/alpha. At least, I used blur effect to create text shadow in an app, very long ago. It had kind of a similar look. |
I made an extreme test in Aegisub with a real large outline: This reveals that they are converting the font to a path and drawing it using a circle-shaped pen. |
Oh, no - not quite: It's a bit more sophisticated. It's a blur which uses the sorrounding as input but output is only inside the actual region, so it cannot grow larger than it is. (the image is done with a high blur value but no opacity applied to any color) |
Interesting, thanks for the analysis. |
Yeah I forgot that outline and blur are separate parameters in ASS. It's just that UWP/WinUI does not support blur. |
SubtitleProviderSsaAss: Add line break
SubtitleProviderSsaAss: Calc font size
SubtitleProviderSsaAss: Print offending tag in case of parsing errors
SubtitleProviderSsaAss: Prevent parsing errors due to ambiguous tags
SubtitleProvider: Fix for color strings like H&H2A4F5D&
SubtitleProviderSsaAss: Shadow is float
Fixes #397