Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C/J vertical metrics requirements #8911

Open
aaronbell opened this issue Jan 14, 2025 · 3 comments
Open

C/J vertical metrics requirements #8911

aaronbell opened this issue Jan 14, 2025 · 3 comments

Comments

@aaronbell
Copy link
Collaborator

aaronbell commented Jan 14, 2025

Over the years, Google has established clear guidance for setting font vertical metrics to ensure consistent rendering across a variety of environments (with web as a particular focus).

As a short summary, this guidance makes use of the two core vertical metric values, sTypo (sTypoAscender / sTypoDescender / sTypoLineGap) to establish typographic metrics (which are set for ideal line height for reading) in conjunction with usWin (usWinAscent / usWinDescent) to establish clipping metrics (where MS Word will not display any glyph paint that extends beyond these metrics), and setting the USE_TYPO_METRICS flag to ensure that they are used for such. The hhea metrics follow the sTypo (as the USE_TYPO_METRICS flag is set) values for consistency on Mac. If you need further explanation of these metrics and how to set them, I suggest reading the spec at the link above.

Interestingly, where the metrics requirements are responsive to the design of the typeface, in the case of CJK they are locked to specific values based on the metrics established by Ken Lunde for the Source Han Sans / Noto CJK project:

Attrib Value Example using 1000upm font
OS/2.sTypoAscender 0.88 * font upm 880
OS/2.sTypoDescender -0.12 * font upm -120
OS/2.sTypoLineGap 0 0
hhea.ascender Set to look comfortable (~1.16 * upm) 1160
hhea.descender Set to look comfortable (~0.288 * upm) -288
hhea.lineGap 0 0
OS/2.usWinAscent Same as hhea.ascent 1160
OS/2.usWinDescent abs(value) of hhea.descent 288
OS/2.fsSelection bit 7 (Use_Typo_Metrics) Do not set / disabled 0

In this case, since the USE_TYPO_METRICS flag is disabled, the usWin metrics act as both typographic vertical metrics and also clipping metrics.

The setting for the sTypo values is of particular note, and comes from the MS OpenType Spec requirements for the sTypo values of CJK fonts.

For CJK (Chinese, Japanese, and Korean) fonts that are intended to be used for vertical (as well as horizontal) layout, the required value for sTypoAscender is that which describes the top of the ideographic em-box. For example, if the ideographic em-box of the font extends from coordinates 0,-120 to 1000,880 (that is, a 1000 × 1000 box set 120 design units below the Latin baseline), then the value of sTypoAscender must be set to 880. Failing to adhere to these requirements will result in incorrect vertical layout.

For CJK (Chinese, Japanese, and Korean) fonts that are intended to be used for vertical (as well as horizontal) layout, the required value for sTypoDescender is that which describes the bottom of the ideographic em-box. For example, if the ideographic em-box of the font extends from coordinates 0,-120 to 1000,880 (that is, a 1000 × 1000 box set 120 design units below the Latin baseline), then the value of sTypoDescender must be set to -120. Failing to adhere to these requirements will result in incorrect vertical layout.

There are several problems with this approach.

  1. It is specific to Source Han Sans. As NightFurySL2001 points out, the positioning and ratios of the ideographic-em-box in relation to Latin glyphs can differ from font to font and foundry to foundry. So the ratio used for establishing the position of the box should have flexibility to adjust: 0.88:0.12, 0.85:0.15, 0.80:0.20. As long as it still sums to the ideographic-em-box, then any specific ratio should be fine.
  2. usWin metrics are not reflective of what is in the font. Like (1), the design of a given font can differ quite significantly, so by setting the usWin metrics to specific values instead of being responsive to the design, the metric might be too small to avoid clipping, or excessively large for the design.
  3. As USE_TYPO_METRICS is disabled, then usWin metrics are used for default line-spacing when displaying the font on the browser, and in applications like MS Word. And since usWin metrics are primarily a clipping metric, then in most cases the line-gap will likely be too big for a given design. There are even complaints about this for Source Han Sans / Noto Sans CJK.

The BASE table
Stepping back from the guidance in the MS OT Spec, Ken Lunde, and Google’s specification, these requirements raise some key questions:

  • If USE_TYPO_METRICS is disabled, why does setting sTypo metrics matter?
  • Why can’t we use sTypo/USE_TYPO_METRICS in the same way as in non-CJK fonts to establish ideal default line spacing?

On this subject the OT Spec directs readers to Baseline tags and to the BASE table. The BASE table is part of the OpenType code of a font that provides information regarding the ideographic-em-box. Below is a sample of the ideographic-em-box and the corresponding BASE table.

Image
table BASE {
  HorizAxis.BaseTagList     icfb  icft  ideo  romn;
  HorizAxis.BaseScriptList  
                DFLT 	ideo  	-75   835  -120     0,
                hani  	ideo   	-75   835  -120     0,
                kana  	ideo   	-75   835  -120     0,
                latn  	romn   	-75   835  -120     0,
                cyrl  	romn   	-75   835  -120     0,
                grek  	romn   	-75   835  -120     0;

  VertAxis.BaseTagList      icfb  icft  ideo  romn;
  VertAxis.BaseScriptList 
                DFLT  	ideo    	45   955     0   120,
                hani  	ideo    	45   955     0   120,
                kana  	ideo    	45   955     0   120,
                latn  	romn    	45   955     0   120,
                cyrl  	romn    	45   955     0   120,
                grek  	romn    	45   955     0   120;
} BASE;

The BASE tags above are:

  • icfb: Ideographic character face bottom edge (in HorizAxis) / left edge (in VertAxis)
  • icft: Ideographic character face top edge (in HorizAxis) / right edge (in VertAxis)
  • ideo: ideographic em-box bottom edge (in HorizAxis) / left edge (in VertAxis)
  • romn: Latin baseline. Usually 0 in HorizAxis, inverse of ideo in VertAxis

These BASE tags provide language-specific metrics data that may be used for typesetting purposes, such as to enable better cross-script alignment. However, in cases where a font does not include a BASE table and an application needs to define the ideographic-em-box for rendering purposes, there is specific logic laid out in the OT spec wherein the typoMetrics are used as a fallback:

ideoEmboxLeft = 0
If HorizAxis.ideo is defined:
	ideoEmboxBottom = HorizAxis.ideo
	If HorizAxis.idtp is defined:
		ideoEmboxTop = HorizAxis.idtp
	Else:
		ideoEmboxTop = HorizAxis.ideo + head.unitsPerEm
	If VertAxis.idtp is defined:
		ideoEmboxRight = VertAxis.idtp
	Else:
		ideoEmboxRight = head.unitsPerEm
	If VertAxis.ideo is defined and is non-zero:
		Warning: Bad VertAxis.ideo value
	Else If this is a CJK font:
		ideoEmboxBottom = OS/2.sTypoDescender
		ideoEmboxTop = OS/2.sTypoAscender
		ideoEmboxRight = head.unitsPerEm
	Else:
		ideoEmbox cannot be determined for this font

Because of this fallback logic, the OT Spec recommends that thesTypo and hhea metrics align with the BASE table to ensure consistency:

CJK fonts generally should have the same descender value recorded in hhea.descender, OS/2.sTypoDescender, and HorizAxis.ideo (if present) fields, and the same ascender value recorded in hhea.ascender, OS/2.sTypoAscender, and HorizAxis.idtp (if present) fields.

A new direction forward for C/J vertical metrics
This appropriation of thesTypo leaves us in a quandary. In order to ensure compatibility with “some” applications and legacy environments, it is required to keep the sTypo aligned with the ideographic em-box. However, doing so removes our capability to set the vertical metrics apart from the clipping metrics.

Interestingly, the OT spec seems to predict this predicament:

The OS/2.sTypoDescender and OS/2.sTypoAscender fields in a CJK font may specify metrics different from the HorizAxis.ideo and HorizAxis.idtp values in the BASE table.

To that end, I would like to propose a new approach for C/J fonts. For the sTypo, we use the ideographic em-box as a reference guide, and scale it up proportionally, similar to how Latin treats the ascender / descender values. hhea will follow the sTypo, and usWin will continue to align with yMin and yMax. Finally, an accurately-set BASE table will be required to enable applications that need ideographic information to position the glyphs correctly.

Taking LXGW WenKai TC as an example:

Attrib Value Example using WenKai
OS/2.sTypoAscender ideoEmboxTop + (15–20% * emBox)/2 852+(0.18*1000)/2 = 942
OS/2.sTypoDescender ideoEmboxBottom - (15–20% * emBox)/2 -148-(0.18*1000)/2  = -238
OS/2.sTypoLineGap 0 0
hhea.ascender OS/2.sTypoAscender 942
hhea.descender OS/2.sTypoDescender -238
hhea.lineGap 0 0
OS/2.usWinAscent abs(value) of yMax 1102
OS/2.usWinDescent abs(value) of yMin 285
OS/2.fsSelection bit 7 (Use_Typo_Metrics) Enabled
OT BASE table Required

(note – in this case, I used a 18% increase on the sTypoMetrics to follow the suggestion of the original designer)

Image
table BASE {
  HorizAxis.BaseTagList     icfb  icft  ideo  romn;
  HorizAxis.BaseScriptList  
                DFLT 	ideo  	-96   800  -148     0,
                hani  	ideo   	-96   800  -148     0,
                kana  	ideo   	-96   800  -148     0,
                latn  	romn   	-96   800  -148     0,
                cyrl  	romn   	-96   800  -148     0,
                grek  	romn   	-96   800  -148     0;

  VertAxis.BaseTagList      icfb  icft  ideo  romn;
  VertAxis.BaseScriptList 
                DFLT  	ideo    	28   974     0   148,
                hani  	ideo    	28   974     0   148,
                kana  	ideo    	28   974     0   148,
                latn  	romn   	28   974     0   148,
                cyrl  	romn    	28   974     0   148,
                grek  	romn    	28   974     0   148;
} BASE;

The result of this approach, in MS Word for Mac:
Image

As you can see, this change has produced significantly less space between the lines as the overall line height (ascender+descender) has reduced from 1317 to 1180. It helps the text hold together better, and read more comfortably than previously.

Open Questions

  • It has been reported by NightFury that in cases where the vhea/vmtx table are not present, that the sTypo values may used as a fallback when setting text vertically. It would be worth investigating if this use is widespread, and if so, then addition of these vertical-specific tables should be recommended for C/J fonts, and required for any that are intended for vertical use.
  • This document specifically discusses C/J fonts, but not Korean as Hangeul can / should be treated differently than ideographic-heavy scripts. It will be covered by a separate document.

Risks

  1. Applications that use the sTypo metrics for ideographic-based positioning will find that any font employing the above method are no longer positioned as expected. I believe this is primarily a legacy issue, but worth noting.
  2. If the proposal is applied to all C/J fonts across the library, then there will be backwards compatibility problems. One mechanism to mitigate this risk is to only apply it moving forwards.

Priority
This is a high priority issue that needs resolution as it is currently preventing immediate onboarding for many Traditional Chinese fonts:

And will prevent onboarding of these upcoming Chinese projects as well

Additionally, we have two upcoming Japanese and three Korean projects which will also be impacted.

@davelab6
Copy link
Member

@tiroj you might be interested in this :)

@celestialphineas
Copy link

celestialphineas commented Jan 15, 2025

Regarding Risk 1, I would like to mention that:

The majority of SC/TC vendors never write a BASE table. Taking advantage of OS/2.sTypoDescender as the bottom line of the ideographic em-box is a very recent practice in the industry, and is now well-received by SC/TC vendors. I believe this risk is a real threat, and could break everything, impacting new solutions that handle typefaces from early-developed to more recent SC/TC products, as well as legacy code handling the proposed solution.

To my knowledge, the nowadays common practice of foundries in China mainland is as follows:

  • No BASE table appears.
  • sTypo metrics are set for the ideographic em-box. Usually 850/-150, as this is the default valued inferred by Adobe applications, when a CJK typeface contains no BASE table. Fonts without a BASE table and at the same time with sTypo values not set to 850/-150 will not fit the layout grid.
  • The Latin baseline does not necessarily lie on the y=0 baseline. Instead, Latin letters may float around the baseline to match the Han characters visually, rather than being positioned as technicians usually expect. In other words, the real Latin baseline information is usually lost.

@rutopio
Copy link

rutopio commented Jan 15, 2025

Our usual approach:

  • No BASE table
  • Use sTypo and hhea.ascender/descender to define the metrics.
  • Use 880/120 ratio (not retroactively applied to older designs or fonts modified from other open-source fonts).
  • Set sTypoLinegap = hheaLinegap = 0.
  • Set usWinDescent = 150.
    • Based on some tests we conducted before, if usWinDescent exceeds 150, some applications on Windows platform (such as Office 365) will default to 200% line height.
    • Since Windows and Office are frequently updated, I’m not sure if this issue has been resolved. Perhaps more user feedback is needed.
  • The baseline of Latin letters does not necessarily align with y = 0. Instead, to match the visual balance of CJK and Latin characters, and ensure compatibility with other fonts from our foundry for mixed typesetting, it is usually determined by the designers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants