Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Font files are classified as Scala or HolyC #4870

Open
abarichello opened this issue May 26, 2020 · 18 comments
Open

Font files are classified as Scala or HolyC #4870

abarichello opened this issue May 26, 2020 · 18 comments

Comments

@abarichello
Copy link

Font source files included in repositories are being misidentified as Objective-J/HolyC/Scala.

URL of the affected repository:

https://github.com/ValveSoftware/Proton/tree/proton_5.0
https://github.com/adobe-fonts/source-han-mono

Expected language:

none

Detected language:

Scala/HolyC/SuperCollider/etc

@lildude
Copy link
Member

lildude commented May 26, 2020

As you've already found, this is because of the .sc extension which is only associated with those three languages and no font "languages" or otherwise explicitly ignored. As a result, things fall through to the heuristic and then on to the classifier, but given Linguist doesn't know anything about the font "language" you're expecting it to find, it'll never find it.

As Linguist relies upon community contributions to address such things, we'd welcome a PR that either adds the language or ignores the files.

@lildude
Copy link
Member

lildude commented May 26, 2020

Oh, this applies to the other extensions too.

@stale
Copy link

stale bot commented Jun 26, 2020

This issue has been automatically marked as stale because it has not had activity in a long time. If this issue is still relevant and should remain open, please reply with a short explanation (e.g. "I have checked the code and this issue is still relevant because ___."). Thank you for your contributions.

@stale stale bot added the Stale label Jun 26, 2020
@aeikum
Copy link

aeikum commented Jun 26, 2020

A month is a long time? Yes of course it's still an issue. Come on, bot. Am I going to have to do this every month?

@stale stale bot removed the Stale label Jun 26, 2020
@lildude
Copy link
Member

lildude commented Jun 26, 2020

A month is a long time? Yes of course it's still an issue. Come on, bot. Am I going to have to do this every month?

Yes, or you can submit a PR to add support 😉, after all Linguist relies almost exclusively on community contributions.

@stale
Copy link

stale bot commented Jul 26, 2020

This issue has been automatically marked as stale because it has not had activity in a long time. If this issue is still relevant and should remain open, please reply with a short explanation (e.g. "I have checked the code and this issue is still relevant because ___."). Thank you for your contributions.

@stale stale bot added the Stale label Jul 26, 2020
@aeikum
Copy link

aeikum commented Jul 27, 2020

#4870 (comment)

@stale stale bot removed the Stale label Jul 27, 2020
@stale
Copy link

stale bot commented Aug 29, 2020

This issue has been automatically marked as stale because it has not had activity in a long time. If this issue is still relevant and should remain open, please reply with a short explanation (e.g. "I have checked the code and this issue is still relevant because ___."). Thank you for your contributions.

@stale stale bot added the Stale label Aug 29, 2020
@aeikum
Copy link

aeikum commented Aug 31, 2020

Asdf

@stale stale bot removed the Stale label Aug 31, 2020
@Alhadis
Copy link
Collaborator

Alhadis commented Aug 31, 2020

@aeikum Could you enlighten us on what .SC files are? I googled SC font file" but it (naturally) brings up results for small-cap` variants of actual typefaces.

@aeikum
Copy link

aeikum commented Aug 31, 2020

I believe they are variants of the font files for different languages. So they're different file types, despite the same final extension. See here: https://github.com/adobe-fonts/source-han-mono/tree/master/Heavy/OTC

J - Japanese, K - Korean, SC - Simplified Chinese, TC - Traditional Chinese. Not sure about HC.

@Alhadis
Copy link
Collaborator

Alhadis commented Aug 31, 2020

SC - Simplified Chinese, TC - Traditional Chinese. Not sure about HC

Hong Kong and Taiwan have different variations of traditional Chinese, so TC and HC probably stand for "Taiwan Chinese" and "Hong Kong Chinese", respectively.

Anyway. From what I see in adobe-han-mono/Heavy/OTC, the .HC files are a mix of PostScript (Type 1 fonts, essentially specialised PostScript programs), OpenType feature definitions, and some ad hoc-looking format for CID font metadata. Since most of these files are too large to be indexed or displayed on GitHub, it's gonna be difficult to accurately determine how many repositories use .HC and .TC as file extensions (see CONTRIBUTING.md if you're unsure why that's relevant).

@abarichello Aside from the two repositories you linked to, how many others have you encountered where this is obviously an issue?

@smola
Copy link
Contributor

smola commented Aug 31, 2020

Note that there was quite some discussion about font files: #2516

@Alhadis
Copy link
Collaborator

Alhadis commented Aug 31, 2020

That issue's from early/mid 2015. Barely any of it is relevant anymore — several missing font formats were added by yours truly, and being the resident font-nerd, I'm always eager to add support for a font format. 😉

@abarichello
Copy link
Author

abarichello commented Aug 31, 2020

@abarichello Aside from the two repositories you linked to, how many others have you encountered where this is obviously an issue?

Advanced search returned hundreds of results for .SC files being identified as Scala. .HC is harder to find.

Alhadis added a commit to Alhadis/Silos that referenced this issue Sep 16, 2020
@Alhadis
Copy link
Collaborator

Alhadis commented Sep 16, 2020

Well, if you or anybody else is interested, here's an unsorted stash of .sc files harvested from search results. They don't appear to have much in common, however... 😕

@stale
Copy link

stale bot commented Dec 25, 2020

This issue has been automatically marked as stale because it has not had activity in a long time. If this issue is still relevant and should remain open, please reply with a short explanation (e.g. "I have checked the code and this issue is still relevant because ___."). Thank you for your contributions.

@stale stale bot added the Stale label Dec 25, 2020
@aeikum
Copy link

aeikum commented Dec 28, 2020

Yup

@stale stale bot removed the Stale label Dec 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants