Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Move all grammar-related metadata to grammars.yml #4990

Open
Alhadis opened this issue Sep 2, 2020 · 10 comments
Open

Suggestion: Move all grammar-related metadata to grammars.yml #4990

Alhadis opened this issue Sep 2, 2020 · 10 comments
Assignees

Comments

@Alhadis
Copy link
Collaborator

Alhadis commented Sep 2, 2020

Currently, metadata pertaining to grammars and case-by-case exceptions are handled in four different places:

Moreover, the difficulties related by @pastra98 made me realise there's more we could be doing with regards to locating grammar and license files. Specifically, we should be able to provide a manual path if need be — the currently hardcoded search locations can remain the default for grammars without a src: field defined, or whatever.

Here's how it *might* look.
# Each entry correlates to a directory in "vendor/grammars/#{key}"
--
abl-tmlanguage:
    license: MIT
    source: chriscamicas/abl-tmlanguage
    scopes:
        - source-abl
    
    # Location of files inside submodule repository. Usually
    # calculated automatically, though maybe it makes sense
    # to always provide this list, as opposed to only those
    # grammars with non-standard file locations?
    files:
        grammar: abl.tmLanguage.json
        license: LICENSE

actionscript3-tmbundle:
    license: MIT
    source: simongregory/actionscript3-tmbundle
    scopes:
        - source.actionscript.3
        - text.html.asdoc
        - text.xml.flex-config

c.tmbundle:
    license: MIT
    source: textmate/c.tmbundle
    scopes:
        - source.c
        - source.c++
        - source.c.platform
    aliases:
        source.c++: source.cpp

hy.tmLanguage:
    license: MIT
    source: Slowki/hy.tmLanguage
    scopes:
        - source.hy
    paths:
        - hy.json
        - LICENSE.md

language-roff:
    license: ISC
    source: Alhadis/language-roff
    scopes:
        - hidden.manref
        - source.ditroff
        - source.ditroff.desc
        - source.gremlin
        - source.ideal
        - source.pic
        - text.roff
        - text.runoff

sublimesystemverilog:
    license: MIT
    source: https://bitbucket.org/Clams/sublimesystemverilog/get/default.tar.gz
    scopes:
        - source.systemverilog
        - source.ucfconstraints

Genshi.tmbundle:
    license: MIT
    source: https://svn.edgewall.org/repos/genshi/contrib/textmate/Genshi.tmbundle/Syntaxes/Markup%20Template%20%28XML%29.tmLanguage
    scopes:
        - text.xml.genshi

Thoughts?

EDIT: Oh yeah, it'd also be nice if we refined our terminology a little, because it's confusing to refer to both a TextMate compatible grammar file and the submodule containing it as a "grammar" (maybe "grammar-source" for the latter?) Given most of the grammars I write nowadays are almost exclusively added to language-etc (a super-bundle of whatever I can't be fucked publishing separately anymore, but nothing specific), it'd be a helpful distinction.

@Alhadis Alhadis self-assigned this Sep 2, 2020
@pastra98
Copy link
Contributor

pastra98 commented Sep 2, 2020

That would be pretty cool, actually! Maybe as a flag with the add-grammar script to specify the path?

@Alhadis
Copy link
Collaborator Author

Alhadis commented Sep 2, 2020

No need. Most of the time, the files are located somewhere predictable. Atom actually forces you to place them in the grammars directory, and limits you to JSON or CSON (a cleaner alternative to JSON) (it also imposes a bunch of other myopic, insane restrictions which I won't go into here).

@pastra98
Copy link
Contributor

pastra98 commented Sep 2, 2020

Ah okay... So it's not that many repos that are structured differently. But I could add a source: field to the grammars.yml manually, if need be?

@Alhadis
Copy link
Collaborator Author

Alhadis commented Sep 2, 2020

But I could add a source: field to the grammars.yml manually, if need be?

This is more of an RFC to discuss a potential enhancement to Linguist. The changes involved are many, and they touch lots of different components that behave similarly, but operate independently.

In other words, there's nothing you need to worry about WRT your PR. 😉

@pastra98
Copy link
Contributor

pastra98 commented Sep 2, 2020

I get that, I was trying to understand how you envision this to work from a users perspective haha

@Alhadis
Copy link
Collaborator Author

Alhadis commented Sep 2, 2020

The add-grammar script would support an optional -f/--file switch to specify the location of a grammar file, but that's about it. Usage would be something like --file hy.json or --file ./hy.json or even

--file https://github.com/Slowki/hy.tmLanguage/blob/master/hy.json

since users will try all sorts of things, and it's better for a program to be maximally permissive about what input it accepts, rather than barking at users "to provide a path relative to the upstream repository's root directory".

@lildude I don't think I've overdone that script enough, should I add a man page? 😁

@lildude
Copy link
Member

lildude commented Oct 7, 2020

This sounds like an interesting idea, though is likely to be a lot of work. Is all that work really worth the effort for the few corner cases not correctly handled now?

Without thinking too hard about it, a few quick points come to mind:

  • we won't be able to remove the dependency on submodules unless you implement a script to do the downloading and updating... which would be reinventing what git already does which could be more brittle and harder to debug as it would require more specific knowledge.
  • I'm not sure you'll be able to get rid of vendor/licenses/config.yml whilst retaining the required Licensed/Licensee integration, without implementing some more custom code, which I'd prefer we didn't do as we've only just got back to using licensed as it is intended to be used.

So if anything, this change would really be consolidating some of the current grammars.yml and tools/grammars/compiler/data.go info.

Feel free to start a PoC PR and we can track and discuss things as it progresses.

@johnmays
Copy link
Contributor

Why are grammars.yml and languages.yml in two completely different spots? Wouldn't it make sense to move grammars.yml out of root and into /lib/linguist?

@lildude
Copy link
Member

lildude commented Oct 22, 2023

@johnmays Not really. grammars.yml isn’t actually used directly by Linguist. It’s used to track the external grammars which are used by the syntax highlighting engine which is a completely independent application. The same applies to the vendor/licenses/config.yml - it’s not directly used by linguist and in that case is used to configure licensed, the external library we use for license management.

If we were to move it, it could probably go into vendor/grammars but this would just be making a change for the sake of it and could possibly be a breaking change for other users of this repo expecting the file where it is.

@johnmays
Copy link
Contributor

I see. That makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants