-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docstring indentation is too aggressive when semantic indentation is enabled. #18
Comments
I tried out your example and I think I see what you're getting at. A random idea is to use another grammar for text or docstrings - somewhat like what was discussed here. No idea if there is a "plain text" grammar nor whether performance could be an issue though (^^; Not sure if the following will be at all useful, but FWIW... I've wondered about how to handle things in docstrings because sometimes there are snippets of code which inevitably one may feel would be nice to not have to manipulate as plain text. Sort of related to that, in janet-ts-mode, to get rainbow-delimiters to work well, I ended up tweaking syntax properties of strings (there are two types in janet) like this. Those kinds of tweaks cause a variety of behaviors in Emacs to be different within strings. I think I learned how to do this from |
A markdown grammar would probably be the most appropriate. Some lsp clients will parse docstrings as markdown. While markdown is not standard, it could at least be used for things like `code snippets`
I thought those things were exclusively for font-locking, but I'm honestly not quite sure what I'm even looking at without spending some time studying that code. I will give it a more in depth look as I find the time. |
This uses the "inline" version of the markdown grammar. https://github.com/MDeiml/tree-sitter-markdown see issue #18
It's true that font-locking can be affected, however:
via: https://www.gnu.org/software/emacs/manual/html_node/elisp/Syntax-Tables.html So these lines (specifically the ;; tweaking syntax class of some chars for better behavior
(let ((pos (1+ bt1-start))
(stop (1- n-end)))
(while (< pos stop)
(when (not (memq (char-after pos)
;; XXX: might need to tweak this further
(list ?\s ?\n ?\r ?\t ?\f ?\v
?\( ?\) ?\[ ?\] ?{ ?})))
(put-text-property pos (1+ pos)
'syntax-table
(string-to-syntax "w"))) fiddle with stuff within strings in such a way that a variety of other commands (including motion ones) behave differently than otherwise.
Yeah, sorry for the lack of explanation -- it took me quite some time to wrap my head around this stuff (well, to the degree that I succeeded anyway). Anyway, may be this approach isn't needed. |
This uses the "inline" version of the markdown grammar. https://github.com/MDeiml/tree-sitter-markdown see issue #18
It may be. The first path I'm going to pursue is using a nested markdown grammar. That solves another outstanding todo I had for font-locking in docstrings. I might actually push that to main because it is working quite nicely. I haven't touched indentation with it yet. |
This uses the "inline" version of the markdown grammar. https://github.com/MDeiml/tree-sitter-markdown see issue #18
The auto-installation stuff makes me nervous. I'm glad it's been working so far. I've merged the nested markdown grammar to main. I'm leaving my bug branch for this commit open though. I still need to tackle indentation with it. Haven't worked with nested parser indentation yet. |
Yeah, that along with depending on multiple grammars that might change...perhaps motivation to consider alternative approaches? Regarding change in grammars, I wonder about the semver-ish idea mentioned here:
I wonder how much of the change in tree-sitter-clojure could be described in terms of 1 and 2. My impression (quite possibly mistaken) is that most of the change was somewhat in the neighborhood of 3. May be it depends a lot on "what stage" a grammar's development is at. IIUC, the author of the idea has extensive experience looking after the integration of a very large number of grammars in the context of an editor so it seems reasonable to suppose their observations informed the 3-point idea above. |
sogaiu ***@***.***> writes:
Yeah, that along with depending on multiple grammars that might change...perhaps motivation to consider alternative approaches?
There is some discussion in the mailing list now on how to handle this,
that's actually what prompted me to open the issue you linked below. I'm
not quite sure what the solution is other than to always install the
exact grammar version we want in a unique location, and always load
that. I'm not so sure people will like that though. Personally I enjoy
having all the grammars pre-installed from my OS packages.
Regarding change in grammars, I wonder about the semver-ish idea mentioned [here](tree-sitter/tree-sitter#2611 (comment)):
> 1. point updates are grammar fixes that don't affect queries at all;
> 2. minor updates add grammar features (nodes) that need query updates to make use of, but queries work fine with otherwise;
> 3. major updates remove/change grammar features (nodes) that break current queries.
I wonder how much of the change in tree-sitter-clojure could be described in terms of 1 and 2. My
impression (quite possibly mistaken) is that most of the change was somewhat in the neighborhood of
3. May be it depends a lot on "what stage" a grammar's development is at.
In my experience it is very hard to make a significant change that falls
into categories 1 and 2. Even adding nodes can be breaking in their own
way. They may not break queries (I think) but they can break things like
"node under point" and "parent of current node". That is kind of
something we saw with the transpose-sexp issue #17 that was opened
earlier. While that was caused by the transpose function changes, I
think it's easy to see how adding a new node could cause the it to
break.
IIUC, the author has extensive experience looking after the integration of a very large number of
grammars in the context of an editor so it seems reasonable to suppose their observations informed
the idea stated above.
Of course, I would defer to someone with more experience with
tree-sitter grammars in this regard. My own is fairly limited. I hope
that issue results in something a little more concrete than "every
grammar should use git tags with semvar".
…--
Danny Freeman
|
I think I read some of it [1] -- thanks for mentioning it.
I like relying on my OS packages for things that don't change much, but for things that end up as dependencies of things I'm working on, I find using OS packages to sometimes be a hindrance. (That's when not using NixOS or Guix though.) As tree-sitter itself hasn't reached a 1.0 status, as I understand it, there is some feeling among the maintainers that certain things can be changed. Since the grammars depend on tree-sitter in a way, I view them as more in flux. So yeah, if I were not doing tree-sitter-related work, surely I'd prefer to not be fiddling with different versions of grammars, but I wonder whether at this time how stable of an experience one can expect. Perhaps this also depends on the language in question. When I use Neovim, I notice rebuilding of grammars from time to time and I do occasionally run into issues where I need to intervene manually. I have not found the error messages to be enough on their own to figure out what to do, but often enough someone else has already asked so a relevant query can lead to a solution (^^; Not great for sure, but this may have something to do with my lack of experience and skill with things vim-ish...
I wonder if there are any issues / PRs at nvim-treesitter that demonstrate the feasibility of the 3-point idea...My imagination is not succeeding in seeing the idea working well. [1] Thread subject -> "Re: Update on tree-sitter structure navigation" |
sogaiu ***@***.***> writes:
As tree-sitter itself hasn't reached a [1.0
status](tree-sitter/tree-sitter#930), as I understand it, there is some
feeling among the maintainers that certain things can be changed. Since the grammars depend on
tree-sitter in a way, I view them as more in flux.
So yeah, if I were not doing tree-sitter-related work, surely I'd prefer to not be fiddling with
different versions of grammars, but I wonder whether at this time how stable of an experience one
can expect. Perhaps this also depends on the language in question.
I don't think people in our positions, writing grammars and tooling
based on them, can expect much stability. Users may not have the same
expectations though. I think with Emacs at least there is a general
understanding that the various *-ts-modes are still in an "experimental"
stage.
When I use Neovim, I notice rebuilding of grammars from time to time and I do occasionally run into
issues where I need to intervene manually. I have not found the error messages to be enough on their
own to figure out what to do, but often enough someone else has already asked so a relevant query
can lead to a solution (^^; Not great for sure, but this may have something to do with my lack of
experience and skill with things vim-ish...
I think it's going to be pretty general advice when people report
issues, to say "make sure you can reproduce your issue with this version
of the grammar first". In fact, that is problem something to include in
the repo's github issue template.
I also have this new ticket open
#19
to write some kind of interactive command to help diagnose grammar
issues.
> In my experience it is very hard to make a significant change that falls into categories 1 and 2.
> Even adding nodes can be breaking in their own way. They may not break queries (I think) but they
> can break things like "node under point" and "parent of current node". That is kind of something
> we saw with the transpose-sexp issue
> [#17](#17) that was opened earlier. While
> that was caused by the transpose function changes, I think it's easy to see how adding a new node
> could cause the it to break.
I wonder if there are any issues / PRs at nvim-treesitter that demonstrate the feasibility of the 3-point idea...My imagination is not succeeding in seeing the idea working well.
Demonstrating the feasibility of the semantic versioning guidelines? I
think my imagination is at a loss too. What might we expect to see? A
grammar successfully releasing a 0.0.X patch update and not breaking
anything? Then a 0.X.0 minor version update, adding nodes, and
everything keeps working? I'm afraid I'm at a loss as well.
…--
Danny Freeman
|
Something along those lines would be neat evidence, but I was thinking more along the lines of just seeing that changes in particular grammars that happened to fall into categories 1 and 2 (without the explicit version labelling). I have been wondering how often 1 or 2 has happened (slightly edited):
I suppose without some effort to arrange for those kinds of changes they are not likely to occur by chance. Was curious whether there was already some kind of "track record". I have my doubts about any scheme like this for at least the following reasons:
I found just getting the grammars I've worked on [1] to behave appropriately to be challenging enough... [1] ...and these were probably some of the simplest grammars even. |
Sure. What was tricky about Neovim was that I was seeing error messages in contexts where I wasn't expecting to see them at all. I would start Neovim and then suddenly see errors which looked vaguely like they would have something to do with tree-sitter but I wasn't sure (again, that may have a lot to do with my relative vim-ish ignorance). I haven't hit anything like that with Emacs + tree-sitter yet, but then my ignorance seems to be much less here.
Sounds like a plan :)
Do you know if any other *-ts-modes are doing anything similar? |
sogaiu ***@***.***> writes:
Sure. What was tricky about Neovim was that I was seeing error messages in contexts where I wasn't
expecting to see them at all. I would start Neovim and then suddenly see errors which looked vaguely
like they would have something to do with tree-sitter but I wasn't sure (again, that may have a lot
to do with my relative vim-ish ignorance).
I haven't hit anything like that with Emacs + tree-sitter yet, but then my ignorance seems to be much less here.
I have with other grammars (javascript for sure).
Most of the time indentation just doesn't work, or the entire buffer
does not get font-locked (because a query is now invalid).
> I also have this new ticket open [#19](#19) to write some kind of interactive command to help diagnose grammar
issues.
Do you know if any other *-ts-modes are doing anything similar?
I have not seen anything like it yet. I've also not seen any bug reports
about grammar issues yet so it's not super high priority right now. As
far as I know, any distros packaging tree-sitter-clojure currently use a
compatible version with clojure-ts-mode. Great for now, but we will
probably end up pushing a breaking change to the grammar at some point.
When that happens and the distro grammars inevitably lag behind I think
compatibility issues are likely to occur. I think the
"clojure-ts-grammar-doctor" will come in handy for users.
…--
Danny Freeman
|
Oh I meant that in Emacs it was easier to investigate and potentially fix things (as compared with Neovim) -- possibly because of the existence of Once I see an error message in Neovim, I might get an idea of what file and line to look at, but from that point, I'm in code I have no idea about with no practically economic and reliable way of observing execution like I can with edebug. Anyway, this is perhaps kind of tangential. Sorry about that.
Yeah, it might be a good way to collect some relevant info for reproducing an issue too? May be that's along the lines of what you already stated here:
|
See commit 177ac05
Bascically, as you type and hit treesit indent functionality tries to unindent everything back to 2 spaces, so it's hard to add nicer indentation inside docstrings.
The text was updated successfully, but these errors were encountered: