tree-sitter-clojure has been tested using a variety of methods.
Note: Current serious testing is done via the code and instructions in the ts-clojure repository. The description below is left for historical purposes.
This document will touch on some of those methods and why they were attempted:
- Using corpus data from other tree-sitter-clojure attempts
- Using Clojure source from Clojars
- Generative testing via Hypothesis
Other employed methods that won't be covered (in much, if any, detail) here:
- Sporadic manual invocations
- Using tonsky's sublime-clojure test data
- Generative testing via test.check
- Manual inspection of the grammar
There were at least two previous attempts at implementing tree-sitter-clojure, one by oakmac and another by Tavistock. Important things were learned by trying to make these attempts work, but for reasons not covered here, a separate attempt was started.
Both earlier attempts had
corpus
data
that could be adapted for testing. Consequently,
tsclj-tests-parser was
created to extract the relevant data as plain
files.
These were in turn fed to tree-sitter's parse
command using the
tree-sitter-clojure grammar to check for parsing errors.
If changes are made to tree-sitter-clojure's grammar, this method can be used to quickly check for some forms of undesirable breakage. (This could be taken a bit further by adapting the content as corpus data for tree-sitter-clojure.)
One issue with this approach is that it relies on manually identifying and spelling out appropriate test cases, which in the case of Clojure, is complicated by the lack of a language specification.
Apart from detailed research, this was partially addressed by testing against a large sample of Clojure source code written by the community.
The most fruitful method of testing was working with Clojure source written by humans for purposes other than for testing tree-sitter-clojure.
Initially, repositories were cloned from a variety of locations, but before long a decision was made to switch to using "release" jars from Clojars.
The latter decision was motivated by wanting source that was less likely to be "broken" in various ways. Compared to "release" jar content from Clojars, the default branch of a repository seemed to have a higher probability of "not quite working". Although the Clojars "release" idea was an improvement, weeding out inappropriate Clojure source was still necessary.
A variety of approaches were used to come up with a specific list of jars from Clojars, but the most recent attempt is gen-clru-list. This is basically a babashka script that fetches Clojars' feed.clj, does some processing, and writes out a list of urls. For reference, this approach currently yields a number of urls in the neighborhood of 19,000.
The retrieved content was initially checked using
a-tsclj-checker (an
adaptation of
analyze-reify) which uses
Rust bindings for
tree-sitter
and tree-sitter-clojure to parse Clojure source code. Notably, it can
traverse directories and also operate on .jar
files.
Once an error is detected, it is easier to investigate if one has
direct access to the Clojure source file in question (as compared with
rummaging around .jar
files). Thus, it was decided to create a
single directory tree containing extracted data from all retrieved
jars. On a side note, the single directory tree took less than 2 GB
of disk space.
A less fancy, but easier to maintain (i.e. not written in Rust) tool --
ts-grammar-checker -- was
developed as an alternative to a-tsclj-checker
. Strictly speaking,
ts-grammar-checker
may not be necessary as one can probably employ
tree-sitter's parse
command in combination with find
, xargs
and the like
if on some kind of *nix. An example of a comparable invocation is:
find ~/src/clojars-cljish -type f -regex '.*\.clj[cs]?$' -print0 | xargs -0 tree-sitter parse --quiet > my-results.txt
a-tsclj-checker
is the fastest tool but it has not been updated to
the most recent version of tree-sitter-clojure. ts-grammar-checker
is not quite as fast, but it can be easily adapted to work with other
tree-sitter grammars (e.g. it's
used
for
tree-sitter-janet-simple
as well). However, it does not support accessing content within
.jar
files.
Across somewhat less than 150,000 files (.clj, .cljc, .cljs),
a-tsclj-checker
typically takes a little less than 30 seconds, while
ts-grammar-checker
typically takes a bit more than 100 seconds (at
least on the author's machine). In subjective terms, it hasn't felt
terribly different because knowing there is at least a 30 second wait,
one typically doesn't sit waiting at a prompt for execution
completion.
For any files that parse with errors, it can be handy to apply
clj-kondo. The specific
details that clj-kondo
reported were often helpful when examining
individual files, but that diagnostic information also provided a way
to partition the files into groups. Subjectively it can feel more
manageable to deal with 5 groups of files compared with 100 separate
files (though it's true that the grouping does not always turn out to
be that meaningful).
An individual "suspect" file is typically viewed manually in an editor
(usually one that has clj-kondo
support enabled) and examined for
"issues".
In practice, testing the grammar against appropriate Clojure source from Clojars has been the most useful in finding issues with the grammar. The lack of a specification for Clojure increased the difficulty of creating an appropriate grammar, but having a large sample of code to test against helped to mitigate this a bit. On more than one occasion some version of the grammar failed to parse some legitimate Clojure source and subsequent investigation revealed that the grammar had not accounted for an uncommom and/or unanticipated usage.
This method has a significant weakness as there could be cases where tree-sitter would parse successfully but the result could be inappropriate. For example, if the grammar definition was faulty, something which should be parsed as a symbol might end up parsed as a number with no error reported.
To partially address this issue, generative / property-based testing was attempted.
Initially, some effort was made to use
test.check.
However, an outstanding issue with
test.check
(aka TCHECK-112) seemed very likely to be relevant for the types of
tests being considered. Also, the approach used
libpython-clj to call
tree-sitter via Python bindings for
tree-sitter. Although
invoking tree-sitter via Python worked, it was awkward to connect this
with test.check
. For the above reasons, the test.check
+
libpython-clj
approach (neat as it was) was abandoned.
Interestingly, Python's Hypothesis doesn't suffer from test.check's "long-standing Hard Problem" so that was given a try. prop-test-ts-clj and hypothesis-grammar-clojure are the resulting bits.
At least one issue was discovered and it also turned out that parcera was affected.
The code was also adapted a bit to test Calva. Some issues were discovered and reported upstream.
A drawback of this approach is that details of the tree-sitter-clojure grammar became embedded in the tests. One consequence is that if tree-sitter-clojure's grammar changes, then the tests may need to be updated to reflect changes in the grammar (if there is an intent to continue to use them).
tree-sitter-clojure has been tested in a variety ways attempting to address various real-world constraints (e.g. lack of a language specification, limitations of tree-sitter's approach for a language with extensible syntax, etc.). AFAICT, for what it sets out to do, it seems to work pretty well so far.