Testing Suites
Currently, flowR contains two testing suites: one for functionality and one for performance. We explain each of them in the following. A more general test experience is granted by npm run checkup
whichIn addition to running those tests, you can use the more generalized npm run checkup
. This will include the construction of the docker image, the generation of the wiki pages, and the linter.
Functionality Tests
The functionality tests represent conventional unit (and depending on your terminology component/api) tests. We use vitest as our testing framework. You can run the tests by issuing:
npm run test-full
However, depending on your local R version, your network connection and potentially other factors, some tests may be skipped automatically as they don’tdon't apply to your current system setup (or can't be tested with the current prerequisites). Each test can specify such requirements as part of the TestConfiguration
, which is then used in the test.skipIf
function of vitest. It is up to the ci to run the tests on different systems to ensure that those tests are ensured to run.
Test Structure
All functionality tests are to be located under test/functionality.
Writing a Test
Currently, this is heavily dependent on what you want to test (normalization, dataflow, quad-export, ...) and it is probably best to have a look at existing tests in that area to get an idea of what comfort functionality is available.
Generally, tests should be labeled according to the flowR capabilities they test. The set of currently supported capabilities and their IDs can be found in ./src/r-bridge/data/data.ts
. The resulting labels are used in the test report that is generated as part of the test output. They group tests by the capabilities they test and allow the report to display how many tests ensure that any given capability is properly supported.
Various helper functions are available to ease in writing tests with common behaviors, like testing for dataflow, slicing or query results. These can be found in the _helper
subdirectory.
For example, an existing test that tests the dataflow graph of a simple variable looks like this:
assertDataflow(label('simple variable', ['name-normal']), shell, 'x', emptyGraph().use('0', 'x') );
When writing dataflow tests, additional settings can be used to reduce the amount of graph data that needs to be pre-written. Notably:
-
expectIsSubgraph
indicates that the expected graph is a subgraph, rather than the full graph that the test should generate. The test will then only check if the supplied graph is contained in the result graph, rather than an exact match. -
resolveIdsAsCriterion
indicates that the ids given in the expected (sub)graph should be resolved as slicing criteria rather than actual ids. For example, passing12@a
as an id in the expected (sub)graph will cause it to be resolved as the corresponding id.
The following example shows both in use.
assertDataflow(label('without distractors', [...OperatorDatabase['<-'].capabilities, 'numbers', 'name-normal', 'newlines', 'name-escaped']), shell, '`a` <- 2\na', emptyGraph() .use('2@a') .reads('2@a', '1@`a`'), { expectIsSubgraph: true, resolveIdsAsCriterion: true } );
Running Only Some Tests
To run only some tests, vitest allows you to filter tests.
Besides, you can use the watch mode (with npm run test
) to only run tests that are affected by your changes.
c50b1d3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"artificial" Benchmark Suite
Retrieve AST from R code
245.4433267727273
ms (100.97557255775912
)237.85305927272728
ms (97.36861369002281
)1.03
Normalize R AST
17.92380840909091
ms (31.526905348222503
)16.982624772727274
ms (30.42886266900597
)1.06
Produce dataflow information
60.91719522727273
ms (126.41920544061759
)60.41169277272727
ms (128.7371176899317
)1.01
Total per-file
855.0886790454545
ms (1545.3067215845101
)833.961438
ms (1514.7315556086162
)1.03
Static slicing
2.1224975199953495
ms (1.1586635761483512
)2.0461436166648226
ms (1.2405957027340997
)1.04
Reconstruct code
0.24969024095374226
ms (0.1957585127565638
)0.23572579664556767
ms (0.19160803373208626
)1.06
Total per-slice
2.3865562701014373
ms (1.2375596561897722
)2.2952539344461735
ms (1.3064191460121453
)1.04
failed to reconstruct/re-parse
0
#0
#1
times hit threshold
0
#0
#1
reduction (characters)
0.7869360165281424
#0.7869360165281424
#1
reduction (normalized tokens)
0.7639690077689504
#0.7639690077689504
#1
memory (df-graph)
95.46617542613636
KiB (244.77619956879823
)95.46617542613636
KiB (244.77619956879823
)1
This comment was automatically generated by workflow using github-action-benchmark.
c50b1d3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"social-science" Benchmark Suite
Retrieve AST from R code
245.26885322
ms (46.042311577377724
)254.70445752
ms (48.56635699718653
)0.96
Normalize R AST
19.275219
ms (15.519069244596878
)19.45440952
ms (14.953138748943163
)0.99
Produce dataflow information
74.63655406000001
ms (72.32703486511011
)75.30514048
ms (71.35653069164984
)0.99
Total per-file
7768.1200948000005
ms (29057.697911117084
)7850.45238692
ms (28841.253371136383
)0.99
Static slicing
16.04143741063529
ms (44.39267620069801
)16.172304981916042
ms (44.135225929438114
)0.99
Reconstruct code
0.2803405855208846
ms (0.1573590448964055
)0.34529443588845116
ms (0.1709415154465775
)0.81
Total per-slice
16.330065101053936
ms (44.430783054404905
)16.52688720676753
ms (44.1588873153231
)0.99
failed to reconstruct/re-parse
0
#0
#1
times hit threshold
0
#0
#1
reduction (characters)
0.8712997340230448
#0.8712997340230448
#1
reduction (normalized tokens)
0.8102441553774778
#0.8102441553774778
#1
memory (df-graph)
99.4425
KiB (113.62933451202426
)99.4425
KiB (113.62933451202426
)1
This comment was automatically generated by workflow using github-action-benchmark.