StoryCheck for Web3 apps based on Ethereum. Experimental app testing playground as well as an API served via Gradio on port 7860
.
It takes as input markdown formatted user stories with steps written in natural language. Then it parses the text and executes the steps in a virtual web browser (via Playwright) closely emulating the actions of a real user. Uses RefExp GPT to predict UI element coordinates given a referring expression.
Note: Storycheck is currently most reliable for testing the UI of smartphones and tablets.
# Creating a new DAO LLC via SporosDAO.xyz on Goerli Test Net
## Prerequisites
- Chain
- Id 5
- Block 8856964
## User Steps
1. Browse to https://app.sporosdao.xyz/
1. Click on Create a new company button
1. Click on Go! right arrow button left of Available
1. Select On-chain name text field
1. Type Test DAO
1. Select Token Symbol text field
1. Type TDO
1. Click on Continue button in the bottom right corner
1. Select Address text field above Enter the wallet address
1. Type 0x5389199D5168174FA177908685FbD52A7138Ed1a
1. Select text field below Initial Tokens
1. Type 1200
1. Select text field under Email
1. Type [email protected]
1. Scroll down
1. Click on Continue button
1. Click on Continue button at the top
1. Scroll up
1. Click on the checkbox left of Agree
1. Scroll down
1. Click on Continue button
1. Scroll up
1. Click on Deploy Now button
1. Press Tab
1. Press Tab
1. Press Enter
1. Press Home
## Expected Results
- Wallet transactions match snapshot
The prerequisites section sets conditions which allow the test to execute from a deterministic blockchain state, which respectively allows for predictable results. Currently supported prerequsite is Chain
at the top level with Id
as a required parameter, and optionally Block
and RPC
. These parameters are passed to anvil
to create a local EVM fork for the test run.
By default each test starts with 10,000 ETH
in the mock user wallet (same as anvil default test accounts).
In order to fund the mock wallet with other tokens (e.g. USDC, DAI, NFTs), the User Steps
section of the story file should begin with prompts that initiate the funding via front end interactions (e.g. Uniswap flow for ETH/USDC swap).
Often Web3 Apps use front end libraries such as wagmi.sh to access current chain state. When that is the case, the user story should include the exact RPC URL used by the front end as a prerequisite. That allows StoryCheck to intercept all calls directed to the RPC and reroute towards the local blockchain fork. This is important to ensure that the app reads and writes from/to the local chain fork.
The following example sets up a local fork of ETH Mainnet starting from the latest block using a default RPC.
## Prerequisites
- Chain
- Id 1
The following example sets up a local fork of Goerli Testnet starting from the given block number and using a given RPC URL.
## Prerequisites
- Chain
- Id 5
- Block 8856964
- RPC https://eth-goerli.g.alchemy.com/v2/3HpUm27w8PfGlJzZa4jxnxSYs9vQNMMM
The format of user steps in this section resembles the HOWTO documentation of a web3 app. Teams may use the same markdown in their documentation (e.g. gitbook, notion, docusauros) and execute it with StoryCheck to make sure that the latest web app behavior is in sync with docs.
Each step in a user story is classified as an action prompt from the following set:
Browse
- prompts that start withbrowse
and include a URL link to a web page are interpreted as browser navigation actions. For examplebrowse to https://app.uniswap.org
. For implementation details, see Playwright goto.Click
- prompts that start withclick
,tap
, orselect
followed by a natural language referring expression of a UI element are interepreted as click actions with the corresponding UI element target. For exampleclick on Submit button at the bottom
orselect logo next to ETH option
. For implementation details see Playwright mouse click and RefExp GPTType
- prompts that start with the keywordtype
,input
orenter
(case insensitive) followed by a string are interpreted as a keyboard input action. For exampleType 1000
orType MyNewDAO
. For implementation details, see Playwright type.Scroll
- prompts that start withscroll
followed byup
ordown
are interpreted respectively asPress PageDown
andPress PageUp
Press
- prompts that start withpress
followed by a keyboard key code (F1
-F12
,Digit0
-Digit9
,KeyA
-KeyZ
,Backquote
,Minus
,Equal
,Backslash
,Backspace
,Tab
,Delete
,Escape
,ArrowDown
,End
,Enter
,Home
,Insert
,PageDown
,PageUp
,ArrowRight
,ArrowUp
) are interpreted as a single key press action. For further details, see Playwright press.
Expected Results section currently implements a default transaction snapshot check similar to jest snapshot matching.
The first time a test is run, all write transactions going through window.ethereum
are recorded and saved. Subsequent runs must match these write transactions. If there is a mismatch, then one of three changes took place in the UI under test:
- Developers changed the frontend code in a significant way. This warrants a careful code review and update of the user stories.
- There is malicious injected code that changes the behavior of the app. A big red alert is in order! App infrastructure is compromised: hosting providers, third party libraries, or build tools.
- There is a bug in some of the third party dependencies that affects UI behavior. Developer attention required to track down and fix the root cause.
Snapshot files with wallet transactions are saved to a file with .snapshot.json
extension in the same directory where the story markdown file is stored.
├─ astory.md
├─ astory.snapshot.json
flowchart TD
A[User Story] -->|check| B(StoryCheck)
B --> |parse| C[Markdown Parser]
B -->|play| D[Browser Driver / playwright]
D -->|locate UI element| E[AI Model]
D -->|sign tx| F[Mock Wallet / EIP1193Bridge]
F -->|blokchain tx| G[Local EVM Fork / anvil]
├─ .\ — "Main StoryCheck python app."
│ │
│ ├─ markdown — "Markdown parser. Outputs abstract syntax tree (AST) to interpreter."
│ │
│ ├──┬─ interpreter — "Runtime engine which takes AST as input and executes it."
│ │ │
│ ├──┼──┬─ browser — "Playwright browser driver."
│ │ │ │
│ │ │ └─ mock_wallet — "JavaScript mock wallet provider injected in playwright page context as Metamask."
│ │ │
│ │ ├─ ai — "RefExp GPT AI model that predicst UI element location based on natural language referring expressions."
│ │ │
│ │ └─ blockchain — "Local EVM fork runtime via Foundry Anvil."
│ │
│ └─ examples — "Example user stories."
This project is pre-configured to build and run via Gitpod.
To run locally or in another dev environment, copy the steps from .gitpod.yml
StoryCheck can be run as a shell command or as a web service.
$>./storycheck.sh --help
usage: StoryCheck by GuardianUI [-h] [-o OUTPUT_DIR] [--serve] storypath
Parses and executes user stories written in markdown format.
positional arguments:
storypath Path to the user story input markdown file (e.g. mystory.md).
options:
-h, --help show this help message and exit
-o OUTPUT_DIR, --output-dir OUTPUT_DIR
Directory where all results from the storycheck run will be stored. Defaults to "results"
--serve Run as a web service. Defaults to "False".
Copyright(c) guardianui.com 2023
For example to run a check of mystory.md, use:
./storycheck.sh mystory.md
If all story checks / tests pass, the command will return with exit code 0
. Otherwise if any test fails or other errors occur, the exit code will be non-zero.
This makes it possible to use storycheck in shell scripts or CI scripts.
StoryCheck can be used as a test step in CI scripts. Here is an example github action which sets up a storycheck environment and runs checks. If the storycheck step fails, the CI script fails as well.
The output directory of a test run is either specified via --output-dir
command line argument
or defaults to ./results
. It contains a number of helpful artifacts for debugging:
├─ ./results — "Main output directory for an input story file."
│ │
│ ├─ storycheck.log — "Consolidated log file between test runner, browser and EVM."
│ │
│ ├─ tx_log_snapshot.json — "Snapshot of all blockchain write transactions."
│ │
│ ├─ videos/ — "Video recordings of browser interactions."
│ │
│ ├─ screenshots/ — "Browser screenshot for every prompt in the User Steps section."
│ │
│ ├─ anvil-out.json — "Configuration for the anvil EVM fork."
│ │
│ ├─ trace.zip — "Session trace for Playwright Trace Viewer."
│ │
Thanks for your interest in contributing!
Please start with a new discussion before opening an Issue or Pull Request.