This project is a collection of notes and examples to help me understand WASI and WASM. This is an API which provides access to operating system like function calls to access files and the filesystem, Berkley sockets, clocks, and random numbers. At the same time it has a capability-based security model for this I/O (it extends WebAssembly's sandbox model).
$ curl https://wasmtime.dev/install.sh -sSf | bash
$ source ~/.bashrc
$ wasmtime --version
wasmtime-cli 0.40.0
The interface of wasi is defined in .witx
files. The .witx files for
wasi_snapshot_preview1 can be used by a tool like wasmtime's wiggle
tool
generate traits for Rust which can then be implemented. Other tools can generate
simlar constructs for other languages. These traits are then implemented.
The example fd_write.wat shows the usage of the fd_write system call. The input to fd_write is shown below:
__wasi_fd_write(__wasi_fd_t fd, const __wasi_ciovec_t *iovs and size_t iovs_len
__wasi_fd_t
is just defined as:
typedef uint32_t uvwasi_fd_t;
And __wasi_ciovec_t
as:
typedef struct uvwasi_ciovec_s {
const void* buf;
size_t buf_len;
} uvwasi_ciovec_t;
So we can see that we have a pointer to a buffer and a length.
We make a .wasm module for this using:
$ make out/fd_write.wasm
wat2wasm -v -o out/fd_write.wasm src/fd_write.wat --debug-names
0000000: 0061 736d ; WASM_BINARY_MAGIC
0000004: 0100 0000 ; WASM_BINARY_VERSION
; section "Type" (1)
...
And then we can run this using a wasm runtime like wasmtime:
$ ~/work/wasm/wasmtime/target/debug/wasmtime out/fd_write.wasm
something
This section is an attempt to answer a question I asked myself while working on a uvwasi issue. The fix in this case was correcting a return value from uvwasi_fd_readdir. I read the docs and figured out that something was calling this function, and depending on the value returned would try to read more entries (or not if it was done). But which function was calling this uvwasi function was not clear to me.
Lets take a look at compiling c program what uses readdir (man 3 readdir).
The program is very simple and prints out all the files
(directory entries) from the src
directory:
$ make out/readdir
$ ./out/readdir
readdir example
d_name: multivalue.wat
...
This is a dynmically linked executable:
$ file out/readdir
out/readdir: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=e5ad67a7cb11843781c8b42bd281898dcef7a8b1, for GNU/Linux 3.2.0, stripped
And we can inspect the dynamic symbols using:
$ nm -D out/readdir
U closedir@GLIBC_2.2.5
U fwrite@GLIBC_2.2.5
w __gmon_start__
U __libc_start_main@GLIBC_2.34
U opendir@GLIBC_2.2.5
U puts@GLIBC_2.2.5
U readdir@GLIBC_2.2.5
0000000000404060 B stderr@@GLIBC_2.2.5
$ ldd out/readdir
linux-vdso.so.1 (0x00007ffc02fef000)
libc.so.6 => /usr/lib64/libc.so.6 (0x00007fbfeb9ed000)
/lib64/ld-linux-x86-64.so.2 (0x00007fbfebc16000)
A wasm module is more like a statically linked executable:
$ make out/readdir_s
cc -O0 -g -s -static -o out/readdir_s src/readdir.c
$ file out/readdir_s
out/readdir_s: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=50e9b76eea75c197d1a417c3946b462108ead1b6, for GNU/Linux 3.2.0, stripped, too many notes (256)
$ ls -hgG out/readdir_s out/readdir
-rwxrwxr-x. 1 21K Sep 7 11:03 out/readdir
-rwxrwxr-x. 1 1.5M Sep 7 11:10 out/readdir_s
If we use objdump on readdir_s we won't see anything named readdir
. But we
can take a look at the linker map which is generated for this target, named
readdir_s.map
:
.text 0x0000000000444110 0xf0 /usr/lib/gcc/x86_64-redhat-linux/11/../../../../lib64/libc.a(readdir64.o)
0x0000000000444110 readdir
0x0000000000444110 __readdir64
0x0000000000444110 __readdir
0x0000000000444110 readdir64
And if we disassemble readdir_s
:
$ objdump -d out/readdir_s
...
4016c9: e8 42 2a 04 00 call 0x444110
Now, lets compile the same c program into wasm:
$ make out/readdir.wasm
This target is using wasi-sdk
to compile readdir.c
to a wasm module.
So lets take a closer look at the generated module:
$ wasm-objdump -x out/readdir.wasm
readdir.wasm: file format wasm 0x1
Section Details:
Type[16]:
...
- type[4] (i32, i32, i32, i64, i32) -> i32
Import[11]:
- func[0] sig=2 <wasi_snapshot_preview1.args_get> <- wasi_snapshot_preview1.args_get
- func[1] sig=2 <wasi_snapshot_preview1.args_sizes_get> <- wasi_snapshot_preview1.args_sizes_get
- func[2] sig=3 <wasi_snapshot_preview1.fd_close> <- wasi_snapshot_preview1.fd_close
- func[3] sig=2 <wasi_snapshot_preview1.fd_fdstat_get> <- wasi_snapshot_preview1.fd_fdstat_get
- func[4] sig=2 <wasi_snapshot_preview1.fd_prestat_get> <- wasi_snapshot_preview1.fd_prestat_get
- func[5] sig=0 <wasi_snapshot_preview1.fd_prestat_dir_name> <- wasi_snapshot_preview1.fd_prestat_dir_name
- func[6] sig=4 <wasi_snapshot_preview1.fd_readdir> <- wasi_snapshot_preview1.fd_readdir
- func[7] sig=5 <wasi_snapshot_preview1.fd_seek> <- wasi_snapshot_preview1.fd_seek
- func[8] sig=6 <wasi_snapshot_preview1.fd_write> <- wasi_snapshot_preview1.fd_write
- func[9] sig=7 <wasi_snapshot_preview1.path_open> <- wasi_snapshot_preview1.path_open
- func[10] sig=8 <wasi_snapshot_preview1.proc_exit> <- wasi_snapshot_preview1.proc_exi
By looking at the above we can see that there are a number of imports that
need to provided by host environment, via the importObject
. We can see that
function 6 which has signature 4 which is wasi_snapshot_preview1.fd_readdir
this is the uvwasi function.
In Node.js we can see that fd_readdir
is mapped to the wasiImport object
and this function will call uvwasi_fd_readdir.
WASI {
wasiImport: WASI {
args_get: [Function: bound args_get],
...
fd_read: [Function: bound fd_read],
fd_readdir: [Function: bound fd_readdir],
...
When we compile the c program into a wasm module we do so with a compiler that supports Wasi, in our case we are using clang from wasi-sdk which uses wasi-libc. If we think about this it makes sense that we have different libc libraries for different environments. The make target that compiles readdir.c into an executable is using the GNU libc library, and the target that compiles into a wasm module will use wasi-libc.
So, the call to readdir
which we know is declared in dirent.h
but is not
included in our code but instead linked by the linker in the compilation
process.
So, lets start by finding out where wasi_snapshot_preview1.fd_readdir
is
declared. It can be found in libc-bottom-half/sources/__wasilibc_real.c
:
int32_t __imported_wasi_snapshot_preview1_fd_readdir(int32_t arg0, int32_t arg1, int32_t arg2, int64_t arg3, int32_t arg4) __attribute__((
__import_module__("wasi_snapshot_preview1"),
__import_name__("fd_readdir")
));
This is specifying that this function will be defined as the symbol name
fd_readdir
inside of the WebAssembly linking environment using
import-name.
Lets take another look at the wasm module:
$ wasm-objdump -x out/readdir.wasm
Type[17]:
- type[0] (i32, i32, i32) -> i32
- type[1] (i32, i64, i32) -> i64
- type[2] (i32, i32) -> i32
- type[3] (i32) -> i32
- type[4] (i32, i32, i32, i64, i32) -> i32
...
Import[11]:
...
- func[6] sig=4 <wasi_snapshot_preview1.fd_readdir> <- wasi_snapshot_preview1.fd_readdir
So, the mistake I've been making was that I was expecting to find a call to
readdir
somewhere in the disassembled wasm, but there is no guarantee that
the name of the function will be preserved so we have to figure out how this
works. This can be done using:
$ wasm-objdump -d out/readdir.wasm > readdir.dis
If we search for where function 6 is called we find that happens in function 32:
003481 func[32]:
003482: 20 00 | local.get 0
003484: 20 01 | local.get 1
003486: 20 02 | local.get 2
003488: 20 03 | local.get 3
00348a: 20 04 | local.get 4
00348c: 10 86 80 80 80 00 | call 6 <wasi_snapshot_preview1.fd_readdir>
003492: 41 ff ff 03 | i32.const 65535
003496: 71 | i32.and
003497: 0b | end
Function 32 uses the signature index 4 so it takes 5 parameters and returns an i32. This sounds familiar, it looks like func 32 is fd_readdir
fd_readdir(fd: fd, buf: Pointer<u8>, buf_len: size, cookie: dircookie) -> Result<size, errno>
In our case the Result is actually one of the parameters passed in.
If we search for where function 32 is called (search for 'call 32') we find that it gets called in functions, 25 and 46. Lets start with function 25:
(func (;25;) (type 3) (param i32) (result i32) ;; one parameter (DIR* dirp)
(local i32 i32 i32 i32 i32 i32 i32 i64 i64) ;; 9 local variables in this function
local.get 0 ;; local 0 is the first parameter so this is DIR* dirp.
i32.const 28 ;; this is placing 28 onto the stack
i32.add ;; this is adding 28 to the DIR pointer, which buffer_used (see struct _DIR) further down)
local.set 1 ;; store this value in local variable 1
local.get 0 ;; load local 0 again (it was popped off by i32.add)
i32.load offset=20 ;; loads a 32-bit integer from memory, starting from the topmost value on the stack, which is DIR* dirp, and offset 20
;; and if we again look at struct _DIR this looks like it will be the buffer_processed member of DIR.
local.set 2 ;; The above value (buffer_processed is stored in local variable 2
;; local 0 = DIR* dirp
;; local 1 = buffer_used
;; local 2 = buffer_processed
loop (result i32) ;; label = @, is is the for (;;) loop.
block ;; label = @2
block ;; label = @3
block ;; label = @4
block ;; label = @5
block ;; label = @6
block ;; label = @7
local.get 1 ;; push buffer_used onto the stack
i32.load ;; load a 32-bit integer starting from the topmost value on the stack
local.tee 3 ;; store this value in local variable 3 (buffer_used value), and also leave a copy on the stack
local.get 2 ;; push buffer_processed onto the stack
i32.sub ;; buffer_used - buffer_processed
local.tee 4 ;; store the result in local variable 4 (buffer_left)
i32.const 23 ;; 23 is sizeof(__wasi__dirent_t) though I though this should be 21 bytes
i32.gt_u ;; greater than unsigned checks if buffer_left < 23, if true 1 will be pushed onto the stack, 0 otherwise
br_if 0 (;@7;) ;; if not greater than branch to label @7
local.get 3 ;; push local variable 3 (buffer_used value) onto the stack
local.get 0 ;; push local variable 0 (DIR* dirp)
i32.load offset=24 ;; load from memory starting at the topmost value on the stack, using offset 24 (buffer_size)
local.tee 4 ;; store the result in local variable 4 and also leave a copy on the stack
i32.ge_u ;; if buffer_used < buffer_size
br_if 1 (;@6;) ;; branch to end of block 1 if not true
i32.const 0 ;; load const 0 onto the stack (NULL)
return ;; return topmost value (NULL);
end
Function 25's signature can be found in index 3 which takes a single i32 and
returns an i32. I'm thinking that these i32 are pointers, which would match
up with the signature of readdir
:
struct dirent* readdir(DIR* dirp);
If we look at wasi-sdk it contains wasi-libc as a git submodule in
src/wasi-libc. There is a top-half of this libc implementation which is based
on musl and in
wasi-sdk/src/wasi-libc/libc-bottom-half/cloudlibc/src/libc/dirent/readdir.c we
can find the implementation of readdir
:
struct dirent *readdir(DIR *dirp) {
for (;;) {
size_t buffer_left = dirp->buffer_used - dirp->buffer_processed;
if (buffer_left < sizeof(__wasi_dirent_t)) {
// End-of-file.
if (dirp->buffer_used < dirp->buffer_size)
return NULL;
goto read_entries;
}
...
read_entries:
// Discard data currently stored in the input buffer.
dirp->buffer_used = dirp->buffer_processed = dirp->buffer_size;
// Load more directory entries and continue.
__wasi_errno_t error = __wasi_fd_readdir(dirp->fd,
(uint8_t *)dirp->buffer,
dirp->buffer_size,
dirp->cookie,
&dirp->buffer_used);
if (error != 0) {
errno = error;
return NULL;
}
dirp->buffer_processed = 0;
}
}
If we compare this to the wat (WebAssembly Text format) above we can see that they match up.
Notice the first if statement in the for loop and the second if statement which
is checking the if buffer_used is less than the buffer_size. If buff_used is
the same as buffer_size this indicated that there are more items to be read and
the code will jump/goto read_entries
.
So to answer my original question it is the above function, which is part of the wasi-libc library, which is compiled into the Wasm module. Much like a statically linked "normal" c program which is statically linked with a libc implementation would include the readdir function as well.
The following are some structs and macros with comments used in the above code comments:
#define DIRENT_DEFAULT_BUFFER_SIZE 4096
struct _DIR {
// Directory file descriptor and cookie.
int fd; // 0 - 4
__wasi_dircookie_t cookie; // 4 ->12
// Read buffer.
char *buffer; // 12 ->20
size_t buffer_processed; // 20 ->24
size_t buffer_size; // 24 ->28
size_t buffer_used; // 28
// Object returned by readdir().
struct dirent *dirent;
size_t dirent_size;
};
typedef struct __wasi_dirent_t {
__wasi_dircookie_t d_next;
__wasi_inode_t d_ino;
__wasi_dirnamlen_t d_namlen;
__wasi_filetype_t d_type;
} __wasi_dirent_t;
typedef uint64_t __wasi_dircookie_t; // size: 8 bytes
typedef uint64_t __wasi_inode_t; // size: 8 bytes
typedef uint32_t __wasi_dirnamlen_t; // size: 4 bytes
typedef uint8_t __wasi_filetype_t; // size: 1 bytes
#define DIRENT_DEFAULT_BUFFER_SIZE 4096
struct _DIR {
// Directory file descriptor and cookie.
int fd; // 0 - 4
__wasi_dircookie_t cookie; // 4 ->12
// Read buffer.
char *buffer; // 12 ->20
size_t buffer_processed; // 20 ->24
size_t buffer_size; // 24 ->28
size_t buffer_used; // 28
// Object returned by readdir().
struct dirent *dirent;
size_t dirent_size;
};
We can run readdir.wasm
using wasmtime:
$ wasmtime --dir=. out/readdir.wasm
The example args_sizes_get.wat contains an example of calling __wasi_args_sizes_get.
This shows an important point that I totally missed when first looking at calling it. Looking at the documentation we can see that this function outputs:
size_t argc The number of arguments
size_t argv_buf_size The size of the argument string data.
What I did not understand was that these are pointers that are passed into the function. So we have to specify the memory locations that it should use to populate these values.
The test can be run manually:
$ wasmtime src/args_sizes_get.wat one two three four five six
$ echo $?
7
Just note that the name of the program also counts as an argument.
For this example we need to set up an char** to store the information in. We need to pass this into the args_get function and it will populate the.
The documentation for args_get states that the sizes of the buffers should match that returned by __wasi_args_sizes_get(). In this case we are going to hard code these sizes for simplicity.
The example args_get.wat currently hard codes everything and should be invoked with like this: :
$ wasmtime src/args_get.wat one two
args_get.wat
one
two
The reason for doing this is to demonstrate and make the example as simple as possible so that I understand how memory management works.
Take the following C main function:
int main(int argc, char** argv) {
The standard library will set up argv for us:
argv char*
+-------+ +--------+ +--------+
|address| ---->|address | ---->|progname|
+-------+ +--------+ +--------+
The following is using a simple C example:
$ lldb -- ptp one two three
(lldb) expr argv
(char **) $1 = 0x00007ffeefbfefd8
(lldb) memory read -f x -s 8 -c 8 0x00007ffeefbfefb8
0x7ffeefbfefb8: 0x00007ffeefbff288 0x00007ffeefbff2d3
0x7ffeefbfefc8: 0x00007ffeefbff2d7 0x00007ffeefbff2db
(lldb) expr *argv
(char *) $1 = 0x00007ffeefbff288 "/Users/danielbevenius/work/c++/learningc++11/src/fundamentals/pointers/ptp"
We can visualize this:
char** char*
+------------------+ +------------------+ +--------+
|0x00007ffeefbfefb8| ----> |0x00007ffeefbff288| ----> |progname|
+------------------+ +------------------+ +--------+
+------------------+ +--------+
|0x00007ffeefbff2d3| ----> | "one" |
+------------------+ +--------+
+------------------+ +--------+
|0x00007ffeefbff2d7| ----> | "two" |
+------------------+ +--------+
+------------------+ +--------+
|0x00007ffeefbff2db| ----> | "three"|
+------------------+ +--------+
In our case with get_args
we need to set this memory up manually. First, we
need argv** which is a pointer, size 4 if we are using 32 bit addressing.
argv:
i32.const 0 ;; offset for argv pointer
i32.const 0 ;; value
i32.store align=2
Remember the pointers should all come after each other in memory, so we should be able to add able to dereference argv to get the first pointer, then add 4 to get to the second pointer. So
+------------------+ +------------------+ +--------------+
| 4 | ----> | | ----> |"args_get.wat"|
+------------------+ +------------------+ +--------------+
0 3 4 8 64 76
+------------------+ +--------+
| | ----> | "one" |
+------------------+ +--------+
9 13 77 80
+------------------+ +--------+
| | ----> | "two" |
+------------------+ +--------+
14 18 81 84
Just note that if the program name will not include the directory, only the name of the executable:
$ wasmtime out/first.wasm
args[0]=first.wasm
Just keep this in mind when inspecting memory as it took me a while to realise that the directory was not expected.
Inputs:
char **argv
A pointer to a buffer to write the argument pointers.
char *argv_buf
A pointer to a buffer to write the argument string data.
Memory:
8-bit 16-bit 32-bit 64-bit
↓ ↓ ↓ ↓
[00][00][00][00][00][00][00][00][00][00][00][00]...
8-bit [ 0][ 1][ 2][ 3][ 4][ 5][ 6][ 7][ 8][ 9][10][11]...
16-bit[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ]...
32-bit[ 0 ][ 1 ][ 2 ]...
64-bit[ 0 ][ 1 ]...
The example environ_sizes_get.wat contains an example of calling __wasi_environ_sizes_get.
$ wasmtime --env="ONE=1" --env="TWO=2" src/environ_sizes_get.wat
$ echo $?
2
The example environ_get.wat contains an of calling environ_get.
$ wasmtime --env="ONE=1" --env="TWO=2" src/environ_get.wat
$ xcode-select --install
$ brew install llvm
==> Caveats
To use the bundled libc++ please add the following LDFLAGS:
LDFLAGS="-L/usr/local/opt/llvm/lib -Wl,-rpath,/usr/local/opt/llvm/lib"
llvm is keg-only, which means it was not symlinked into /usr/local,
because macOS already provides this software and installing another version in
parallel can cause all kinds of trouble.
If you need to have llvm first in your PATH run:
echo 'export PATH="/usr/local/opt/llvm/bin:$PATH"' >> ~/.bash_profile
For compilers to find llvm you may need to set:
export LDFLAGS="-L/usr/local/opt/llvm/lib"
export CPPFLAGS="-I/usr/local/opt/llvm/include"
Find the targets that
$ /usr/local/opt/llvm/bin/llc --version
LLVM (http://llvm.org/):
LLVM version 8.0.0
Optimized build.
Default target: x86_64-apple-darwin18.0.0
Host CPU: haswell
Registered Targets:
...
wasm32 - WebAssembly 32-bit
wasm64 - WebAssembly 64-bit
...
So this looks pretty good to me, we have wasm32
and wasm64
.
$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
$ rustup install nightly-x86_64-apple-darwin
Add the wasm32-wasi
target:
$ rustup target add wasm32-wasi --toolchain nightly
Troubleshooting:
--- stderr
error[E0463]: can't find crate for `std`
|
= note: the `wasm32-unknown-unknown` target may not be installed
error: aborting due to previous error
$ rust target add wasm32-unknown-unknown --toolchain nightly
Use the following command to build wasmtime:
$ RUSTFLAGS=-g CC="clang" CXX="clang++" cargo +nightly build --release
You might need to update rust using:
$ rustup update nightly
After this update add $WASMTIME_LOCAL_REPO/target/release/ to your PATH.
After building clang with wasm/wasi support this project can be build using make or cmake. The path to the directory where llvm and the wasi-sysroot can be updated in the Makefile/CMakeList.txt.
To compile src/first.c and then run using wasmtime using make:
$ make
$ make run
To compile src/first.c using cmake:
$ mkdir build
$ cmake ..
$ make
$ git clone https://github.com/CraneStation/wasi-sdk.git
$ git submodule init
$ git submodule update
I've also downloaded and
unpacked these locally as this is quicker than having to build llvm-project and
also it takes a look of disk space which is an issue for me.
This is what I specify in the Makefile as the LLVM_HOME
.
$ git clone https://github.com/CraneStation/wasi-libc.git
$ git submodule init
$ git submodule update
$ make WASM_CC=/usr/local/opt/llvm/bin/clang WASM_AR=/usr/local/opt/llvm/bin/llvm-ar WASM_NM=/usr/local/opt/llvm/bin/llvm-nm
$ make WASM_CC=/home/danielbevenius/opt/bin/clang WASM_AR=/home/danielbevenius/opt/bin/llvm-ar WASM_NM=/home/danielbevenius/opt/bin/llvm-nm
...
#
# The build succeeded! The generated sysroot is in /Users/danielbevenius/work/wasi/wasi-libc/sysroot.
#
Specifying a --sysroot=/somedir
when building will make the compiler look for headers and
libraries in /somedir/include and /somedir/lib. So we will want to specify this
sysroot that was created above when compiling:
$ clang --sysroot=/opt/wasi-sdk/ --target=wasm32-unknown-wasi -o module.wasm
Download libclang_rt.builtins-wasm32.a and copy it to the wasi lib:
$ mkdir /usr/local/Cellar/llvm/8.0.0_1/lib/clang/8.0.0/lib/wasi
$ cp ~/Downloads/libclang_rt.builtins-wasm32.a /usr/local/Cellar/llvm/8.0.0_1/lib/clang/8.0.0/lib/wasi/
$ make
$ wasmtime/target/debug/wasmtime out/first.wasm
First build without --release
and then use rust-lldb:
$ rust-lldb target/debug/wasmtime out/first.wasm
Install beta channel:
$ rustup install beta
Add wasi target to beta:
$ rustup target add wasm32-wasi --toolchain beta
Build using beta:
$ cargo +beta build --target=wasm32-wasi
Install:
$ curl https://get.wasmer.io -sSfL | sh
$ source /Users/danielbevenius/.wasmer/wasmer.sh
Run:
$ wasmer run out/first.wasm
First wasi...
The text format for wasm is of type S-expressions where the first label inside a parentheses tell what kind of node it is:
(module (memory 1) (func))
The above has a root node named module
and two child nodes, memory
and func
.
All code is grouped into functions:
(func <signature> <locals> <body>)
The signature declares the functions parameters and its return type. The locals are local variables to the function. The body is a list of instructions for the function.
(module
(func $add (param $first i32) (param $second i32) (result i32)
get_local $first
get_local $second
(i32.add)
)
(export "add" (func $add))
)
Notice that a wasm "program" is simply named a module as the intention is to have
it included and run by another program.
The body is stack based so get_local
will push $first onto the stack.
i32.add
will take two values from the stack, add then and push the result onto
the stack.
Notice the $add
in the function. This is much like the parameters that are
index based but can be named to make the code clearer. So we could just as well
written:
(export "add" (func 0))
export is a function that makes the function available, using the name add
in
our case.
You can compile the above .wat file to wasm using wabt:
$ out/clang/Debug/wat2wasm ~/work/nodejs/scripts/wasm-helloworld.wat -o helloworld.wasm
And the use the wasm in Node.js:
const fs = require('fs');
const buffer = fs.readFileSync('helloworld.wasm');
const promise = WebAssembly.instantiate(buffer, {});
promise.then((result) => {
const instance = result.instance;
const module = result.module;
console.log('instance.exports:', instance.exports);
const addTwo = instance.exports.addTwo;
console.log(addTwo(1, 2));
});
Lets take a closer look at the WebAssembly API.
WebAssembly
is the how the api is exposed.
WebAssembly.instantiate:
compiles and instantiates wasm code and returns an object with two members
named module
and instance
.
To inspect the .wasm you can use wasm-objdump:
$ wasm-objdump -x src/add.wasm
add.wasm:file format wasm 0x1
Section Details:
Type:
- type[0] (i32, i32) -> i32
Function:
- func[0] sig=0 <add>
- func[1] sig=0 <addTwo>
Export:
- func[0] <add> -> "add"
- func[1] <addTwo> -> "addTwo"
The following repo, wasm-c-api a C API to allow you to use function defined in wasm from C/C++.
Make sure you configure V8 to have the following configuration options:
$ gn args out.gn/x64.release/
is_debug = false
target_cpu = "x64"
is_component_build = false
v8_static_library = true
V8 is quite large and I looks like wasm-c-api expects v8 to be cloned in the same directory. I just updated the Makefile to allow the V8 dir to be configured to allow building using:
$ make V8_DIR="/Users/danielbevenius/work/google/javascript" CFLAGS="-g"
WebAssembly.Memory
is used to deal with more complex objects like strings. Is
just a large array of bytes which can grow. You can read/write using i32.load
and i32.store.
Memory is specified using WebAssembly.Memory{}:
const memory = new WebAssembly.Memory({initial:10, maximum:100});
10
and 100
are specified in pages which are fixed to 64KiB.
So here we are saying that we want an initial size of 640KiB.
Is a libc implementation just like glibc. Pronounced muscle. Musl uses less space compared to glibc and is written with security in mind. So applications could be compiled against either glibc or musl. So how to we compile a program against musl?
Clone llvm-project.
$ git clone https://github.com/llvm/llvm-project.git
$ cd llvm-project
$ mkdir build && cd build
Default build
$ cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=/home/dbeveniu/opt -DCMAKE_BUILD_TYPE=Debug -DLLVM_ENABLE_PROJECTS="clang;lld;clang-tools-extra;libcxx;libcxxabi;lldb" -DLLVM_TARGETS_TO_BUILD=WebAssembly -DLLVM_DEFAULT_TARGET_TRIPLE=wasm32-wasi -DDEFAULT_SYSROOT=/home/dbeveniu/opt/share/wasi-sysroot ../llvm
$ make -j8
$ make install
The installation will be placed in /home/dbeveniu/opt. With this installed we should be able to compile and specify the output target as wasm.
Generate llvm intermediate representation (IR):
$ ~/opt/bin/clang --target=wasm32 -emit-llvm -c -S src/add.c
Compile the IR:
$ ~/opt/bin/llc -march=wasm32 -filetype=obj add.ll
The above will produce an object file named add.o
. This need to be linked
into a wasm module using wasm-ld
:
$ wasm-ld --no-entry --export-all -o add.wasm add.o
We can run this using node:
$ node src/add.js
10 + 20 = 30
So the above allowed us to compile from c and output assembly. But that does not allow us to use wasi.
For this we need a wasi-libc.
This is a c library like glibc but with a limited number of functions available.
$ git clone https://github.com/CraneStation/wasi-libc.git
$ git submodule init
$ git submodule update
$ make WASM_CC=~/opt/bin/clang WASM_AR=~/opt/bin/llvm-ar WASM_NM=~/opt/bin/llvm-nm SYSROOT=~/opt/share/wasi-sysroot
...
#
# The build succeeded! The generated sysroot is in /home/dbeveniu/opt/share/wasi-sysroot.
#
Specifying a --sysroot=/somedir
when building will make the compiler look for headers and
libraries in /somedir/include and /somedir/lib. So we will want to specify this
sysroot that was created above when compiling:
$ clang --sysroot=/opt/wasi-sdk/ --target=wasm32-unknown-wasi -o module.wasm
The wasi header, the api.h file, is generated by tools/wasi-headers:
cargo run -- WASI/phases/snapshot/witx/typenames.witx \
WASI/phases/snapshot/witx/wasi_snapshot_preview1.witx \
--output ../../libc-bottom-half/headers/public/wasi/api.h
But we also need a compiler-rt:
builtins:
a simple library that provides an implementation of the low-level target-specific hooks required by code generation and other runtime components. For example, when compiling for a 32-bit target, converting a double to a 64-bit unsigned integer is compiling into a runtime call to the "__fixunsdfdi" function. The builtins library provides optimized implementations of this and other low-level routines, either in target-independent C form, or as a heavily-optimized assembly.
So lets say you have a division of a long long and on a 32 bit system there might not be an instruction for this. In cases like this instead that instruction will be replaced with an call to a function from the compiler-rt library. TODO: Add an example of this.
$ mkdir compiler-rt && cd compiler-rt
$ cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_TOOLCHAIN_FILE=../../wasi-sdk.cmake -DCOMPILER_RT_BAREMETAL_BUILD=On -DCOMPILER_RT_BUILD_XRAY=OFF -DCOMPILER_RT_INCLUDE_TESTS=OFF -DCOMPILER_RT_HAS_FPIC_FLAG=OFF -DCOMPILER_RT_ENABLE_IOS=OFF -DCOMPILER_RT_DEFAULT_TARGET_ONLY=On -DWASI_SDK_PREFIX=/home/danielbevenius/opt -DCMAKE_C_FLAGS="-O1" -DLLVM_CONFIG_PATH=../bin/llvm-config -DCOMPILER_RT_OS_DIR=wasi -DCMAKE_INSTALL_PREFIX=/home/danielbevenius/opt/lib/clang/11.0.0/ -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON ../../compiler-rt/lib/builtins
$ make -j8
$ make install
/usr/bin/cmake -P cmake_install.cmake
-- Install configuration: "RelWithDebInfo"
-- Installing: /home/dbeveniu/opt/lib/clang/11.0.0/lib/wasi/libclang_rt.builtins-wasm32.a
With this in place we should be able to compile a c source code into wasi:
$ clang -### --target=wasm32-unknown-wasi --sysroot=/home/dbeveniu/opt/share/wasi-sysroot -nostdlib -Wl,--no-entry -Wl,--export-all -o add.wasm src/add.c
clang version 11.0.0 (https://github.com/llvm/llvm-project.git 89a66474b6c1e5843c3dbc96bde52e5a7076c6cc)
Target: wasm32-unknown-wasi
Thread model: posix
InstalledDir: /home/dbeveniu/opt/bin
(in-process)
"/home/dbeveniu/opt/bin/clang-11" "-cc1" "-triple" "wasm32-unknown-wasi" "-emit-obj" "-mrelax-all" "-disable-free" "-main-file-name" "add.c" "-mrelocation-model" "static" "-mthread-model" "posix" "-mframe-pointer=none" "-fno-rounding-math" "-masm-verbose" "-mconstructor-aliases" "-target-cpu" "generic" "-fvisibility" "hidden" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/home/dbeveniu/opt/lib/clang/11.0.0" "-isysroot" "/home/dbeveniu/opt/share/wasi-sysroot" "-internal-isystem" "/home/dbeveniu/opt/lib/clang/11.0.0/include" "-internal-isystem" "/home/dbeveniu/opt/share/wasi-sysroot/include/wasm32-wasi" "-internal-isystem" "/home/dbeveniu/opt/share/wasi-sysroot/include" "-fdebug-compilation-dir" "/home/dbeveniu/work/wasm/learning-wasi" "-ferror-limit" "19" "-fmessage-length" "0" "-fgnuc-version=4.2.1" "-fobjc-runtime=gnustep" "-fno-common" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "/tmp/add-593917.o" "-x" "c" "src/add.c"
"/home/dbeveniu/opt/bin/wasm-ld" "-L/home/dbeveniu/opt/share/wasi-sysroot/lib/wasm32-wasi" "--no-entry" "--export-all" "/tmp/add-593917.o" "-o" "add.wasm"
And we can try it out with the same node.js code as above:
$ node src/add.js
10 + 20 = 30
We can inspect add.wasm
using wasm-objdump:
$ wasm-objdump -x add.wasm
add.wasm: file format wasm 0x1
Section Details:
Type[2]:
- type[0] () -> nil
- type[1] (i32, i32) -> i32
Function[2]:
- func[0] sig=0 <__wasm_call_ctors>
- func[1] sig=1 <add>
Table[1]:
- table[0] type=funcref initial=1 max=1
Memory[1]:
- memory[0] pages: initial=2
Global[7]:
- global[0] i32 mutable=1 - init i32=66560
- global[1] i32 mutable=0 <__dso_handle> - init i32=1024
- global[2] i32 mutable=0 <__data_end> - init i32=1024
- global[3] i32 mutable=0 <__global_base> - init i32=1024
- global[4] i32 mutable=0 <__heap_base> - init i32=66560
- global[5] i32 mutable=0 <__memory_base> - init i32=0
- global[6] i32 mutable=0 <__table_base> - init i32=1
Export[9]:
- memory[0] -> "memory"
- func[0] <__wasm_call_ctors> -> "__wasm_call_ctors"
- func[1] <add> -> "add"
- global[1] -> "__dso_handle"
- global[2] -> "__data_end"
- global[3] -> "__global_base"
- global[4] -> "__heap_base"
- global[5] -> "__memory_base"
- global[6] -> "__table_base"
Code[2]:
- func[0] size=2 <__wasm_call_ctors>
- func[1] size=61 <add>
Custom:
- name: "name"
- func[0] <__wasm_call_ctors>
- func[1] <add>
Custom:
- name: "producers"
Notice that there are a number of functions and symboles that we did not
add, infact we only have one function named add
.
When is this added?
If we just generate the IR from add.c we get:
Generate only the object file add.o:
/home/danielbevenius/opt/bin/clang-11 "-cc1" "-triple" "wasm32-unknown-wasi" "-emit-obj" "-mrelax-all" "-disable-free" "-main-file-name" "add.c" "-mrelocation-model" "static" "-mthread-model" "posix" "-mframe-pointer=none" "-fno-rounding-math" "-masm-verbose" "-mconstructor-aliases" "-target-cpu" "generic" "-fvisibility" "hidden" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/home/danielbevenius/opt/lib/clang/11.0.0" "-isysroot" "/home/danielbevenius/opt/share/wasi-sysroot" "-internal-isystem" "/home/danielbevenius/opt/lib/clang/11.0.0/include" "-internal-isystem" "/home/danielbevenius/opt/share/wasi-sysroot/include/wasm32-wasi" "-internal-isystem" "/home/danielbevenius/opt/share/wasi-sysroot/include" "-fdebug-compilation-dir" "/home/danielbevenius/work/wasm/learning-wasi" "-ferror-limit" "19" "-fmessage-length" "0" "-fgnuc-version=4.2.1" "-fobjc-runtime=gnustep" "-fno-common" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "add.o" "-x" "c" "src/add.c"
Then we can launch the the wasm linker and debug it:
gdb --args /home/danielbevenius/opt/bin/wasm-ld "-L/home/dbeveniu/opt/share/wasi-sysroot/lib/wasm32-wasi" "--no-entry" "--export-all" "add.o" "-o" "add.wasm"
(gdb) b main
r
This will break in lld/tools/lld/lld.cpp
which is a executable what contains
4 linkers. A ELF, COFF, WebAssembly, and a Mach-O linker. The name of the executable
run, in our case this is wasm-ld
:
(gdb) p argv[0]
$3 = 0x7fffffffd877 "/home/danielbevenius/opt/bin/wasm-ld"
Notice that wasm-ls
is a lin to lld
:
lrwxrwxrwx. 1 danielbevenius danielbevenius 3 Feb 11 15:39 /home/danielbevenius/opt/bin/wasm-ld -> lld
This will be matched against the Flavor Wasm
using the getFlavor
function.
And this is what the switch case will match:
case Wasm:
return !wasm::link(args, canExitEarly(), llvm::outs(), llvm::errs());
wasm::link
can be found in lld/wasm/Driver.cpp
(gdb) b Driver.cpp:83
Which will call LinkerDriver().link(args) which can be found in the same
file LinkerDriver::link
:
...
createSyntheticSymbols();
This is what will create the additional functions. For example, it will
create a synthetic function named __wasm_call_ctor
.
There are others but I'm going to focus on this one.
So where will be a function added named __wasm_call_ctor
and this will be
called by _start
:
extern void __wasm_call_ctors(void);
void _start(void) {
// The linker synthesizes this to call constructors.
__wasm_call_ctors();
...
}
Now, _start
is define in wasi-libc but when we ran this
In LinkDriver::link we find the following function call:
readConfigs(args);
...
config->entry = getEntry(args);
In our case, we passed in --no-entry so this will be "":
if (arg->getOption().getID() == OPT_no_entry)
return "";
Note about llvm targets:
When you build llvm you specify the targets it should build. This will
generated a file named build/include/llvm/Config/Targets.def
.
LLVM_TARGET(WebAssembly)
The wasm Driver will call initLLVM
which
static void initLLVM() {
InitializeAllTargets();
...
}
inline void InitializeAllTargets() {
// FIXME: Remove this, clients should do it.
InitializeAllTargetInfos();
inline void InitializeAllTargetInfos() {
#define LLVM_TARGET(TargetName) LLVMInitialize##TargetName##TargetInfo();
#include "llvm/Config/Targets.def"
}
So this will be expanded by the preprocessor to:
LLVMInitializeWebAssemblyTargetInfo();
When InitializeAllTargetInfos is called this file will be included
$ clang -S -emit-llvm --target=wasm32-unknown-wasi --sysroot=/home/dbeveniu/opt/share/wasi-sysroot -nostdlib src/add.c
This will produce a file named add.ll
:
$ cat add.ll
; ModuleID = 'src/add.c'
source_filename = "src/add.c"
target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128"
target triple = "wasm32-unknown-wasi"
; Function Attrs: noinline nounwind optnone
define hidden i32 @add(i32 %x, i32 %y) #0 {
entry:
%x.addr = alloca i32, align 4
%y.addr = alloca i32, align 4
store i32 %x, i32* %x.addr, align 4
store i32 %y, i32* %y.addr, align 4
%0 = load i32, i32* %x.addr, align 4
%1 = load i32, i32* %y.addr, align 4
%add = add nsw i32 %0, %1
ret i32 %add
}
attributes #0 = { noinline nounwind optnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"clang version 11.0.0 (https://github.com/llvm/llvm-project.git 89a66474b6c1e5843c3dbc96bde52e5a7076c6cc)"}
This would then be passed to the linker wasm-ld
which must be adding these
additional symbols. It looks like these are being added by a function
named createSyntheticSymbols in lld/wasm/Driver.cpp
:
static WasmSignature nullSignature = {{}, {}};
...
WasmSym::callCtors = symtab->addSyntheticFunction(
"__wasm_call_ctors", WASM_SYMBOL_VISIBILITY_HIDDEN,
make<SyntheticFunction>(nullSignature, "__wasm_call_ctors"));
Alright, we can see that this is just an empty function in our module:
$ wasm-objdump -d add.wasm
add.wasm: file format wasm 0x1
Code Disassembly:
0000d5 func[0] <__wasm_call_ctors>:
0000d6: 0b | end
So where is this function called? And can we add something to it our selves?
First, __wasm_call_ctors
is called from crt1.c
:
extern void __wasm_call_ctors(void);
...
void _start(void) {
__wasm_call_ctors();
``
So above we specify that our target should be `wasm32-unknown-wasi`. Is this
specifying the backend to be used?
### Installing ninja
```console
$ git clone git://github.com/martine/ninja.git
$ configure.py --bootstrap
The ninja executable will be in the same directory.
wasmtime compilation error:
cargo:warning=In file included from signalhandlers/SignalHandlers.cpp:8:
cargo:warning=In file included from signalhandlers/SignalHandlers.hpp:5:
cargo:warning=/usr/local/opt/llvm/bin/../include/c++/v1/setjmp.h:35:15: fatal error: 'setjmp.h' file not found
cargo:warning=#include_next <setjmp.h>
cargo:warning= ^~~~~~~~~~
cargo:warning=1 error generated.
exit code: 1
--- stderr
It seems that after upgrading to Mojove some headers were no longer in the /include
directory. These can be installed using the following command:
open /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg
signalhandlers/SignalHandlers.hpp:5:10: fatal error: 'setjmp.h' file not found
signalhandlers/SignalHandlers.hpp:5:10: fatal error: 'setjmp.h' file not found, err: true
thread 'main' panicked at 'Unable to generate bindings: ()', /home/danielbevenius/.cargo/git/checkouts/wasmtime-5c699c1a3ee5d368/b7d86af/wasmtime-runtime/build.rs:32:5
You can trouble shoot this by creating a directory and generating the cmake build files:
$ mkdir tmp && cd tmp
$ cmake "/home/danielbevenius/.cargo/git/checkouts/wasmtime-5c699c1a3ee5d368/b7d86af/wasmtime-runtime/signalhandlers" "-DCMAKE_INSTALL_PREFIX=/home/danielbevenius/work/openshift/faas-wasi-runtime-image/target/debug/build/wasmtime-runtime-a5693ecd59b8474f/out" "-DCMAKE_C_FLAGS= -ffunction-sections -fdata-sections -fPIC -m64 -I/usr/include" "-DCMAKE_C_COMPILER=/usr/bin/cc" "-DCMAKE_CXX_FLAGS= -ffunction-sections -fdata-sections -fPIC -m64 -I/usr/include" "-DCMAKE_CXX_COMPILER=/usr/bin/c++" "-DCMAKE_BUILD_TYPE=Debug"
Then we can run the command the failed:
$ "cmake" "--build" "." "--target" "install" "--config" "Debug" "--"
This actually succeeded.
Looking closer at the error message you can see that it is saying the it failed when running rustc on wasm-runtime/build.rs. I've build rustc manually and it is installed in a non-default location. So this can be worked around by specifying:
$ export BINDGEN_EXTRA_CLANG_ARGS="-I/usr/include"
c/cant_dotdot.c:1:10: fatal error: 'assert.h' file not found
#include <assert.h>
^~~~~~~~~~
1 error generated.
or
/usr/include/sys/cdefs.h:784:2: error: Unsupported architecture
#error Unsupported architecture
Make sure you have specifed the correct --sysroot.
Sections in a module
* Types
* Import
* Function
* Tables
* Memory
* Global
* Export
* Start
* Element
* Code
* Data
Are function signatures that can be reused, for example for imports, functions definitions.
This maps values like JavaScript objects or OS file handles. Its a way to allow the wasm module to access something outside its memory. For example, say you have a function pointer which you want to call from wasm module. If you had direct access to this memory pointer you might be able to learn about the memory layout and exploit it. For example, updating the pointer to something different might execute some other piece of code. A table is an array that lives outside of wasm's memory and the values stored in the array are references to functions.
This section allows for the intialization the content or a table imported or defined in the table section. What would this be like, would this be like passing in an empty table entry and populating it with a function pointer to a function in the wasm module. But would we not just export the function in that was (I'm probabably not understanding the usage here fully).
Defines the optional memory of the module by defining its initial and max size. This memory can be initialized by the data section.
This section contains any global (think static C variables).
These are functions, tables, memory segements and global variables that are made available to the host.
The Stack operations takes their operands from—and put their result onto—the stack. There is now way to inspect the stack apart from using opcodes that push and pop values from the stack.
Locals and globals are like variables, they are named and can hold any of the same basic types as the stack can hold (i32, et al.)
Memory is linear so all memory addresses used are expressed in terms of byte offsets from the beginning of a memory segment
i32.load offset= alignment=
Now, operators can have immediate arguments and are considered part of the
instructions themselves. The alignment
is a hint of the alignment
0 = 8 bit
1 = 16 bit
2 = 32 bit
3 = 64 bit
8-bit 16-bit 32-bit 64-bit
↓ ↓ ↓ ↓
[00][00][00][00][00][00][00][00][00][00][00][00]...
8-bit [ 0][ 1][ 2][ 3][ 4][ 5][ 6][ 7][ 8][ 9][10][11]...
16-bit[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ]...
32-bit[ 0 ][ 1 ][ 2 ]...
64-bit[ 0 ][ 1 ]...
The second immediate for load is the address offset
. The effective
address is the sum of the address operand and the offset.
(stack) (immediate)
effective-address = address-operand + offset
The reason for the offset is when using dynamic memory where a compiler may add a constant offset to all memory operations in order to relocate one area of memory to another.
So when we want to store a value in memory we need to specify a address operand
i32.const 0 ;; address operand
i32.const 12 ;; value to store
i32.store offset=0 align=2 ;; size_buf_len
The offset defaults to 𝟶, the alignment to the storage size of the respective memory access, which is its natural alignment.
This would store the value 18
at address 0:
32-bit
↓
[12][00][00][00][00][00][00][00][00][00][00][00]...
[ 0 ][ 1 ][ 2 ]...
Notice that everything is stored in little endian.
Elements are “handles” for opaque foreign values (like OS file handles.)
Unlike with other index spaces, indexing of labels is relative by nesting depth, that is, label 0 refers to the innermost structured control instruction enclosing the referring branch instruction, while increasing indices refer to those farther out.
typedef struct uvwasi_s {
struct uvwasi_fd_table_t fds;
size_t argc;
char** argv;
char* argv_buf;
size_t argv_buf_size;
size_t envc;
char** env;
char* env_buf;
size_t env_buf_size;
} uvwasi_t;
So we have a file descriptor table first followed by argc and argv. These are the arguments passed to the module. How are the passed?
In the uvwasi example these are configured programatically:
uvwasi_options_t init_options;
...
init_options.argc = 3;
init_options.argv = calloc(3, sizeof(char*));
init_options.argv[0] = "--foo=bar";
init_options.argv[1] = "-baz";
init_options.argv[2] = "100";
init_options.envp = (const char**) environ;
init_options.preopenc = 1;
init_options.preopens = calloc(1, sizeof(uvwasi_preopen_t));
init_options.preopens[0].mapped_path = "/var";
init_options.preopens[0].real_path = ".";
r = uvwasi_init(uvw, &init_options);
$ otool -L out/app
out/app:
/usr/local/lib/libuv.1.dylib (compatibility version 2.0.0, current version 2.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)
You can also use this environment variable:
$ DYLD_PRINT_LIBRARIES=1 out/app
dyld: loaded: /Users/danielbevenius/work/wasm/uvwasi/out/app
dyld: loaded: /usr/local/lib/libuv.1.dylib
dyld: loaded: /usr/lib/libSystem.B.dylib
dyld: loaded: /usr/lib/system/libcache.dylib
dyld: loaded: /usr/lib/system/libcommonCrypto.dylib
dyld: loaded: /usr/lib/system/libcompiler_rt.dylib
dyld: loaded: /usr/lib/system/libcopyfile.dylib
dyld: loaded: /usr/lib/system/libcorecrypto.dylib
dyld: loaded: /usr/lib/system/libdispatch.dylib
dyld: loaded: /usr/lib/system/libdyld.dylib
dyld: loaded: /usr/lib/system/libkeymgr.dylib
dyld: loaded: /usr/lib/system/liblaunch.dylib
dyld: loaded: /usr/lib/system/libmacho.dylib
dyld: loaded: /usr/lib/system/libquarantine.dylib
dyld: loaded: /usr/lib/system/libremovefile.dylib
dyld: loaded: /usr/lib/system/libsystem_asl.dylib
dyld: loaded: /usr/lib/system/libsystem_blocks.dylib
dyld: loaded: /usr/lib/system/libsystem_c.dylib
dyld: loaded: /usr/lib/system/libsystem_configuration.dylib
dyld: loaded: /usr/lib/system/libsystem_coreservices.dylib
dyld: loaded: /usr/lib/system/libsystem_darwin.dylib
dyld: loaded: /usr/lib/system/libsystem_dnssd.dylib
dyld: loaded: /usr/lib/system/libsystem_info.dylib
dyld: loaded: /usr/lib/system/libsystem_m.dylib
dyld: loaded: /usr/lib/system/libsystem_malloc.dylib
dyld: loaded: /usr/lib/system/libsystem_networkextension.dylib
dyld: loaded: /usr/lib/system/libsystem_notify.dylib
dyld: loaded: /usr/lib/system/libsystem_sandbox.dylib
dyld: loaded: /usr/lib/system/libsystem_secinit.dylib
dyld: loaded: /usr/lib/system/libsystem_kernel.dylib
dyld: loaded: /usr/lib/system/libsystem_platform.dylib
dyld: loaded: /usr/lib/system/libsystem_pthread.dylib
dyld: loaded: /usr/lib/system/libsystem_symptoms.dylib
dyld: loaded: /usr/lib/system/libsystem_trace.dylib
dyld: loaded: /usr/lib/system/libunwind.dylib
dyld: loaded: /usr/lib/system/libxpc.dylib
dyld: loaded: /usr/lib/libobjc.A.dylib
dyld: loaded: /usr/lib/libc++abi.dylib
dyld: loaded: /usr/lib/libc++.1.dylib
uvwasi_fd_fdstat_get()
stats.fs_rights_base = 6291603
uvwasi_fd_fdstat_get()
stats.fs_rights_base = 6291603
In my case I don't want to use the system libuv but instead on that I've build with debug symbols.
Show contents of archive:
$ ar -t ~/work/nodejs/libuv/out/Debug/libuv.a
__.SYMDEF SORTED
fs-poll.o
idna.o
inet.o
threadpool.o
timer.o
uv-data-getter-setters.o
uv-common.o
version.o
async.o
core.o
dl.o
fs.o
getaddrinfo.o
getnameinfo.o
loop.o
loop-watcher.o
pipe.o
poll.o
process.o
signal.o
stream.o
tcp.o
thread.o
tty.o
udp.o
proctitle.o
darwin.o
fsevents.o
darwin-proctitle.o
bsd-ifaddrs.o
kqueue.o
See all the symbols:
$ nm ~/work/nodejs/libuv/out/Debug/libuv.a
gcc -o ./out/app out/obj/uvwasi.o out/obj/fd_table.o out/obj/uv_mapping.o app.c -g -L/Users/danielbevenius/work/nodejs/libuv/out/Debug/ -luv -Wall -Wsign-compare -I./include -luv
Undefined symbols for architecture x86_64:
"_uv_gettimeofday", referenced from:
_uvwasi_clock_time_get in uvwasi.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
$ ar -t ~/work/nodejs/libuv/out/Debug/libuv.a | grep _uv_gettimeof_day
Notice that this symbol does not exist.
$ man dyld
Now run with DYLD_LIBRARY_PATH
$ DYLD_LIBRARY_PATH=/Users/danielbevenius/work/nodejs/libuv-build/lib ./out/app
$ lldb -- wasmtime --env="ONE=1" --env="TWO=2" src/fd_write.wat
(lldb) target create "wasmtime"
Current executable set to 'wasmtime' (x86_64).
(lldb) settings set -- target.run-args "--env=ONE=1" "--env=TWO=2" "src/fd_write.wat"
Set a breakpoint in $wasmtime_home/src/bin/wasmtime.rs
:
(lldb) br s -f wasmtime.rs -l 203
$ rustfmt --check somefile
$ echo $?
If there where any issues the exit value will be 1.
Can be enabled by adding RUST_LOG
:
RUST_LOG=wasi_common=trace cargo test socket --features wasm_tests
Compiler code generator backend.
cranelift-codegen takes as input the intermediate language of a functions, the target (the arch like x86_64 for example), and compiler settings. The output is machine code.
$ lldb -- target/debug/wasmtime-basic
(lldb) br s -f main.rs -l 31
(lldb) r
pub fn wasm_add_full() -> Result<ActionOutcome, ActionError> {
let mut binary_file = File::open(concat!(env!("OUT_DIR"), "/add.wasm")).unwrap();
let mut binary: Vec<u8> = Vec::new();
binary_file.read_to_end(&mut binary).unwrap();
So this first section is reading add.wasm
indo a Vector.
Next we have:
// First, we need a Wasmtime context. To build one, we need to get an
// Instruction Set Architectures (ISA) from `cranelift_native.
let isa_builder = cranelift_native::builder().unwrap();
The is an Instruction Set Architecture builder that we are creating. If we
look in the cranelift::native
lib.rs we find the builder
function
This function will call the isa::lookup which can be found in
~/work/wasm/cranelift/cranelift-codegen/src/isa/mod.rs
.
pub fn builder() -> Result<isa::Builder, &'static str> {
let mut isa_builder = isa::lookup(Triple::host()).map_err(|err| match err {
isa::LookupError::SupportDisabled => "support for architecture disabled at compile time",
isa::LookupError::Unsupported => "unsupported architecture",
})?;
if cfg!(any(target_arch = "x86", target_arch = "x86_64")) {
parse_x86_cpuid(&mut isa_builder)?;
}
Ok(isa_builder)
}
The call to Triple::host() will return the Triple which contains the host
Architecture, vendor, OS. TODO: show the host specific code for host
.
The host
function is found in target-lexicon
crate
(~/.cargo/registry/src/github.com-1ecc6299db9ec823/target-lexicon-0.9.0/src/host.rs)
include!(concat!(env!("OUT_DIR"), "/host.rs"));
Notice that is is reading a file from the current projects output directory:
$ find target/ -name 'host.rs'
target//debug/build/target-lexicon-6301a8d7cd05e389/out/host.rs
```rust
pub const HOST: Triple = Triple {
architecture: Architecture::X86_64,
vendor: Vendor::Apple,
operating_system: OperatingSystem::Darwin,
environment: Environment::Unknown,
binary_format: BinaryFormat::Macho,
};
With this information isa::lookup
will be called (~/work/wasm/cranelift/cranelift-codegen/src/isa/mod.rs):
pub fn lookup(triple: Triple) -> Result<Builder, LookupError> {
match triple.architecture {
Architecture::Riscv32 | Architecture::Riscv64 => isa_builder!(riscv, "riscv", triple),
Architecture::I386 | Architecture::I586 | Architecture::I686 | Architecture::X86_64 => {
isa_builder!(x86, "x86", triple)
}
Architecture::Arm { .. } => isa_builder!(arm32, "arm32", triple),
Architecture::Aarch64 { .. } => isa_builder!(arm64, "arm64", triple),
_ => Err(LookupError::Unsupported),
}
}
Now, isa_builder
is actually a macro:
macro_rules! isa_builder {
($name: ident, $feature: tt, $triple: ident) => {{
#[cfg(feature = $feature)]
{
Ok($name::isa_builder($triple))
Notice that we are going go use the name x86
in our case, which is a module
in the isa crate (~/work/wasm/cranelift/cranelift-codegen/src/isa/x86/mod.rs):
pub fn isa_builder(triple: Triple) -> IsaBuilder {
IsaBuilder {
triple,
setup: settings::builder(),
constructor: isa_constructor,
}
}
Lets take a closer look at IsaBuilder (~/work/wasm/cranelift/cranelift-codegen/src/isa/mod.rs).
pub struct Builder {
triple: Triple,
setup: settings::Builder,
constructor: fn(Triple, settings::Flags, settings::Builder) -> Box<dyn TargetIsa>,
}
So back to the demo code we have settings::builder()
but there is nothing
being set.
let flag_builder = settings::builder();
let isa = isa_builder.finish(settings::Flags::new(flag_builder));
Lets take a closer look at isa_builder.finish which will construct the target isa:
impl Builder {
/// Combine the ISA-specific settings with the provided ISA-independent settings and allocate a
/// fully configured `TargetIsa` trait object.
pub fn finish(self, shared_flags: settings::Flags) -> Box<dyn TargetIsa> {
(self.constructor)(self.triple, shared_flags, self.setup)
}
}
So at this point we have:
// Then, we use the ISA to build the context.
let mut context = Context::with_isa(isa);
(~/work/wasm/wasmtime/crates/jit/src/lib.rs) we can find:
mod context;
So we can look in ~/work/wasm/wasmtime/crates/jit/src/context.rs
for the
with_isa
function:
pub struct Context {
namespace: Namespace,
compiler: Box<Compiler>,
global_exports: Rc<RefCell<HashMap<String, Option<wasmtime_runtime::Export>>>>,
debug_info: bool,
features: Features,
}
pub fn with_isa(isa: Box<TargetIsa>) -> Self {
Self::new(Box::new(Compiler::new(isa)))
}
And the compiler crate:
pub struct Compiler {
isa: Box<dyn TargetIsa>,
code_memory: CodeMemory,
trap_registration_guards: Vec<TrapRegistrationGuard>,
trampoline_park: HashMap<*const VMFunctionBody, *const VMFunctionBody>,
signatures: SignatureRegistry,
strategy: CompilationStrategy,
/// The `FunctionBuilderContext`, shared between trampline function compilations.
fn_builder_ctx: FunctionBuilderContext,
}
// Now, we instantiate the WASM module loaded into memory.
let mut instance = context.instantiate_module(None, &binary).unwrap();
// And, finally, invoke our function and print the results.
// For this demo, all we're doing is adding 5 and 7 together.
let args = [RuntimeValue::I32(5), RuntimeValue::I32(7)];
context.invoke(&mut instance, "add", &args)
}
WebWorkers rely on message passing for communication, and in V8 each worker is a separate Isolate and they don't share compiled code or JavaScript objects with each other. WASM threads can share the same wasm memory using an underlying SharedArrayBuffer. Each wasm thread runs in a WebWorker but they share their memory which allows them to work like a normal pthreads application would.
This proposal provides the building blocks for writing a threading library and does not provide the actual library itself.
Atomic operations have been added
TODO: Add an example...
$ wat2wasm --enable-reference-types src/anyref.wat -o anyref.wasm
$ node --experimental-wasm-anyref src/anyref.js
This js example can be found in anyref.js and the wat in anyref.wat
Currently a wasm function can only return zero or one value.
The following example demonstrates how a function that return multiple values can be called:
$ wat2wasm --enable-multi-value src/multivalue.wat -o multivalue.wasm
$ node --experimental-wasm-mv src/multivalue.js
This js example can be found in multivalue.js and the wat in multvalue.wat
Some languages use exceptions, like C++ or C# and these can be polifilled but this impacts performance. Also JavaScript has exceptions and so if a wasm function calls a JS function which throws an error the wasm module will not be able to handle the error.
$ wat2wasm --enable-exceptions src/exception.wat -o exception.wasm
$ node --experimental-wasm-anyref --experimental-wasm-eh src/exception.js
This js example can be found in exception.js and the wat in exception.wat.