Skip to content

Commit

Permalink
CRAN release 0.8.3
Browse files Browse the repository at this point in the history
  • Loading branch information
shikokuchuo committed Apr 17, 2023
1 parent ea1fba9 commit 364247a
Show file tree
Hide file tree
Showing 4 changed files with 60 additions and 66 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: mirai
Type: Package
Title: Minimalist Async Evaluation Framework for R
Version: 0.8.2.9039
Version: 0.8.3
Description: Lightweight parallel code execution and distributed computing.
Designed for simplicity, a 'mirai' evaluates an R expression asynchronously,
on local or network resources, resolving automatically upon completion.
Expand Down
2 changes: 1 addition & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# mirai 0.8.2.9039 (development)
# mirai 0.8.3

* `mirai()` gains the following enhancements (thanks @HenrikBengtsson):
+ accepts a language or expression object being passed to '.expr' for evaluation.
Expand Down
28 changes: 13 additions & 15 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,6 @@ Features efficient task scheduling, scalability beyond R connection limits, and

`mirai()` returns a 'mirai' object immediately. 'mirai' (未来 みらい) is Japanese for 'future'.

The asynchronous 'mirai' task runs in an ephemeral or persistent process, spawned locally or distributed across the network.

{mirai} has a tiny pure R code base, relying solely on {nanonext}, a high-performance binding for the 'NNG' (Nanomsg Next Gen) C library with zero package dependencies.

### Table of Contents
Expand Down Expand Up @@ -86,7 +84,7 @@ m <- mirai({
m
```

Above, all named objects are passed through to the mirai.
Above, all specified `name = value` pairs are passed through to the 'mirai'.

The 'mirai' yields an 'unresolved' logical NA value whilst the async operation is ongoing.

Expand All @@ -109,7 +107,7 @@ Alternatively, explicitly call and wait for the result using `call_mirai()`.
call_mirai(m)$data |> str()
```

For easy programmatic use of `mirai`, '.expr' accepts a pre-constructed language object to evaluate, and also a list of named arguments passed via '.args'. So, the following would produce the same results as the above:
For easy programmatic use of `mirai()`, '.expr' accepts a pre-constructed language object, and also a list of named arguments passed via '.args'. So, the following would be equivalent to the above:

```{r equiv}
expr <- quote({
Expand All @@ -134,9 +132,7 @@ High-frequency real-time data cannot be written to file/database synchronously w

Cache data in memory and use `mirai()` to perform periodic write operations concurrently in a separate process.

A 'mirai' object is returned immediately.

Below, '.args' is used to pass a list of objects already present in the calling environment to the mirai by name (see `base::name`). This is an alternative use of '.args', and can be combined with `...` to also pass in `name = value` pairs.
Below, '.args' is used to pass a list of objects already present in the calling environment to the mirai by name. This is an alternative use of '.args', and may be combined with `...` to also pass in `name = value` pairs.

```{r exec2}
library(mirai)
Expand All @@ -147,6 +143,8 @@ file <- tempfile()
m <- mirai(write.csv(x, file = file), .args = list(x, file))
```

A 'mirai' object is returned immediately.

`unresolved()` may be used in control flow statements to perform actions which depend on resolution of the 'mirai', both before and after.

This means there is no need to actually wait (block) for a 'mirai' to resolve, as the example below demonstrates.
Expand All @@ -171,9 +169,9 @@ Now actions which depend on the resolution may be processed, for example the nex

Use case: isolating code that can potentially fail in a separate process to ensure continued uptime.

As part of a data science or machine learning pipeline, iterations of model training may periodically fail for stochastic and uncontrollable reasons (e.g. buggy memory management on graphics cards).
As part of a data science / machine learning pipeline, iterations of model training may periodically fail for stochastic and uncontrollable reasons (e.g. buggy memory management on graphics cards).

Running each iteration in a 'mirai' process isolates this potentially-problematic code such that if it does fail, it does not bring down the entire pipeline.
Running each iteration in a 'mirai' isolates this potentially-problematic code such that even if it does fail, it does not bring down the entire pipeline.

```{r exec3r}
library(mirai)
Expand Down Expand Up @@ -228,7 +226,7 @@ To view the current status, call `daemons()` with no arguments. This provides th
daemons()
```

The default `dispatcher = TRUE` launches a `dispatcher()` background process that connects to individual background `server()` processes on the local machine. This ensures that tasks are dispatched efficiently on a first-in first-out (FIFO) basis to servers for processing. Tasks are queued at the dispatcher and sent to a server as soon as it can accept the task for immediate execution.
The default `dispatcher = TRUE` creates a `dispatcher()` background process that connects to individual daemon processes on the local machine on behalf of the client. This ensures that tasks are dispatched efficiently on a first-in first-out (FIFO) basis to servers for processing. Tasks are queued at the dispatcher and sent to a daemon as soon as it can accept the task for immediate execution.

```{r daemons4}
daemons(0)
Expand Down Expand Up @@ -277,7 +275,7 @@ A port on the client also needs to be open and available for inbound connections

#### Connecting to Remote Servers Through Dispatcher

The default `dispatcher = TRUE` creates a background `dispatcher()` process on the local client machine, which listens to a vector of URLs that remote servers dial in to, with each server having its unique URL.
The default `dispatcher = TRUE` creates a background `dispatcher()` process on the local client machine, which listens to a vector of URLs that remote `server()` processes dial in to, with each server having its unique URL.

It is recommended to use a websocket URL starting `ws://` instead of TCP in this scenario (used interchangeably with `tcp://`). A websocket URL supports a path after the port number, which can be made unique for each server. In this way a dispatcher can connect to an arbitrary number of servers over a single port.

Expand Down Expand Up @@ -342,7 +340,7 @@ Closing the connection causes the dispatcher to exit automatically, and in turn

#### Connecting to Remote Servers Directly

By specifying `dispatcher = FALSE`, remote servers connect directly to the client. The client listens at the below address, and distributes tasks to all connected server processes.
By specifying `dispatcher = FALSE`, remote servers connect directly to the client. The client listens at a single URL address, and distributes tasks to all connected server processes.

```{r remote, eval=FALSE}
daemons(url = "tcp://10.111.5.13:0", dispatcher = FALSE)
Expand All @@ -358,7 +356,7 @@ Alternatively, simply supply a colon followed by the port number to listen on al
daemons(url = "tcp://:0", dispatcher = FALSE)
```

Note that above, the port number is specified as zero. This is a wildcard value that will automatically cause a free ephemeral port to be assigned. The actual assigned port is provided as the return value of the call, or it may be queried at any time by requesting the status using `daemons()`.
Note that above, the port number is specified as zero. This is a wildcard value that will automatically cause a free ephemeral port to be assigned. The actual assigned port is provided as the return value of the call, or it may be queried at any time by requesting the status via `daemons()`.

--

Expand All @@ -370,7 +368,7 @@ Rscript -e 'mirai::server("tcp://10.111.5.13:0")'

--

On the client, requesting the status will return the client URL for `daemons`. The number of daemons connecting to this URL is not limited and network resources may be added and removed at any time, with tasks automatically distributed to all server processes.
The number of daemons connecting to the client URL is not limited and network resources may be added or removed at any time, with tasks automatically distributed to all server processes.

`$connections` will show the actual number of connected server instances.

Expand All @@ -393,7 +391,7 @@ This causes all connected server instances to exit automatically.

### Compute Profiles

The `daemons` interface allows the easy specification of compute profiles. This is for managing tasks with heterogeneous compute requirements:
The `daemons()` interface also allows the specification of compute profiles for managing tasks with heterogeneous compute requirements:

- send tasks to different servers or server clusters with the appropriate specifications (in terms of CPUs / memory / GPU / accelerators etc.)
- split tasks between local and remote computation
Expand Down
94 changes: 45 additions & 49 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,6 @@ communications, courtesy of ‘nanonext’ and ‘NNG’ (Nanomsg Next Gen).
`mirai()` returns a ‘mirai’ object immediately. ‘mirai’ (未来 みらい) is
Japanese for ‘future’.

The asynchronous ‘mirai’ task runs in an ephemeral or persistent
process, spawned locally or distributed across the network.

{mirai} has a tiny pure R code base, relying solely on {nanonext}, a
high-performance binding for the ‘NNG’ (Nanomsg Next Gen) C library with
zero package dependencies.
Expand Down Expand Up @@ -93,7 +90,8 @@ m
#> - $data for evaluated result
```

Above, all named objects are passed through to the mirai.
Above, all specified `name = value` pairs are passed through to the
‘mirai’.

The ‘mirai’ yields an ‘unresolved’ logical NA value whilst the async
operation is ongoing.
Expand All @@ -108,21 +106,20 @@ result.

``` r
m$data |> str()
#> num [1:100000000] 0.718 -4.001 1.026 0.736 -0.16 ...
#> num [1:100000000] 0.97 5.532 -2.559 -0.254 0.303 ...
```

Alternatively, explicitly call and wait for the result using
`call_mirai()`.

``` r
call_mirai(m)$data |> str()
#> num [1:100000000] 0.718 -4.001 1.026 0.736 -0.16 ...
#> num [1:100000000] 0.97 5.532 -2.559 -0.254 0.303 ...
```

For easy programmatic use of `mirai`, ‘.expr’ accepts a pre-constructed
language object to evaluate, and also a list of named arguments passed
via ‘.args’. So, the following would produce the same results as the
above:
For easy programmatic use of `mirai()`, ‘.expr’ accepts a
pre-constructed language object, and also a list of named arguments
passed via ‘.args’. So, the following would be equivalent to the above:

``` r
expr <- quote({
Expand All @@ -135,7 +132,7 @@ args <- list(m = runif(1), n = 1e8)
m <- mirai(.expr = expr, .args = args)

call_mirai(m)$data |> str()
#> num [1:100000000] 1.455 0.691 0.835 1.765 0.352 ...
#> num [1:100000000] 0.221 -0.603 -2.056 -0.347 -1.014 ...
```

[« Back to ToC](#table-of-contents)
Expand All @@ -150,12 +147,10 @@ synchronously without disrupting the execution flow.
Cache data in memory and use `mirai()` to perform periodic write
operations concurrently in a separate process.

A ‘mirai’ object is returned immediately.

Below, ‘.args’ is used to pass a list of objects already present in the
calling environment to the mirai by name (see `base::name`). This is an
alternative use of ‘.args’, and can be combined with `...` to also pass
in `name = value` pairs.
calling environment to the mirai by name. This is an alternative use of
‘.args’, and may be combined with `...` to also pass in `name = value`
pairs.

``` r
library(mirai)
Expand All @@ -166,6 +161,8 @@ file <- tempfile()
m <- mirai(write.csv(x, file = file), .args = list(x, file))
```

A ‘mirai’ object is returned immediately.

`unresolved()` may be used in control flow statements to perform actions
which depend on resolution of the ‘mirai’, both before and after.

Expand Down Expand Up @@ -196,12 +193,12 @@ the next write.
Use case: isolating code that can potentially fail in a separate process
to ensure continued uptime.

As part of a data science or machine learning pipeline, iterations of
As part of a data science / machine learning pipeline, iterations of
model training may periodically fail for stochastic and uncontrollable
reasons (e.g. buggy memory management on graphics cards).

Running each iteration in a ‘mirai’ process isolates this
potentially-problematic code such that if it does fail, it does not
Running each iteration in a ‘mirai’ isolates this
potentially-problematic code such that even if it does fail, it does not
bring down the entire pipeline.

``` r
Expand Down Expand Up @@ -230,8 +227,8 @@ for (i in 1:10) {
#> iteration 4 successful
#> iteration 5 successful
#> iteration 6 successful
#> Error: random error
#> iteration 7 successful
#> Error: random error
#> iteration 8 successful
#> iteration 9 successful
#> iteration 10 successful
Expand Down Expand Up @@ -276,20 +273,20 @@ daemons()
#>
#> $daemons
#> online instance assigned complete
#> abstract://985d3a9681b606e1a46a203e81f3f49ca68f5af4 1 1 0 0
#> abstract://556a2d6a54bab94eb8ff09df26e522222c201291 1 1 0 0
#> abstract://6fae9639af17ca0ce157c8f2511c18005c28c7f3 1 1 0 0
#> abstract://f8997dfac7acfcc2eb2bdcb35e9b71f1c36c6a6a 1 1 0 0
#> abstract://68dc04328f4d5f2d678e6d65064c7c5269ea945e 1 1 0 0
#> abstract://331f99c0deb7317ae706c8ce37f5921878ea16b5 1 1 0 0
#> abstract://6910477e0d73f6158ceb229482f08568443e2cd1 1 1 0 0
#> abstract://a4a1cbd88330c8b6f2f312e037e865496ffe34a1 1 1 0 0
#> abstract://7e754a74b5b31f58a94427ecf7c8b8dd1a4290c5 1 1 0 0
#> abstract://91290cbcfedec37ddacf5ff1d101cdc3ac4d4a29 1 1 0 0
#> abstract://038e1d4c08ce14ab61ddc273db85b68fa8b5bfa6 1 1 0 0
#> abstract://cccec64e89d21aba0b9a7093fdf724379cf88921 1 1 0 0
```

The default `dispatcher = TRUE` launches a `dispatcher()` background
process that connects to individual background `server()` processes on
the local machine. This ensures that tasks are dispatched efficiently on
a first-in first-out (FIFO) basis to servers for processing. Tasks are
queued at the dispatcher and sent to a server as soon as it can accept
the task for immediate execution.
The default `dispatcher = TRUE` creates a `dispatcher()` background
process that connects to individual daemon processes on the local
machine on behalf of the client. This ensures that tasks are dispatched
efficiently on a first-in first-out (FIFO) basis to servers for
processing. Tasks are queued at the dispatcher and sent to a daemon as
soon as it can accept the task for immediate execution.

``` r
daemons(0)
Expand Down Expand Up @@ -359,7 +356,8 @@ examples below.

The default `dispatcher = TRUE` creates a background `dispatcher()`
process on the local client machine, which listens to a vector of URLs
that remote servers dial in to, with each server having its unique URL.
that remote `server()` processes dial in to, with each server having its
unique URL.

It is recommended to use a websocket URL starting `ws://` instead of TCP
in this scenario (used interchangeably with `tcp://`). A websocket URL
Expand Down Expand Up @@ -411,10 +409,10 @@ daemons()
#>
#> $daemons
#> online instance assigned complete
#> ws://:5555/1 0 0 0 0
#> ws://:5555/2 0 0 0 0
#> ws://:5555/3 0 0 0 0
#> ws://:5555/4 0 0 0 0
#> ws://:5555/1 1 1 0 0
#> ws://:5555/2 1 1 0 0
#> ws://:5555/3 1 1 0 0
#> ws://:5555/4 1 1 0 0
```

As per the local case, `$connections` will show the single connection to
Expand Down Expand Up @@ -454,7 +452,7 @@ dispatcher are terminated.
#### Connecting to Remote Servers Directly

By specifying `dispatcher = FALSE`, remote servers connect directly to
the client. The client listens at the below address, and distributes
the client. The client listens at a single URL address, and distributes
tasks to all connected server processes.

``` r
Expand All @@ -466,28 +464,27 @@ listen on all interfaces on the local host, for example:

``` r
daemons(url = "tcp://:0", dispatcher = FALSE)
#> [1] "tcp://:46017"
#> [1] "tcp://:38185"
```

Note that above, the port number is specified as zero. This is a
wildcard value that will automatically cause a free ephemeral port to be
assigned. The actual assigned port is provided as the return value of
the call, or it may be queried at any time by requesting the status
using `daemons()`.
the call, or it may be queried at any time by requesting the status via
`daemons()`.


On the server, `server()` may be called from an R session, or an Rscript
invocation from a shell. This sets up a remote daemon process that
connects to the client URL and receives tasks:

Rscript -e 'mirai::server("tcp://10.111.5.13:46017")'
Rscript -e 'mirai::server("tcp://10.111.5.13:38185")'


On the client, requesting the status will return the client URL for
`daemons`. The number of daemons connecting to this URL is not limited
and network resources may be added and removed at any time, with tasks
The number of daemons connecting to the client URL is not limited and
network resources may be added or removed at any time, with tasks
automatically distributed to all server processes.

`$connections` will show the actual number of connected server
Expand All @@ -499,7 +496,7 @@ daemons()
#> [1] 0
#>
#> $daemons
#> [1] "tcp://:46017"
#> [1] "tcp://:38185"
```

To reset all connections and revert to default behaviour:
Expand All @@ -515,9 +512,8 @@ This causes all connected server instances to exit automatically.

### Compute Profiles

The `daemons` interface allows the easy specification of compute
profiles. This is for managing tasks with heterogeneous compute
requirements:
The `daemons()` interface also allows the specification of compute
profiles for managing tasks with heterogeneous compute requirements:

- send tasks to different servers or server clusters with the
appropriate specifications (in terms of CPUs / memory / GPU /
Expand Down

0 comments on commit 364247a

Please sign in to comment.