Skip to content

Latest commit

 

History

History
688 lines (445 loc) · 22 KB

agere2015.md

File metadata and controls

688 lines (445 loc) · 22 KB

Deny Capabilities for Safe, Fast Actors

Sylvan Clebsch, Sophia Drossopoulou

Sebastian Blessing, Andy McNeil

Imperial College London, Causality Ltd


Motivation

  • Combining the actor-model with shared memory eliminates the need to copy messages between actors.
  • This improves performance, but results in the possiblity of data-races.
  • Dynamic data-race prevention (via locks or lock-free algorithms) is error-prone and carries a run-time cost.
  • A static approach could eliminate all data-races with no run-time cost.

Goals

  • Statically checked data-race freedom.
  • Ability to pass both mutable and immutable data in messages safely.
  • No restriction on message shape (e.g. trees).
  • Apply our system to a real-world actor-model programming language: Pony.
http://ponylang.org

What can be safely sent in a message by reference?

  • References to actors that allow sending messages, but not reading from or writing to the actor's state.
  • Opaque references that allow neither reading from nor writing to the object.
  • Globally immutable references.
  • Mutable references, as long as the sending actor is guaranteed not to retain an alias to the object that allows either reading from it or writing to it.
    • That is, _read and write unique_ mutable references.

What permissions have to be expressed?

  • Writeability.
  • Readability.
  • Aliasing.
    • Track writeable aliases to determine global immutability.
    • Track readable and writeable aliases to determine read/write-uniqueness.

Expressing writeability and readability on a reference is easy.

Statically tracking all possible aliases is trickier. For example, in C++ a `const` reference is _locally_ immutable, but has no guarantee of _global_ immutability.

Let's concentrate on those aliases.


First, a terminology note:

  • A _local_ alias is another reference to the same object, held by the _same_ actor.
  • A _global_ alias is another reference to the same object, held by a _different_ actor.

Isolation

Let's call a read/write-unique mutable reference to an object iso, to indicate an isolated type.

An __iso__ reference to an object must guarantee that there are no readable or writeable aliases to the object, anywhere in the program. If such aliases existed, and we sent the object to another actor, we couldn't guarantee the __iso__ object and those aliases were held by the _same_ actor. So we couldn't guarantee data-race freedom.

This is ok.

digraph {
  Data [shape=box]
  Alice -> Data [label=iso]
  Bob -> Data [label="no permissions"]
}

But this is bad!

digraph {
  Data [shape=box]
  Alice -> Data [label=iso]
  Bob -> Data [label=readable]
}
Bob can read the Data while Alice is writing to it, which can cause data-races.

And this is bad too!

digraph {
  Data [shape=box]
  Alice -> Data [label=iso]
  Alice -> Data [label=readable]
  Bob
}
Alice could _send_ the __iso__ Data to Bob, but retain a readable reference. So in the future, Bob could write to the Data while Alice reads it.

The existence of an iso reference to an object must deny the existence of any local or global readable or writeable alias to the same object.


Global Immutability

Let's call a globally immutable reference to an object val, to indicate a value type.

A __val__ reference to an object must guarantee that there are no writeable aliases to the object, anywhere in the program. It's ok for readable references to exist, since we won't write to the object.

This is ok.

digraph {
  Data [shape=box]
  Alice -> Data [label=val]
  Alice -> Data [label=readable]
  Bob -> Data [label=readable]
  Bob
}

But this is bad!

digraph {
  Data [shape=box]
  Alice -> Data [label=val]
  Bob -> Data [label=writeable]
  Bob
}
Alice can read while Bob writes.

And this is bad too!

digraph {
  Data [shape=box]
  Alice -> Data [label=val]
  Alice -> Data [label=writeable]
  Bob
}
Alice could _send_ the __val__ Data to Bob, but retain a writeable reference.

The existence of a val reference to an object must deny the existence of any local or global writeable alias to the same object.

But readable aliases are fine.

Opaqueness

Let's call an opaque reference to an object that allows neither reading nor writing tag, to indicate a tag type.

A __tag__ reference to an object doesn't have to make any guarantees about aliases. It's ok for both writeable and readable aliases to exist, since we won't read from the object.

This is ok.

digraph {
  Data [shape=box]
  Alice -> Data [label=tag]
  Bob -> Data [label=writeable]
  Bob
}

And this is ok, too.

digraph {
  Data [shape=box]
  Alice -> Data [label=tag]
  Alice -> Data [label=writeable]
  Bob
}

The existence of a tag reference to an object doesn't deny the existence of any other aliases to the same object.


A reference to an actor doesn't have to make any guarantees about aliases.

Since we interact with the actor asynchronously, a reference to an actor doesn't have to make any more guarantees than an opaque reference. So we can type actors as __tag__ references.

Deny Properties


Let's build a matrix of these deny properties.


An iso reference denies local and global read and write aliases.

| Deny global | aliases | | -----------------------|--------------|-----------|----- Deny local aliases | Read/Write | Write | None Read/Write | iso | | Write | | | None | | |


A val reference denies local and global write aliases.

| Deny global | aliases | | -----------------------|--------------|-----------|----- Deny local aliases | Read/Write | Write | None Read/Write | iso | | Write | | val | None | | |


A tag reference denies nothing.

| Deny global | aliases | | -----------------------|--------------|-----------|----- Deny local aliases | Read/Write | Write | None Read/Write | iso | | Write | | val | None | | | tag


That's interesting! A type is sendable when it guarantees locally what it guarantees globally.

| Deny global | aliases | | -----------------------|--------------|-----------|----- Deny local aliases | Read/Write | Write | None Read/Write | iso | | Write | | val | None | | | tag


What other properties can we derive from this matrix?


Any reference that denies global read/write aliases is mutable. You can use that reference to read from or write to the object, because no other actor can read from or write to the object.

| Deny global | aliases | | -----------------------|--------------|-----------|----- Deny local aliases | Read/Write | Write | None Read/Write | iso | | Write | | val | None | | | tag | mutable | |


Any reference that denies global write aliases is immutable. You can use that reference to read from the object, because no other actor can read from or write to the object. You can't use that reference to write to the object, because some other actor might read from it.

| Deny global | aliases | | -----------------------|--------------|-------------|----- Deny local aliases | Read/Write | Write | None Read/Write | iso | | Write | | val | None | | | tag | mutable | immutable |


Any reference that denies nothing is opaque. You can't use that reference to read from or write to the object, because some other actor might read from or write to the object.

| Deny global | aliases | | -----------------------|--------------|-------------|----- Deny local aliases | Read/Write | Write | None Read/Write | iso | | Write | | val | None | | | tag | mutable | immutable | opaque


What goes in the empty cells of the matrix?


If we deny global read/write aliases, but allow any local alias, we have a reference that is mutable but not sendable.

It behaves like an object in most object oriented languages: you can have lots of aliases to it, and you can mutate it. But you have a guarantee: you know no other actor will read from or write to the object.

Let's call a non-unique mutable reference to an object ref, to indicate a reference type.

| Deny global | aliases | | -----------------------|--------------|-------------|----- Deny local aliases | Read/Write | Write | None Read/Write | iso | | Write | | val | None | ref | | tag | mutable | immutable | opaque


If we deny global write aliases, but allow any local alias, we have a reference that is immutable but not sendable.

In other words, it is _locally_ immutable as opposed to _globally_ immutable. This is analogous to a C++ `const` type, except we also know it is data-race free: no _other_ actor will write to it.

Let's call a locally immutable reference to an object box, to indicate a black box type.

| Deny global | aliases | | -----------------------|--------------|-------------|----- Deny local aliases | Read/Write | Write | None Read/Write | iso | | Write | | val | None | ref | box | tag | mutable | immutable | opaque


If we deny global read/write aliases, but allow local read aliases (but not write aliases), we have a reference that is write-unique, but not read-unique.

We can read from it and write to it through this reference, because we know only the same actor can read from it. We can also _freeze_ it as a __val__ (globally immutable) type in the future, because we know the object is only writeable through this reference. Or we can give up our write-uniqueness and allow local writeable aliases, and become a __ref__ (non-unique mutable) type.

Let's call a write-unique mutable reference to an object trn, to indicate a transition type.

| Deny global | aliases | | -----------------------|--------------|-------------|----- Deny local aliases | Read/Write | Write | None Read/Write | iso | | Write | trn | val | None | ref | box | tag | mutable | immutable | opaque


That's it!


The other spaces in the matrix stay empty, because we can't locally deny aliases that we don't globally deny.

| Deny global | aliases | | -----------------------|--------------|-------------|----- Deny local aliases | Read/Write | Write | None Read/Write | iso | | Write | trn | val | None | ref | box | tag | mutable | immutable | opaque


(If you have better names for these annotations, please let me know.)


Reference Capabilities


We call this system of annotations reference capabilities (rcap).

It is inspired by _object capabilities_ (ocap), a system for building secure, principle of least authority code in object-oriented languages. Existing ocap actor-model languages, such as __E__ and __AmbientTalk__, are the inspiration for Pony and for the reference capability type system.

Reference capabilities have a subtype relationship.

digraph {
  rankdir = LR
  iso -> trn
  trn -> ref
  trn -> val
  ref -> box
  val -> box
  box -> tag
}
In other words, an rcap is a subtype of another rcap if it denies _more_.

Is this too complicated for a programmer to use?


Theoretically: it's fine!

When programming in languages that aren't provably data-race free, programmers already keep a model of isolated, globally immutable and opaque data in their heads. These annotations _reify_ for the compiler a concept that is already pervasive in real world code.

Empirically: it's fine?

When programmers learn Pony, it appears to take them around two weeks to get the hang of how to use reference capabilities to express their _existing_ ideas about ownership and sharing. But we don't have anything more than anecdotal evidence for this.

Good defaults reduce the annotation load.

In Pony, each type can have its own default rcap. As a result, roughly 90% of types in the standard library have no rcap annotation.

How do reference capabilities compose?


Alice has a bicycle, which has brakes, which might be engaged.

digraph {
  rankdir=LR
  "bike: Bicycle" [shape=box]
  "brakes: Brakes" [shape=box]
  "engaged: Bool" [shape=box]
  Alice -> "bike: Bicycle" [label=iso]
  "bike: Bicycle" -> "brakes: Brakes" [label=ref]
  "brakes: Brakes" -> "engaged: Bool" [label=val]
}
How does Alice see these different objects? We call this _viewpoint adaptation_. We write viewpoint adaptation as: object ▷ field = result.

The brakes see their engaged status as val.

digraph {
  rankdir=LR
  "bike: Bicycle" [shape=box]
  "brakes: Brakes" [shape=box]
  "engaged: Bool" [shape=box]
  Alice -> "bike: Bicycle" [label=iso]
  "bike: Bicycle" -> "brakes: Brakes" [label=ref]
  "brakes: Brakes" -> "engaged: Bool" [label=val]
}
How does the bicycle see the engaged status? Global immutability is _deep_: __ref__ ▷ __val__ = __val__ Alice sees the engaged status as __val__, because viewpoint adaptation is right-to-left: __iso__ ▷ (__ref__ ▷ __val__) = __val__

The bicycle sees its brakes as ref.

digraph {
  rankdir=LR
  "bike: Bicycle" [shape=box]
  "brakes: Brakes" [shape=box]
  "engaged: Bool" [shape=box]
  Alice -> "bike: Bicycle" [label=iso]
  "bike: Bicycle" -> "brakes: Brakes" [label=ref]
  "brakes: Brakes" -> "engaged: Bool" [label=val]
}
How does Alice see the brakes? The uniqueness guarantees of __iso__ can't be violated, so __iso__ ▷ __ref__ = __tag__. In other words, an alias of the brakes from _outside_ the bicycle has to be a __tag__ to prevent the bicycle's _isolation_ from being violated. If Alice _sends_ her bicycle to Bob, we have to be sure she doesn't keep any readable or writeable references to the bicycle _or to any mutable state the bicycle can reach_.

There are rules for how each rcap sees each other rcap.

Each rule is simply the application of the deny properties of the object's rcap to the field's rcap.

(This table is simpler than it looks.)

Field
Object iso trn ref val box tag
iso iso tag tag val tag tag
trn iso trn box val box tag
ref iso trn ref val box tag
val val val val val val tag
box tag box box val box tag
tag

Similarly, there are rules as to what rcaps can be written into the fields of an object without violating its rcap.

This means not violating read/write uniqueness for an __iso__ and not violating write uniqueness for a __trn__. We call this _safe-to-write_. We write safe-to-write as: object ◁ input.
Input
Object iso trn ref val box tag
iso
trn
ref
val
box
tag
Sendable and read/write-unique types are ok to write into an __iso__. Sendable and write-unique types are ok to write into a __trn__. Sendable and non-unique types (i.e. all types) are ok to write into a __ref__. The other types aren't mutable, so it's not ok to write any type into them.

Non-reflexive aliasing and ephemerality


Is this legal?

let x: Something iso = ...
let y: Something iso = x
No! We would have two __iso__ references to the same object. That's bad.

The alias of an rcap is the rcap that has the most deny properties, but which is compatible with the original rcap.

In other words, the original rcap doesn't deny the alias.
Original Alias
iso tag
trn box
ref ref
val val
box box
tag tag

We have to destroy an iso reference in order to assign the underlying object to another iso reference.

let x: Something iso = ...
let y: Something iso = consume x
We use `consume` to indicate a destructive read. The object isn't destroyed - but the alias (in this case `x`) is removed from the lexical scope.

What's the type of consume x where x: Something iso?

`Something iso^` The `^` means an _ephemeral_ (or _unaliased_ ) type.

An ephemeral type has had one alias removed.

Since an __iso__ denies local and global read/write aliases, an __iso^__ is not reachable by any readable or writeable aliases. Which means it is safe to assign it to an __iso__, thereby establishing _one_ reachable read/write alias.

Destructive read and non-reflexive aliasing give us static alias tracking with no runtime cost.


Future Work


Pony uses reference capabilities in the presence of generic types, generic methods, algebraic data types, and a non-null type system.

We hope to extend the formalisation of reference capabilities to cover these cases.

Pony uses reference capabilities to help optimise garbage collection.

We hope to formalise this too.

Come get involved!

Join our community of students, hobbyists, industry developers, tooling developers, library writers, compiler hackers, and type system theorists.

Why not you?

http://ponylang.org