-
Notifications
You must be signed in to change notification settings - Fork 112
TM018 Tuples
Updated 6-21-2016 to use {a;b}
syntax
Updated 11-29-2015 to use {a,b}
syntax.
Updated 5-28-2015 to use (a;b)
syntax.
The currently-proposed t;0
syntax for access is ambiguous, need a better alternative. We don't want to use t.0
because that's too close to conflicting with likely syntax errors concerning number literals like .5
, which need a special syntax error.
Require spaces after semi-colons?
Change tuple access to not need a dot?
Tuples are useful for a whole bunch of things, and until now we've tried to see if we can get by without explicit tupling in the language. We've used anonymous objects for returning small collections of things, and several of us have ended up just adding a "Pair" datatype that we've used in isolated cases. Also, constructor syntax for key-value constructors (like string-dict) just don't work out in a typed context as well without some way to group keys and values.
Good tuple syntax would help us clean up a lot of these use cases. This is (the start of) a tuple proposal for Pyret.
Parenthesized, semicolon-separated sequences are for tuples:
Braced, comma-separated sequences are for tuples:
t = {"fst", "snd", 3}
The type syntax for tuples uses the same notation:
{Number, String, (Boolean -> Boolean)}
Destructuring can be done by assignment:
check:
{a; b; c} = {1; 2; 3}
a is 1
b is 2
c is 3
end
Or by explicit access:
check:
tup = {"a"; "b"; "c"}
tup{0} is "a"
tup{1} is "b"
tup{2} is "c"
end
That's about it; lots more discussion is below.
This would allow us to change the string-dict constructor to:
sd = [string-dict: {"a"; 5}, {"b"; 6}]
If we want special-er key-value syntax in constructors, it could desugar to tuples, which is still easy to type-check. So
[string-dict: "a" => 5, "b" => 6]
Could desugar to:
[string-dict: {"a"; 5}, {"b"; 6}]
Destructuring assignment is special syntax just for tuples:
{a; b; c} = {1; 2; 3}
A nice thing about this is that it doesn't require any special integration with cases to start, but it can be added later. So if you have a list of pairs, you'd write:
fun unzip<A; B>(l :: List<{A; B}>) -> {List<A>; List<B>}:
cases(List<(A; B)>) l:
| empty => {empty; empty}
| link(f, r) =>
{v1; v2} = f
{rest1; rest2} = unzip(r)
{link(v1, rest1); link(v2, rest2)}
end
where:
unzip([list: {1; 2}, {3; 4}]) is {[list: 1; 3], [list: 2; 4]}
end
Down the road, if pattern matching in cases gets more complete, the matches can just merge up into the cases patterns. To start, I think extending let-binding to handle this case will add enough for us to get a feel for the feature.
This kind of assignment does not nest (e.g. no multiple tuple levels).
We will have to write, e.g. unzip3, unzip4, for larger tuple cases.
For arbitrary element access, we could also add special syntax that requires a number:
check:
tup = {1; 2; 3}
tup.{0} is 1
tup.{1} is 2
tup.{2} is 3
tup.{3} raises "lookup-large-index"
end
It's a little ugly, but not terrible. Chaining could get odd:
check:
tup = {{"skip"; {"skip"; "skip"; {"skip"; "skip"; "skip"; "this one!"}}}}
tup.{0}.{1}.{2}.{3} is "this one!"
end
I think we do want (whether we expose concrete syntax for it or not), a binding form that does this, so we can express tuple deconstruction as a desugaring rather than as its own operation. That is,
{a; b; c} = {x; y; z}
means...
t = {x; y; z}
a = t.{0}
b = t.{1}
c = t.{2}
This avoids tying up the semantics of tuple access with binding forms.
Tuples recursively check their members for both equal-always
and equal-now
.
Tuples have identity, so identical
will report separate instantiations of
tuples containing the same values as non-identical.
check:
{1; 2; 3} is {1; 2; 3}
{1; 2; 3} is%(equal-now) {1; 2; 3}
{1; 2; 3} is-not%(identical) {1; 2; 3}
t = {1; 2; 3}
t is%(identical) t
end
One point of this proposal is to add enough syntax to make type-checking tuples as easy as possible:
{e1; e2; ...} :: {t1; t2; ...}
Single-element access is easy to type-check by projecting out the correct index of the tuple type (and checking for overflow). By the desugaring above, checking destructuring assignment is also simple.
One thing we could do to aid inference is allow annotations in tuple destructuring:
{a :: Number; b :: String; c :: (Number -> Number)} = {x; y; z}
This is similar to how we allow annotations in cases and for bindings.
raw-arrays already steal JavaScript array as a datatype (and they probably deserve to keep it). I think we'd want a few representations for the simple cases:
function Tuple1(v0) {
this.v0 = v0;
}
function Tuple2(v0, v0) {
this.v0 = this.v0;
this.v1 = this.v1;
}
function Tuple3(v0, v1, v2) {
this.v0 = this.v0;
this.v1 = this.v1;
this.v2 = this.v2;
}
And then
function TupleN(v0, v1, v2, vs) {
this.v0 = v0;
this.v1 = v1;
this.v2 = v2;
this.vs = vs;
}
For all further cases.
The compiler ought to be able to generate really fast code for the small cases (which will look like normal JS class member lookup to please the JIT), and will have an extra level of indirection (through an array) for the larger cases.
Concretely:
tup{0} == compiles to ==> tup.v0
tup{1} == compiles to ==> tup.v1 ? tup.v1 : tuple_index_error(...)
tup{2} == compiles to ==> tup.v2 ? tup.v2 : tuple_index_error(...)
tup{3} == compiles to ==> tup.vs && tup.vs[3] ? tup.vs[3] : tuple_index_error(...)
The checks are obvious to omit when code is fully type-checked.
Since these are for exploration, they may have some typos or thinkos in them; please feel free to correct on the mailing list or via an issue.
Several are pretty obvious functional patterns, which are used to feel out the syntactic choices.
fun zip<A; B>(l1 :: List<A>, l2 :: List<B>) -> List<{A; B}>:
doc: "Assumes same length for lists"
cases(List<A>) l1:
| empty => empty
| link(f1, r1) =>
link({f1; l2.first}, zip(r1, l2.rest))
end
end
With some extensions to cases
, we could write zip
as:
fun zip<A, B>(l1 :: List<A>, l2 :: List<B>) -> List<{A; B}>:
doc: "Assumes same length for lists"
cases({List<A>, List<B>}) (l1; l2):
| {empty; empty} => empty
| {link(f1, r1); link(f2, r2)} =>
link({f1; f2}, zip(r1, r2))
end
end
fun unzip<A, B>(l :: List<{A; B}>) -> {List<A>; List<B>}:
cases(List<(A; B)>) l:
| empty => {empty, empty}
| link(f, r) =>
{v1; v2} = f
{rest1; rest2} = unzip(r)
{link(v1, rest1); link(v2, rest2)}
end
where:
unzip([list: {1; 2}, {3; 4}]) is {[list: 1, 3]; [list: 2, 4]}
end
import string-dict as SD
users = [SD.string-dict
{"[email protected]"; make-user()};
{"[email protected]"; make-user()}]
for each(u from users.items()):
{email; user-obj} = u
when user-obj.is-new-account():
send-welcome-email(email)
end
end
This kind of pattern comes up all the time both in our compiler and in CS173 assignments.
fun eval-with-stores(exprs :: List<Expr>, store :: Store) -> {List<Val>; Store}:
for fold(acc from {empty, store}, expr from exprs):
{vals; current-store} = acc
{val; new-store} = eval(expr, current-store)
{link(val, vals); new-store}
end
end
Currently, things like this are common:
fun eval(expr :: Expr, s :: Store) -> { v: Val, s: Store }:
# an interpreter
end
fun eval-with-stores(exprs :: List<Expr>, store :: Store) -> {vs : List<Val>, s: Store}
for fold(acc from {vs: empty, s: store}, expr from exprs):
result = eval(expr, acc.s)
{
v: link(result.v, acc.vs),
s: result.s
}
end
end
Which requires remembering names like "v"
and "s"
if it wants to be
concise, or writing out long names like "value"
and "store"
to be more
descriptive. Also note that it would be weird to name the values field in the
return of eval-with-stores
just "v"
rather than "vs"
, since it needs to
be descriptive. With pairs it's more natural to be anonymous.