Fix Haskell's arrow notation #1

tomjaguarpaw · 2019-12-01T16:55:17Z

No description provided.

lexi-lambda · 2019-12-01T17:06:26Z

The trouble with making commands less second class is that they fundamentally involve a complex notion of lexical closure. Given a command f -< e, e may refer to “arrow-local” variables. This command may be passed to a control operator like handle, but it does not close over its free variables, as they must be threaded through the operator explicitly. This is where the environment-passing of proc’s desugaring comes in.

For a command to stand on its own, it needs to include information about which variables are free in its body within its type. This is complicated, as it requires a kind of extra dimension of information than types normally contain.

For a way to encode this, one could imagine an entirely new kind of type, an Id. These are not ordinary Types, nor are they Symbols—you could think of them kind of like hygienically-scoped identifiers. In that sense, they would be related to Template Haskell’s 'Name and ''Name quotations (though unfortunately that syntax is already stolen at the type level for data kinds). An Id would behave kind of like a skolem, but instead of being scoped by a forall, they are implicitly scoped by the placement of the term-level binder.

Using this new Id kind, a command could be given the type

Command :: IdBag -> [Type] -> Type -> Type

where the IdBag contains the command’s free variables, the [Type] is the argument stack, and the Type is the return type. However, implementing this Id kind would obviously be a pretty radical change to the type system for arrows, which would be a pretty tough sell!

tomjaguarpaw · 2019-12-01T17:20:26Z

I think your Command type describes the command type from Definition 3 (page 5) of http://www.staff.city.ac.uk/~ross/papers/notation.html (although the arrow type itself seems to be missing from your version).

lexi-lambda · 2019-12-01T17:33:14Z

Yes, they are related, though I think the command type given by Definition 3 lacks the information about the free variables, which is the biggest obstacle to a command standing on its own. Take a look at Type and Translation Rules for Arrow Notation in GHC, and note how the context of arrow-local variables (named Δ) is interwoven with both typechecking and desugaring rules.

My ArrowEnv type family given in my proposal, and the corresponding Σ form in my typechecking judgements, is closer to my Command type described here. However, there is still a key difference: in the current typechecking rules and my proposal, the environment threading only appears explicitly in the typechecking rule for (| e c ... |); in other rules the environment threading is implicit. This is precisely because the notation is so aggressively first order: the (| e c ... |) rule is the only place where some second-order behavior sneaks in, and even there, it’s highly constrained.

If you want commands to be real values, it becomes possible to move them out of their current context into some other context entirely, and the assumptions made by the current rules for proc no longer apply. Therefore, you have to move a lot more information into the type system to ensure that the eventual use site where the command ends up still includes the necessary context, and captures the necessary information to wire up the command’s inputs to the corresponding values in its context during desugaring.

tomjaguarpaw · 2019-12-01T17:38:53Z

Isn't your IdBag the thing on the LHS of Paterson's ~>?

lexi-lambda · 2019-12-01T17:44:27Z

Oh, I suppose you are right. But Paterson’s ~> is really just using an extra argument on the argument stack to thread the environment through a control operator using a compiler-defined representation; it is guaranteed to work because the thing on the LHS (the b in b ~> θ) is universally quantified. Paterson punts on the idea of the sort of “relocatable” command I describe above. Things only work because the skolem in the type assigned to control operators ensures the environment is threaded to the operator’s arguments, so the first-order nature of the rest of the system can be maintained.

tomjaguarpaw · 2019-12-01T17:49:56Z

I would love an arrow syntax that allowed the types to look like the curried function types that we're used to in Haskell. For example, instead of

keyed :: Rule m (e, k, a) b -> Rule m (e, Map k a) (Map k b)

to have something like (following Paterson's notation)

keyed :: (k ⇀ a ⇀ Rule m \ b) -> (Map k a ⇀ Rule m \ Map k b)

This looks a lot like the monadic version would

keyed :: (k -> a -> Rule m b) -> (Map k a -> Rule m (Map k b))

But even if we could get ⇀ to mean something sensible we still need to handle the environment, as you mentioned above.

keyed :: (k ⇀ a ⇀ e ⇀ Rule m \ b) -> (Map k a ⇀ e ⇀ Rule m \ Map k b)

This looks less pleasant, but it's still workable.

tomjaguarpaw · 2019-12-01T17:54:50Z

I think we probably have to put the environment into the terminal type constructor. Can we have something like

data Command arr env =
    A arr env Type
  | (⇀) Type (Command arr)

lexi-lambda · 2019-12-01T17:57:43Z

The issue I was alluding to above is that it isn’t clear what env ought to be instantiated with. What do you put there? It can’t be a tuple, since that just describes the types of things, but you can’t connect the types to particular variables in scope without more information. It can’t be a mapping from variable names to types, as variables can be shadowed. So you need something more—and that brings in all the complexity of the hypothetical Id kind.

tomjaguarpaw · 2019-12-01T18:09:14Z

Oh, I assumed that keeping it polymorphic would always be fine.

lexi-lambda · 2019-12-01T18:09:55Z

If you can take the command out of its context and pass it around, where does the quantifier go?

tomjaguarpaw · 2019-12-01T18:21:07Z

Hmm, it feels like keeping it polymorphic a la ST should be sufficient, but I may not be imagining the same sorts of things that you are.

lexi-lambda · 2019-12-01T18:27:25Z

To be a little less cryptic, let me put it another way. When GHC checks a control operator against a polymorphic type, that’s okay, because a control operator isn’t a command, but a command maker. All that polymorphic type is saying is that the command maker needs to work on commands with any environment, and it needs to treat them all equivalently. That’s fine.

But GHC never assigns a polymorphic type to the environment of an actual command itself. That would be a big problem! If a command is given the type ∀ env. Command env args a, that implies it has to work with any environment… which would mean the term would have to be closed. No good. Dually, if a command is given the type ∃ env. Command env args a, that would mean the caller has to be able to provide any environment, which is also no good.

So after GHC checks a control operator against the polymorphic type, it actually immediately instantiates it to a concrete type before continuing with typechecking. It only uses the polymorphism to ensure the control operator can’t observe the concrete type, since it’s an implementation detail.

Put yet another way, the polymorphism in a control operator is simply a way of ensuring two environments are exactly the same (specifically, the input environment to the resulting command and the input environment to its arguments). It pins a relationship between two commands. If you want a command to be able to stand on its own, there’s no relationship you can take advantage of, there’s just an environment, so you can’t get away with the polymorphism trick.

tomjaguarpaw · 2019-12-01T19:50:26Z

This is an "embedded" version of the typing rules that I'm imagining. I've used GADTs rather than TypeFamilies as I prefer to keep things explicit. That implies a certain amount of syntactic noise though. It wouldn't leak through to the user though.

{-# LANGUAGE DataKinds      #-}
{-# LANGUAGE TypeOperators  #-}
{-# LANGUAGE GADTs          #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE PolyKinds      #-}

import GHC.Types
import Control.Arrow
import Control.Category

data CG type_ = A type_ | (:->) type_ (CG type_)

infixr :->
infixr :->.

type (:->.) (a :: k) (b :: k) = a :-> A b

data Command arr env c where
  I :: arr env b -> Command arr env (A b)
  K :: Command arr (env, a) b
    -> Command arr env (a :-> b)

data CO f c where
  J :: f b -> CO f (A b)
  L :: (f b -> CO f c) -> CO f (b :-> c)

type ControlOperator arr env = CO (Command arr env)

originalKeyed :: (Eq k, Applicative m)
              => Rule m (e, k, a) b
              -> Rule m (e, Map k a) (Map k b)
originalKeyed = undefined

keyed :: (Eq k, Applicative m)
      => ControlOperator (Rule m) env
         ((k :-> a :->. b) :->. (Map k a :->. Map k b))
-- It's the type that's important.  The implementation is too messy to
-- care about looking it.
keyed = L (\x ->
  let K (K (I x')) = x
      mungeArguments = (arr (\(x, y, z) -> ((x, y), z)) >>>)
  in  J (K (I (originalKeyed (mungeArguments x')))))

-- Irrelevant cruft:

data Map k v
-- Only doing this proxy business to avoid turning on KindSignatures
data Rule m a b = Rule (Proxy (m a))
data Proxy a = Proxy

instance Category (Rule m)
instance Arrow (Rule m)

[EDIT: Improved names] [EDIT2: Generalised to a jaw-dropping degree]

tomjaguarpaw · 2019-12-01T20:09:47Z

Naively, I expect the env to a similar role to the polymorphic return type in the holey monoid and various Cont tricks*. I expect it to either end up looking like a -> b -> c -> ... -> env or (a, (b, (c, ... env))) depending on which way it happens to work out in practice!

NB: Not as I said before the same role as the fully polymorphic s in ST.

lexi-lambda · 2019-12-02T09:07:36Z

I think you might be focusing too much on control operators (e.g. keyed) and not enough on commands themselves. The core issue at play here is what type we give to commands, since that’s where pinning down the meaning of env actually comes up.

Consider a simple use of proc:

proc (foo, bar) -> do
  baz <- f -< foo
  (g -< bar) `handle` \e -> h -< (e, baz)

Now, imagine we’d like to pull the \e -> h -< (e, baz) command out into a local variable, a perfectly reasonable thing to want to do in real programs. That means we’d like to be able to write something like this:

proc (foo, bar) -> do
  baz <- f -< foo
  let h' = \e -> h -< (e, baz)
  (g -< bar) `handle` h'

The key question is: what is the type of h'?

Given we’ve already established a command has three components—environment, argument stack, and return type—two thirds of the question is easy to answer. If we assume h has type arr (a, b) c, then h' must have type env ~> a ⇀ arr \ c. But what is env? It must be something, as the user presumably ought to be able to write the type down. And even if they can’t, GHC needs to know what env is in order to do the proper desugaring, since it relies on the type of env to know which variables to thread through handle.

To reiterate:

env cannot just be b, since that is insufficient information for GHC to know where the value of type b comes from during desugaring.
A closer solution would be to make env a row type, like (baz :: b | r), to use PureScript syntax. That would work, but it could be a little confusing, since if baz were shadowed between the definition of h' and its use site, it would use the shadowing definition, breaking lexical scope. (I have thought about problems like that at great length, since it’s very reminiscent of the hygiene problem in macro system research.)
A perfect solution would be a row type where baz is not a symbol, but a lexically-/hygienically-scoped name, aka my hypothetical Id kind. Pinning down the precise semantics of such a type would be tricky, though—you need to know how to check to ensure an identifier does not “escape” its scope, for example, and that involves an interaction with term-level scoping that skolem escape doesn’t have to deal with.

tomjaguarpaw · 2019-12-02T11:47:48Z

Let's see if I can catch up with you a bit. It seems to me that we would like to be able to go one step further and write

proc (foo, bar) -> do
  baz <- f -< foo
  let h' = \e -> h -< (e, baz)
  let g' = (g -< bar)
  g' `handle` h'

Assume h has type arr (e, baz) c and g has type arr bar c. Then h' must have type env ~> e ⇀ arr \ c and g' must have type env ~> bar ⇀ arr \ c. In particular the envs must be the same type, right? (The cs are also the same type but I don't think that's very interesting).

lexi-lambda · 2019-12-02T15:07:40Z

In that particular case, yes… but depending on whether or not you believe Let Should Not Be Generalized, I think it’s not necessarily clear that their inferred types should be identical. Consider this slightly modified example:

proc (foo, bar) -> do
  let h' = \e -> h -< (e, foo)
  let g' = g -< bar
  baz <- g' `handle` h'
  let i' = i -< baz
  i' `handle` h'

In this example, h'’s environment type must fundamentally be more than a (unconstrained) rigid type variable for the program to typecheck. The environment of i' includes baz, but obviously the environment of g' cannot. It ought to still be legal to use h' in both of them, assuming the type of h' is generalized.

tomjaguarpaw · 2019-12-02T21:08:21Z

Intuition question: is the env basically capturing the type of the "closure" formed by an ~~application~~ abstraction?

lexi-lambda · 2019-12-02T21:18:32Z

Yes, exactly. Sorry if I hadn’t made that clear already; that’s sort of essential.

The thing that makes it complicated is you don’t actually have arrow closures, since arrows only have inputs and outputs. So when you desugar something like

proc (x, y) -> do
  z <- f -< x
  (g -< z) `handle` (h -< y)

you have to turn it into something like

first f >>> handle (first g) (second h)

and for more elaborate kinds of composition it gets ugly really quick.

One thing I guess you could maybe consider, which I haven’t really thought about, is if it would be possible to extend the notion of closure to arrows. That is, could you somehow introduce an operation that encodes closure more directly? I’m not sure. I’m not really sure what that would mean, since closure implies a lexical scope “outside” the arrow that it can close over, but you have to be “trapped inside” the arrow for it to work, since otherwise you have to thread arrows through arrows and that needs ArrowApply.

tomjaguarpaw · 2019-12-02T21:38:24Z

Ah great. What confused me is that the type of a control operator is supposed to be

forall b. (b ~> t1) -> ... -> (b ~> tn) -> (b ~> t)

but now I realise (I think) that this will effectively only be ever used at type

forall a1 ... an. (a1 ~> t1) -> ... -> (an ~> tn) -> ((a1, ..., an) ~> t)

by choosing b = (a1, ..., an) and then projecting what we need for each argument. The latter seems encouraging. Doesn't it give us a way of more clearly threading what we need to the right place?

tomjaguarpaw · 2019-12-02T21:40:30Z

Specifically, where Paterson has

proc p -> form e c1 ... cn = e (proc p -> c1 ) ... (proc p -> cn )

couldn't we replace it with

proc (p1, ..., pn) -> form e c1 ... cn = e (proc p1 -> c1) ... (proc pn -> cn)

lexi-lambda · 2019-12-02T21:44:33Z

but now I realise (I think) that this will effectively only be ever used at type

Essentially yes, but the type you’ve written isn’t quite right, since all the bs have to be the same (since the control operator is just passing the same value to each of its arguments). So it’s really

forall a1 ... an. ((a1, ..., an) ~> t1) -> ... -> ((a1, ..., an) ~> tn) -> ((a1, ..., an) ~> t)

and then each argument projects what it needs out of that tuple.

Specifically, where Paterson has

proc p -> form e c1 ... cn = e (proc p -> c1 ) ... (proc p -> cn )

couldn't we replace it with

proc (p1, ..., pn) -> form e c1 ... cn = e (proc p1 -> c1) ... (proc pn -> cn)

No, because all the arguments have to have the same type. Remember: you don’t control e, it’s just an expression that produces a function on arrows. The arrow it produces accepts an argument and threads that argument to its argument arrows. It wouldn’t “know” how to project the right tuple out for each argument individually.

tomjaguarpaw · 2019-12-02T21:48:39Z

OK, I'll have to mull that over a bit more.

Another question: is this basically just doing closure conversion?

lexi-lambda · 2019-12-02T22:49:12Z

Another question: is this basically just doing closure conversion?

Yes, it is pretty much precisely closure conversion. The problem is the implementation details of our conversion algorithm leak out into the type system.

tomjaguarpaw · 2019-12-03T07:28:11Z

Great. Until future notice I will think of this as performing closure conversion with only a restricted set of operations to hand.

tomjaguarpaw · 2019-12-04T10:58:25Z

I observed two things that interested me. Firstly, if we can make abstractions of the form we've been considering then we can package up things that only exist in the "arrow argument scope" as commands and then later extract them, e.g.

    ...
    let bAsACommand = returnA -< b
    ...
    b <- bAsACommand -< ()

There's nothing new in this except that this particular way of looking at things was new to me.

Secondly, any "env as scope" approach has to be able to deal with shadowing, so if we write

proc (foo, bar) -> do
  baz <- f -< foo
  let h' = \e -> h -< (e, baz)
  baz <- somethingElse -< quux
  (g -< bar) `handle` h'

then if the env for h refers to baz then it has to be able to deal with baz becoming shadowed later (unless we want dynamic scope :D).

lexi-lambda · 2019-12-04T15:58:46Z

Secondly, any "env as scope" approach has to be able to deal with shadowing

Yes, that’s what I was talking about in #1 (comment) (see points number 2 and 3). It’s tricky! I can’t think of a solution other than my complicated Id kind idea, but maybe I’m not thinking sufficiently outside the box.

tomjaguarpaw · 2019-12-04T17:34:11Z

Ah yes, I see. I'm glad I'm finally catching up :)

tomjaguarpaw · 2020-01-01T16:13:48Z

I'm pleased to report that in the intervening four weeks I've managed to improve my understanding to the point where all your comments above make sense to me. Previously I didn't really have a solid grasp of the issues under discussion. This summarises my current beliefs:

It should be possible to express any "higher-order" arrow notation expression using arrow combinators
It should be possible to come up with a desugaring that works in the general case
This desugaring might have to be rather complicated and might interact in interesting ways with Haskell's type checker, because it requires explicitly keeping track of the context of every statement (i.e. all the variables in scope) and plumb them correctly to their point of use (simulating closures).

tomjaguarpaw · 2020-01-01T16:21:24Z

We're trying to do something a little bit like converting lambda-calculus to SKI.

ivanperez-keera · 2020-01-02T16:21:19Z

As a regular user of arrows, I believe I could contribute to advancements in this area in GHC, but I am behind on my understanding of the current state of arrows, and the problem that this issue is trying to solve.

As time passes, papers are outdated, and it becomes harder and harder to catch up. This was originally my problem: syntax that I thought was supported wasn't, and so I learned by using them, but saw no path from papers to code.

I do not understand all the syntax that is available. For example, I never used banana brackets. I also normally limit my use of arrows to Arrow, ArrowLoop, and ArrowChoice.

What is the smallest set of up-to-date references that one would have to read to understand arrows as they are implemented today?

lexi-lambda · 2020-01-02T16:50:38Z

What is the smallest set of up-to-date references that one would have to read to understand arrows as they are implemented today?

As far as I can tell, if you demand they be both complete and fully up to date, the minimal set currently includes the GHC source code. Alas.

Type and Translation Rules for Arrow Notation in GHC is by far the closest thing we have, though it includes no explanatory prose, so it’s best viewed as a revision of the information in Paterson’s original paper. However, even those two do not form a full picture, as the changes made in GHC 7.10 are not represented in either of them. The relevant section of the GHC User’s Guide gets most of the rest of the way there, but it is imprecise and misses a few subtleties.

That said, I think understanding the above three resources will basically have you up to speed, and it’s certainly enough to be able to have an informed discussion. Some of the added context explained in my proposal can also help to understand a few of the subtleties absent from the GHC User’s Guide.

If that proposal is accepted and implemented, I’ll try to update the Types and Translation Rules document to reflect the status quo as part of those changes, but for now, that’s the situation.

tomjaguarpaw · 2020-01-16T16:10:21Z

it is pretty much precisely closure conversion

This answer expands a little bit on this idea

https://stackoverflow.com/questions/2300151/closure-conversion-and-separate-compilation-of-higher-order-function-calls/2300601#2300601

tomjaguarpaw self-assigned this Dec 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Haskell's arrow notation #1

Fix Haskell's arrow notation #1

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019 •

edited

Loading

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019 •

edited

Loading

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 2, 2019

tomjaguarpaw commented Dec 2, 2019

lexi-lambda commented Dec 2, 2019

tomjaguarpaw commented Dec 2, 2019 •

edited

Loading

lexi-lambda commented Dec 2, 2019

tomjaguarpaw commented Dec 2, 2019

tomjaguarpaw commented Dec 2, 2019

lexi-lambda commented Dec 2, 2019

tomjaguarpaw commented Dec 2, 2019

lexi-lambda commented Dec 2, 2019

tomjaguarpaw commented Dec 3, 2019

tomjaguarpaw commented Dec 4, 2019

lexi-lambda commented Dec 4, 2019

tomjaguarpaw commented Dec 4, 2019

tomjaguarpaw commented Jan 1, 2020

tomjaguarpaw commented Jan 1, 2020

ivanperez-keera commented Jan 2, 2020

lexi-lambda commented Jan 2, 2020

tomjaguarpaw commented Jan 16, 2020

Fix Haskell's arrow notation #1

Fix Haskell's arrow notation #1

Comments

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019 • edited Loading

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 1, 2019

tomjaguarpaw commented Dec 1, 2019 • edited Loading

tomjaguarpaw commented Dec 1, 2019

lexi-lambda commented Dec 2, 2019

tomjaguarpaw commented Dec 2, 2019

lexi-lambda commented Dec 2, 2019

tomjaguarpaw commented Dec 2, 2019 • edited Loading

lexi-lambda commented Dec 2, 2019

tomjaguarpaw commented Dec 2, 2019

tomjaguarpaw commented Dec 2, 2019

lexi-lambda commented Dec 2, 2019

tomjaguarpaw commented Dec 2, 2019

lexi-lambda commented Dec 2, 2019

tomjaguarpaw commented Dec 3, 2019

tomjaguarpaw commented Dec 4, 2019

lexi-lambda commented Dec 4, 2019

tomjaguarpaw commented Dec 4, 2019

tomjaguarpaw commented Jan 1, 2020

tomjaguarpaw commented Jan 1, 2020

ivanperez-keera commented Jan 2, 2020

lexi-lambda commented Jan 2, 2020

tomjaguarpaw commented Jan 16, 2020

tomjaguarpaw commented Dec 1, 2019 •

edited

Loading

tomjaguarpaw commented Dec 1, 2019 •

edited

Loading

tomjaguarpaw commented Dec 2, 2019 •

edited

Loading