-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A step towards dependent types #362
base: main
Are you sure you want to change the base?
Conversation
lib/hobbes/boot/dependent.hob
Outdated
@@ -0,0 +1,3 @@ | |||
|
|||
type Type = ^x.|Prim:([char] * (() + x)), Record:[[char]*x], Variant:[[char]*x*int], OpaquePtr:([char]*int*bool), Exists:([char]*x), Recursive:([char]*x), TAbs:([[char]]*x), TApp:(x*[x]), Func:([x]*x), Array:x, FixedArray:(x*x)| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome, nicely simplifies the crazy ad-hoc C++ encoding of the same data (and embeds in hobbes perfectly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort-of, note we are missing TExpr...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we hide the TExpr's Expr behind an existential?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed out TVar because I didn't want them as input, but it might actually be useful to be able to construct them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that in order to go fully into dependent types, we just need Expr
(with a few more constant forms for e.g. the unit type, the universe type hierarchy, ...) since we would no longer differentiate between expressions and types (or we'd say an expression belongs to a "type" which is another expression which belongs to a "type" etc until we terminate in a universe, where it'll always be the case that a universe belongs to the type of its successor universe).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I can add this as TExpr:<hobbes. ExprPtr>, then give another unqualified to reify that pointer.
Edit, no that's a terrible idea. I should just make a similar recursive variant for Expr.
Then we could reimplement switchOf and use the hobbes types in c++ too...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making the internal representations of types and expressions into these bindable recursive variants seems like the right way to go. Then all of that boilerplate code to go between the two equivalent representations can be eliminated.
After that, reducing the type/expr types to just expr will get us to dependent types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Removing boilerplate is always the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now if we also constructed an Expr type, allowed that to be returned and added a SymDef that reifies that, we could write a generic apply.
I am sorry if the I think that moving hobbes to a dependent type system is a really great idea. There are a lot of things that we could do with such a type system. It doesn't even have to be homotopy type theory, but we can always benefit from more descriptive types. One thing that I struggle with a little bit is how to reconcile qualified types (like we have here where constraints in types get eliminated with |
I'm not being serious. Unqualifiers are great, I use one as a type provider for sql queries for instance. But it does do my head in a bit moving back between hobbes and c++, and since quote type existed already, I thought I sh^H^Hcould abuse them. I like the staged evaluation, I just want to do it one place. With this approach, I can in the 1st stage create the type sig from the TString or whatever, then in the second use type classes on it to create the value I need. In the file example, if we add a function that takes the deserialized type rep and shove it through the MonoTypePtrToExpr monstrosity, we could do all that fun file things entirely in staged hobbes |
This gets at something that I think is really important and subtle about PLs like this, which is this phase distinction between types and terms (and I guess it goes all the way up the universe hierarchy, so types are prior to terms but "kinds" are prior to types, etc). The reason that "recordHeadLabel" can be very efficient in its |
The recordHeadValue_ I provide is still executed at compile time (I meant lag in the repl), it is just compile takes longer, as we call into the llvm as part of the compile. The problem is the lack of fundeps. This could be solved if I wrote TypeApply a This approach does not abandon the multi-stage approach, it just allows writing stages in hobbes Specifically, once compiled, latency is unaffected. Compile takes longer, noticeable in the repl Compiling one line with an unqualifier, the unqualifier is call multiple times, each call involves here a call to the llvm. Caching on MonoTypePtr inputs would remove a lot of this and speed up compile, but not affect either way run-time speed. recordHeadValue_ at runtime is just cast and a Proj, no worse than the standard one. |
Oh I see, that's awesome! Yeah it should only need to compile and run once then, and maybe we turn down back-end optimizations a little (if the LLVM step is very expensive). |
Yeah, llvm is not fast. Maybe your standalone jit is what we need. Sadly looking at :x we see the llvm doesn't inline my version, but the timings aren't that different
|
BTW, I'm sorry but it is 9pm here on a Friday, I'm not going to fix the build now. I hope you can try out whatever the nix version is, which does build, to play around with this |
…pe to contain TVars, maybe returning them is useful
No problem, it was a small issue on some platforms/compilers and I fixed it. We use hobbes in a variety of different projects with different levels of strictness, so I've tried to capture as many of those as I can to avoid finding out about them when those projects can't use a new version of hobbes. I still need to look at this more closely, but I think that it's a very good idea and probably a step in the right direction. |
Just for fun, here's an example using this that isn't just a re-implementation of existing c++ code. This is a very simple printf type thing, where we use a quotedd string to generate a Type that then processes the arguments correctly.
Then
Feeding in arguments of the wrong type into the tuple is a compile-time error |
That printf example is excellent. I see how it stages evaluation so that format string processing happens at compile-time (and the format string type is equivalent to the unit type in run-time representation, so it has no cost at run-time). I wonder if the quote brackets are worthwhile or if it'd be better to avoid them (I'm just wondering, not saying that quote brackets aren't worthwhile). I added that quote syntax to do remote code execution, but have heard differing opinions about how clear it is to use (and have had a few people tell me that they have a hard time understanding "what happens when"). One good thing about the quote brackets is that it makes staging obvious (and people reading the code can have something to point to). Another way we could do it would be to make something like a pi-type where named arguments could be implicitly quoted. It's kind of tangential, but I'm curious to know what you think of working with quoted expressions versus alternatives. |
;) I don't think that's going to help matters. I see that quotes were introduced for rpc, but I think they are underappreciated and could be more widely useful. (inputFile could take an arg of a quoted string for instance, and remove the need for the specific unqualifier constraint boilerplate). As for your question, in afraid the answer is, as a pragmatic end-user, I don't really mind. I made this pr as functionality I wanted was annoyingly close, all the pieces were there, just needed joining. I don't have a well enough developed PLT sense of aesthetics to really be able to take an opinion. Quotes have the advantage of at least being very clear that something different is happening. |
That's good to hear, maybe the quote syntax is helpful as a visual cue that this part happens at an earlier stage. This is a really good start and we will probably want to iterate on it for a while. I don't know if it's better to keep it open or merge it and iterate with future PRs (generally I tend to merge and iterate with new PRs). There are cases for dependent types where you don't want the "quoted part" to go away at run-time. For example, you could have a dependent record type like There are other cases like that where we can use dependent types to relate multiple values. Another useful one is for decidable equality, where a check I hope that isn't discouraging to point out, but it's just to say that there's still some additional behavior we'd have to capture before we really said that we "have dependent types". This is a good step though. |
I think that we should add some tests around this, given that it's kind of a subtle feature and we want to capture the valuable things you've highlighted here. Maybe your examples could be turned into tests? That printf one especially is a great test, the type-level arithmetic also. |
} | ||
|
||
ExprPtr with(const TExpr *) const { | ||
// We can't put an arbitrary value in a Type, can we do something else? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although we could just return the ExprPtr out of this TExpr, to make this reflection total we'll have to do the same thing for expr (and/or just bring the two types into one).
lib/hobbes/lang/preds/dependent.C
Outdated
}; | ||
|
||
template <> | ||
struct lift<Type*, false> : public liftType { }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this go through this type_
global variable? Were you avoiding explicitly constructing the Type
structure here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extreme DRY adherence, sorry. The hobbes représentation for this much easier to read and refer back to, so I want it there, but getting that instance here is tricky
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if I can find a good way to do that without requiring you to repeat yourself (which would be very bad, I agree), but also without requiring this reacharound with a global variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put in a change to avoid the global variable in this case, hope you don't mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's how it's done, I was sure I'd tried that, but obviously not.
return true; | ||
} | ||
// This doesn't seem to work how I'd hoped, closures don't get eval'ed | ||
// properly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens that you aren't expecting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't get this sort of thing
t1 :: (TypeApply a `\x.\y.x+y` 3) => () -> a
t1 = \().unsafeCast(())
t2 :: (TypeApply b `\x y.apply` a 2) => a -> ((Z b))
t2 = unsafeCast
to work, but actually it seems to be dying on the def of t1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an interesting example, because in t1
your result for a
should actually have a qualified polymorphic type like (Add int a b) => exists E.(((E,a)->b) * E)
. However, type constraints only allow monomorphic types (or type variables) in their arguments, so it's kind of a problem of the result being "too big to fit".
That will be a problem for TypeApply
with any such type (not just closures).
The fact that LLVM gets involved and we do some compilation to determine a result for one of these constraints is another way to get to the same conclusion. We can only produce machine code (so only involve LLVM) at monomorphic types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, should have been more exact with signatures, replacing (+)
with ladd
, doesn't change matters, and the ladd closure is certainly compilable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I guess you have an arity mismatch in t1
(since TypeApply
expects 4 arguments). But it does seem like monomorphic closure types can't get through either:
> :t newPrim() :: (TypeApply a `\x y.\z.iadd(x,iadd(y,z))` 21 1) => a
stdin:1,28-52: Expected existential type in closure application
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TypeApply
takes a variable number of arguments, t1
should be fine there, since there is only one argument to the first function. One problem is that I've managed to delete some code, t1
as stands can only be evaluated to a long
, [char]
or Type
, the fallback code that shoves it into a quote has gone walkabout. If only I had tests ;)
Your attempt won't work, because you are using iadd
and feeding in long
s, (this is slightly annoying that type level longs are written without the L
, and type level 0L
must be written as false
, as 0
gets turned to void
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Facepalm, of course, my mistake re: type-level longs ... that should be changed. I was too happy with the product/sum deconstruction bottoming out in unit (for products) and void (for sums) so I could write a*b*1
or a+b+0
. In retrospect, it wasn't worth it. I think we can change 0
to a type-level number instead of void
.
Also the syntax difference for term-level long versus type-level long, that's annoying (and just tripped me up). If we do go all the way down the rabbit hole with dependent types, then we won't be able to distinguish syntax between levels, so will have to resolve this. :T
ExprPtr e = validateType(tenv, ex, ds); | ||
if (!e) return false; | ||
if (*e->type()->monoType() == *primty("long")) { | ||
r = MonoTypePtr(TLong::make(ctx->compileFn<long()>(e)())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like falling down the rabbit hole, very nice. :)
Oh yes, this is far from full dependent types, (see PR title ;) ), but it allows us to write lots of things that dependent types would. The "quoted part" is available at run-time, through typeValueLower. Here is an array that knows its length at compile time, the size is taken from the type
This seems close to the functionality (but not syntax) you mention above. (And we can add static lookup constraints)
Then
My plan here is to work from what can be easily added to allow a dependently typed program to be mechanically translated into something that compiles. Once every thing can be done like that, the final step is just syntatic sugar. |
I do get what you say though, for instance loooking over an array and pushing elements onto such a static array does not work, let alone if we did pushed to two arrays and then demanded proof they were equal. But by adding this limited type level application here, I think it might be become a bit clearer how to solve that in the current framework. The other option is to rewrite an just do it, but that seems more work |
Trying this example with an LLVM 3.5 debug mode build on macOS, for some reason the process hangs and eventually gets killed. But when I run it under ETA: Tried it with LLVM 6 debug mode, ran totally fine (much faster to compile actually). This might be an issue with 3.5, kind of annoying. I'll see if I can put a test in at least. |
I hope you don't mind me poking around your PR, but I added a unit test based on your printf example (changed to sprintf, so I get a value back I can validate). For new features like this, I think it's helpful to have some tests to show how it's supposed to work. |
I'm afraid I haven't seen anything like that, somewhat to my surprise this whole thing went very smoothly. |
Please feel free to do make whatever modifications make sense. I've got no great attachment to the code other than I like what can be done with it. Totally agree that tests for this sort of thing is necessary. If the overall approach makes sense, I'll try to add a couple more when I get time. |
static MonoTypePtr build(_OpaquePtr::type& op) { | ||
return OpaquePtr::make(makeStdString(std::get<0>(op)), std::get<1>(op), std::get<2>(op)); | ||
return OpaquePtr::make(makeStdString(op.at<0>()), op.at<1>(), op.at<2>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth, the reason for this difference has to do with memory layout. An std::tuple<A,B>
doesn't have the same memory layout as an std::pair<A,B>
, nor the same memory layout as struct { A a; B b; }
. That's the only reason I made the hobbes::tuple<...>
type, to get a tuple with standard memory layout -- which is necessary so that tuples in hobbes can be mapped into and out of C++ without cost.
Are you happy with this change? I think that it's pretty small and isolated, basically just adding these two functions and a type definition. If you'd like to use it, I'm ok with merging it. |
Re: caching, I did something like this on an internal change, might be pushing it to github fairly soon. You can rely on the fact that type structural equality implies pointer equality and vice versa -- which makes it easy to cache by type. |
Thinking about this a little bit more, I think that this could actually be very useful for us to squeeze even more efficiency out of the case where we want to convert our structured logs back to text (maybe not the most exciting use-case, but a valuable use-case anyway). For structured logs (as produced by the header-only |
This is sweet, the ability for hobbes to support dependent types is fantastic. @adam-antonik, will you mind adding a section about this in our documentation as well ? 👍 for choosing |
Sorry for disappearing from this recently. I hope to have time to get back to this soonish. This isn't abandoned by me at all, but boring work has taken priority this month. |
No rush! |
I just noticed that it appears that the builds/tests are stuck for this PR although if you click through you'll see that they all ran and passed. I think that the reason is that you sent the PR from the master branch of your fork. I have hit this same problem before, and have found that I can't get travis-ci to indicate status correctly unless I submit from a non-master branch off of my fork. |
I'm upset that I have to write Unqualifiers in C++ and not in hobbes. So upset I made this mess.
This adds two Unqualifiers, TypeValueLower, and TypeApply.
Let's start with TypeValueLower. It lowers a TLong to a long, TString to a [char], and (quote a) to a. This allows us to write recordHeadLabel as
Great, but we still can't write recordHeadValue. Hence TypeApply.
will lower any types after the second to ExprPtr's, apply them to f, and raise the result to a MonoTypePtr stored in a. For the arguments: TLongs to longs, TStrings to [char], (quote a) to a, and an arbitrary non-type variable to an encoding as a recursive variant named Type that stores that type information. For the result: long to TLong, [char] to TString, Type to its MonoType counterpart, and anything else to a quoted expression.
So we can do this
Then
But also
Using the quote conversion to do
Let's now do something with the ExprPtr representation of a type, and write recordHeadLabel
This is a fairly large lump of code, but it would allow moving a fair amount of the logic currently in c++ in unqualifiers into the language itself.
It hasn't been tested thoroughly, and needs tidying, but I wanted to get your view on if this is worth pursuing.
And we need some cacheing in here, that recordHeadLabel example above takes a noticeable amount of time to run each time.