-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closures are bad? #33
Comments
From my perspective, you’re first paragraph lacks clarity. I am not inside your head. I need you to explain in sufficient detail so as to convey holistically what you are thinking. From my perspective, a closure includes a reference to its bound variables (aka environment) and thus the bound variables can only be garbage collected or freed with reference counting. I do not understand what you propose otherwise? You provided insufficient explanation (examples) of how “object properties” would improve this. I am thinking that only by keeping allocations ordered on the stack, can we free them deterministically. The callbacks in a |
The problem is the creation of garbage, consider:
Every time doit is called a new closure is created storing the captured variables on the heap. If this happens in a game or animation, or on any frequent event like a mouse-event, this creates new garbage which eventually triggers a collection that pauses the interactivity. So the less garbage we create the less "jank" there will be in the UI. Most times the above can be replaced with a new method (TypeScript):
and called like this:
This does not heap allocate on the callback, and so generates less garbage. To be clear we are talking about the rate of garbage generation, and the pauses caused by the GC dealing with the garbage. |
JavaScript should be reusing the (and not be creating new) instances of (at least copy-by-reference) variables in the closure environment on each closure when calling Thus afaics, you do not need object properties to solve the problem. You need to create one instance of the closure (and callback) and reuse it when you call And note that making the variables object properties did not guarantee they are unboxed in JavaScript. So I think your point is we should discourage closures in inline functions? Or discourage closures in functions defined in loops? |
My point above is that closures make it too easy to do this. |
Does it have anything to do with closures? Even if the variables are local, a new instance of the callback function is going to be created on the heap each time |
It's creating the closures on the heap that is creating the garbage. |
I have stated that even without closures:
Have you refuted that claim? How can we say it nicely to encourage both of us to put effort into making these threads have the least amount of unnecessary back and forth, and the effort put into respond cogently to what each other write. Otherwise the communication is not only arduous to participate in, but even more arduous to re-read or for an outside observer to read for the first time. When trying to go back and re-read these threads, it becomes very clear that we could do better. I share some of the blame. I remember you wrote once that you do not consider these threads formal exposition and thus no need to put too much effort into them. But I do not find that careless discussion is lost production. When I re-read some of our threads, we lost a lot of time due to confusing communication. You have important ideas to share. So it behooves us to communicate well. Edit: someone else noticed this. Also mentioned here. Edit#2: I think perhaps closures play some role in that they encourage the programmer to inline a function to capture the external environment without needing to pass it as an argument down the call stack. And thus this encourages creating a new function instance every time instead of creating one function instance and calling it. var thing
list.forEach(() => thing++)
for(var o in list) doit(() => thing++) It would be much more efficient to write: var thing,
f = () => thing++
list.forEach(() => thing++) // this one only creates one instance of the inline function
for(var o in list) doit(f) It is impossible in general to prevent function definitions and/or closures in loops because the loop can be external to the current block visible to the compiler, although localized cases could be lint warnings or disallowed by the compiler. |
@shelby3 wrote:
It is probably true in unoptimized cases (i.e. in general) if when the referenced variable would otherwise be stored on the stack (i.e. copy-by-value), then referencing it in a closure will force it to be placed on the heap instead. However, these copy-by-value variables could be unboxed and grouped together in one data structure, perhaps even with the body of the function itself in memory. I suspect that JavaScript only uses copy-by-value semantics on some primitive types such as For those that would be otherwise placed on the stack, it might be wise to lint warn or disallow these in closures. The programmer could still explicitly work around it by creating a
Object properties may or may not improve upon the above problem, depending on if JavaScript always unboxes in object properties, those primitive types that are copy-by-value on the stack (so they become unified with the one object data structure instead of separate heap allocations). Yet even still a heap object will be created for the object. And JavaScript may already be doing essentially the same optimization as explained above. My claim and presumption is that most of the improvement you were presumably getting with your suggested fix was due to eliminating the amplification issue discussed below.
But in any case, it is the multiple instances of the function (especially inside a loop) that amplify the problem the most, because regardless if copy-by-value or copy-by-reference, then multiple instances of the references to the heap instances have to created in multiple instances of the closure environment data structure. I apologize if that all was not clear from what I wrote before as quoted above. |
I am reporting a real problem in a real application, so there is no question it is a real problem. I have had to refactor the code to get acceptable performance. Perhaps I have not sufficiently explained, but creating the closure creates an object on the heap, referring to member variables and methods of the object hosting the callback does not (because we create no new objects on the heap). Note: I am not creating a new object for each callback, it's not necessary because in JavaScript only one event handler can be called at a time because it is single threaded. Copy-by-value or copy-by-reference makes little difference, because the closure itself requires a heap object, even if it captures no variables. It is eliminating the closure that results in the performance improvement due to reduced pressure on the garbage collector. To eliminate the closure (a dynamic function that needs heap space) with a static function (that has no heap space) any captured variables need to be stored somewhere else, and in JavaScript they might be global variables, or member variables of some component.
My thought is to not allow variable capture by a nested function, so for the inner function variables declared in the outer scope are not "in-scope". So a function can only access local variables, function arguments arguments and properties of the object for methods. If you read this paper: https://cseweb.ucsd.edu/~goguen/pps/utyop.ps It makes a good argument for not allowing first class functions at all, and instead provides modules with sorts and subsorts. Apparently there is some conflict trying to provide higher order functions and sub-sorts in the same system, and the author of this paper believes that sub-sorts are more useful for programming than higher order functions. I am not saying I agree, but I need to think about it. |
Instancing a function also creates an object on the heap. Unless I see your code, I can not trust that you know what is causing your problem and trust that you have analyzed it correctly.
I am sorry but it seems like you have entirely not understood what I wrote. The object hosting the callback is on the heap. So you instanced the callback only once (and it is on the heap) and that explains the reduction in GC load.
This is the first time you have mentioned event handler. You leave out so many details then expect me to trust some nebulous bug report. That is the not correct way to report a bug. You must provide a reproducible code sample that can be independently analyzed. I can not unequivocally analyze a nebulous description without the code. How in the heck could the GC load be impacted at all if you are not instancing the callback for the event handler many times? Meaning that it was instancing the callback only once that was likely responsible for the reduction in GC load, and very little if nothing to do with closures. |
That is analogous to killing the dog to kill the fleas. There are valid use cases of nested functions with closures, e.g. partial application. |
I found this: https://futhark-lang.org/blog/2017-01-25-futhark-module-system.html#parametric-modules Module composition is all done at compile-time, thus it is a total order and can not be dynamic open modularity. |
Back near the beginning I said:
This says that the function is called from an animation loop or event handler.
It would be much more efficient to write:
Yes you seem to have understood the problem, except instead of a loop we are talking about a callback or an event handler. Consider:
now consider this is called as part of the 'mousemove' event handler, it will create hundreds of new closures per second. Instead we want to write:
This form creates much less garbage. |
And precisely because you were creating a new instance of the function each time. It had nothing to do with the closure. Yes I entirely understood the problem. But for some unknown reason, you apparently do not read (carefully) what I write. |
I don't understand why you are disagreeing with me, the first version clearly creates a closure on the heap every_time the event handler is called. It is the accumulation of these closures on the heap that is causing the problem. I know this is the case, because when I stop the closure being created every time the event handler is called the problem goes away. I think your point is that you can have a closure elsewhere (outside of the loop as you put it), so it is not the closure that is the problem, but the placement of the closure. But that misses my point that having closures makes it easy to code in this way. Remove closures from the language and now the programmer is forced to do it a different way which does not create so much garbage. So closures are an enabler of an anti-pattern, they don't force you to use the anti-pattern, and there is also a good way to use them. So the whole point of my comment was to consider the merit of choosing language features that enable good coding only, rather than having features that enable both good and bad coding styles. So given the choice between a language feature that can be used in a good way and a bad way, and another that can only be used in a good way, we should choose the latter, and not enable people to write bad code. |
My point is that your example is creating a new instance of the function on each event callback, which is causing an object or data structure (storing the function and optionally its closure environment) to be allocated on the heap, thus placing more workload on the GC. I had explained in great detail why the addition of a closure environment is not the significant additional heap allocation overhead. The significance of the heap allocation is caused by the creation of multiple instances of the inline function (as the callback) in your example. I hope that is clear by now, as I do not know how else to write that in plain English. I feel this is about the 3rd or 4th time I have written that in this thread. Do I have some error in my conceptualization which is cause you’re (unless I am mistaken what appears to me to be) cognitive dissonance? I do not wish to be derogatory in any way. I appreciate your willingness to interact with me. It is a bit frustrating that we talk past each other so often, but I guess that it is good that our brains do not think exactly the same, so that we can catch each others’ errors and blind spots. I am under extreme time pressure right now, so it is impacting my patience for repeating myself, but as @skaller wrote in the Typeclass objects thread, humans need redundancy of communication. @keean wrote:
Referring presumably to this example:
Depending on what is inside the body of that inline function Btw, I thought about the nature of this bug and I realized a generalized improvement which can help prevent the bad programming without removing the generality we need for good programming. The bad cases where we create many instances of function needlessly can only occur for instances that would not otherwise be automatically deallocated when the stack frame is unwound, if we have an RAII model you proposed and I agreed with (see also and downthread). In other words, if our language distinguished instances that can not be deterministically deallocated when the function call (and/or block-level scoping) stack unwinds, these would allocations which are candidates for unnecessarily overloading the GC. In fact, the candidates for creating the problem are not only function constructions, but also include any object creations that can not be deallocated when the stack frame is. The language could either force these allocations to have a special annotation and/or the IDE could syntax color them conspicuously, so the programmer is aware these must be studied more closely. If we feel inline functions are the most frequent candidates, we could make those more conspicuous. But again, I do not think the problem is specific to closures or even inline functions without closures, but in general allocating things on the heap too frequently (which could not be deterministically deallocated with the unwinding of the stack frame, as opposed to heap allocations which could be). GC makes construction on the heap so trivially easy and lacking of boilerplate that we forget the impact of these, which was not the case when we had to manually |
You are right it is the rate of object creation that is the problem. However here a 'functional' idiom is responsible for the high rate of object creation (closures are implicit objects). The inline function code takes no space, the code is only there once in the source, and in the machine code. A closure is a pointer to the function code and a pointer to the environment containing the state. A closure object is created whether the environment is empty or not, so the environment pointer can be null. For every call to the inline function a new closure object is created pointing to the same machine code for the function. So the garbage created consists of closure objects. |
Actually you are correct that if there is no closure environment, then afaics in theory there is no reason to create a new instance of the function code on the heap for each call of your event handler. I guess I was just assuming JavaScript is dumb and creates a new instance of a function every time the code is run where a function is defined (because tests apparently show it is doing that even without any closure environment), but really it should not, at least if there are no closure environment variables. And even if there is a closure environment, if all the variables in that environment are pass-by-reference (as opposed to pass-by-value) then all the instances (all of the calls to your event handler) can in theory share the same closure environment. So in theory it should only be the closure on pass-by-value variables (assuming closures are not supposed to share the same instances of these variables?) that would cause the excessive heap allocations. And you are correct that is an implicit and insidious special case problem that the programmer would not normally be aware of by looking at the code. So if what I have written is correct, then it would probably behoove our compiler to warn about or require a conspicuous annotation for closures on pass-by-value variables. But are closures really supposed to have unique instances of the pass-by-value variables in the closure environment? I do not think so! Seems maybe you have discovered some bad semantics in JavaScript’s closure specification? P.S. re-read that last paragraph of my prior post. Edit: I am perplexed why you are even seeing the problem, because according to MDN, there should only be one closure per lexical environment and they even provide an example of how to force more closures with a function factory. Edit#2: ah I figured it out. A new closure must be created for each function invocation because the lexical environment can be in a different function call. So loops within a function should not cause this problem because the closure on the lexical environment is in the same function call, thus they can all share the same closure. Did you have a closure on any variables in your function or is JavaScript creating an empty environment closure any way? If the latter, then I am perplexed why it is doing that and apparently it is. Perhaps the reason JavaScript does that is so it only needs one data format for functions in that they always take a reference to their closure environment (even it is empty) and perhaps this improves efficiency somehow. Edit#3: And then I found this. Ah maybe the answer is here:
Ah so does JavaScript do no static analysis on which variables in the lexical scope are statically referenced? And you can you dynamically access the variables in the closure scope using Edit#4: even though all function local variables have to be on the heap, they could still be unboxed in the closure environment object. |
Okay so JavaScript’s lack of static checks and its liberal runtime dynamism with If we have static analysis with our proposed language compiler, then we can fix that problem automatically by defining such functions in an external (global) scope automatically when transpiling to JavaScript, so the programmer does not have to worry about unwittingly creating this bug. But it also means in our language any concept analogous to Yet the problem remains for closures (i.e. that are not empty as statically compiled), they are allocated on the heap and the GC is tasked with freeing them which is a burden. In my mind, GC should be more for long-lived objects and not for function local variables, which is why I like the RAII concept and deterministically deallocated function (and/or block) local variables. Even though a generational GC helps, it still is (afaik) inferior in efficiency to deterministic stack frame deallocation. My recently proposed “stored (transitive)” annotation will help alleviate this issue. This is finer grained control than pure functions. When a function argument’s reference to a closured callback is not annotated as stored, then the caller knows no references are held on the callback outside the called function, so the closure can be deallocated deterministically in the function (or block) stack frame scope of the caller. It will not help in a naive transpilation to JavaScript, but the source code would be ready for future improvements in compilation. We could even eventually use a stack, perhaps even within JavaScript as Emscripten does employing Perhaps closure cases that are not resolved by the above improvements need to be conspicuously annotated (to remove a compiler warning), so the programmer is aware of the danger of the inefficiency and potential to thrash the GC. Closures are not all bad, but they apparently have extra pitfalls in JavaScript. Kudos to @keean for raising an important issue. |
@shelby3 wrote:
…
I am contemplating that it is key to note that passing as an argument (to This is important because I presume that optimized JavaScript engines such as V8 are normally proficient at rapid generational collection of those lexical scope So really the problem you found is related to the nature of It is very important to know if my presumptions in this post are correct, because it impacts the design of the language we create if we are targeting JavaScript. Specifically, I am contemplating a Thus I am presuming that closures are not as bad in JavaScript as you were thinking and JavaScript is not as derelict and inefficient as I was thinking. Instead it seems the only issue is storing a reference for longer than the current function (thus bypassing the generational GC scheme) of an object created in a frequently called function or loop. Feedback? |
Yes, JavaScript does okay when objects get created and destroyed quickly. However the problem is not specific to get timeout and simple assignments of closures to member variables can cause the same effect. It is always better to create a named function like this:
Than use an anonymous one:
But it's not as bad as with setTimeout, which has the added problem of creating lots of timer objects as well as function objects. |
But again that problem is not specific to closures, but rather that creating new objects in a loop (or frequently such as a mouse event callback) and storing a reference to it in a longer-lived instance will thus bypass the generational GC. I stumbled onto this resource and do not feel well enough today to read it all and assimilate it, so I do not know if it is a good read: https://github.com/getify/You-Dont-Know-JS/tree/master/scope%20%26%20closures EDIT: “one-shot” listeners might alleviate the need for event callbacks. |
I wrote:
Note this may not be true for the case of objects deterministically allocated on the heap and/or non-deterministic reference counting. A generational GC may be able to group (and thus amortize some costs for) deallocation into larger economies-of-scale. For example, a callback function that gets called in a loop would do a separate deallocation on every invocation rather than grouped in a periodic generational GC. However deterministic deallocation of objects stored on the stack would likely be more performant given nothing is done to deallocate them other than the adjusting the stack pointer, which is done anyway on return from each function. |
Closure Typing For Imperative LanguagesClosures over (even immutable except in some cases) state cause a complication to the type system so as to avoid breaking type safety. ML settled on the value restriction which restricts parametric polymorphism even in cases which would be type safe. Taking the perspective that the typing problem occurs from closure over state (although not all closures-over-state cause the problem as noted below for covariance and OCaml’s relaxation for immutability), the Closure Typing which can type all Hindley-Milner programs deals with state in closures. IOW, for example the partial application of any input of a curried function1 creates a closure over parametrised state causing that type parameter to be concrete from that point forward in the evaluation. Closure Typing was dropped when the ML functors were added for modularity, because it was deemed to not only overly complicate the typing system implementation and modular interaction of type signatures in modules, but also could necessitate sharing typing information between modules that might need to be encapsulated (and encapsulation being one of the necessary features of modularity). Annotating whether the type parameter may only be allowed in covariant assignments, and then combining with the value restriction reduces/relaxes the restrictions compared to value restriction alone. OCaml has covariant annotations in the context of a default value restriction2 to overcome some restrictions for example when using abstract types. Note eta expansion as a means of ameliorating the restrictions of value restriction causes redundant evaluation of potentially imperative code which can potentially cause unintended side-effects. Being pure functional, Haskell avoids this typing and restriction problem with closures, but has the dual of the analogous problems named the monomorphisation restriction because it can still model imperativity (otherwise it wouldn’t be a useful PL). And: https://jozefg.bitbucket.io/posts/2015-03-27-unsafe.html EDIT: Polymorphism by name for references and continuations resolves the issue and there’s even a call-by-type idea to retain some of the efficiency without the type unsafety) 1 2 Thus OCaml can’t generalize a partially applied (curried) function because it doesn’t have Closure Typing or Weak Typing as a default. |
For a while I have considered not allowing closures at all, but using function objects. This also emphasises the difference between values and objects (values are immutable and must be copied, objects can be mutable and can be referenced). These things all influence each other, so it is difficult to talk about one in isolation. We can lessen the boilerplate by deriving some methods automatically, like a default constructor. Consider a counter, in JS you can write: function Counter(start) {
var count = 0
return {
inc: () => {...}
get: () => {...}
}
} We can write this as a function object, or a module like this:
The main difference is the function object / module does not capture any ambient scoped variables, you have to pass anything you want to use to the constructor. This means the state and size of the object is clearly visible to the programmer. Also when you invoke one it is clear what is being passed in. Capture of scoped variables is anti-modular. |
See the article on Midori you linked to, ambient authority is considered bad, and is equivalent to capturing outer scope variables. In Midori objects are capabilities, and mutable static object properties are not allowed. An example of ambient authority would be a "user-id" defined in an outer scope that is referenced by closures defined in that scope. |
Now this discussion of modularity is becoming very relevant and interesting to the studying I’ve been doing the past days, but I really wanted to discuss a module design in the thread where we were originally discussing ML modules/functors (will try to move the discussion there eventually).
In some respects that is what modules do when they export an interface then internally encapsulate a non-exported lexical environment. Closures are modules that export a function interface. One issue is that closures are not applicative at least w.r.t. to memory allocation side-effects. They are also not applicative w.r.t. to the captured lexical environment if any of the environment is mutable (even if not mutated by the closure). An applicative closure is not prevented from being used in other contexts. Generative modules also have this side-effects dependencies restriction.
Agreed but there’s a tradeoff.
In essence I presume what @rossberg’s 1ML accomplishes is the unification of module functors with function closures. Yet this requires predicativity which thus forces a restriction due to the requirement to wrap the impredicative module recursion. Whereas, his MixML unifies signatures and (even partial) implementations (aka structures), thus allowing recursion and the more complex MixML type system apparently allows HKT on implementations instead of only via functors. I’m leaning towards MixML as the better tradeoff but I require more study and @rossberg seems to prefer more compositional 1ML. I expect to eventually figure out that wrapping is as limiting as the non-unification of functors with functions, but I require first to more fully/deeply understand these comparative restrictions and the type systems they require/allow (and these details are very intricate and somewhat beyond the level of effort I have available to invest right now).
I agree with the implication that lexical closures should never be exported outside out of a module, because they require human understanding of the aforementioned contextual issues of the lexical environment. They increase encapsulation, composition and reduce boilerplate locally when employed within the lexical environment in which they were created. Maybe we should even restrict the referencing of a closure where the reference is not within the same top-down lexical environment of the closure, i.e. not allow implicitly exporting the closure outside its lexical environment? That would be even more restrictive than not allowing closures to be exported outside the module/package. Seems like we might be making progress on a design for limiting to closures to where they are effective and less like to cause errors. This btw is afaics another reason that abstract types are modular and type families are not, because type families are injected from outer context and thus limit the degrees-of-freedom on encapsulated, compositional, explicitly dependent joins. The big problem for me (from what I know thus far) with abstract types was in ML they’re forcing HKT via functors yet MixML seems to remove this restriction with a more complex type system with linear types for tracking the enforced kinds on abstract types (apparently that leaks the kind out of the encapsulation though?). There will always be a tradeoff between the trilogy of encapsulation, composition, explicit signature dependencies within a given order (level) of the model (perhaps at higher-orders of expression, then lower orders can be more complete in the 3 respects, so the tripartite theorem tradeoffs moves into the higher order of the model). These “pick 2-of-3 tripartite theorems” seems to arise often in understanding partial orders (and we don’t exist if a single total order does). |
There are definitely reasons to prefer modules over type classes. Maybe I am starting to prefer the idea of modules + implicits. Type families provide a kind of associated types for type-classes. The problem is modular associated types require dependent-types. Consider a heterogeneous list containing modules with an associated type. If we create a value of the associated type, this type will depend on the module value. Now the p |
Now the problem is we need to emit different code for floats and ints (if we want performance there are different registers and op-codes), but we don't know what type there is until runtime. I unions can help here, because if we know the union of all modules that this program can pass, then we can monomorphise two versions of the function, and we choose which one based on the module. This would still need to be dynamic/runtime polymorphism. So this leads back to what I was saying about type propagation, we want to propagate concrete types and unions forward, and type-classes and constraints backwards. |
Doesn’t this entirely resolve the selection of specific functors implementing the interfaces at compile-time, except for any higher-rank polymorphism (polymorphic recursion)? The compiler propagates the quantified types down the function call hierarchy and the interface constraints backwards up the hierarchy. |
@keean wrote:
Are modular type classes an option? |
@sighoya this is exactly what I was referring to when I said modules+implicits. The problem is to keep things functional requires applicative modules, and that can lead to problems with associated types. I think you need some kind of dependent types to do this. I would be okay with this if I can find a dependent type system that maintains type erasure. |
modular type classes support associated types inside (without dependent types), but I assume, you mean something different. |
Modules/Functor can be generative or applicative, an applicative module/functor is like a function and the associated types from an application are always the same. For a generative module/functor each 'new' instantiation generates fresh unique types. |
@sighoya not soundly. A module signature is a type, a module itself is a value. If you pass a module to a function and the type of the function depends on the module, then you have a dependent type. |
@keean wrote:
Can you clarify why monomorphisation can’t be used instead of type erasure? We’re talking typeclass bounds so the instanced type of the abstract type is known at compile-time at the call site. |
@keean wrote:
Yes, this makes sense. An module can be seen as an instance of a struct as struct instances like modules allow field access. And a struct can be seen as a signature. @keean wrote:
Okay, but the function signature can also depend on a integer value instead of a module value. So why not disallow it? |
It can be a problem because you cannot always type check the function if you don't know the module value. An integer value is different because it has an integer kind (where a kind is one level up from a type). For example we have a module signature that has an associated type, say an iterator over a collection where the associated type is the 'value-type' of the collection. We compile some code that uses this module signature we do not know at compile time the type of the 'value-type' because that is not known until runtime when you know the module value that is passed. |
@sighoya wrote:
If I’m understanding the issue correctly, the function takes the signature of a module, then the abstract types in the signature have an unknown instance type (because they’re abstract), because module signatures can be implemented by multiple structures via for example ML functors. Therefore the function is polymorphic, but I’m failing to understand why that prevents type erasure because we know the struct instance type at the call site don’t we? That is why I asked @keean about the case where the input is typeclass bounded and can be monomorphised. There’s an additional complication because of optional sealing, the abstract type can be made encapsulated, so the function is not allowed to know the type. So this prevents us from using that type as associated type, because if we could then we would be able to request a typeclass bound that reveals the type. EDIT: I just see @keean was answering at same time I did:
So the issue is separate complication? I thought you agreed we monomorphise the functions at the call-site, except where polymorphic recursion disallows it. So is the exceptional cases that can’t be monomorphised that require the dependent-typing? IOW, runtime dispatch, which is not the same as the usercode-level reflection which is undesirable. What’s the problem with runtime dispatch???? I do not understand your objection. Typeclass objects (in Rust aka existentials in Haskell) are also runtime dispatch. Is it an objection due to desire for maximum performance? EDIT: I see your concern is about runtime dispatch:
Why are we discussing modules in the Closures are Bad thread? Soon we won’t be able to find discussion based on any taxonomy when we are so disorganized about not moving offtopic discussion to an appropriately titled thread. The search facilities if Github are very rudimentary. And Google search won’t take you directly to the post in the middle of a long issues thread (and even Ctrl+F won’t because Github as a “Load more…" when issues thread is long). |
We can turn the associated type into a type parameter, but this is a problem because a type parameter is an input to a module, and and associated type is an output. By making it an input we lose the encapsulation (the module may represent an array of integers for example, this is a particular issue with iterators that are constructed by collections). |
Okay let me check if my understanding is correct? So you’re saying that if the value (aka element) type of a container (aka collection) is implemented with an abstract type of a module, then an associated typeclass such as for an iterator which needs to also know that value type will refer to it as input from the module signature for the container. And your point is this breaks encapsulation because the module is outputting the type (as an input to the iterator’s module signature). Your point is that since the type family for an associated type is bounded, then the type can not be “input” by any typeclass not in the bounded family? But the iterator typeclass bound is also another module signature, so the reference from the iterator module signature to the abstract type in the container module signature retains the encapsulation. Afaics, even without it being an input we can break encapsulation as follows though? I am positing that because we can request a typeclass bounds (not on the associated type but…) and that typeclass’s interface can have a method which reveal something about the associated type (exposing a factory or something) assuming we implement associated types with abstract types instead of as bounded type families. IOW, typeclasses break encapsulation of associated types when implemented with abstract types (which they don’t do in Haskell because confined within the type family but…)? For example, let’s say you want to know if the encapsulated associated type (implemented as an abstract type) possesses certain properties, the function could require a typeclass bound which uses that associated type in a way such that it exposes properties on that associated type. If it compiles, then the associated type has those properties. Whereas, with a bounded type family as in Haskell, there’s control over which typeclasses can associate the type. Yet I should note this capability is inherent in the ML module system. Seems we can violate encapsulation via the backdoor of having instances cooperate with exposing properties via matching of instance selection. IOW we lift the revealing to a metalanguage convention above the base language, which is essentially what typeclasses are as they can be implemented on top of modules. The generative essence implication would be that unbounded (open) modularity is incompatible with encapsulation. I realize that what I’m describing isn't revealing the type as known to the base language type system, and it is a metalanguage on top which reveals properties. But isn’t that enough to break some security invariants? Hmm. And @rossberg (co-creator of 1ML) explained that cordoned type families are not open modularity. I had posted this link upthread about typeclasses being anti-modular: http://blog.ezyang.com/2014/09/open-type-families-are-not-modular/ |
As far as I know, associated types strongly correlate with existential types in Haskell. |
They’re required for implementing multi-parameter typeclasses. See also: https://people.mpi-sws.org/~dreyer/papers/mtc/main-long.pdf#page=4 (page 4). And see also §5. Related Work on pg. 12 where they talk again about multiparameter typeclasses. So if implementing typeclasses on top of modules, we’d still need the abstract types in modules. |
May I suggest? Could we please kindly move the modularity discussion to the new Modularity issues thread: |
I’m going to try to incorporate the following in an ongoing discussion about effects systems and it’s also related to the need for the value restriction (find a link to it in above linked discussion). Erik Meijer wrote:
|
A summary of the main issues with closures are:
|
Referring to my prior post above, noting a potentially important use case for non-local closures, so perhaps we will want to allow implicit local closures and non-local closures must be explicitly created. @keean raised this issue in a different thread. |
Regarding the scaping problem is this a viable option?:
|
I have been writing some application code in JavaScript and TypeScript recently, and one of the things that I have started to realise is that using closures results in lots of garbage for the collector to clean up. It is becoming apparent to me that the convenience of closures costs you in performance. The easy solution is to refactor the captured variables as object properties, and the closure as an object property. A lot of the time we do not need the 'queued state' that is provided by closures, and when we do, creating a new object to encapsulate that state is not a big problem, especially when you have anonymous objects. This reminds me that 'runnable' objects in Java or 'function objects' in C++ serve the same purpose as closures.
I guess my thoughts are should the syntax make closures harder to use (by forcing you to create a function object) which then makes the programming effort align with the execution efficiency. To put this another way closures make it easy for developers to write inefficient code, by removing them from the language syntax developers would fall back to creating object properties and methods for the callbacks which is more efficient, and would only create a new object to hold the values where it was really necessary. This is approach could be described as "make the easy way the correct/efficient way". I don't know if this is something I really want to do yet, its here more as an observation, and an idea about how syntax can support the programmer doing the right thing.
The text was updated successfully, but these errors were encountered: