-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[reconstructed] WD-40 (for reducing Rust with the next mainstream language) #35 #50
Comments
Original Author: @shelby3
We need optional GC and reference counting. The latter excels at deterministic, prompt finalization (which is required for non-main memory finalization) but has the trade-offs quoted below:
Has it ever been tried to combine reference counting with GC? Apparently so: Edaqa Mortoray wrote:
Given our single-threaded model, we do not need atomics on the reference counting and given the proposal of the OP to emphasize borrowing and RAII then we would not get the benefits of a generational GC scheme. So reference counting would give us determinism except in the case of cyclical references and then GC would kick in to break cycles. However, when we know the cyclical references are probable (e.g. any kind of non-acyclic graph such as the DOM or a doubly-linked list), then it is wasted overhead to employ reference counting, although that overhead could be very tiny (i.e. infrequent tracing) if we only want to catch unexpected cycles (i.e. as an insurance policy and we’d want to log these and consider them bugs). I suppose we could rate-limit reference counting finalization, to prevent the rare aberrant high latency stall. @fauigerzigerk wrote:
@david-given wrote:
@iofj wrote:
For this last quoted issue, I am contemplating the ability to employ the unused bits (c.f. also) of 64-bit pointers (64-bit also provides a huge virtual memory space and c.f. also) would enable storing an index into a contiguous array which has the reference counts as array elements so they would be in contiguous pages. If the number of available unused bits is insufficient to count all allocated objects, then perhaps each compiled object type could employ a separate array (i.e. a zero-runtime-cost compile-time array selection), presuming the cache thrashing issue is due to intra-page-size fragmentation, not lack of inter-page contiguity. Also then we do not need to create the array element until the second reference. This is applicable when mostly passing around objects and not accessing all the data of the object as frequently. Weighted reference counting can be employed to reduce the accesses to the reference count by half (since the source pointer has to be accessed any way), but it requires some unused bits in the pointer to store the weight. David Ireland wrote:
Edit Nov. 18, 2017 Distinguish between non-deterministic reference counting and deterministic RAII. |
Original Author: @shelby3 @anonymous wrote in private:
I need progress on code, not language design, but I am trying to have paradigm I can be thinking about when structuring my code in TypeScript + C, so that it can be an easy transition if ever we complete a new PL and also to drive my insights on that potential language. Concretely, I think it means I will write wrapper classes with getters and setters in TypeScript that take an ArrayBuffer for data and do my |
Original Author: @shelby3 I wrote:
Re-reading the prior linked comments… First realization is that reference counting is non-deterministic and is inferior to GC in every facet (for the applicable use cases of non-deterministic resource management and thus the inherent lack of deterministic finalization) except for interop with non-GC objects as explained below (but potentially at the cost of loss of compaction and loss of automatic cyclical references memory leak detection). Afaics, a useful facility to add in addition to GC is the zero-cost1 abstraction of deterministic RAII block scope allocated/deallocated on the stack, because it (probably improves performance and) adds determinism of finalization (aka destructors) and interacts with the optimization of low-level performance as explained below. To achieve such zero-cost resource management abstractions requires typing/tracking the lifetimes of borrows of references so the compiler can prove a resource has no live borrow when it goes out of block scope. I had proposed the compile-time enforced “is stored” annotation on function inputs when an input will be (even transitively) borrowed beyond the lifetime of the function application. A separate concern from the issue of tracking borrows (a la Rust’s total ordering which assumes multithreaded preemption everywhere or my proposed partial ordering in conjunction with singled-threaded event model) for the purpose of preventing concurrent race conditions access (which afair we discussed in depth in the Concurrency thread). This granular typing of the context of side-effects seems more straightforward than some grand concept of typing everything as a (a cordoned context of imperative) side-effect??2 One of the key advantages/requirements of GC is the ability to move objects so that memory can be periodically compacted (with the compaction being amortized over many implicit/delayed deallocation events which would otherwise be explicit/immediate/non-deterministic-latency with reference counting), which enables allocation to be as low-cost as incrementing a pointer to the end of allocated objects in a contiguous memory address space. As an aside, for highly optimized low-level code to run at maximum performance, it must be possible to freeze the position of objects in the virtual memory address space, else the objects must be copied. The former has a concurrency live and dead-lock issue w.r.t. the said compaction. For RAII managed data structures that contain pointers to data (i.e. unboxed) which is not RAII managed at the current or containing block scope (i.e. for GC pointers since pointers to data RAII managed in a contained block scope would exhibit use-after-free errors), these pointers have to be mutable by the compaction engine and have to be recorded in the list of references which the GC must trace. To have full interoperability between GC instances and RAII (stack allocated data structure) instances without copying data, incurs either a performance penalty of reloading pointers on each access (in case they’ve been moved by the compactor) else the aforementioned concurrency live and dead-lock issue w.r.t. the said compaction (which is an onerous, undesirable can-of-worms). Thus, I’m leaning towards that stack allocated data structures should not interop with GC instances, except via copying because (other than deterministic finalization) performance is their primary raison d'être. And interop with GC would (as GC does) also encourage unboxed data structures, which are known to thrash caches and degrade performance. Additionally, I think it should be allowed to have GC instances which are unboxed data structures (e.g. transpiled to However, is memory compaction even necessary on 64-bit virtual address spaces wherein the hardware can map the virtual address pages to compacted, available physical pages? Well I guess yes unless most objects are much larger than the ~4Kb page size. However, afaics perhaps the compaction issue could be significantly mitigated by allocating same size objects from contiguous virtual address space significantly larger than ~4KB in reference counting algorithm. Thus allocations would be first applied to deallocated slots in the address space, if any, per the standard reference counting allocation and deallocation algorithm, which employs a linked-list of LIFO free slots.3 If compaction is required, then I contemplate whether to allow unsafe low-level code which manually employs 1 Zero runtime cost because due to the compile-time determinism, we know at compile-time when the deallocation will occur. 2 How does Oleg’s insight conflict with @keean’s prior stance (in our discussions) against AST macros? I vaguely remember that he was against macros because they obfuscated debugging and the type system. Is this the paper? http://okmij.org/ftp/Computation/having-effect.html#HOPE I’m unable to readily grok Oleg’s work because he communicates in type theory and Haskell (neither of which am I highly fluent). Afaics he explains with details/examples instead of conceptually and generatively from ground level concepts. And I do not have enough time to apply arduous effort. So he limits his audience to those who are fluent in every facet of the prior art he builds on and/or who can justify the time and effort. Essentially all side-effects distill to the generative essence of shared state. If state is not shared, then there are no interactions of effects between those processes. Afaics, a Monad controls the threading of access to an orthogonal context across function application. That orthogonal context can be state in the State monad. Correct? That is why Monads can’t generally compose because they cordon orthogonal contexts. But what is the goal to achieve? If we model higher-level concepts as imperative building blocks with contexts, we can do unlimited paradigms, but not only will our code become impossible to read because of the overhead of the necessary syntax and type system to allow for such generality (which was essentially my complaint against trying to model everything, e.g. even subtyping, as typeclasses as you were proposing before), but we have an explosion of issues with the interaction of too many paradigms. Afaics, this isn’t headed towards a simple language. There will not be a perfect language. Type theory will never be complete and consistent. Design is driven by goals. Thoughts? 3 Richard Jones. Garbage Collection Algorithms For Automatic Dynamic Memory Management, §2.1 The Reference Counting Algorithm |
Original Author: @NodixBlockchain If i may :) For me the main problem i would have with garbage collector is that they are harder to debug and test. The nightmare i have is you will test the functionality on simple program, then it works, and you keep adding functions, race conditions, and two month latter you realize there is a memory leak or memory corruption somewhere, and you're good for days or weeks of debugging to trace the problem. With reference counting, it's very easy to trace problem, because you know where the freeing must happen, the point where the last reference to an object is released can be tracked easily, and it make it very easy to track problems, either they are memory corruption due to memory being released prematurely , or memory leaks. With garbage collector who will parse the whole tree every now and then, it will traverse thousands of pointer allocated over few minutes of running, and it can become much harder to track where the problem is if it free a memory that it shouldn't, or if it doesn't free a memory that it should. Especially if you go for multi layered scope, with closures, and asynchronous code execution, it can become very hard to make sure the GC will catch everything right. And bugs will generally appear when the program start to become more complex rather than in easy vanilla cases when the program is small and simple. And if you need to have a system of plugin, like code that are loaded at runtime that can manipulate references itself, it can become also impossible to track deterministically the reference tree at compile time, and the host program might have no clue at all of what the plugin is doing with the reference passed to it. And it's the same problem with distributed application. For me now when i want to add new feature, especially for memory management, my first concern is not only 'how to make it work', but also how easy it will be to detect problems, and to track what cause the problem. With reference counting, it's easy to make a break point when reference get freed, and having the debugger to pause everything and inspect all the context at the moment where it happens. With GC, the analysis is delayed, and all the context of execution that change the state of the object is lost, so it makes it very hard to debug. For me to make GC programming easy, it needs something like android SDK where all the application is structured at macro level with activities, intents, special class for asynchronous/background execution, and making in sort that lifetime of all objects are predictible due to the structuring of the application, declaration of resources in XML etc. https://developer.android.com/guide/components/activities/process-lifecycle.html They have whole lot of classes to build the application on, to make resource management and lifecycle of object more deterministic. It impose a certain structure to all application classes and component to make memory management fit in case that the GC can easily deal with. Making general purpose GC to deal with complex language with multi layered scope and asynchronous execution without giving any hint to the pattern that the object is likely to follow by inheriting some class, and structuring the application with built class with predictible behavior seems very hard. Reference counting is much easier to deal with for the general case. |
Original Author: @keean My preference is to manage as much as possible from the stack as you can. There are some simple tricks that can make this work. If all allocation is RAII and you have proceedures not functions then to return some data you simply pass a collection (set/map/array etc) into the proceedure. Edit: where this gets complicated is you then end up with two kinds of pointers "owning pointers" which must form a Directed-Acyclic-Graph, and non-owning (or reference) pointers (and links between data-structures). The simple solution is to make all 'references' weak-references, so that you have to null-test them when dereferencing to check if the resource they point to has gone away. Then we simply delete things when they go out of scope, removing all resources pointed to by owning-pointers in the DAG. No GC no Reference counting, and a simple non-zero check on dereference. Of cource 'nullable' pointers are not considered a good thing, so we can then look for static techniques where we do not allow references to outlive the object to they reference. This is where Rust's lifetimes come from. |
Original Author: @shelby3 @NodixBlockchain (aka IadixDev @ BCT) wrote:
I can’t speak for @keean (and this is his project area but this is my thread), but personally I’d prefer “please don’t” (or post once for every 10 posts by others in this thread, so as to keep the S/N ratio of the thread higher), i.e that you would comment in your thread “Saluations! :)” instead (or at least not in this/my thread), unless you have something new+important+accurate+well-researched to point out which you have not written already in the past. If you quote from this thread, and post in your thread, I will probably read (presuming I have time) and perhaps even reply there. I’m trying to avoid this thread becoming another 300+ posts that I can’t wade through when I want to refresh my own memory of what my (and @keean’s) analyses was/is. @keean and I have been doing backtalking on LinkedIn to reduce clutter in these issues thread. You’re welcome to add me there.
I’m wondering whether that is a superstitious, anecdotal belief (i.e. aliasing error) or an assertion you actually studied, measured, and verified to be the case with in depth effort for comparing reference counting and tracing garbage collection. Because inherently reference counting doesn’t track the directed graph of memory allocation (i.e. all we have are reference counts and no back pointers to the references that comprise those counts); thus, there’s no feasible way to see which data structures have leaked because they’re no longer reachable from the untraced roots of the said graph in a reference counting GC system— for example due to cyclical references. Whereas, in theory a tracing garbage collector could provide such a graph for debugging.
You can’t see which/where are all the references to the shared object that had it’s reference count decremented, because it’s the implicit nature of non-deterministic memory allocation to not know which pointers would be the candidates and the reference counting scheme provides no back pointers from the said object to said sharing pointers (as aforementioned). I agree it wouldn’t normally be possible to set a runtime breakpoint on the change in number of references to a specific object with a tracing GC, because the assignment of pointers doesn’t touch the referenced object (although write-guards for generational GC might get you as close as the virtual page but that might not be helpful). Thus, there’s no way to know which assignment to set a breakpoint on for testing the address of the said shared object. Yet comparably, the more optimized variants of reference counting also have the same debugging problem because they don’t even update the said common object (as explained below) because they keep the reference count in the pointers (and additionally there is the Weighted Reference Counting). In both cases, a sophisticated debugger and language could in theory be designed to set such conditional breakpoints on all pointer assignments (conditionally breaking only if the said shared object address is encountered), if such a facility is helpful for debugging memory leaks. And the tracing GC would have more information about the memory allocation graph.
It seems you’re presuming that reference counting models deterministic memory allocation, but that is not generally the case and (per my reply to @keean below), if that were generally the case then in theory it could be replaced by zero-runtime-cost compile-time deterministic allocation mgmt (albeit with tsuris of a complex compile-time typing system). Afaics, you continue to repeat this belief system of yours (presumably confirmation biased by your presumed preference for your low-level framework you discussed in your thread “Saluations! :)” and per my comments to @keean about C/C++ below) without refuting the points that have been made to you numerous times previously. I’ll recap and augment… (and hoping you’ll keep the not-so-thoroughly-contemplated-studied noise to a minimum in this/my thread please).
I don’t see how the process lifecycle has anything to with the generalized problem that in general memory allocation is non-deterministic.
Programming paradigms for avoiding memory leaks are useful regardless of the memory allocation scheme chosen. I previously mentioned the following disadvantages for reference counting to you (either in the Concurrency thread and/or the other thread you created). And note the first one below is a form of memory leak inherent in reference counting. Reference counting (aka ARC, which technically is a form of “direct”1 garbage collection) can’t garbage collect cyclical references. And according to an expert book I’m reading there’s no known means of weak pointers or other paradigm which can reliably solve the problem of cyclical references in all scenarios.2 This makes sense because reference counting (RC) doesn’t contemplate the entire directed graph of objects holistically. Thus only “indirect, tracing”1 garbage collection which deals with the entire directed graph will reliably detect and garbage collect cyclical references. However, note that probabilistic, localized tracing combined with RC decrements is comparable to mark-sweep (MS)3 for (even apparently asymptotic) throughput computational complexity costs, yet with an upper bound (“6 ms”) on pause times due to locality of search. Yet presumably pause times are eliminated with a concurrent MS (and as cited below, RC has less throughput than BG-MS). According to this expert book4, reference counting has significantly worse throughput (except compared to the pathological asymptotic huge heap case for tracing GC which can be avoided15, and perhaps multi-threaded/multi-core/parallelism but I think need to study that more) because of the significant additional overhead for every pointer assignment. This overhead as a percentage is worse for highly optimized, low-level languages (e.g. C/C++) compared to interpreted languages which have high overhead any way. Note that a claimed throughput parity for RC with MS5 (and it’s just culling young generation overhead similar to prior art16) is not parity with generational GC combined with MS (BG-MS).6 Additionally the adjustment of the reference counts on each pointer assignment thrashes the cache7 further reducing performance, except for optimizations which generally have to combined with a tracing GC any way (and they diminish the afaics mostly irrelevant claimed deterministic finality benefit).8 Additionally for multithreaded code, there is additional overhead due to contention over the race condition when updating reference counts, although there are potential amortization optimizations which are also potentially superior for parallelized and/or distributed paradigms9 (but they diminish the afaics mostly irrelevant claimed deterministic finality benefit). Additionally memory allocation management on the heap is non-deterministic, so the determinism of the immediacy of deallocation for referencing counting is afaics more or less irrelevant. Additionally reference counting (especially when executing destructors) can cause a dominoes cascade of latency (aka “stop the world”) and the amortization optimizations10 again diminish the afaics mostly irrelevant claimed deterministic finality benefit. However, in my next post in this thread, I will propose a variant of deferred reference counting11 in conjunction with a clever generational GC scheme (apparently similar to prior art16), for the avoiding the pathological asymptotic case of tracing GC. See my further comments about this in my reply to @keean below.
In general, more complexity in the code will increase the probability of semantic memory leaks, even semantic leaks of the form that automatic garbage collection can’t fix because the error is semantic. Reference counting (a form of automated GC) will also suffer these semantic memory leaks. The issue of complexity of semantics, is more or less not peculiar to the type of automatic GC scheme chosen. Thus I don’t view this as a valid complaint against tracing GC in favor of reference counting. 1 Richard Jones (1996). Garbage Collection Algorithms For Automatic Dynamic Memory Management, §1 Introduction, pg. 2. 2 Richard Jones (1996). Garbage Collection Algorithms For Automatic Dynamic Memory Management, §3.5 Cyclic reference counting, pg. 56 – 62, §3.6 Issues to consider: Cyclic data structures, pg. 71. 3 D. F. Bacon and V. T. Rajan (2001). Concurrent cycle collection in reference counted systems. In J. L. Knudsen, editor, Proceedings of 15th European Conference on Object-Oriented Programming, ECOOP 2001, Budapest, Hungary, June 18-22, volume 2072 of Lecture Notes in Computer Science, pp. 207–235. Springer-Verlag, 2001. 4 Richard Jones (1996). Garbage Collection Algorithms For Automatic Dynamic Memory Management, §2.1 The Reference Counting Algorithm, pg. 19 – 25, §3 Reference Counting, pp 43 – 74. 5 R. Shahriyar, S. M. Blackburn, X. Yang, and K. S. McKinley (2012), "Taking Off the Gloves with Reference Counting Immix," in OOPSLA ‘13: Proceeding of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications, 2013. Originally found on Steve Blackburn’s website. 6 Stephen Blackburn; Kathryn McKinley (2003). "Ulterior Reference Counting: Fast Garbage Collection without a Long Wait". Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications. OOPSLA 2003, Table 3 on pg. 9. 7 Richard Jones (1996). Garbage Collection Algorithms For Automatic Dynamic Memory Management, §2.1 The Reference Counting Algorithm: The strengths and weakness of reference counting, pg. 22. 8 Richard Jones (1996). Garbage Collection Algorithms For Automatic Dynamic Memory Management, §3.3 Limited-field reference counts, pp. 50 – 55, §3.3 Limited-field reference counts: One-bit reference counts, pg. 52, §3.6 Issues to consider: Space for reference counts & Locality of references, pg. 71. 9 Richard Jones (1996). Garbage Collection Algorithms For Automatic Dynamic Memory Management, §8.4 Concurrent Reference Counting, pp. 200 – 201, §3.7 Notes, pg. 74, §4.1 Comparisons with reference counting, pg. 77. 10 Richard Jones (1996). Garbage Collection Algorithms For Automatic Dynamic Memory Management, §3.1 Non-recursive freeing, pg. 44. 11 Richard Jones (1996). Garbage Collection Algorithms For Automatic Dynamic Memory Management, §3.2 Deferred reference counting: The Deutsch-Bobrow algorithm, pg. 46.
This is conceptually broadly analogous/related to deferred reference counting12 schemes which rely on compile-time deterministic tracing of the deallocation, such as via linear (aka uniqueness) types. Although I was obviously also contemplating similarly to you in my prior posts in this thread (after you had reminded me about RAII in our previous discussions and my recollection of programming in C++ in the late 1990s), after reading a book on garbage collection and thinking more about the issues, I’m going to further argue against the compile-time deterministic memory mgmt language design strategy in my upcoming detailed post, because I believe it’s not as generally flexible, generally sound, nor highly comprehensible solution. These stack memory mgmt paradigms you’re referring to, have narrow applicability because they don’t handle the general case of non-deterministic heap management. Afaics, it’s a special case methodology that leads to a very complex set of corner cases such as for C/C++ which tries to pile on a bunch of different sort of compile-time optimizations and keep as much memory mgmt on the stack, but lead to a brittle clusterfuck of a language with a 1500 page manual that virtually no one understands entirely (not even the C++ creator Bjarne Stroustrup nor the STL creator Alexander Stepanov, as they both admitted). And those stack paradigms don’t marry well holistically with a bolt-on tracing garbage collection appendage13 (i.e. was not design holistically with the language), although I have not yet read a more recent research paper on combining tracing GC with Rust.14 To even attempt to apply these compile-time deterministic memory management holistically requires a complex typing tsuris akin to Rust’s total ordering of borrow lifetimes with lifetime annotation parameters for transitivity of borrows from inputs to outputs, and of course the programmer must violate the invariants and create unsafe code in order to handle non-deterministic (i.e. runtime randomized) memory mgmt, which defeats the entire point as then the total order is no longer checked and the unsafety can leak back into the code the compiler thinks is safe. I apply the 80/20 or 90/10 Pareto principle to prioritization. Although all that special case tsuris might be able to obtain an extra 10% of performance, it has the cost of 90% more effort on debugging, readability of the code, maintenance of the code, complexity of the code, etc.. When you really need that 10% boost (or in the case where the more generalized GC paradigm is much slower for some reason), then go write a module in C++ or Rust. I see no need to invest my effort in trying to recreate (or improve upon) those languages, because there are millions of man-hours already in those ecosystems. Our only prayer of creating something useful in this decade with our resources, is to 90/10% prioritize on clean abstractions that are elegant and generalized and 90% performant with 10% of the effort and complexity. EDIT: on the coding for readability point, “Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.” Additionally, perhaps my GC design ideas (to be outlined in my next post) might possibly be compelling (if I haven’t made some errors in my thought process, which is entirely possible given I need to eat dark chocolate to even get my brain to ephemerally coax my brain out of the 105 IQ brain fog gutter and function non-lethargically presumably due to an ongoing TB infection). Note however, that the prior art for GC is apparently vast and seemingly exhaustive in terms of the low hanging fruit of obvious, simple variants16 (i.e. fertile ground presumably only in the realm of esoteric, high complexity, and/or abstruse). Thus, unlikely I discovered something (after a day of reading and contemplation) that hadn’t be considered previously in the prior art, although there appears to be potential prior art similar to my ideas16 that have appeared later than the book I read. Note I independently arrived at my ideas (after about a day of contemplating the 1996 book) and have not yet read that 2003 research paper16 to compare. Total ordering compile-time paradigms (including attempting to type check everything) are afaics brittle paradigms that break because the programmer must necessarily break out of them because the entropy of the universe is not bounded. I commented on this previously numerous times in these issues threads. The non-determinism (i.e. unbounded randomness and entropy) of the universe (i.e. the real world in I/O) can’t be statically modeled. My thought is we need to be circumspect about the benefits versus trade-offs of every paradigm and feature we choose in a programming language. And the target use cases (and acumen/preferences of the target programmers demographics) of the programming language will also factor into the decision process. For example, Haskell programmers favor brevity and mathematical expression with trade-offs such as hidden (2nd, 3rd, and 4th order) details in the form of implicit typing (which the mathematically minded model in their mind without reading them explicitly in the code), which drastically reduces the demographics of programmers who can read the code and participate. Some snobbishly claim this is a good filter to keep the less intelligent programmers away. Haskell has other trade-offs (necessarily to enable the mathematical modelling) which we’ve discussed else where are outside the scope of this concise summary, including but not limited to (and my descriptions may not be entirely accurate given my limited Haskell fluency) the debugging non-determinism tsuris of lazy evaluation (although the requirement for lazy evaluation is disputed), the bottom type populating all types, a total ordering (or algebra) on implementation of each data type for each typeclass interface, a total ordering on the sequential (monadic) threading of I/O processing in the code (requiring unsafe code to fully accommodate the non-determinism of I/O), etc..
Human managed weak references are an incomplete and human-error prone strategy. Also as you noted then you leak the non-determinism from deallocation to how to handle error conditions of null pointers, leaking the non-determinism into program logic! That is horrible. Whereas, a provably sound weak pointer algorithm has exponential asymptotics and thus is no better or worse than tracing GC mark-sweep.2 12 Richard Jones (1996). Garbage Collection Algorithms For Automatic Dynamic Memory Management, §3.2 Deferred reference counting, pg. 45. 13 Richard Jones (1996). Garbage Collection Algorithms For Automatic Dynamic Memory Management, §9 Garbage Collection for C, pg. 227. 14 Y. Lin, S. M. Blackburn, A. L. Hosking, and M. Norrish (2016), "Rust as a Language for High Performance GC Implementation," in Proceedings of the Sixteenth ACM SIGPLAN International Symposium on Memory Management, ISMM ‘16, Santa Barbara, CA, June 13, 2016, 2016. 15 Richard Jones (1996). Garbage Collection Algorithms For Automatic Dynamic Memory Management, Preface: The audience, pg. xxiv, Preface: The Bibliography and the World-Wide Web, pg. xxv. 16 Stephen Blackburn; Kathryn McKinley (2003). "Ulterior Reference Counting: Fast Garbage Collection without a Long Wait". Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications. OOPSLA 2003. I originally found this on Wikipedia’s page for Reference Counting. Kindle version of the book is available on Amazon for less then $20 if you don’t want to turn your monitor sideways to read the “free” (copyright violation theft) pdf linked above. EDIT: a more concise overview resource:
|
Original Author: @keean The claims made about GC efficiency in those kind of books never turn out to be true in reality. Look at how JavaScript is slow yet emscripten can get within factor of 2 of native 'C', yet both run on the same VM. The difference is that asm.js and WebASM are strongly typed and manually manage the memory. In theory the JIT can make well typed JS code as fast as strongly typed code, so well typed TypeScript should be as fast as the compiled asm.js/WebASM, but its not. The remaining difference is mostly down to memory management in my opinion. |
Original Author: @NodixBlockchain
I wonder if this solution would work if the object instance referenced can change during the lifetime of the reference. Because it would mean you might need to delete the referenced object before the reference get out of scope in case a new object is assigned to it during its lifetime. |
Original Author: @barrkel You can see the efficiency of most GCs really easily: just look at % time I don't think there's any theory that JS would ever be as fast as webasm And that's without getting into the costs of semantic fidelity. You can't |
Original Author: @keean TypeScript can force you to stick to classes for objects, allowing the JIT to efficiently optimise these. Really for JS where there are no allocations going on it is as fast as asm.js. GC hides all sorts of costs, like the extra level of pointer indirection. Probably a major factor is that GC hides the cost of allocation and deallocation, so the programmer does not realise they are thrashing object creation. What is the fastest garbage collected language? |
Original Author: @keean GC is even worse with multi threads, because it has to stop all the threads to do the collection. This leads to GC pauses, and bad user interface Jank, plus programs crashing due to fragmentation... |
Original Author: @keean
The reference count needs to be in the object referenced, not the pointer, so it won't be in the same cache line as the pointer. When you have multiple pointers to the same object they all need to increment/decrement the same counter. |
Original Author: @keean The key fact for me is that managed memory usage is always significantly worse than the managed. Yes GC can be fast providing you have twice the RAM available. Typically on a multi-tasking machine this means less RAM available for other processes. This results is real slowdown of things like the disk-caching layer because its losing pages to the application. The end result is that the user experience of using a system with GC is significantly worse than using a system with manual allocation - providing the software does not crash due to bad memory management :-) |
Original Author: @keean It cannot be with the pointer, because when you create a new pointer to the array you need to increment the reference count. |
Original Author: @barrkel The argument in favour of GC isn't speed of execution, it's speed of Reference counting is GC. Arguably, arena allocation, or any other scheme Yes, GC scales because it trades space for time. If you're running on a I mostly just butted in because accusing GC for JS / Typescript being I'm gonna unsubscribe from this thread now. I got linked in by an |
Original Author: @keean
Actually the difference in performance between JS and WebASM is only about 10-20% anyway, so that is in the 2%-10% region you claimed for the GC, so for some algorithms it could account for 100% of the difference... |
Original Author: @keean Interestingly I am about to port a bunch of asm.js to typescript, as the memory allocation for asm.js / webasm does not work well with the main JS garbage collector, and allocating the heap for the webasm is causing fragmentation of the JS heap. In the end it turns out that the better memory usage, keeping all them memory managed by the GC is better for the application stability than the small performance gain of having the code in asm.js. Following this argument to its logical conclusion, the better memory usage of manually managed memory for the whole application would be better overall. Edit: its probably worth pointing out that this application uses a lot of memory, and has some periods when it is performing rapid allocations. This leads to memory usage for a single tab of about 1GB (even though memory buffers are 16mb so with manual allocation it would get nowhere near that level). Frequently when usage gets more than 1GB the web-browser crashes, even though most of that memory is probably free, and just waiting to be collected. |
Original Author: @shelby3 @NodixBlockchain wrote:
This is off-topic. Explicit, unchecked (i.e. “unsafe”) manual mgmt is not what we are interested in here. The topic was about automatic (or compiler checked deterministic) mgmt. As you know, both @keean and I are significantly into compiler-assisted checking. I understand you want to participate. But you really have not wrapped your mind around all the concepts, and so you should be more circumspect and lurk more. I said if you want to post to your thread, then I will try to participate occasionally. Please be respectful and recognize your limitations.
After briefly learning about it, I’m no longer interested in your low-level framework. Your monotheistic intent on pushing your low-level preferences diverges the focus of my thread, because you don’t have an open-mind attempting to be objective about research results. Your low-level framework focus is irrelevant to the topic of this thread. Please talk about that in your thread which you entitled “Saluations! :)”. This/my thread is not for trying to convince yourself and/or the world that your low-level framework paradigm is great. Your thread is for that. I wouldn’t ban you even if this were my repository, but I believe curation (not absolute censorship) is valuable. Trust me, if you write anything in your thread which I happen to read, that I think is important enough to be in this thread, I will quote it over in this thread. Also I presented a guideline given that I find your posts have so many errors and low informational value, that you could post after every 10 posts by others in the thread (so as to force you to focus yourself more and keep the S/N ratio here reasonable). I considered the low-level implications as even more evident by the post which follows this one. And realized that your preferences are a dinosaur. See my next post. I am discussing here about general purpose, high-level programming language design that employs compiler checking and/or automated runtime GC to facilitate lower costs of development and guard against human error. With some thought given to performance optimization and some perhaps C-like low-level capabilities as long as the result is clean, simple, and elegant and not explicit, manual, bugs-prone human mgmt of memory. |
Original Author: @shelby3
TypeScript transpiles to JavaScript (ES6), not ASM.js nor WASM. (As you know, TypeScript aims to be round-trip compatible with and a typed superset of JavaScript— meaning it should pass through ES6 code more or less unchanged.) Also C code which is transpiled to ASM.js/WASM, is employing different abstractions such as not boxing every damn thing, such as every element of every array. The VM can’t always statically type objects, so methods and type coersions can require dynamic overhead. Also I believe JS may be passing all function arguments on the heap (the prototype chain of objects which form the closures we previously discussed), and thus although the throughput and locality (i.e. contiguity for minimizing cache and virtual paging thrashing) is similar (i.e. allocation contiguously on the stack versus the generational heap with only an extra end of area pointer increment for each generational area allocation as compared to one for the SP on each function call), the volume of objects before a collection (i.e. reduction of the end of area pointer, although presumably all function arguments could be allocated as one data structure thus equivalent cost) is much greater than for stack passed arguments and thus perhaps additional thrashing. C code rarely employs hashmap lookup, yet JS literally encourages it since all object properties and non-numeric array indices employ it. There are many more details and we would need to study all the reasons: https://news.ycombinator.com/item?id=7943303 This thread is in part my attempt to get the best of low-level ASM.js/WASM married to a better GC scheme that is more compatible with (some of) those lower-level abstractions!
Agreed, e.g. do not realize they’re boxing everything forcing an extra pointer indirection for every read/write of an array element for example.
Huh? JS is several times slower than C, and ASM.js is afair purportedly within about 40% slower than native C speed: https://julialang.org/benchmarks/ Perhaps you’re referring to JS highly optimized to avoid features which impair aforelinked V8’s optimization?
Generational GC is very efficient nearly as much as the stack (as I explained) and I am going to propose an algorithm that I am thinking might make it even more efficient.
It is true that the asymptotic computational complexity (i.e. performance) of tracing GC is pathological both as live memory usage approaches/exceeds the bound of physical memory and as total memory consumption increases for computers with more GBs (or TBs) of DRAM. That is why I cited that research for footnote16 in my prior post—which btw my independently derived idea seems to be roughly analogous to—for removing the pathological asymptotics by leveraging an optimized form of RC for the older generations. I will write my detailed post about this soon. And that is why I allude in my response to @barrkel to the time vs. space trade-off approaching parity between automated GC and manual/stack/compile-time deterministic paradigms for memory management. The reason that the deterministic paradigms you espouse will not exceed the space vs. time trade-offs for the throughput (nor maximum latency pauses) of state-of-the-art automated GC in general is because I repeat that memory allocation mgmt is inherently non-deterministic (thus requires runtime management)! Also there is research1 (which I haven’t read yet) which seems to indicate that as the number of (perhaps also non-deterministic) factors (e.g. power and energy management issues) that need to be managed increases, then automatic algorithms may become the most plausible (reasonable cost, time-to-market, stability, maintenance) for most applications with hand-tuning being further, further relegated to the esoteric that demand extreme tuning (e.g. the NASA space shuttle).
Even manual memory mgmt will have virtual memory paging out to disk as the physical memory is exhausted. That half memory deal is afaik due to the “semi-space” of a copy collector that is not generational. Afaics, there are better alternatives now. See footnote16 in my prior post. I need to study this more thoroughly though. Of course manual (both unchecked and compiler checked) mgmt can be a bit faster or correct some pathological cases (as well create potentially other ones ;), but at great cost as I wrote in my prior post as follows:
And you alluded to the great cost in terms of bugs and/or tsuris:
In my prior post, I had alluded to how attempting to have the compiler check memory mgmt is forcing it into a deterministic, total order, which will necessarily cause the programmer to violate the compiler checks and leads to bugs (unless perhaps that compile-time paradigm is somehow limited to only deterministic instances, but from my study thus far it appears that the cross-pollution/interopt between young and older generation allocation makes non-deterministic, runtime automated algorithm interop with a deterministic young generation have worse characteristics than a runtime, automated generational non-deterministic collector for the younger generation):
1 T. Cao, S. M. Blackburn, T. Gao, and K. S. McKinley, "The Yin and Yang of Power and Performance for Asymmetric Hardware and Managed Software," in ISCA ‘12: The 39th International Symposium on Computer Architecture, 2012. Originally found on Steve Blackburn’s website. Also T. Cao’s master’s thesis prior art.
+1.
Also it is about consistency of memory mgmt, i.e. avoiding the pitfalls of the average programmer or rushed development team.
Well yes, but as I mentioned above in my reply to @keean, and I presume you’re also alluding to, that the differences in space vs. time tradeoffs between automatic and manual memory mgmt are getting closer to low double digit (or even single digit) percentages (i.e. “extremes” of optimization) and thus heavily favoring automatic GC for most use cases. Note I’m not conflating this metric with the 2 - 10% figure for throughput differences that you asserted.
Note though that the metric you propose (although significantly representative given all other factors equivalent) doesn’t account for all impacts of the GC scheme chosen, such as cache and virtual paging locality (contiguity) to minimize thrashing which can significantly impact performance. Note however, that deterministic compile-time (manual explicit or for stack argument conventions and such thus assisted partially implicit) memory mgmt doesn’t necessarily achieve a panacea for this ancillary factor and could actually degrade it because generational and copy GC increase locality. I appreciate your assistance in helping to explain. And I agree that is an important clarification to emphasize. Please feel free to participate here as often or infrequently as you want. Btw, @keean is expert in other areas such as C/C++, Haskell, Prolog, type theory, and library algebras. Afaik, he has only been developing seriously with JS/TypeScript for a relatively much shorter period of time. So that might explain it. Or it could also be rushing and trying to answer late in the evening after a long day of managing his company.
+1. |
Original Author: @NodixBlockchain
If you consider reference as objects with a delete operators rather than only a pointer with a free, the destructor can release reference to childs before to release the reference. It's not anecdotal, i have fully working application server with it's own memory management, object oriented script language, which deal with thousands of objects and i never spent more than 1h on tracking memory leak in the whole development cycle because i have clear methodology to track them. The script system is actually close to javascript with tree of objects on several layers so i could make generational GC very easily, but there is too much corner case with multi threads and plugin for me to really consider it, over simple reference counting who works for every single case possible with very small code base. Been doing tons of experiment with different model of memory management. And i also have many experience with GC language such as AS3, android SDK and javascript, and there are always many instances where i wish there was a manual memory management, because the garbage collector won't really release the memory when it 'should', and it's very often macro managed with the GC only releasing things once in a while, which is far from being optimal if there is need for lot of dynamic object creation inside of a component that need to run for long time allocating lot of new object. And it's not even multi threaded, and there are already these kind of problem that are recurrent in every GC language out there. |
Original Author: @shelby3 I’m trying to preempt another 1000+ post thread. We already had extensive discussions of your ideas, framework, perspective in the Concurrency and your “Saluations! :)” threads. We don’t need to repeat that again. I’m not trying to be dictatorial or unappreciative of discussion. I’m very interested right now in automated GC, for the reasons I’m explaining. I want to see if we can make it work optimally with the lower-level abstractions and resolve the pathological asymptotics issue.
I already told you that explicitly asking the programmer to remember to break cycles is off-topic (i.e. the destructor will not be automatically fired if there is a cyclic reference preventing the parent most object's reference count from going to 0):
Although you’ve deleted (or moved to your thread?) those numerous prior off-topic posts I asked you to (thank you), you want to repeat the same points of what I percieve to be off-topic noise w.r.t. to the focus I want in this thread. I understand you think you have something important to say, but you apparently did not even pay attention or grok that I had already stated that it (i.e. the same point you’re repeating again) was/is off-topic.
We already discussed in the prior threads that (your and Rust’s paradigm of allowing/presuming) willy-nilly multithreading everywhere is not the paradigm I’m interested in. I’m looking at the single-threaded event model. I will be discussing GC and multithreading in this thread.
How many times am I going to have to repeat myself that I am not interested in discussing further the anecdotal claims of your low-level framework. You’re apparently overlooking the more holistic point that:
If you’re interested in your paradigm in spite of what I perceive to be the convincing exposition in my prior posts, then please feel free of course to discuss in a different thread. I may occasionally read there. If you make your best arguments there (not in this thread) and if I see anything I agree with, then I will quote it over here (and invite/entice you to discuss here). Please do not pollute my thread (at least respect the one post for every 10 others request, or 6 if all the intervening posts are lengthy and infrequent) with what I currently perceive to be off-topic discussion. I do not want an interactive chat style discussion in this thread. We can chat back-and-forth realtime in private or another thread. I wanted focused, carefully thought out discussion in this thread that is easy to re-read and condense enough for others to follow. Because this is my thread wherein I am trying to sort of what language paradigms I need and this is a very serious issue I am in a rush to complete and I do not want the distraction of what I perceive to be a boisterous (repetitively arguing the same points over and over) dinosaur and entirely incorrect. Essentially you’re trying to route me away from extensive sources to your pet project. That is very myopic. You’re apparently a prolific and very productive programmer (unlike my pitiful self reduced to rubble because still apparently suffering from a perhaps resistant strain of Tuberculosis). You and I aren’t able to collaborate much because you and I are ostensibly headed in different directions w.r.t. programming paradigms, while maintaining very low overhead on throughput.
It “should” only do so in a balance of throughput, asymptotics, pause latency, and (time & space) locality. If you’re preferring priorities of time locality (immediacy), then not only are you attempting the impossible of defying the non-determinism of memory mgmt (i.e. immediacy is irrelevant when the period of allocation is non-deterministic), you’re also not optimizing the entire equation for memory mgmt. This sort of myopia might not manifest as measured problems in your handcrafted project due to the confirmation bias of handcrafted design decisions, but may in more diverse and general ecosystem of use cases and developers. Your exposition has the hallmarks of an amateur lacking academic study. You’re referring to automated GC implementations from a decade or more ago. Even Google’s JavaScript V8’s GC is only incremental, generational with a standard pathological asymptotics mark-sweep for the older generation. I wrote already in this thread that new research has drastically improved this issue:
You make me repeat myself because you don’t assimilate all the details I write. Presumably you’re so glued to your mental model of programming that you don’t notice subtleties that paradigm-shift the presumptions.
Agreed, but note in my prior post with the many footnotes about GC, I cited some research (Weighted Reference Counting) that places the count in the pointers for allocation and only has to decrement the shared count on deallocation. Also cited research about employing the reference counting only for a single-bit stored in the pointer in a reference counting generation scheme. Also in an earlier post I suggested to pulling all the counts closer together in contiguous locality to minimize cache line and virtual page faults. There’s also the concept of buffering the increments and decrements to amortize. |
Original Author: @NodixBlockchain The thing is my paradigm is the basics for high level language, which is fully working with JS/Json, can generate binary data for webGL apps, or other kind of dynamic application. Even Node.js or javascript VM use this kind of code internally. You can't have any kind of high level language without this kind of code underlying it. And for the moment it give more feature than javascript as it work in multi threaded environment and memory can be freed safely when it's not used directly. The script language can already process HTML data, including form variable, query variable, and interact 100% with js app, and i think it's still cleaner than PHP. for example https://github.com/NodixBlockchain/nodix/blob/master/export/web/nodix.site#L555 It can parse POST variable, including files, generate js variable and code with live variable from the node etc. The low level mechanism is only to be seen as a minimalistic VM to run the script language, which either handle network events, or generate html page with dynamic data. Any kind of advanced high level language will need this sort of code to manage memory, references, multi threads etc The lexer/VM can be very simple because the low level code already have advanced feature to deal with objects, memory, scope and evaluation of objects data. |
Original Author: @shelby3 @NodixBlockchain, you’re repeating the same points I already discussed with you in the Concurrency and your “Saluations! :)” threads. I refuse to repeat my explanations again to you as to why what you think is applicable, is not applicable to what I am doing. It’s not my job to repeat myself over and over, just because you think I didn’t already refute you before in those other threads. Please stop. Please.
Incorrect. And I do not want to explain why again.
Your paradigm for multithreading is not applicable to the direction I’m headed, as I will repeat what I wrote before quoted as follows:
I’m not trying to be a jerk. I have clear reasons for the focus that I’m pursuing. I hope you also understand my available time is very limited (and no that is not a valid reason that I should employ your framework). Sometimes I ponder if the trolls at BCT paid you to come and disrupt me. I’m having enough difficulty as it is making forward progress, and I need to be taken off on repetitive tangents like I need another hole in my head in addition to the one I already have from a hammer. |
Original Author: @shelby3
Agreed afaik you appear to be making a valid and important point, which dovetails with my initial peek into analyses about GC for proper integration with the low-level coding. The problem you’re experiencing perhaps has something to do with the fact that most browsers are limited to presuming a least common denominator of a 32-bit virtual address space and this gets very complicated as to how they split this up between processes. They also have to protect against writing outside of bounds efficiently. I dug a little bit into the issues (c.f. the last 2 links in quoted text below) but did not completely climb down the rabbit hole.
I had mentioned to you in private discussion my idea about storing older generation objects in same sized objects memory pages in order to minimize fragmentation with a 64-bit virtual address space. This would rely on less frequent compaction pauses for old generation objects, and would reuse old generation slots instead of waiting to compact them (i.e. not primarily a copy collector). Also note that very large or small objects are typically handled different by GC algorithms, so there may be an issue there w.r.t. to the 16MB objects you’re allocating. I agree that the memory allocation seems to have corner cases around the interaction of the GC in JS and the allocation of the However, as I pointed out in prior posts, just because automated GC has a corner issue with a particular scenario, doesn’t enable the presumption that manual memory mgmt is a panacea in terms of the holistic equation of memory mgmt optimization. You gain some control in one area and lose something in other axis of the optimization space (e.g. more human-error prone, less throughput, or worse asymptotics, etc).
I’m contemplating whether I can design my own programming language which models what I want, will tolerably to transpile to JS/TypeScript (or something else) easily enough to allow me to use it now, and then rewrite the VM and compiler later to optimum. This is what I’m analysing now and the reason I’m doing research on GC.
How do you arrive at this conclusion? Have the posts I have made since you wrote that, changed your mind? Afaics, the optimum solution is to better design the GC and VM with the low-level coding in mind. JavaScript was designed by Brendan Eich in 10 days. I presume that since then backward compatibility demands have prevented any holistic overhaul. Also I’m prepared to deprecate 32-bit OSes as a viable target. I’m future directed because anyway, we’re not going to get something popularized within a couple of years at best. Afaik, even mobile is moving towards 64-bit rapidly. Microsoft Windoze may be a bit resistant, but 64-bit Home is available for those who care when they buy a machine. And anyone still running that spyware without dual booting or running Linux in a VirtualBox is really behind the curve. |
Original Author: @NodixBlockchain
Just have to comment on this, you're paranoiac dude lol I don't have anything to do with anyone on BCT lol |
Original Author: @keean
Because I am good enough to get the manual memory management right, and it will use a lot less memory, which will result in less fragmentation, and less crashes and therefore a better user experience. I am still convinced that static analysis and region based memory management is a sweet spot for high performance languages. I think GC is useful for simple coding, but there are some big problems:
If you have a system with all three of the above problems, there is nothing you can do as a programmer within the language to solve the problem. You can however solve problems (2) and (3) if you can avoid (1) by limiting object creation. If you can force pre-allocation of resources, and then use effects to restrict new object creation, so that the compiler will prevent you using idioms that thrash the GC then it could be acceptable. The problem is that you need all the library APIs to respect this too, and provide a way for the user to pass the result buffer into the function/procedure. As we know if you give people multiple ways to solve a problem they will probably pick the wrong way. This is why I am interested in a language that forces people to do things the right way, so that then library APIs will always be written the correct way. Of course this might not make the language very popular, but I think it is an interesting approach to explore. |
Original Author: @shelby3 @NodixBlockchain wrote:
It’s half-joke and half observing that the trolling of BCT seems to follow me over here. However, it must be stated that you’re not insincere. You truly believe (or believed) your work is applicable to what I want (or to the balanced presentation for other readers/participants here). I just don’t know how much verbiage I have to put out in order to get you to understand my side. Seems like perhaps you understand now. Again I am very much willing to read anything you write in the thread you created on @keean’s repository, if I have time. But I hope you will understand my point in this thread. I suggest re-reading the thread, as I made so many edits to my posts over the past 8+ hours, including my prior responses to you in this thread. Again thank you for cooperating on my request. That was very honorable. I’ll try to be mutually respectful in your thread.
That appears to ignore the arguments @barrkel and I have made about the 80/20 rule of nature. Those who ignore that rule perpetually fail in business and life. Of course you can do anything with infinite programming man-hour resources (assuming it doesn’t end up in a maintenance clusterfucked hairball), but that is not a valid argument. That appears to be the hubris of anecdotal experience not backed by a sound analysis of the factors involved: which I will revisit below since apparently my prior writings in this thread are being ignored. No one should dictate to you which programming language you should use. Likewise, you can’t dictate to the rest of the world, what it prefers in terms of trade-offs in costs and outcomes for choice of paradigms in the programming languages. Java, JavaScript, Python, PHP, Haskell are very popular programming languages which all use tracing GC (and apparently not even the state-of-the-art GC):
C/C++ are also popular, but afaik they do not match the aggregate market share of the others. As well, C/C++ these days is typically is limited to programs which require low-level control. My quest in @keean’s repository has to been to get the advantage of GC and perhaps some other aspects of FP such as first-class functions and closures in a simple language that also can do some low-level things such as unboxed data structures. To further erode the use cases where C/C++ would be chosen, to relegate that clusterfucked C++ language tsuris (and perhaps Rust also but the jury is still out on that) more and more to the extreme fringe of use cases. For example, It is ridiculous that I must employ a cumbersome Node.js Buffer library to build an unboxed data structure. Also I think you may be underestimating the trade-offs (at many different levels such as development time, maintenance, worst case failure modes, scalability, etc) of avoiding the assistance of runtime algorithms. Essentially since non-deterministic resource mgmt is an invariant of programming, you’ll end up writing a runtime GC algorithm to handle it, so I can’t see how your stance makes any sense whatsoever:
For the highest quality software (i.e. mission critical stuff) perhaps yes (although I think perhaps algorithmically managed non-determinism will become more reliable than handcrafted), but I see already from my study of the research that gradually (even suddenly) state-of-the-art GC will continuously improve and cannibalize most of it until you’re relegated to programming languages that most people have no interest in.1 You always knew I was interested in mainstream programming languages. So if you remain glued to that stance then we're likely headed in different directions. I thought I was going to be supporting static memory allocation analysis for Lucid, until I realized that generational allocation can be as performant as compile-time stack based for the frequent young objects (and I have a new algorithm in mind to make it more so by eliminating most of the copying, i.e. “sudden” may be about to happen in a few hours). Btw, I’m known for writing prophetic predictions. There’s some hubris back at ya. 😘
This has all been solved as of today. I will be writing down the details shortly.
You mean @keean’s way. I will prove to you that is not the right way w.r.t. to this issue, although I do believe you have many important insights into programming. 1 However, I think it may be correct to allow some form of manual control over memory allocation for programmers to work around corner case issues with the automated GC. |
Original Author: @keean If we are targeting JavaScript, this is irrelevant, as we have to live with the web-browsers GC. Mozilla's work with their Rust browser has shown you get better performance by bringing the DOM into the JavaScript GC, rather than having it reference counted. I think if you have GC the best model is to have everything managed by the GC, otherwise there seem to be more corner cases. For example mixing JavaScript and WebAsm seems to bad for memory performance. So either we compile to web-assembly and manage the memory ourselves, or we should use the web-browsers GC. If we have any interaction with GC manages systems we are probably better off using the web-browsers GC. However when compiling to native, it would be nice to have a static analysis so you don't need a runtime for the binary. I would really like it if a program to add two numbers consisted of a couple of loads and add and a store and nothing else. As stated above, the benefit of GC is easy development, and freedom from some errors. Unfortunately people don't seem to realise you can still leak memory with GC, it eliminates some simple error cases, but you can still write programs that leak memory like a sieve until they crash. In my experience it is much harder to debug memory leaks with GC than it is with manual allocation, because it is invisible to the programmer. So GC makes some easy coding easier, and some hard coding harder, but the easy cases are common and the hard ones rarer. |
Original Author: @shelby3 Inserted:
|
Original Author: @shelby3 So after all that @keean has now effectively agreed I was correct from the start, lol. (Although he may not realize it, lol) @keean so you’ve adopted strong and weak references with a paradigm for being explicit about finalization. You have regurgitated what I claimed:
|
Original Author: @keean @shelby3 You said RAII couldn't do 'X'... I just tried to understand what 'X' was, and when I did I presented the solution (explicit Weak references). So we have shown that RAII does not have the problems you were claiming. My point that resource lifetimes are isomorphic to handle lifetimes still stands, because the weak reference lets you destroy the handle. |
Original Author: @shelby3
Nope. Firstly because a weak reference does not let you relinquish the resource — only the strong reference can relinquish it. Also because by conflating concerns you will force inner encapsulated concerns to become propagated to outer separate concerns:
IOW, the by-default release due to ARC ownership is presenting to the programmer an assumption of isomorphic lifetime between ARC and resources, which causes the programmer to become lazy about thinking about optimal resource lifetimes. And when the programmer does decide to think about it, they realize the semantics have nothing to do with the ARC lifetime in these cases and thus they use explicit mechanisms to override ARC. So ARC was the wrong paradigm to start with. Put the programmer in the wrong mental model. Even the case where you said the programmer could use explicit braced blocks or the compiler could automatically infer, are a conflation of the fact that resource lifetimes and ARC lifetimes are not isomorphic (i.e. I pointed out that the automatic inference case can’t choose between an ambiguous order when there’s more than one expiring in the same inferred scope, thus an unprincipled paradigm with heuristic outcomes). Also a non-ARC, GC’ed language can’t intermix ARC’ed resource handles into non-ARC, GC’ed data structures — you create a What Color Is Your Function bifurcation clusterfuck. |
Original Author: @keean
Explicit weak references are better than implicit weak references. How will you tell if a reference is weak or not? If all references are weak, then you must test the reference is still valid before every dereference. That sounds like a lot of boilerplate.
True, and this is the point I was making much further up, a GC language can only have finalizers, and that is not good enough, so you must have explicit close statements for resources. If this is the way you choose to go, then using the type-system to statically track whether a file handle is open will help, because it will make sure you don't use-after-close statically, and can enforce a type-guard check in the dynamic case. My point all along has been that deterministic destructors avoid the need for all that boilerplate, and can provide all the same functionality (with explicit weak references), but you need to use RC for your automatic memory management. I commented that I thought this was why Apple went with RC for Swift, because it prevents more resource bugs than just GC which only prevents memory bugs. However I have shown how you can avoid resource bugs with GC by using the type system, so it just comes down to code clutter/boilerplate. |
Original Author: @shelby3 Inserted:
|
Original Author: @keean @shelby3 I think implicit weak references are more problematic than RAII for handles. It means that some references (to memory) behave differently (as strong references) to other references (to resources, as weak references). By including explicit weak references we can treat any reference (to a resource or memory) the same, as a cache that we can clear at any time, and we can rely on them being strong references if not explicitly weak. This seems a more consistent approach to me. |
Original Author: @shelby3
It does seem that optimal answer which will also accommodate a GC’ed PL is being explicit about which single reference is the strong one, so then the type system can attempt to track it to insure it is explicitly released. Thus this is orthogonal to ARC. ARC is heuristic way of getting automatic release in many cases but it isn’t complete because not isomorphic and explicitness of the strong reference is required in order to not be heuristic (in some scenarios) and not be oblivious (to inadvertent resource starvation) in other scenarios because the programmer became too comfortable relying on implicit ARC instead of thinking about the explicit resource lifetime.
Or throw an exception. The point is to make the programmer think about when he releases his single strong reference, that he is sure that all the weak references are no longer in use, even if they still exist. Otherwise with ARC, he must communicate to the owners of all those weak references to tell them to assign One could argue that just make all references strong and make sure you structure your code so that all references have died when you want the resource lifetime to end. But my thought is that this like trying to turn the Titanic. You have encouraged the programmer to scatter references to the wind (e.g. inside layers of encapsulation) and now you want him to unconflate all that in order to conflate it with ARC but in a way that optimizes resource lifetimes instead of ARC lifetimes. Also I am wary of your proposed optimization that the resource lifetime ends implicitly upon the last access to the resource handle. There was my out-of-order heuristic point which admittedly may be a bit contrived or overemphasized. What if I have some callback setup to inform me when the resource’s buffer has flushed to disk? So then inferred ARC optimization you proposed closes the file before the flush but let’s say I have another callback set up to do something after closing the file which depends on that buffer being flushed to disk. Maybe that is also overly contrived.
That is related to my last point upthread to @sighoya that by having only one strong reference responsible for explicit resource lifetime, the compiler could perhaps track it via escape analysis and warn if the programmer has not placed an explicit intent (e.g. a variant of That proposal is probably much less intrusive than some complex linear typing to statically check for resource lifetimes with a proliferation of strong references — which I presume will not handle the range of dynamism in general programming language usage. And remember I wrote yesterday at the very start of this tangential discussion that my original motivations was unifying the issue with non-ARC, GC’ed PLs. So conflating with ARC isn’t even an option for my goal anyway.
If you presume the strong reference is always optimally congruent with implicit ARC destruction then you seem to have forgotten already my (even recent) points about it being non-isomorphic. Ultimately we need to be explicit with the strong reference sometimes. And the other times I argue it would be best to be explicit about defaulting to stack scope. Even your map example is being explicit because it assigns
I think with a single strong reference there does not need to be more clutter/boilerplate with non-ARC, GC (ARC is also a form of GC), but I am still not yet convinced that requiring curt explicitness is worse than letting the compiler release implicitly. IOW a single strong reference with escape analysis should as @sighoya pointed out, be implicitly resolvable at compile-time (although somewhat dubiously I argue not perfectly optimally…yet I would agree with a single strong reference can at least catch all the absence of explicitness at compile-time?). |
Original Author: @shelby3 Note the edits:
|
Original Author: @keean @shelby3 I think we are coming to a consensus. Let me try and summarise:
No it does not. My example language has non-nullable references. A reference if it exists must be valid (a strong reference). The explicit weak reference you must test with (reference.isValid()) before you can dereference, so there is no null here either.
Again no, the explicit Weak reference tells the programmer that the resource can dissapear at any time. All other references are strong. A clarification: GC and RC are both forms automatic-memory managemnt. RC is not GC because Garbage Collection is a specific alforithm that is not Reference Counting. We both agree that RC + Weak-References allows RAII to be used for all cases. We both agree that a combination of static checks and dynamic tests can make GC + explicit destruction safe, and that finalizers are not good enough. What seems to remain is a preference for a stylistic difference, I prefer explicit weak references with implicit destructors, you prefer explicit destructors with implicit weak references? |
Original Author: @NodixBlockchain Am i correct if i say shelby issue is on the line of for example if you have an image class that contain a file handle, the lifetime of the image class might be longer than the minimum required lifetime of the filehandle ? And thus the filehandle might remain open longer than it should if only based on RAII ? |
Original Author: @shelby3 Note another edit:
|
Original Author: @shelby3
That’s not the complete conceptualization because of course the image class could set its copy of the file handle to |
Original Author: @shelby3
I see you alluded to move semantics for C++ but that is still being explicit. You move the reference out of the map (and what do you leave in its place except a
I was referring to the implicit release of the resource upon the destruction of the strong reference in the
I cited the literature near the top of this issues thread #35 that conflicts with your desired taxonomy. Even Rust’s docs use the term generally:
Although I agree your taxonomy seems to make some sense (actually not), the legacy literature seems to use the term GC to refer to all forms of automatic memory management deallocation and to use adjective ‘tracing’ GC to distinguish from ARC. But there is GC which is not ARC and is not tracing such as my new ALP proposal, so I would prefer your taxonomy, although then I am not sure if I should refer the collection of garbage in my ALP idea as GC or as only AMM? So then we have two terms with the same meaning. IOW, memory has no more life if it can’t be referenced anymore and it continues to have a life for as long as any reference can access it. Whereas non-memory resources live on beyond (or need to die before) the life of the reference to the handle to the resource. The reference to the file handle is not isomorphic with the life of the resource referred to by the file handle.
But semantically conflates for example explicit removal from a
Even GC + implicit destructors if we use static escape analysis with a single explicitly typed strong reference.
Well we could even have a single explicit strong reference (thus any weak references are implicitly typed by default, although of course access to a weak reference always requires either explicit conditional test or an implicit runtime exception when use-after-destructed/use-after-finalized) with implicit (or explicit) destructors and make it work with either ARC or non-ARC, GC. Thus I see no advantage to conflating with ARC? And I do conceive disadvantages to conflating it with ARC, not only including that it won’t work with non-ARC, but also because conflating with ARC encourages the programmer to conflate reference access lifetimes with resource lifetimes which are not semantically isomorphic. So yes I think you are getting closer to understanding my stance, but there still appears to be some amount of disconnect? |
Original Author: @shelby3 Inserted:
|
Original Author: @sighoya You can't require the compiler to detect in interior when to release/finalize a resource, only in terminus. But the latter can't be required for dynamic cases, too, i.e. a finalization must be either aborted if some other entity is blocking your resource or you enforce to disrupt the access to that resource and cause the other entities to possibly fail in their obligations. |
Original Author: @thickAsABrick Hello: I'll confirm that I, @thickAsABrick, did not post anything to, or contribute to, this conversation. Not sure why I am receiving these notifications. |
Original Author: @shelby3
Hahaha. Sorry I did not check to see you really exist. I guess I should have taken the visual cue when it was bolded in the comment. Apologies. I will not do it again. I hope you are not offended if I find this to be funny. Hey I wish I had thought of that username before you did. And @Ichoran thought I was chasing away new community, lol. (He is correct) |
Original Author: @thickAsABrick No problem. Just making sure my account was not hacked. Quite honestly, I do not know much about the subject matter that you are discussing. |
Original Author: @shelby3
Thanks for the understanding. And your last sentence fits perfectly then — not that you should be expected to jump into an esoteric 601+ comments issues thread about highly obscure, abstruse programming language design topics and be able to assimilate it. I actually like the Jethro Tull song even though it received poor reviews. |
Original Author: @shelby3
If there is only one reference (or at least only one strong reference) to the resource handle then the compiler should be able to reason (by tracking the implicit linear type) when that reference dies either upon exit from its terminal stack scope or assignment of I was not proposing to track only in the interior (to stack frame scope) when to release. I was proposing to track in interior whether any code paths don’t escape the stack frame scope and thus make sure the programmer has been explicit about whether to release those automatically or has explicitly released them — or alternatively we could discuss whether the compiler should just release them implicitly by default without any |
Original Author: @shelby3 Note the edit:
|
Original Author: @keean @shelby3 I can find a few references to automatic memory management and GC vs ARC, but it seems not widespread. To avoid confusion we should probably use MS (mark sweep) and RC (reference counting) are forms of AMM (automatic memory management), and avoid use of GC altogether.
You can always refactor so the life of the handle is the life of the resource. If the resource needs to be shorter lived you can use a Weak reference. If the resource needs to be longer lived you can store it in a global container (global array, hash map, singleton object etc).
There is no reason for them not to have the same scope. If I have access to the handle, I should be able to read from the file. If I don't need access any more destroy the handle. There is no need to have a handle to a closed file.
No, because a strong reference must never be invalid, so you cannot close a strong reference to a file. A strong reference to a file must be RAII. Basically with a strong reference you must always be able to call read on the handle without an error. By definition, if you want to be able to explicitly close the handle, it is by definition a weak reference to a file. So with GC the close is explicit, the weakness of the handle is implicit. You have no choice over this (unless you want unsoundness).
To repeat above you should not use a strong reference to the resource with GC, because that would rely on finalizers to release the handle, and that can lead to resource starvation. It's not safe. Basically mark-sweep = weak resource handle Edit: Regarding C++, yes you are right you would swap a null there, but that's C++, which is not an ARC language. This would imply that "Weak" is defined:
And therefore Weak would be a billable reference to the strong file handle. However you would not be allowed to just write a null. Weak is an abstract datatype, so the value is private, and you would have to call weak.delete() which calls the destructor on the object contained, and then replaces with a null. |
Original Author: @keean
|
Original Author: @Ichoran @shelby3 - Um, your supposed code counterexample isn't a counterexample. In fact, the disproof of it being a counterexample was in my Rust example! Here it is again: Here is the code, in case you want to see it here instead of run the runnable example there: struct Str { string: String } Rust drops at block closure, but it resolves lifetimes to last use when needed ("non-lexical lifetimes"), so it could do it automatically at the point where it says So your first example would magically just work. If not, the explicit drop with move semantics (so you can't use it again) would non-magically work. Note that
also ties up resources forever. Because the blocking is secret (forgotten), it doesn't look like a problem. But
with automatic dropping of You don't even have to know the secret. It just works. It's just correct. So Here's another problem with
The problem? You've still go that file handle because you needed it live when you acquired some other resource. In contrast, RAII handles this flawlessly. Having to remember to deal with resources is a pain. If the compiler can figure out that they need to be closed, they should just plain be closed. If you need help understanding why your life is so easy now, then an IDE could help you out. |
Original Author: @thickAsABrick My apologies for asking: is there something I can do at my end to stop these emails? I tried unsubscribe, but it does not seem to be helping? |
Huge thanks to @NodixBlockchain for coding and running the script to recover this thread (appending to what I had manually recovered from my backup) which was accidentally deleted by @keean. Note the thread has been continued in a new issue, so I am going to close this thread so that new comments are not added: |
Thanks @NodixBlockchain |
Original Author: @shelby3
Original URL: https://github.com/keean/zenscript/issues/35#issue-243345358
Original Date: July 17, 2017
shared_ptr
as the default over GC).Rc<T>
.Afaics, there are some marvelous simplifying implications of this axiom:
Presuming all functions adhere except where they declare references to be GC, afaics the compiler can automatically implement RAII (in
try-finally
blocks² with an equivalentfinally
concept forPromise
callbacks) for deterministic memory deallocation (and automatically deterministically destruct instances of a type that have a destructor¹ which is a major feature lacking from GC languages and thewith
of Python being inferior because it is manual and object isn’t deallocated deterministically; noting that such types can not used with GC references directly or indirectly as a member of a data structure that is GC referenced).Afaics, the same code will be sound and applicable whether the data structures are implemented by compiler unboxed without GC or boxed with GC. In other words, other than the aforementioned explicit GC references for algorithmic convenience and performance, the programmer never needs to indicate whether the data structures should be boxed or unboxed³. I want to strip away the noise/complexity of
Ref<>
,Box<>
, etc.. For read-only inputs (i.e. the unannotated default) the programmer need not even think about whether the inputs are cloned-by-value (i.e. primitive type) or passed-by-reference; and cloned-by-value type inputs should never be allowed to be annotated as mutable on the function signature (although they can be mutated inside the function body unless they explicitly annotated read-only and the point being that the programmer never needs to think about whether the type is primitive or not and can see if there is a mutability implication by looking at the function signature). Although I presume an optimizing compiler will always strive for unboxed (because this will be more performant), if we start writing code in say TypeScript we can strive to adhere to this pattern (or have our compiler generate TypeScript) without all the unboxing mess if we so desire.Goodbye to Rust’s lifetime parameters and 3 kinds of pointers noise/tsuris/complexity; and due to the proposed single-threaded concurrency event model goodbye also to Rust’s mutability borrowing restrictions/complexity/tsuris.
When thinking about interfacing with ASM.js (or Webassembly aka WASM) with minimum (or no) marshaling overhead, then since (which IMO is a major paradigmatic flaw) there is no way yet to have those interact directly with GC references (ostensibly because since there is no static typing then JavaScript thinks these are unbounded polymorphic, i.e.
Any
akaObject
), we can (until some future where we write our own VM) favor unboxed andmalloc
ing these via Emscripten (or similar transpilation to ASM.js/WASM of our own C-like language features).¹ But not cloning resources such as file handles.
² And Rust’s error handling has no
try
(panic!
is doesn'tthrow
an error up the stack call hierarchy) and instead forces all exceptions to be bubbled up asResult<T, E>
return values which is excessive boilerplate noise/tsuris.³ Even arrays of data structures that have variable sized members such as
String
can be cleverly implemented (by a library or the compiler) to use indices byte offsets into a second array which contains the variable sized values. This can even work with mutable variable size values, by creating a new value at the end of the second array growing the array as necessary and/or implementing amalloc
andfree
on the second array, i.e. this second array heap is freed from the parent heap as a monolith. Note if the second array is encapsulated so it can never be referenced directly, then it can be repositioned in virtual address space as needed to minimize virtual address space fragmentation. Monomorphisation could make such libraries more efficient.Note bounded polymorphism never absolutely requires boxing. Unbounded polymorphism would but we are refusing to support
Any
, i.e. all unbounded polymorphism in modules must be bounded when consumed by compiled/linked code.Subsequent discussion in this thread will explain further the rough sketch so that readers can gain clarity. Let’s see what caveats and flaws are revealed via discussion. I look forward to @keean’s feedback and perhaps he will explain to what degree it corresponds to some features of the Ada PL.
I do not claim any of these ideas are novel inventions. @keean has alluded to for example cloning as an alternative to lifetimes and/or GC. What I am doing above is making a more holistic presentation of various ideas that have been floating around.
Of course typeclasses or ML functor/modules on top of this for HLL abstractions. If typeclasses, there is an option to go with implicit function arguments instead of
requires
, which makes it more readily implementable in TypeScript.@shelby wrote:
@shelby3 wrote:
EDIT: on the coding for readability point, “Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.”
@shelby3 wrote:
@shelby3 wrote:
@shelby3 wrote:
@keean wrote:
@keean wrote:
@shelby3 wrote:
@keean replied:
The text was updated successfully, but these errors were encountered: