-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Macros and DSLs #31
Comments
I don't want macros :-) I don't want a second weaker version of application in a language. What you are basically saying is that the data format for expressing objects combined with the AST is the language syntax. I am not sure I want this - as the AST currently features things like application nodes which you would not want in the language syntax. I think homoiconicity is over-rated, and I would rather have a nice user friendly syntax without it compromising the AST, and vice versa. |
Many programmers want them. Also DSLs.
Huh?
Huh?
Me too. I didn't see any flaw in my proposal for macros which would prevent that. Perhaps you misread the OP. |
Macro's are bad and are included in languages because normal function application is not powerful enough. If you want a function write a function. |
Making things too powerful is often worse than a proper separation-of-concerns. I doubt you can achieve the same things that macros and DSLs can achieve without making the type system so powerful and obtusely convoluted (and with mind bending and numbing type inference) that nobody can understand the code any more, not even the person who wrote it. K.I.S.S.. |
Macro's provide an escape hatch to bypass the type system (in some languages), this is a bad thing. Messing about with the program code itself using macro splicing is a bad idea, just write a function to do the thing in the first place. With generics you should never need a macro anyway. |
Macros also make debugging harder as they are applied at compile time. Any errors reported at runtime will be in the post macro processed code, so the line numbers and context cannot be found in the program source. I have yet to see one compelling use of macros in the last 20 years (and this comes from someone that wrote their own macro-assembler and used to think that macros were all you needed to build a compiler and a high level language). |
Maybe you can persuade me, if you can provide an example of a macro that cannot be more cleanly implemented in another way? |
Macros enable editing the source code of the arguments before they are evaluated. A functional lazy evaluation of those arguments will not enable the transformation of their source code. Macros can thus perform some automated operations based on the structure of the source code within its arguments, not just the evaluated value of the arguments. That is turtles all the way down recursively. For example, if we have list of functions we want to call with different numbers of arguments, they each have different types, and we want to add new common argument to the end of each of these calls, and collect the results in a list. Much easier to call a macro to do that, as it doesn't need to fight with the type system, as the type checking will be performed after the macro has done its code transformation. This example could also be done with partial application without a macro, except the typing can still get unwieldy, because those functions might have different type signatures for remaining default arguments or we have to manually specify the default arguments. Macros are basically a way the programmer can rescue himself from the fact that the language development moves too slowly and can't always do everything that you and I could imagine it might do one day in the distant future. Macros are basically useful any where you need to do transpiling, so you don't have to write your own whole program compiler. And there have been many times I wanted to transform code, e.g. to convert the Also much of the functionality needed for macros is also needed for evaluating and running code sent over the wire, i.e. dynamic code compilation and modularity. We can't expect to link everything statically in this era of the Internet. DSLs enable friendlier syntax such as SQL queries that are checked grammatically by the compiler. Edit: a possible new idea for the utility of AST macros: #11 (comment) |
I know what macro's are, I am saying there is always a better way to do the same thing, without having source-to-source rewriting going on in the program code, which obfuscates errors and results in another case of write-only-code. |
Someday. But someday can be too many years (even decades) too late. Refresh. |
I have not written a macro for about 20 years, so that someday was decades ago :-) |
I don't limit my analysis to only your anecdotal limited needs. I’m interested in mine and those of others. I think DSLs are the most important thing that can't be done reasonably without transforming code. I explained to enable the "macros" (parser + compiler transformation of code) for those based on detecting the type of the expression. Show me how to write a SQL query DSL (without any noise in the syntax that isn't in the normal SQL syntax!) that is grammatically correct at compile (evaluation) time, that can be done without code transformation. Scala can sort of finagle it by use infix function names as keywords, but that is not compile-time parsing of the SQL specific syntax issues. |
You do not need macro's to create DSLs.
Sure I did it in Haskell back in 2004 :-) |
I am not interested in your obtuse HList gobbledygook. A DSL should be elegant and easy to comprehend and work correctly. I asked you to show me how to do this, not asking you to brag. |
What's obtuse about this: moveContaminatedAnimal :: DAnimalType -> DCntdType -> DFarmId -> Query ()
moveContaminatedAnimal animalType contaminationType newLocation = do
a1 <- table animalTable
a2 <- restrict a1 (\r -> r!AnimalType `SQL.eq` animalType)
c1 <- table contaminatedTable
c2 <- restrict c1 (\r -> r!CntdType `SQL.eq` contaminationType)
j1 <- join SqlInnerJoin a2 c2 (\r -> r!AnimalId `SQL.eq` r!CntdAnimal)
j2 <- project j1 (CntdAnimal .*. HNil)
doUpdate animalTable
(\_ -> AnimalLocation .=. toSqlType newLocation .*. HNil)
(\r -> r!AnimalId `SQL.elem` relation j2) |
That doesn't look like unadulterated SQL to me! And you are filling up my thread with (obtuse Haskell and braggart) noise! You have no chance in hell of creating anything popular... ...if you think that gobbledygook is more readable than SQL syntax for the person who wants to express SQL. |
That's relational algebra :-) Here's an example:
Taken from here: https://www.cs.rochester.edu/~nelson/courses/csc_173/relations/algebra.html
C# LINQ seems pretty popular and its based on exactly the same ideas. |
It's not elegant. It's gobbledygook. |
Its very elegant. Relational algebra avoids several problems that SQL has as a language. SQL is not a proper algebra, its got a lot of ad-hoc elements that are pretty bad. |
It's not SQL. I asked you to show me how to do unadulterated SQL without a code transformation. |
Its better then SQL. Why would I recreate a flawed query model along with all its problems. I took the opportunity to so something better. In any case you are not going to be able to parse SQL syntax in the host language with macro's either because you cannot represent SQL in the languages AST. |
Whether SQL is good or not is not germane to the point of this thread, which is whether it is possible to model the unadulterated syntax of another language without a code transformation. Yes or no?
Incorrect. Re-read the OP. |
No, and its not possible to model it with code transformation either. |
Incorrect. |
prove it :-) |
Re-read the OP. It is explained there. |
That's not a proof, I want code :-) Its not possible without custom parsers that can be implemented for new types of literals. |
Which is exactly what I wrote in the OP. |
Back on examples where macros can’t be done with functions, there is no way to write a function will execute its argument expressions in the lexical scope of the caller and provide the name of the variables as a separate argument. There is no way to form that functional closure with the replacement lexical binding without a code transformation. Paul Graham’s anaphoric macro eliminates the boilerplate of lambdas. It couldn’t be done without boilerplate nor macros with Scala Another example is when you want to declare numerous types and/or functions (in the lexical scope of the caller) based on some pattern (that would be an absolute pita to manually enter from the keyboard). A function can’t declare new types and functions. You can only do this with code generation. There are instances where only code transformation will suffice. |
And why would you ever want to do that?
Which is again the kind of trick I don't want to allow. If I wanted that kind of thing I could provide first class environments, but its just makes code harder to understand for no real reason. What algorithm do you need the above to implement? Does it help me implement a sort function?
Needing to do this is a symptom of a lack of sufficient abstractive power in the language itself. The 'numerous functions' should be a single generic function. Generics remove the need for macros in the language by providing a type safe way to abstract code over parameters. |
Tikhon Jelvis opines that Haskell would benefit from macros for syntax sugar to clean up some of the syntactic noise:
Any reactions? |
Personally I would find that a terrible idea. I like that once I have learned the syntax and semantics I understand the language and can read anyone's code. Macro's would take that away. There are other reasons, but I think this is such an important one we don't need to go I to the others. Haskell has a great system for generics using type-classes, and even allows operator overloading. You can create your own DSL using the built in syntax, so why would you want macro's, apart from to make it not look like Haskell, which would be bad. |
If you have optional lazyness combined with inlining you can cover a wide range of use cases for macros. |
Wrong, and I never understood the argument of "writing code in any order". There is always in order in code evaluation, it depends what you want to have first for evaluation. |
@sighoya Order matters, it's why you cannot write programs (apart from proofs) in Prolog without 'cut'. Infact where it really shows is that some programs work in Prolog without 'cut' because they are logically correct, but will not terminate in your lifetime without it :-) |
Well guys I argued in the Why Not Haskell? that we want to code as much as possible so that order does not matter, e.g. the Actor paradigm. Yes order matters sometimes, but when it does not need to matter, then we shouldn’t fail to take advantage of that increased degrees-of-freedom. @sighoya why did you thread jack? This thread is not about purity. We have a thread for that. You should have posted your comment in the correct thread. Sorry my German perfectionism can’t handle this chaos. Let's just start writing randomly in any thread okay? 😵 I think those who are sloppy in their communications will also write sloppy code. |
I wrote the following in an email to @analyticbastard
https://clojure.org/guides/faq#why_keywords
Also: https://m.oursky.com/why-i-chose-clojure-over-javascript-24f045daab7e |
@shelby3 if you want to see how much help types are, you only have to compare programming JavaScript with programming TypeScript. In production code TypeScript is much more reliable, especially if you use strict null checking. I also think a complete separation between code and data is a good idea. Data should not be executable, and code should be static and unchanging. In this was the correctness of code can be verified before delivery to the user. I think the kind of dynamic LISP like environments that mix code and data are great where the programmer is also the end user. I think macro's, staged compilation and even code generators are not good ideas in general. I think syntax should have a consistent meaning, and datatype generic programming allows similar conciseness without the same problems. I do like the idea of allowing operator overloading and custom operators, but I think it's important that operator semantics are consistent (ie predictable by the programmer without consulting the implementation). I think code should be readable in isolation, that means I should be able to read and understand the code in a module without looking at other modules, just by understanding the syntax of the language. This does mean applying axioms sometimes, for example when overloading '+' we should be able to check it's associative and commutative. (Should floating point addition use the '+' operator as it's not always associative?) |
Hickey points out that performance optimization may be the most important use of static typing.
I will reserve judgement until I gain more experience. But Hickey is correct that eventually at some level of complexity typing runs out of usable runway. I may find that I need the additional flexibility and power that would otherwise be intractable with typing. I think perhaps you are being too dogmatic. Don’t throw out the baby with the bath water. But again I will wait until I have real world use cases from experience to discuss with you. Until then, I dunno.
Yes strive for that but don’t even attempt to prove everything and don’t require a global, total order because remember I already debated you in the Subtyping thread #8 that universal, total orders do not exist. I will not repeat that mathematical and philosophical debate again. You simply can’t hold the world fully ordered in the palm of your hand Keean. Don’t let the impossible quest for perfection be the enemy of good, and accept it doesn’t entirely fit into your ideals. We can’t reconcile such a rigid idealism, and still flex, with unbounded nature. Note I do appreciate the points you have made in this thread and others. I learned a lot. Thanks.
Hickey’s valid point is that types are only really valuable for very low-level, low value “typos” level of checks. The orders-of-magnitude more important semantics can’t be checked with types without abusing types so much that programming become refactoring hell and rigor(ous) mortis. Unit tests are much more important. |
I am not sure that everything runs out of runway. A Turing machine can execute any computable program. The end of the runway is really the limit of computability, you want something more, you need a quantum computer. So if the limit of computability is fixed, what is left is expressive power. Generics have coped with the needs of mathematics for thousands of years (whether mathematicians understood that or not is a separate question). Stepanov's "From Mathematics to Generic Programming" presents programming as part of mathematics thousand year history. What more flexibility and power could you need than the whole of mathematics? Even an intractable example would not overthrow thousands of years of mathematical progress. That would be like creationists claiming one fossil could "dis-prove" evolution... It would not, it is more likely something is wrong with the fossil (mis-dated, a fake etc). If universal total orders do not exist, then what is the periodic table? Here we have a rigid ideal that not only allows us to "flex" with nature, it let's us find new elements, it let's us discover things we would no be able to do without it. In any case, nobody is talking about being able to prove everything. Some things are unprovable within any system (see Godel). However operator overloading can lead to cryptic programs if '+' can stand for any binary function. Of course people can write critic programs using any function, so 'add' could divide for example, so we can't stop every obfuscation. Instead I suggest we focus on stopping the easy ones to stop, and in that way we reduce the cognitive load on the programmer allowing them to focus on the truly complex problems. We can see this cognitive load principle with JavaScript Vs TypeScript. When I code in JS, I often find it takes a long time to get all the typos out of the code. After getting the program working, I am never totally sure it's correct in all circumstances, and yes it needs thorough testing, but this means running the program and making sure we get 100% code coverage in tests. This takes a long time and just the process of doing this takes up time that could be used validating higher level properties. With TS the program would not even compile for a lot of typos, saving a huge amount of time going round the develop/test loop. So even if types only prevented typos (they don't) they would still be worth it. An example where types help semantically: the other day I was trying to debug a graphics rendering problem - it took a long time to find the cause, which was not obvious from the symptoms. It turns out that for one particular type of render object a Uint8Array was being passed instead of an ArrayBuffer. Without typing this would have been impossible to spot - I have written similar sized JS programs and after 3 or 4 years of continuous development you just have no idea what types are being passed around. Here is a case where testing found the problem, but it was no help in solving the problem. You simply have test 'X' fails. You look at the bitmap comparison, and yes the rendered image is distorted compared to the reference. Test driven development is a great idea when you know what it is you are developing. When writing a compiler tests are where I would start. A lot of software is more experimental though, and you can't write the tests before you know what you are testing. But my point above is that the develop/build/test cycle slows down development. It's a lot quicker if the compiler points out the error than if you have to build and run the whole test suite. It's even quicker if your IDE points out the error. For example I have started using the eslint "compat" plugin that test you set target browser versions and will lint and functions that are not supported in any of the selected browsers and versions, this saves a lot of cross-browser testing time. Of course you still do a full test across a range of platforms when releasing software, but you don't want this burden in your main development loop. Running a complete set of tests across different platforms can take days for large software projects, especially when looking for resource starvation bugs like memory leaks. So when the IDE tells you the type error as you code, you can save a lot of time. Types also facilitate independent testing of modules because they specify the interface precisely, which allows modules to be tested independently, again resulting in an exponential (in number of modules) speed up in testing. In summary I find developing without types tediously slow, because you have to keep interrupting coding to run tests, which takes longer and longer as the software grows. |
Think about the intractability of composing Monads with Monad transformers. Then explode your brain by imagining needing to abstract that to composing Monad transformers. I hope you understand the runway is indeed limited. And mathematics is highly limited. As I had argued in the Subtyping thread even the proofs are becoming computational and thus the Halting problem has now invaded mathematics.
I am not going to reopen that discussion/debate we engaged in the Subtyping thread. At least not now. Too busy. The periodic table is not a total order — it’s aliasing error at best (it’s one myopic attempt to summarize nature). Again I will not reply if you attempt to rebut that. That that would be a thread jack anyway.
I did not claim typing is useless. I claimed we should not abuse typing because otherwise we will end up in (refactoring, modularity and composability) rigor(ous) mortis. Examples of abusing typing are probably Monad compositions and perhaps even GADTs. It eventually becomes an unwieldy hairball and if pushed far enough a full blown dumpster fire. I am currently enthusiastic about typeclasses and I think types may be important for certain performance optimizations. I do not disagree with your points about type also add some information. But I also agree with Hickey that types often say very little about the semantics if anything at all.
Somehow you must have missed my statement that typing can’t check all semantic errors. I was speaking about what typing can’t realistically do and thus testing is the ONLY way to catch those errors anyway.
Semantic errors can leak across typed APIs and still type check. |
@shelby3 so I watched the whole clojure video and I liked a lot of what he said. I don't agree with him on types, and think he is missing something by only thinking about values. In Stepanov's introduction to "Elements of Programming" he talks about entities (values), species, and genus. Clojure is only operating at the level of the first of these. I agree with a lot of things in that video, and I agree that we need to consider the "wire" protocol, and that "classes" don't work. If all he had said was that types (as in C++) don't work, then I could agree, but he also says algebraic datatypes don't work (as in Haskell), yet goes on to claim that being an algebra (composition etc) is important. There's a lot of good stuff in that video, and the Epochal timeline is exactly what I was proposing about using state-machines. I think taking Stepanov's model, I would argue there is a lot of value in adding Species(types) and Genus(typeclasses) to a language. The comment that (int,int,float,string) is semantically meaningless is true, but I see it more as an attack on product types Vs Maps than types themselves. I would see this as more a strong argument for nominal types Vs structural types. But I would also suggest that using species/types correctly a product can contain information, for example (Dog, Sheep) it's important to know which is the sheep and which the sheep-dog, even if we don't know anything about that particular set of sheep and dog. Further Genus, also provides semantic information like I think he's completely correct, if you look at types from a certain point of view, but I think it completely missed the point about Entity, Species and Genus. |
I totally agree, been there, done that, I don't think it's the right way. Note that Monads only are important when you approach combining functional purity with static types in a certain way. "Clean" is a statically typed pure functional language with no 'Monads'... Algebraic affects seem a better approach that is compostable.
I am not saying you don't need to test, I am saying that catching as many errors as possible, as early as possible, saves you time and makes you more productive. In my experience programming in TypeScript is a lot more enjoyable, productive and results in a lot less errors in shipped code than programming in JavaScript (because no testing is perfect), and here you have a good comparison because the value level syntax and semantics of the languages are the same. Also note that "spec" is practically a type system: https://clojure.org/guides/spec Which is designed for specifying things like "wire protocols", but also note that in the video he says (paraphrased) we should write the internals of our applications like wire protocols. So we should be using "spec" to internally define APIs and interfaces, which is really an ad-hoc type system by another name. To me it seems like "type systems are bad" has become a box he can't break out of, and rather than admit he was wrong, he's going to great lengths to make "spec" not a type system... Or maybe he just has a limited view of what a type system is. By the way I agree with him about academics :-) databases are important, wire protocols are important. I would add that specifying interfaces both wire and internal are important (especially with multi-person dev teams, working with libraries, even geographically distributed teams). |
[…]
That is what I wrote in my first post about this yesterday.
Yeah but the “we can type everything” purists have the analogous dilemma from the other side of the coin. |
@shelby3 Also I think libraries are a big problem, and by his own admission unsolved by clojure. For me types and generics are an important part of the library problem because they allow interfaces to be specified (and a library interface should be treated just like a wire protocol). Generics solve his problem of having a string at one point and at string somewhere else, without all the intervening code needing to know about the string, and needing to be refactored if the string changes type. By generics here I mean the combination of types (Species) and typeclasses (Genus). |
I agree with that but Hickey also made the point that even composable typing can become unwieldy to compose requiring refactoring etc.. I think when we were discussing how to modularize typeclassing so we can multifurcate the abstract algebra we realized that there could be the Cambrian explosion that Hickey implicitly alludes to. So I can understand that to be philosophically why he may be trying to avoid adding strong typing. |
I find typing helps with refactoring, because when I change the type, the type checker finds all the code I need to change so I don't miss any. I find my refactorings in TypeScript are a lot easier and quicker than refactoring the same JavaScript. I think this is slightly missing the point, as he probably means refactorings are a symptom of the problem, and I agree, but generics with typeclasses and row-polymorphism is one answer to this. I agree with his points on composition, but I find they are not actually opposed to types, but more opposed to structural types. For one I would strongly be I favour of having relations as a built in type, along with relational algebra. I think this answers a lot of his points about composibility without reducing everything to a Typeless map. His points about composing data, that's a relational Join, and subsets, that's a relational Restrict. Relations are typed, and yet maintain all the properties of Maps that he values, as far as I can tell. His problem with this would be needing to know the complete types of relations, but again generics and row polymorphism come to the rescue. I can specify a function that inputs a relation that contains a Name type , but I don't care about the rest and it just gets passed along. |
I believe that is a facet of his point. That you actually have to change them all. That’s the Cambrian explosion. Type inference should not be employed for APIs.
I believe that is another facet of his point. To compose is analogous to Robert Harper’s criticism on a the monolithic abstract algebra for Haskell’s typeclasses: https://existentialtype.wordpress.com/2011/04/16/modules-matter-most/ “As a consequence, using type classes is, in Greg Morrisett’s term, like steering the Queen Mary: you have to get this hulking mass pointed in the right direction so that the inference mechanism resolves things the way you want it to.” Hickey would prefer we can just pretend the APIs compose fully and use testing to make sure we did not violate any of our assumptions about facets of the API our callers don’t actually engage.
I agree that structural compared to nominal typing increases the lack of specificity which widens the Cambrian explosion or mass of the hulking typing morass. I think I am inclined to be clever about how I modularize and structure APIs to minimize the issue. I do think he has a point that at some level we punt to untyped semantics and employ more testing. |
@shelby3 I prefer type inference. If you can write code that looks untyped, without the need extra clutter, but it catches some problems, then this is only a benefit. Then at the module/interface/wire-protocol level the type system forms a specification. If the wire protocol spec, and the type system are the same, then you get rid of any impedance matching required.and therefore boilerplate types/code. |
@keean I think you are just entirely missing the point that requiring everything to have a type whether it is inferred or not can become too heavy to lift. [Inserted edit: I am sympathetic to Hickey’s point, although I fall more on the side of @keean’s preference for strong typing and at least type classes. Where I may differ from @keean is that I want to punt to not trying to type the semantics — I want to draw the line somewhere in terms of how much typing tsuris I would tolerate. Which is why I don’t like Rust, because I think the benefits of manual lifetime typing don’t outweigh the tsuris and overly complex, brittle type system.] You think you have a typing solution for everything, e.g. row polymorphism, typed relational algebra, etc, but that belies Hickey’s point. As you well know that the more features you add to a typing system, the more byzantine it becomes due to the myriad of ways that type features interact. I think he has a very strong point. Perfection is the enemy of good. Since we need to write unit tests anyway, then typing at some level of abuse that you seem to be advocating is just more pita than it is actually beneficial to any great extent. Finding the right balance is what I am trying to figure out. Whereas, you seem determine to remain a type system zealot aka purist. I say actually try to build your ADA-on-steriods and see how it works out in reality. |
@keean please correct me if I am mistaken as follows. And please feel free to discuss this. I am trying to determine whether Clojure offers a viable option for my need to write code both for JavaScript clients and performant, highly scalable servers. Currently I am thinking the optimal solution for my needs would be Scala 3 compiled to ScalaJS and perhaps to Go if I create a Scala compiler plugin for transpiling (a subset of Scala that I use) to Go.
I replied on Quora:
|
Clojure Developer use mostly persistent data structures with log2 performance. Ref is only for critical sections of mutable transitions or you can leverage just Java for it. Some points about Clojure:
|
@sighoya good to see that you’re still alive. I was worried when you went silent. I hope we can both find solutions for our health. I replied to your email about that.
[…]
Edward Kmett’s 2009 post is how I first became of aware of that immutable data structures have a logarithm performance penalty for algorithms that require random access writes.
Strong typing is ill-fit to Lisp.
I criticized that:
|
A detailed follow-up discussion about the tradeoffs with static typing. And a new idea that requires static typing for achieving its posited benefits. |
A homoiconic programming language can parse itself to an AST expressed as native data that can be transformed within the language, evaluated, or isomorphically serialized to human readable syntax.
I don’t think the sacrifices in syntactical and grammatical clarity caused by reduction to minimum consistent grammar such as hierarchical lists of lists s-expressions, is necessary to achieve the homoiconicity which enables macros and DSLs. Although Lisp's low-level semantics distills to Seven Primitive Operators, which makes its AST
eval
simple to implement, the complexity of higher-level semantics which can be expressed (and thus potential semantic complexity of a macro's semantics) is not limited any more so than for any Turing-complete language.Requirements:
For macros we’d only need #1 - 3, because macros take some or all of their arguments as AST references. We wouldn’t want macros to depend on wide-scale inference as this could result in live or dead locks on inference decidability. K.I.S.S.. For DSLs such as a SQL query language, we need to select a customized parser. If we know the data type of the expression being parsed is for example
SQL.Query
and if that type has associated a customized parser and separately compiler, then that parser is run to create the customized AST of the said expression (and if it is not a macro then the associated compiler is run). There are cases where the expression determines its own type (i.e. the expression has side-effects and it is not assigned or the type of the reference it is assigned to it inferred from the expression's type), so the current parser is presumed. This strategy would not work with any wide-scale type inference, which is yet another reason I am against non-localized type inference.A customized parser and compiler could invoke another parser and compiler for an expression within its grammar (perhaps using a similar strategy of selecting by target data type or by explicit delimiters, e.g. the use of curly braces in React JSX), and this could even be recursive.
The text was updated successfully, but these errors were encountered: