Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macros and DSLs #31

Open
shelby3 opened this issue Mar 15, 2017 · 97 comments
Open

Macros and DSLs #31

shelby3 opened this issue Mar 15, 2017 · 97 comments

Comments

@shelby3
Copy link

shelby3 commented Mar 15, 2017

A homoiconic programming language can parse itself to an AST expressed as native data that can be transformed within the language, evaluated, or isomorphically serialized to human readable syntax.

I don’t think the sacrifices in syntactical and grammatical clarity caused by reduction to minimum consistent grammar such as hierarchical lists of lists s-expressions, is necessary to achieve the homoiconicity which enables macros and DSLs. Although Lisp's low-level semantics distills to Seven Primitive Operators, which makes its AST eval simple to implement, the complexity of higher-level semantics which can be expressed (and thus potential semantic complexity of a macro's semantics) is not limited any more so than for any Turing-complete language.

Requirements:

  1. Self-hosted compiler.
  2. Bracketed quotes.
  3. Type inference localized to expressions.
  4. Everything-as-an-expression syntax so that we can select the parser based on the data type the expression is assigned to.
  5. Incremental AST compilation whilst parsing so the compiler knows the data type for #​4.

For macros we’d only need #​1 - 3, because macros take some or all of their arguments as AST references. We wouldn’t want macros to depend on wide-scale inference as this could result in live or dead locks on inference decidability. K.I.S.S.. For DSLs such as a SQL query language, we need to select a customized parser. If we know the data type of the expression being parsed is for example SQL.Query and if that type has associated a customized parser and separately compiler, then that parser is run to create the customized AST of the said expression (and if it is not a macro then the associated compiler is run). There are cases where the expression determines its own type (i.e. the expression has side-effects and it is not assigned or the type of the reference it is assigned to it inferred from the expression's type), so the current parser is presumed. This strategy would not work with any wide-scale type inference, which is yet another reason I am against non-localized type inference.

A customized parser and compiler could invoke another parser and compiler for an expression within its grammar (perhaps using a similar strategy of selecting by target data type or by explicit delimiters, e.g. the use of curly braces in React JSX), and this could even be recursive.

@keean
Copy link
Owner

keean commented Mar 15, 2017

For macros we only need #​1 - 3

I don't want macros :-) I don't want a second weaker version of application in a language.

What you are basically saying is that the data format for expressing objects combined with the AST is the language syntax. I am not sure I want this - as the AST currently features things like application nodes which you would not want in the language syntax. I think homoiconicity is over-rated, and I would rather have a nice user friendly syntax without it compromising the AST, and vice versa.

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

I don't want macros :-)

Many programmers want them. Also DSLs.

a second weaker version of application

Huh?

as the AST currently features things like application nodes

Huh?

I would rather have a nice user friendly syntax without it compromising the AST, and vice versa

Me too. I didn't see any flaw in my proposal for macros which would prevent that. Perhaps you misread the OP.

@keean
Copy link
Owner

keean commented Mar 15, 2017

Macro's are bad and are included in languages because normal function application is not powerful enough. If you want a function write a function.

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

Making things too powerful is often worse than a proper separation-of-concerns.

I doubt you can achieve the same things that macros and DSLs can achieve without making the type system so powerful and obtusely convoluted (and with mind bending and numbing type inference) that nobody can understand the code any more, not even the person who wrote it. K.I.S.S..

@keean
Copy link
Owner

keean commented Mar 15, 2017

Macro's provide an escape hatch to bypass the type system (in some languages), this is a bad thing. Messing about with the program code itself using macro splicing is a bad idea, just write a function to do the thing in the first place. With generics you should never need a macro anyway.

@keean
Copy link
Owner

keean commented Mar 15, 2017

Macros also make debugging harder as they are applied at compile time. Any errors reported at runtime will be in the post macro processed code, so the line numbers and context cannot be found in the program source. I have yet to see one compelling use of macros in the last 20 years (and this comes from someone that wrote their own macro-assembler and used to think that macros were all you needed to build a compiler and a high level language).

@keean
Copy link
Owner

keean commented Mar 15, 2017

Maybe you can persuade me, if you can provide an example of a macro that cannot be more cleanly implemented in another way?

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

Macros enable editing the source code of the arguments before they are evaluated. A functional lazy evaluation of those arguments will not enable the transformation of their source code. Macros can thus perform some automated operations based on the structure of the source code within its arguments, not just the evaluated value of the arguments. That is turtles all the way down recursively.

For example, if we have list of functions we want to call with different numbers of arguments, they each have different types, and we want to add new common argument to the end of each of these calls, and collect the results in a list. Much easier to call a macro to do that, as it doesn't need to fight with the type system, as the type checking will be performed after the macro has done its code transformation. This example could also be done with partial application without a macro, except the typing can still get unwieldy, because those functions might have different type signatures for remaining default arguments or we have to manually specify the default arguments. Macros are basically a way the programmer can rescue himself from the fact that the language development moves too slowly and can't always do everything that you and I could imagine it might do one day in the distant future.

Macros are basically useful any where you need to do transpiling, so you don't have to write your own whole program compiler. And there have been many times I wanted to transform code, e.g. to convert the if as an expression to a lamba function compatible with TypeScript. If TypeScript had powerful macros, I might already be done with some of the features I wanted for my transpiler! Instead I will have to reinvent the wheel.

Also much of the functionality needed for macros is also needed for evaluating and running code sent over the wire, i.e. dynamic code compilation and modularity. We can't expect to link everything statically in this era of the Internet.

DSLs enable friendlier syntax such as SQL queries that are checked grammatically by the compiler.

Edit: a possible new idea for the utility of AST macros: #11 (comment)

@keean
Copy link
Owner

keean commented Mar 15, 2017

Macros enable editing the source code of the arguments before they are evaluated.

I know what macro's are, I am saying there is always a better way to do the same thing, without having source-to-source rewriting going on in the program code, which obfuscates errors and results in another case of write-only-code.

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

there is always a better way to do the same thing

Someday. But someday can be too many years (even decades) too late. Refresh.

@keean
Copy link
Owner

keean commented Mar 15, 2017

I have not written a macro for about 20 years, so that someday was decades ago :-)

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

I don't limit my analysis to only your anecdotal limited needs. I’m interested in mine and those of others.

I think DSLs are the most important thing that can't be done reasonably without transforming code. I explained to enable the "macros" (parser + compiler transformation of code) for those based on detecting the type of the expression.

Show me how to write a SQL query DSL (without any noise in the syntax that isn't in the normal SQL syntax!) that is grammatically correct at compile (evaluation) time, that can be done without code transformation. Scala can sort of finagle it by use infix function names as keywords, but that is not compile-time parsing of the SQL specific syntax issues.

@keean
Copy link
Owner

keean commented Mar 15, 2017

You do not need macro's to create DSLs.

Show me how to write a SQL query DSL

Sure I did it in Haskell back in 2004 :-)

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

Sure I did it in Haskell back in 2004 :-)

I am not interested in your obtuse HList gobbledygook.

A DSL should be elegant and easy to comprehend and work correctly. I asked you to show me how to do this, not asking you to brag.

@keean
Copy link
Owner

keean commented Mar 15, 2017

What's obtuse about this:

moveContaminatedAnimal :: DAnimalType -> DCntdType -> DFarmId -> Query ()
moveContaminatedAnimal animalType contaminationType newLocation = do
    a1 <- table animalTable
    a2 <- restrict a1 (\r -> r!AnimalType `SQL.eq` animalType)
    c1 <- table contaminatedTable
    c2 <- restrict c1 (\r -> r!CntdType `SQL.eq` contaminationType)
    j1 <- join SqlInnerJoin a2 c2 (\r -> r!AnimalId `SQL.eq` r!CntdAnimal)
    j2 <- project j1 (CntdAnimal .*. HNil)
    doUpdate animalTable
        (\_ -> AnimalLocation .=. toSqlType newLocation .*. HNil)
        (\r -> r!AnimalId `SQL.elem` relation j2)

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

That doesn't look like unadulterated SQL to me! And you are filling up my thread with (obtuse Haskell and braggart) noise!

You have no chance in hell of creating anything popular...

...if you think that gobbledygook is more readable than SQL syntax for the person who wants to express SQL.

@keean
Copy link
Owner

keean commented Mar 15, 2017

That's relational algebra :-)

Here's an example:

R1 = join(CSG,SNAP)
R2 = select(CDH,Day="Monday" and Hour="Noon")
R3 = join(R1,R2)
R4 = select(R3,Name="Amy")
R5 = join(R4,CR)
R6 = project(R5,Room)

Taken from here: https://www.cs.rochester.edu/~nelson/courses/csc_173/relations/algebra.html

You have no chance in hell of creating anything popular.

C# LINQ seems pretty popular and its based on exactly the same ideas.

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

That's relational algebra :-)

It's not elegant. It's gobbledygook.

@keean
Copy link
Owner

keean commented Mar 15, 2017

I don't care! Its not elegant.

Its very elegant. Relational algebra avoids several problems that SQL has as a language. SQL is not a proper algebra, its got a lot of ad-hoc elements that are pretty bad.

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

It's not SQL.

I asked you to show me how to do unadulterated SQL without a code transformation.

@keean
Copy link
Owner

keean commented Mar 15, 2017

Its better then SQL. Why would I recreate a flawed query model along with all its problems. I took the opportunity to so something better.

In any case you are not going to be able to parse SQL syntax in the host language with macro's either because you cannot represent SQL in the languages AST.

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

Whether SQL is good or not is not germane to the point of this thread, which is whether it is possible to model the unadulterated syntax of another language without a code transformation. Yes or no?

In any case you are not going to be able to do that with macro's either because you cannot represent SQL in the languages AST

Incorrect. Re-read the OP.

@keean
Copy link
Owner

keean commented Mar 15, 2017

No, and its not possible to model it with code transformation either.

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

its not possible to model it with code transformation either.

Incorrect.

@keean
Copy link
Owner

keean commented Mar 15, 2017

prove it :-)

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

prove it :-)

Re-read the OP. It is explained there.

@keean
Copy link
Owner

keean commented Mar 15, 2017

That's not a proof, I want code :-)

Its not possible without custom parsers that can be implemented for new types of literals.

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

Its not possible without custom parsers that can be implemented for new types of literals.

Which is exactly what I wrote in the OP.

@shelby3
Copy link
Author

shelby3 commented Mar 15, 2017

Back on examples where macros can’t be done with functions, there is no way to write a function will execute its argument expressions in the lexical scope of the caller and provide the name of the variables as a separate argument. There is no way to form that functional closure with the replacement lexical binding without a code transformation.

Paul Graham’s anaphoric macro eliminates the boilerplate of lambdas. It couldn’t be done without boilerplate nor macros with Scala _ shorthand for lambdas that take only one argument if the it is within another lambda.

Another example is when you want to declare numerous types and/or functions (in the lexical scope of the caller) based on some pattern (that would be an absolute pita to manually enter from the keyboard). A function can’t declare new types and functions. You can only do this with code generation.

There are instances where only code transformation will suffice.

@keean
Copy link
Owner

keean commented Mar 15, 2017

there is no way to write a function will execute its argument expressions in the lexical scope of the caller and provide the name of the variables as a separate argument.

And why would you ever want to do that?

There is no way to form that functional closure with the replacement lexical binding without a code transformation.

Which is again the kind of trick I don't want to allow. If I wanted that kind of thing I could provide first class environments, but its just makes code harder to understand for no real reason.

What algorithm do you need the above to implement? Does it help me implement a sort function?

Another example is when you want to declare numerous types and/or functions (in the lexical scope of the caller) based on some pattern

Needing to do this is a symptom of a lack of sufficient abstractive power in the language itself. The 'numerous functions' should be a single generic function.

Generics remove the need for macros in the language by providing a type safe way to abstract code over parameters.

@shelby3
Copy link
Author

shelby3 commented Aug 27, 2018

Tikhon Jelvis opines that Haskell would benefit from macros for syntax sugar to clean up some of the syntactic noise:

it provides ad hoc syntax sugar instead of a flexible, disciplined macro system, which can lead to a bit of a syntactic mess.

I would love to see something that mostly looks and feels like Haskell but with a macro system and no language-level syntactic sugar.

Any reactions?

@keean
Copy link
Owner

keean commented Aug 27, 2018

@shelby3

I would love to see something that mostly looks and feels like Haskell but with a macro system and no language-level syntactic sugar.

Personally I would find that a terrible idea. I like that once I have learned the syntax and semantics I understand the language and can read anyone's code. Macro's would take that away. There are other reasons, but I think this is such an important one we don't need to go I to the others. Haskell has a great system for generics using type-classes, and even allows operator overloading. You can create your own DSL using the built in syntax, so why would you want macro's, apart from to make it not look like Haskell, which would be bad.

@sighoya
Copy link

sighoya commented Aug 27, 2018

@shelby3

If you have optional lazyness combined with inlining you can cover a wide range of use cases for macros.

@sighoya
Copy link

sighoya commented Aug 27, 2018

"Laziness by default is absolutely not a mistake."

Wrong, and I never understood the argument of "writing code in any order". There is always in order in code evaluation, it depends what you want to have first for evaluation.
When you implement lazyness with coroutines then you will see there is an order.

@keean
Copy link
Owner

keean commented Aug 27, 2018

@sighoya Order matters, it's why you cannot write programs (apart from proofs) in Prolog without 'cut'. Infact where it really shows is that some programs work in Prolog without 'cut' because they are logically correct, but will not terminate in your lifetime without it :-)

@shelby3
Copy link
Author

shelby3 commented Aug 27, 2018

Well guys I argued in the Why Not Haskell? that we want to code as much as possible so that order does not matter, e.g. the Actor paradigm.

Yes order matters sometimes, but when it does not need to matter, then we shouldn’t fail to take advantage of that increased degrees-of-freedom.

@sighoya why did you thread jack? This thread is not about purity. We have a thread for that. You should have posted your comment in the correct thread.

Sorry my German perfectionism can’t handle this chaos.

Let's just start writing randomly in any thread okay? 😵

I think those who are sloppy in their communications will also write sloppy code.

@shelby3
Copy link
Author

shelby3 commented Dec 9, 2018

@shelby3
Copy link
Author

shelby3 commented Sep 4, 2020

Why not Clojure (or any LISP)?

@analyticbastard is a Clojure expert. Clojure is a variant of LISP which normally runs on the JVM (Java Virtual Machine). He tells me that he loves homoiconicity. I explained that homoiconicity is very good for the programmer who creates the code because homoiconicity allows them to create private syntaxes and macros which enable their code to do magical jumps through circus hoops and do more with less code.

But as we discussed up-thread, it is not so good for the readers of the code who have to try to learn all these new private syntaxes invented for each code base. IOW, it lacks the regularity that I wrote about in my prior comment on Apr 13.

But the counter argument is that private syntaxes are more readable than the cruft and boilerplate that would otherwise be required in a regularized PL without homoiconicity.

@keean argued up-thread that all the necessary cruft can be avoided with polymorphism (aka generics) and functions (along with typeclasses). I tried to find examples of macros that could refute @keean and I was unable to.

@analyticbastard can you present a macro that leverages homoiconicity and that refutes @keean?

I wrote the following in an email to @analyticbastard

Originally I wanted to target Go, but that just seems impossible to achieve right now from any high-level PL with strong typing.

Hickey is incorrect in some of his claims about the value of typing because he is for example as follows thinking of classes and not type classes:

https://lispcast.com/clojure-and-types/#types-as-concretions

https://youtu.be/2V1FtfBDsLU?t=1141

A typeclass is just a requested interface. There is not concrete abstraction, because additional interfaces can be requested by any function. Whereas classes bind interfaces to data at instantiation.

That is why we will utilize typeclasses and afaik only Haskell, Scala, Rust and Swift offer typeclasses at this time.

I do agree with him (and I disagree with Keean) about the futility of attempting to type check semantics at runtime. Much of our logic in the semantics, so types are not as important as some people think they are. But types are also useful when not abused.

https://youtu.be/2V1FtfBDsLU?t=2097
(Place Oriented Programming)

My answer to this is not just immutability but also an ALP Arena paradigm, because otherwise it’s not going to scale to massively multicore due to for example one GC for all threads.

https://youtu.be/2V1FtfBDsLU?t=2266
(Information)

Hickey may not realize he is championing typeclasses on this point.

https://youtu.be/2V1FtfBDsLU?t=2605
(Clojure and Information)

I do not fully grok this point. Why do we need names to be “first class” instead of just using strings for keys of a map?

https://clojure.org/guides/faq#why_keywords

https://youtu.be/2V1FtfBDsLU?t=2724
(Brittleness and Coupling)

This I believe is mitigated to great extent with degrees-of-freedom with typeclasses. Also Scala and JS both support named-parameters.

https://youtu.be/2V1FtfBDsLU?t=3055
(Language Model Complexity)

I agree with him that reducing the language model complexity is an important goal. We will be using only a subset of Scala and Java’s model, e.g. we will not use classes nor subtyping of classes (i.e. subclassing).

https://youtu.be/2V1FtfBDsLU?t=3164
(Parochialism - names)

He constructs a strawman. The culprit is not static typing per se.

https://youtu.be/2V1FtfBDsLU?t=4008
(Joy of Types rebuttal)

Typeclasses are names (in my design, not structural as in Haskell)! Thus they do specify an interface. Typeclasses enable the degrees-of-freedom Hickey laments don’t exist with classes. Problem is Haskell forces a structural match on typeclasses instead of names (which I think is a design error) and Haskell forces a consistent algebra for each typeclass, which I think is also an unobtainable delusion (aka unobtainium). I have disagree with both Keean and Skaller about for example proving axioms for typeclasses. That’s taking it too far into the the sort of useless and counterproductive activity that Hickey laments. The only real, scalable utility of the typeclasses is a named interface that can be bound at the function site. And yes extensive unit testing can not be replaced by static typing except in small-world, toy examples.

In summary of all his points, yes a keyed-map is a lingua franca and code-as-data with dynamism enables more capabilities. And yes maybe some programs want to include a LISP interpreter library. But static typing is still important, especially for high-performance optimization, space-efficient data structures and cryptographic algorithms.

Note I agree with the advantages in the following linked, of pulling from a pool of enthuiastic, knowledgeable, and pragmatic (i.e. not dogmatic, pious) programmers, as well not having to wait for slow compiles and the flexibility of dynamic typing when needed:

https://lispcast.com/what-is-the-business-value-of-clojure/

https://clojure.org/about/rationale

Apparently there’s a Clojure to C++ compiler which in theory could be compiled to ASM.js with Emscripten:

https://ferret-lang.org/

There’s typed Clojure:

https://typedclojure.org/

Although it’s apparently heresy:

https://stackoverflow.com/questions/5110292/why-is-clojure-dynamically-typed/5110459#5110459

Also:

https://clojurescript.org/

https://m.oursky.com/why-i-chose-clojure-over-javascript-24f045daab7e

@keean
Copy link
Owner

keean commented Sep 5, 2020

@shelby3 if you want to see how much help types are, you only have to compare programming JavaScript with programming TypeScript. In production code TypeScript is much more reliable, especially if you use strict null checking.

I also think a complete separation between code and data is a good idea. Data should not be executable, and code should be static and unchanging. In this was the correctness of code can be verified before delivery to the user.

I think the kind of dynamic LISP like environments that mix code and data are great where the programmer is also the end user.

I think macro's, staged compilation and even code generators are not good ideas in general. I think syntax should have a consistent meaning, and datatype generic programming allows similar conciseness without the same problems. I do like the idea of allowing operator overloading and custom operators, but I think it's important that operator semantics are consistent (ie predictable by the programmer without consulting the implementation). I think code should be readable in isolation, that means I should be able to read and understand the code in a module without looking at other modules, just by understanding the syntax of the language. This does mean applying axioms sometimes, for example when overloading '+' we should be able to check it's associative and commutative. (Should floating point addition use the '+' operator as it's not always associative?)

@shelby3
Copy link
Author

shelby3 commented Sep 6, 2020

Hickey points out that performance optimization may be the most important use of static typing.

I also think a complete separation between code and data is a good idea. Data should not be executable, and code should be static and unchanging. In this was the correctness of code can be verified before delivery to the user.

I think the kind of dynamic LISP like environments that mix code and data are great where the programmer is also the end user.

I think macro's, staged compilation and even code generators are not good ideas in general. I think syntax should have a consistent meaning, and datatype generic programming allows similar conciseness without the same problems.

I will reserve judgement until I gain more experience. But Hickey is correct that eventually at some level of complexity typing runs out of usable runway. I may find that I need the additional flexibility and power that would otherwise be intractable with typing.

I think perhaps you are being too dogmatic. Don’t throw out the baby with the bath water. But again I will wait until I have real world use cases from experience to discuss with you. Until then, I dunno.

I do like the idea of allowing operator overloading and custom operators, but I think it's important that operator semantics are consistent (ie predictable by the programmer without consulting the implementation). I think code should be readable in isolation, that means I should be able to read and understand the code in a module without looking at other modules, just by understanding the syntax of the language. This does mean applying axioms sometimes, for example when overloading '+' we should be able to check it's associative and commutative. (Should floating point addition use the '+' operator as it's not always associative?)

Yes strive for that but don’t even attempt to prove everything and don’t require a global, total order because remember I already debated you in the Subtyping thread #8 that universal, total orders do not exist. I will not repeat that mathematical and philosophical debate again.

You simply can’t hold the world fully ordered in the palm of your hand Keean. Don’t let the impossible quest for perfection be the enemy of good, and accept it doesn’t entirely fit into your ideals. We can’t reconcile such a rigid idealism, and still flex, with unbounded nature.

Note I do appreciate the points you have made in this thread and others. I learned a lot. Thanks.

if you want to see how much help types are, you only have to compare programming JavaScript with programming TypeScript. In production code TypeScript is much more reliable, especially if you use strict null checking.

Hickey’s valid point is that types are only really valuable for very low-level, low value “typos” level of checks. The orders-of-magnitude more important semantics can’t be checked with types without abusing types so much that programming become refactoring hell and rigor(ous) mortis. Unit tests are much more important.

image

@keean
Copy link
Owner

keean commented Sep 6, 2020

I am not sure that everything runs out of runway. A Turing machine can execute any computable program. The end of the runway is really the limit of computability, you want something more, you need a quantum computer. So if the limit of computability is fixed, what is left is expressive power. Generics have coped with the needs of mathematics for thousands of years (whether mathematicians understood that or not is a separate question). Stepanov's "From Mathematics to Generic Programming" presents programming as part of mathematics thousand year history.

What more flexibility and power could you need than the whole of mathematics? Even an intractable example would not overthrow thousands of years of mathematical progress. That would be like creationists claiming one fossil could "dis-prove" evolution... It would not, it is more likely something is wrong with the fossil (mis-dated, a fake etc).

If universal total orders do not exist, then what is the periodic table? Here we have a rigid ideal that not only allows us to "flex" with nature, it let's us find new elements, it let's us discover things we would no be able to do without it.

In any case, nobody is talking about being able to prove everything. Some things are unprovable within any system (see Godel). However operator overloading can lead to cryptic programs if '+' can stand for any binary function. Of course people can write critic programs using any function, so 'add' could divide for example, so we can't stop every obfuscation. Instead I suggest we focus on stopping the easy ones to stop, and in that way we reduce the cognitive load on the programmer allowing them to focus on the truly complex problems.

We can see this cognitive load principle with JavaScript Vs TypeScript. When I code in JS, I often find it takes a long time to get all the typos out of the code. After getting the program working, I am never totally sure it's correct in all circumstances, and yes it needs thorough testing, but this means running the program and making sure we get 100% code coverage in tests. This takes a long time and just the process of doing this takes up time that could be used validating higher level properties. With TS the program would not even compile for a lot of typos, saving a huge amount of time going round the develop/test loop. So even if types only prevented typos (they don't) they would still be worth it.

An example where types help semantically: the other day I was trying to debug a graphics rendering problem - it took a long time to find the cause, which was not obvious from the symptoms. It turns out that for one particular type of render object a Uint8Array was being passed instead of an ArrayBuffer. Without typing this would have been impossible to spot - I have written similar sized JS programs and after 3 or 4 years of continuous development you just have no idea what types are being passed around. Here is a case where testing found the problem, but it was no help in solving the problem. You simply have test 'X' fails. You look at the bitmap comparison, and yes the rendered image is distorted compared to the reference.

Test driven development is a great idea when you know what it is you are developing. When writing a compiler tests are where I would start. A lot of software is more experimental though, and you can't write the tests before you know what you are testing. But my point above is that the develop/build/test cycle slows down development. It's a lot quicker if the compiler points out the error than if you have to build and run the whole test suite. It's even quicker if your IDE points out the error. For example I have started using the eslint "compat" plugin that test you set target browser versions and will lint and functions that are not supported in any of the selected browsers and versions, this saves a lot of cross-browser testing time. Of course you still do a full test across a range of platforms when releasing software, but you don't want this burden in your main development loop. Running a complete set of tests across different platforms can take days for large software projects, especially when looking for resource starvation bugs like memory leaks. So when the IDE tells you the type error as you code, you can save a lot of time.

Types also facilitate independent testing of modules because they specify the interface precisely, which allows modules to be tested independently, again resulting in an exponential (in number of modules) speed up in testing.

In summary I find developing without types tediously slow, because you have to keep interrupting coding to run tests, which takes longer and longer as the software grows.

@shelby3
Copy link
Author

shelby3 commented Sep 6, 2020

What more flexibility and power could you need than the whole of mathematics?

Think about the intractability of composing Monads with Monad transformers. Then explode your brain by imagining needing to abstract that to composing Monad transformers.

I hope you understand the runway is indeed limited. And mathematics is highly limited. As I had argued in the Subtyping thread even the proofs are becoming computational and thus the Halting problem has now invaded mathematics.

If universal total orders do not exist, then what is the periodic table?

I am not going to reopen that discussion/debate we engaged in the Subtyping thread. At least not now. Too busy. The periodic table is not a total order — it’s aliasing error at best (it’s one myopic attempt to summarize nature). Again I will not reply if you attempt to rebut that. That that would be a thread jack anyway.

So even if types only prevented typos (they don't) they would still be worth it.

I did not claim typing is useless. I claimed we should not abuse typing because otherwise we will end up in (refactoring, modularity and composability) rigor(ous) mortis.

Examples of abusing typing are probably Monad compositions and perhaps even GADTs. It eventually becomes an unwieldy hairball and if pushed far enough a full blown dumpster fire.

I am currently enthusiastic about typeclasses and I think types may be important for certain performance optimizations. I do not disagree with your points about type also add some information. But I also agree with Hickey that types often say very little about the semantics if anything at all.

It's a lot quicker if the compiler points out the error than if you have to build and run the whole test suite. It's even quicker if your IDE points out the error.

Somehow you must have missed my statement that typing can’t check all semantic errors. I was speaking about what typing can’t realistically do and thus testing is the ONLY way to catch those errors anyway.

Types also facilitate independent testing of modules because they specify the interface precisely, which allows modules to be tested independently, again resulting in an exponential (in number of modules) speed up in testing.

Semantic errors can leak across typed APIs and still type check.

@keean
Copy link
Owner

keean commented Sep 6, 2020

@shelby3 so I watched the whole clojure video and I liked a lot of what he said. I don't agree with him on types, and think he is missing something by only thinking about values. In Stepanov's introduction to "Elements of Programming" he talks about entities (values), species, and genus. Clojure is only operating at the level of the first of these.

I agree with a lot of things in that video, and I agree that we need to consider the "wire" protocol, and that "classes" don't work. If all he had said was that types (as in C++) don't work, then I could agree, but he also says algebraic datatypes don't work (as in Haskell), yet goes on to claim that being an algebra (composition etc) is important.

There's a lot of good stuff in that video, and the Epochal timeline is exactly what I was proposing about using state-machines.

I think taking Stepanov's model, I would argue there is a lot of value in adding Species(types) and Genus(typeclasses) to a language. The comment that (int,int,float,string) is semantically meaningless is true, but I see it more as an attack on product types Vs Maps than types themselves. I would see this as more a strong argument for nominal types Vs structural types. But I would also suggest that using species/types correctly a product can contain information, for example (Dog, Sheep) it's important to know which is the sheep and which the sheep-dog, even if we don't know anything about that particular set of sheep and dog. Further Genus, also provides semantic information like Four legged a => (a, a, a)

I think he's completely correct, if you look at types from a certain point of view, but I think it completely missed the point about Entity, Species and Genus.

@keean
Copy link
Owner

keean commented Sep 6, 2020

@shelby3

Think about the intractability of composing Monads with Monad transformers. Then explode your brain by imagining needing to abstract that to composing Monad transformers.

I totally agree, been there, done that, I don't think it's the right way. Note that Monads only are important when you approach combining functional purity with static types in a certain way. "Clean" is a statically typed pure functional language with no 'Monads'...

Algebraic affects seem a better approach that is compostable.

Somehow you must have missed my statement that typing can’t check all semantic errors. I was speaking about what typing can’t realistically do and thus testing is the ONLY way to catch those errors anyway.

I am not saying you don't need to test, I am saying that catching as many errors as possible, as early as possible, saves you time and makes you more productive. In my experience programming in TypeScript is a lot more enjoyable, productive and results in a lot less errors in shipped code than programming in JavaScript (because no testing is perfect), and here you have a good comparison because the value level syntax and semantics of the languages are the same.

Also note that "spec" is practically a type system:

https://clojure.org/guides/spec

Which is designed for specifying things like "wire protocols", but also note that in the video he says (paraphrased) we should write the internals of our applications like wire protocols.

So we should be using "spec" to internally define APIs and interfaces, which is really an ad-hoc type system by another name. To me it seems like "type systems are bad" has become a box he can't break out of, and rather than admit he was wrong, he's going to great lengths to make "spec" not a type system... Or maybe he just has a limited view of what a type system is.

By the way I agree with him about academics :-) databases are important, wire protocols are important. I would add that specifying interfaces both wire and internal are important (especially with multi-person dev teams, working with libraries, even geographically distributed teams).

@shelby3
Copy link
Author

shelby3 commented Sep 6, 2020

I would see this as more a strong argument for nominal types Vs structural types.

[…]

I think he's completely correct, if you look at types from a certain point of view, but I think it completely missed the point about […typeclasses]

That is what I wrote in my first post about this yesterday.

So we should be using "spec" to internally define APIs and interfaces, which is really an ad-hoc type system by another name. To me it seems like "type systems are bad" has become a box he can't break out of, and rather than admit he was wrong, he's going to great lengths to make "spec" not a type system... Or maybe he just has a limited view of what a type system is.

Yeah but the “we can type everything” purists have the analogous dilemma from the other side of the coin.

@keean
Copy link
Owner

keean commented Sep 6, 2020

@shelby3 Also I think libraries are a big problem, and by his own admission unsolved by clojure. For me types and generics are an important part of the library problem because they allow interfaces to be specified (and a library interface should be treated just like a wire protocol).

Generics solve his problem of having a string at one point and at string somewhere else, without all the intervening code needing to know about the string, and needing to be refactored if the string changes type. By generics here I mean the combination of types (Species) and typeclasses (Genus).

@shelby3
Copy link
Author

shelby3 commented Sep 6, 2020

@shelby3 Also I think libraries are a big problem, and by his own admission unsolved by clojure. For me types and generics are an important part of the library problem because they allow interfaces to be specified (and a library interface should be treated just like a wire protocol).

Generics solve his problem of having a string at one point and at string somewhere else, without all the intervening code needing to know about the string, and needing to be refactored if the string changes type. By generics here I mean the combination of types (Species) and typeclasses (Genus).

I agree with that but Hickey also made the point that even composable typing can become unwieldy to compose requiring refactoring etc.. I think when we were discussing how to modularize typeclassing so we can multifurcate the abstract algebra we realized that there could be the Cambrian explosion that Hickey implicitly alludes to.

So I can understand that to be philosophically why he may be trying to avoid adding strong typing.

@keean
Copy link
Owner

keean commented Sep 6, 2020

@shelby3

I agree with that but Hickey also made the point that even composable typing can become unwieldy to compose requiring refactoring etc..

I find typing helps with refactoring, because when I change the type, the type checker finds all the code I need to change so I don't miss any. I find my refactorings in TypeScript are a lot easier and quicker than refactoring the same JavaScript. I think this is slightly missing the point, as he probably means refactorings are a symptom of the problem, and I agree, but generics with typeclasses and row-polymorphism is one answer to this.

I agree with his points on composition, but I find they are not actually opposed to types, but more opposed to structural types.

For one I would strongly be I favour of having relations as a built in type, along with relational algebra. I think this answers a lot of his points about composibility without reducing everything to a Typeless map. His points about composing data, that's a relational Join, and subsets, that's a relational Restrict. Relations are typed, and yet maintain all the properties of Maps that he values, as far as I can tell. His problem with this would be needing to know the complete types of relations, but again generics and row polymorphism come to the rescue. I can specify a function that inputs a relation that contains a Name type , but I don't care about the rest and it just gets passed along.

@shelby3
Copy link
Author

shelby3 commented Sep 6, 2020

I find typing helps with refactoring, because when I change the type, the type checker finds all the code I need to change so I don't miss any.

I believe that is a facet of his point. That you actually have to change them all. That’s the Cambrian explosion. Type inference should not be employed for APIs.

I think this is slightly missing the point, as he probably means refactorings are a symptom of the problem, and I agree, but generics with typeclasses and row-polymorphism is one answer to this.

I believe that is another facet of his point. To compose is analogous to Robert Harper’s criticism on a the monolithic abstract algebra for Haskell’s typeclasses:

https://existentialtype.wordpress.com/2011/04/16/modules-matter-most/

“As a consequence, using type classes is, in Greg Morrisett’s term, like steering the Queen Mary: you have to get this hulking mass pointed in the right direction so that the inference mechanism resolves things the way you want it to.”

Hickey would prefer we can just pretend the APIs compose fully and use testing to make sure we did not violate any of our assumptions about facets of the API our callers don’t actually engage.

I agree with his points on composition, but I find they are not actually opposed to types, but more opposed to structural types.

I agree that structural compared to nominal typing increases the lack of specificity which widens the Cambrian explosion or mass of the hulking typing morass.

I think I am inclined to be clever about how I modularize and structure APIs to minimize the issue. I do think he has a point that at some level we punt to untyped semantics and employ more testing.

@keean
Copy link
Owner

keean commented Sep 6, 2020

@shelby3 I prefer type inference. If you can write code that looks untyped, without the need extra clutter, but it catches some problems, then this is only a benefit.

Then at the module/interface/wire-protocol level the type system forms a specification.

If the wire protocol spec, and the type system are the same, then you get rid of any impedance matching required.and therefore boilerplate types/code.

@shelby3
Copy link
Author

shelby3 commented Sep 6, 2020

@keean I think you are just entirely missing the point that requiring everything to have a type whether it is inferred or not can become too heavy to lift.

[Inserted edit: I am sympathetic to Hickey’s point, although I fall more on the side of @keean’s preference for strong typing and at least type classes. Where I may differ from @keean is that I want to punt to not trying to type the semantics — I want to draw the line somewhere in terms of how much typing tsuris I would tolerate. Which is why I don’t like Rust, because I think the benefits of manual lifetime typing don’t outweigh the tsuris and overly complex, brittle type system.]

You think you have a typing solution for everything, e.g. row polymorphism, typed relational algebra, etc, but that belies Hickey’s point. As you well know that the more features you add to a typing system, the more byzantine it becomes due to the myriad of ways that type features interact.

I think he has a very strong point. Perfection is the enemy of good. Since we need to write unit tests anyway, then typing at some level of abuse that you seem to be advocating is just more pita than it is actually beneficial to any great extent.

Finding the right balance is what I am trying to figure out. Whereas, you seem determine to remain a type system zealot aka purist. I say actually try to build your ADA-on-steriods and see how it works out in reality.

@shelby3
Copy link
Author

shelby3 commented Sep 10, 2020

@keean please correct me if I am mistaken as follows. And please feel free to discuss this.

I am trying to determine whether Clojure offers a viable option for my need to write code both for JavaScript clients and performant, highly scalable servers. Currently I am thinking the optimal solution for my needs would be Scala 3 compiled to ScalaJS and perhaps to Go if I create a Scala compiler plugin for transpiling (a subset of Scala that I use) to Go.

  1. Clojure lacks implicit interface instance resolution at call sites, i.e. type classes, which you and I think is an essential feature for genericity, composability and extension.

  2. Clojure allows sharing of references (either immutable or software transaction memory mutables) between threads, c.f. also Agents. This is not a paradigm for a Pony-esque (or my contemplated ALP-like partitioning) thread-local garbage collection (JVM doesn’t support it also) and eliminating hardware cache coherency — and thus it can’t feasibly scale to massively multicore which will become a reality soon (c.f. the Parallelism Parallelism #41 and Mutable Objects Mutable Objects #32 threads).

    However, Clojure’s pmap, future and (for Go-like channels) Core.async offers asynchronous processing which could hypothetically be employed with only copied sharing between threads. In any case, this is not even at par with Go’s elegant, lightweight green threads because Clojure runs on the dreaded JVM.

    (Note sharing mutable state via STM or only sharing immutable state is an orthogonal issue to insoluble fact that in any conceivable programming language, the sharing of state machines between threads, even via STM or Actors, will be susceptible to transaction conflicts)

  3. Ditto as is the case for ScalaJS, ClojureScript doesn’t support (c.f. also) native JavaScript async / await and instead advocates portability (e.g. to JVM on server) attained along with the less byzantine code achieved when employing core.async.

    However, I have proposed to modify the ScalaJS compiler (perhaps with a Scala compiler plugin) to implement portable Promise interopt that would also hypothetically compile (without requiring compile-time, output target case logic, i.e. write-once and fully portable) to both native JavaScript async / await and Go’s green threads model. I do not envision how instead to accomplish such a feat with Clojure although apparently it’s possible with ClojureScript, and I think it would be nearly impossible to source-to-source code transpile Clojure to Go because of Clojure’s the lack of strong typing.

    Imo much more elegant to have native async / await in JavaScript code, such as for when debugging, especially given the special compilation model I proposed which I posit will fix the long-standing bifurcation composeability problem with JavaScript’s async / await. And Go’s green threads model is superior on the server.


I replied on Quora:

I am not currently actively coding in Scala. I am reasonably familiar with the language and have been observing the progress for the Dotty Scala 3 development, which is expected to be stable within a few months perhaps. Scala 3 adds some interesting capabilities and fixes some long-standing soundness issues present in Scala 2.

Ostensibly the main distinction between Scala and Clojure is the former possessing a powerful, strong typing system with higher-kinded types and even Haskell-like type classes. Typeclasses offer more degrees-of-expression and extension than subclassing (aka OOP) which Scala also offers but which I and others think is an antipattern. Higher-kinded types enable abstracting generics for example such as Map, Reduce, Fold over any type of collection. Clojure achieves flexibility by eschewing typing but this comes with some drawbacks and advantages which we are discussing:

Macros and DSLs · Issue #31 · keean/zenscript

I have also been studying the design proposal for generics in Go:

WD-40 Issues thread was accidentally deleted by @keean · Issue #49 · keean/zenscript (#49 (comment))

I could get into a long discussion about why Scala and other PLs such as Clojure, Kotlin, etc are stuck at less than 1% popularity.

I think one critical error for ScalaJS is not supporting JavaScript’s async and await natively. ClojureScript currently also has this deliberate limitation, and it instead supplies its own portable (to all compile targets) API as does Scala. Thus many people just use the excellent TypeScript instead which is a superset of JavaScript and thus supports JavaScript’s async and await natively. But TypeScript also doesn’t have higher-kinded typing which is longstanding ignored request that is heavily upvoted on their issues tracker.

I looked at Elm in 2018 and was not impressed at all. Popularity does not indicate level seriousness nor revenues. So lots of script kiddies like it but some of us write serious stuff with millions of lines of code.

Higher-kinded types: the difference between giving up, and moving forward

Ironic that I have never heard of Reason. Indicates how out-of-touch I have become to a lot of the new stuff that flows out. Everytime I look at the new script kiddie flavor-of-the-month I always conclude I wasted that time. So I guess I just tuned out.

If I am not mistaken the optimization with the Google Closure compiler is optional, not required:

Announcing Scala.js 0.6.28

Compilation and optimization pipeline

@sighoya
Copy link

sighoya commented Sep 10, 2020

Clojure Developer use mostly persistent data structures with log2 performance.

Ref is only for critical sections of mutable transitions or you can leverage just Java for it.

Some points about Clojure:

  • It is dynamic
  • is tied to the JVM
  • has immutable data structures
  • has a REPL
  • typed clojure is just a plugin similar to mypy, external tools are afflicted with asynchronism to their inspected target.
  • has macros
  • no type classes but at least some kind of Interfaces (Protocols)
  • It has not the best performance because of persistent data structures
  • It has horrible error message, they were improved in 1.10, but I don't know if this holds for any case, e.g. debugging
  • Standard implementation of simple functions is hidden behind Core Java functions, probably because of performance.

@shelby3
Copy link
Author

shelby3 commented Sep 10, 2020

@sighoya good to see that you’re still alive. I was worried when you went silent. I hope we can both find solutions for our health. I replied to your email about that.

Clojure Developer use mostly persistent data structures with log2 performance.

[…]

Some points about Clojure:

  • has immutable data structures

Edward Kmett’s 2009 post is how I first became of aware of that immutable data structures have a logarithm performance penalty for algorithms that require random access writes.

  • typed clojure is just a plugin similar to mypy, external tools are afflicted with asynchronism to their inspected target.

Strong typing is ill-fit to Lisp.

  • no type classes but at least some kind of Interfaces (Protocols)

I criticized that:

  1. Clojure lacks implicit interface instance resolution at call sites, i.e. type classes, which you and I think is an essential feature for genericity, composability and extension.

@shelby3
Copy link
Author

shelby3 commented Sep 15, 2020

A detailed follow-up discussion about the tradeoffs with static typing.

And a new idea that requires static typing for achieving its posited benefits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants