Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement converter from markdown to TeXmacs #15

Open
mgubi opened this issue Jun 3, 2021 · 2 comments
Open

Implement converter from markdown to TeXmacs #15

mgubi opened this issue Jun 3, 2021 · 2 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@mgubi
Copy link

mgubi commented Jun 3, 2021

I've researched a bit the topic and come up with several possible realistic solutions.

  • Use a C/C++ parser. Several nice and feasible possibilities, among which two:

    • libsoldout (http://fossil.instinctive.eu/libsoldout/home) : no dependencies
    • peg-markdown (https://github.com/jgm/peg-markdown) : uses peg/leg to generate the parser from a user-friendly description of the markdown grammar. Cons: depends on an external tool. Pro: maybe we can modify the parser description to generate directly C++ code with TeXmacs classes. Anyway it is easy to get the syntax tree and then generate from it a TeXmacs document
  • Do it in scheme.

    • then we could adapt a parser combinator library like comprarsed (Chicken scheme) [http://wiki.call-cc.org/eggref/5/comparse] and use the associated markdown parser (lowdown) [http://wiki.call-cc.org/eggref/5/lowdown]
    • Or we can adapt the packrat library (http://wiki.call-cc.org/eggref/5/packrat) and use one of the above peg grammars.
  • Improve/Use TeXmacs packrat parser. We have already a parser (for semantic editing) which is implemented in C++ while the grammars are described in Scheme. I do not understand right now if there is a way to obtain/generate a parse tree for a successful parse. If yes, then we have just to adapt one of the above grammars and then transform the parse tree into an appropriate TeXmacs document.

@mdbenito
Copy link
Contributor

mdbenito commented Jun 3, 2021

I think the path of least resistance is scheme. Super-fast development cycle, no toolchain to take care of for compilation (and easier multiplatform distribution) and decent-enough speed (with improvements maybe to come thanks to @mgubi ;)

Also, the internal representation as "markdown scheme tree" could be shared between both sides of the converter. Which means that the current tm->md would greatly benefit, because it has, to put it mildly, grown rather "organically" from a few-days hack into an ugly monstrosity.

That being said, my second choice would be the parser currently in TeXmacs, if this didn't mean modifications upstream which would have to be made very carefully, so instead I'd go for the simplest approach, libsoldout ?

@mdbenito mdbenito added enhancement New feature or request help wanted Extra attention is needed labels Jun 3, 2021
@mgubi
Copy link
Author

mgubi commented Jun 4, 2021

It could be. The C/C++ way would be just a blackbox which convert a string into a texmacs or scheme tree. Even if we use the peg/leg systems this means just generate the parser once. A priori these parsers have been used in the wild and they are fairly complete.

But having a home solution is also attractive. Develop or at least understand how to use (:)) the internal packrat parser would be useful to other parsing tasks (like syntax highlighting) or parsing other format. Being implemented in C++ make it very fast. So we can maybe obtain rapidly a scheme tree from it and from a description of the grammar in scheme. This would lead to most development in the Scheme side.

If we realise that the TeXmacs parser is still not versatile enough we can adapt one of the Scheme libraries above to our brand of Scheme, TeXmacs Scheme :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants