An experiment at writing a liblouis table parser based on Parsing expression grammar, pest and Rust
$ cargo run ~/src/liblouis/tables/bengali.cti
Rule: miscellaneous_rules
Text: "include braille-patterns.cti"
Rule: include_rule
Text: "include braille-patterns.cti"
Rule: character_definition_rules
Text: "letter \x0981 3"
Rule: letter_rules
Text: "letter \x0981 3"
Rule: character_definition_rules
Text: "letter \x0982 56"
Rule: letter_rules
Text: "letter \x0982 56"
Rule: character_definition_rules
Text: "letter \x0983 6"
Rule: letter_rules
Text: "letter \x0983 6"
...
Many if not most of the CVEs of liblouis are rooted in the hand crafted parsing functions of liblouis.
Moving to Rust for solid memory management and to parser generator like pest would be of tremendeous help to avoid the buffer overflow problems.
The grammar is pretty complete thanks to the work done on the tree-sitter for liblouis.
The parser can handle all of the tables from the liblouis distribution with the exception of the files that contain continuation lines (da-dk-g16-lit.ctb, da-dk-g26-lit.ctb and da-dk-g26l-lit.ctb). Also it fails on two tables that contain weird unicode chars inside a cotext or match opcode (hu-hu-g2.ctb and my-g2.ctb)
- You need the Rust tool chain.
If you have any improvements or comments please feel free to file a pull request or an issue.
Copyright (C) 2019 Swiss Library for the Blind, Visually Impaired and Print Disabled
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.