Semantic Parsing and Generation of Documents and Documents Components (LORIA)
The product backlog itself is implemented in the issue tracker
of this repository
The primary goal of this WP is to provide the reversible semantic processor (parser and generator) component
of ModelWriter. This objective is broken down into the following sub-goals:
- To analyse the natural language processing requirements set by the technical documents of the Industrial Use Cases (What vocabulary? What styles? What document structures? What standards? etc.) so as to identify gaps in existing technology.
- To define a target semantic representation language for WP4’s Knowledge Base, for use by the "Model" functions of ModelWriter and the generator. To investigate and compare existing
symbolic
andstatistical
approaches to deep semantic parsing so as to define the best option for ModelWriter.
To integrate knowledge and constraint rules in a statistical machine learning framework for the semantic processing of higher-level semantic information such as argumentative or discourse structure. Exploring complementary approaches to semantic parsing, data-to-text and text-to-text generation, we will develop a reversible semantic processor (parser and generator) for ModelWriter based on semantic representations (models) that are rich and precise enough to support both natural language generation and the type of knowledge-based reasoning required by the Industrial Use Cases (e.g., consistency checking and redundancy detection).
A reversible semantic processor
which maps text to formal representations (ORM) and formal representations to text.