Generating Semantic Graph Corpora with Graph Expansion Grammar

Eric Andersson
(Umeå University)
Johanna Björklund
(Umeå University)
Frank Drewes
(Umeå University)
Anna Jonsson
(Umeå University)

We introduce Lovelace, a tool for creating corpora of semantic graphs. The system uses graph expansion grammar as a representational language, thus allowing users to craft a grammar that describes a corpus with desired properties. When given such grammar as input, the system generates a set of output graphs that are well-formed according to the grammar, i.e., a graph bank. The generation process can be controlled via a number of configurable parameters that allow the user to, for example, specify a range of desired output graph sizes. Central use cases are the creation of synthetic data to augment existing corpora, and as a pedagogical tool for teaching formal language theory.

In Benedek Nagy and Rudolf Freund: Proceedings of the 13th International Workshop on Non-Classical Models of Automata and Applications (NCMA 2023), Famagusta, North Cyprus, 18th-19th September, 2023, Electronic Proceedings in Theoretical Computer Science 388, pp. 3–15.
Published: 15th September 2023.

ArXived at: bibtex PDF
References in reconstructed bibtex, XML and HTML format (approximated).
Comments and questions to:
For website issues: