A Corpus-based Toy Model for DisCoCat

Stefano Gogioso
(University of Oxford)

The categorical compositional distributional (DisCoCat) model of meaning rigorously connects distributional semantics and pregroup grammars, and has found a variety of applications in computational linguistics. From a more abstract standpoint, the DisCoCat paradigm predicates the construction of a mapping from syntax to categorical semantics. In this work we present a concrete construction of one such mapping, from a toy model of syntax for corpora annotated with constituent structure trees, to categorical semantics taking place in a category of free R-semimodules over an involutive commutative semiring R.

In Dimitrios Kartsaklis, Martha Lewis and Laura Rimell: Proceedings of the 2016 Workshop on Semantic Spaces at the Intersection of NLP, Physics and Cognitive Science (SLPCS 2016), Glasgow, Scotland, 11th June 2016, Electronic Proceedings in Theoretical Computer Science 221, pp. 20–28.
Published: 2nd August 2016.

ArXived at: http://dx.doi.org/10.4204/EPTCS.221.3 bibtex PDF
