Adapting to the Behavior of Environments with Bounded Memory

Dhananjay Raju
(The University of Texas at Austin)
Rüdiger Ehlers
(Clausthal University of Technology)
Ufuk Topcu
(The University of Texas at Austin)

We study the problem of synthesizing implementations from temporal logic specifications that need to work correctly in all environments that can be represented as transducers with a limited number of states. This problem was originally defined and studied by Kupferman, Lustig, Vardi, and Yannakakis. They provide NP and 2-EXPTIME lower and upper bounds (respectively) for the complexity of this problem, in the size of the transducer. We tighten the gap by providing a PSPACE lower bound, thereby showing that algorithms for solving this problem are unlikely to scale to large environment sizes. This result is somewhat unfortunate as solving this problem enables tackling some high-level control problems in which an agent has to infer the environment behavior from observations. To address this observation, we study a modified synthesis problem in which the synthesized controller must gather information about the environment's behavior safely. We show that the problem of determining whether the behavior of such an environment can be safely learned is only co-NP-complete. Furthermore, in such scenarios, the behavior of the environment can be learned using a Turing machine that requires at most polynomial space in the size of the environment's transducer.

In Pierre Ganty and Davide Bresolin: Proceedings 12th International Symposium on Games, Automata, Logics, and Formal Verification (GandALF 2021), Padua, Italy, 20-22 September 2021, Electronic Proceedings in Theoretical Computer Science 346, pp. 52–66.
Published: 17th September 2021.

ArXived at: http://dx.doi.org/10.4204/EPTCS.346.4 bibtex PDF
References in reconstructed bibtex, XML and HTML format (approximated).
Comments and questions to: eptcs@eptcs.org
For website issues: webmaster@eptcs.org