Published: 7th August 2020 DOI: 10.4204/EPTCS.320 ISSN: 2075-2180

# Proceedings 8th International Workshop onVerification and Program Transformation and 7th Workshop on Horn Clauses for Verification and Synthesis Dublin, Ireland, 25-26th April 2020

Edited by: Laurent Fribourg and Matthias Heizmann

 Preface Laurent Fribourg and Matthias Heizmann VPT'20 Invited Presentations Alberto Pettorossi, the Great Scientist and My Dear Friend Andrzej Skowron Transforming Programs with Alberto in the Rome of the Last Century Sophie Renault Theory Exploration: Conjecturing, Testing and Reasoning about Programs Moa Johansson Keynote Paper A Historical Account of My Early Research Interests Alberto Pettorossi 1 Invited Papers On the Complexity of the Universality and Inclusion Problems for Unambiguous Context-Free Grammars Lorenzo Clemente 29 From Well Structured Transition Systems to Program Verification Alain Finkel 44 From Big-Step to Small-Step Semantics and Back with Interpreter Specialisation John P. Gallagher, Manuel Hermenegildo, Bishoksan Kafle, Maximiliano Klemen, Pedro López García and José Morales 50 Cons-free Programs and Complexity Classes between LOGSPACE and PTIME Neil D. Jones, Siddharth Bhaskar, Cynthia Kop and Jakob Grue Simonsen 65 Prolog for Verification, Analysis and Transformation Tools Michael Leuschel 80 Transformational Verification of Quicksort Emanuele De Angelis, Fabio Fioravanti and Maurizio Proietti 95 Contributed Papers Generating Distributed Programs from Event-B Models Horatiu Cirstea, Alexis Grall and Dominique Méry 110 Optimizing Program Size Using Multi-result Supercompilation Dimitur Nikolaev Krustev 125 Distilling Programs to Prove Termination G.W. Hamilton 140 An Experiment Combining Specialization with Abstract Interpretation John P. Gallagher and Robert Glück 155 HCSV'20 Contributed Papers Towards an SMT-LIB Theory of Heap Zafer Esen and Philipp Rümmer 159 Horn2VMT: Translating Horn Reachability into Transition Systems Denis Bueno and Karem A. Sakallah 163 The Extended Theory of Trees and Algebraic (Co)datatypes Fabian Zaiser and C.-H. Luke Ong 167 Invited Paper Competition Report: CHC-COMP-20 Philipp Rümmer 197

# Preface

These proceedings are the result of merging the contributions to the VPT 2020 and HCVS 2020 workshops which should have taken place in Dublin in April 2020, co-located with the ETAPS joint conferences, but were cancelled due to the pandemic. The VPT workshop was organized in honour of Alberto Pettorossi on the occasion of his academic retirement from Università di Roma Tor Vergata.

The proceedings consist of a keynote paper by Alberto followed by 6 invited papers written by Lorenzo Clemente (U. Warsaw), Alain Finkel (U. Paris-Saclay), John Gallagher (Roskilde U. and IMDEA Software Institute) et al., Neil Jones (U. Copenhagen) et al., Michael Leuschel (Heinrich-Heine U.) and Maurizio Proietti (IASI-CNR) et al.. These invited papers are followed by 4 regular papers accepted at VPT 2020 and the papers of HCVS 2020 which consist of three contributed papers and an invited paper on the third competition of solvers for Constrained Horn Clauses.

In addition, the abstracts (in HTML format) of 3 invited talks at VPT 2020 by Andrzej Skowron (U. Warsaw), Sophie Renault (European Patent Office) and Moa Johansson (Chalmers U.), are included after this preface.

The editors would like to warmly thank the authors of these contributions, as well as the members of the PCs of VPT 2020 and HCVS 2020 for their constructive reviews of the regular papers.

Last but not least, the editors are grateful to Andrei Nemytykh (Program Systems Institute of RAS) and Alexei Lisitsa (U. Liverpool), organizers of VPT 2020, as well as Maurizio Proietti (IASI-CNR), Alberto's unfailing friend and colleague, who provided invaluable assistance in the making of the proceedings. We also sincerely thank EPTCS publisher for its final help with the publication.

We are ourselves pleased to wish Alberto Pettorossi all the best for his new life which, as we know, will continue to be rich in innumerable observations, indispensable for the progress of our community of program transformers and verifiers.

Grazie Alberto!

Done on June 15, 2020
Laurent Fribourg (CNRS) and Matthias Heizmann (University of Freiburg)

# Alberto Pettorossi, the Great Scientist and My Dear Friend

Andrzej Skowron (University of Warsaw - Poland)

It is not easy to summarize our thoughts about a person whom we have known for a long time.

I have known Professor Alberto Pettorossi for almost 40 years. Our first meeting was somewhere in Poland, during a conference. I remember our trip to forest where it was possible to find nice blueberries. While returning from this trip we stopped in a small cafeteria and ordered ice creams. After adding to them blueberries they became quite special. Looking what we had, everybody entering to the cafeteria started trying to order ice creams with blueberries as well. Of course they ended up in unsuccessful attempt. Perhaps, this was the reason why other customers were looking at us as if we were very special guests.

I was very happy when Alberto invited me to cooperate with him on program transformation techniques related to functional programs. There was quite a long period during which our cooperation was intensive with frequent meetings in Rome, in Warsaw, and in many conferences. Together we have written about 20 papers. This was recognized by the Polish Mathematical Society with the offer of the Janiszewski award in 1988.

Alberto was and is always very much dedicated to perfection in the research and life. This was most probably the reason why he decided to work for his PhD thesis abroad, i.e., in Edinburgh, one of the top research universities in Europe. At that time PhD studies in Italy did not exist. Alberto's attitude towards absolute perfection and precision in research also created some problems in our cooperation. For example, dealing with program optimization for computing complex functions one can expect some limitations on the application of existing techniques provided by mathematical logic. I remember, that during my stay as a visiting professor in the U.S.A., we had recognized this fact while writing a paper for the AAAI conference. However, Alberto was not very much interested in this direction of research. Nowadays, while using Data Science methods for analyzing large data coming from interactions with complex phenomena, we understand more and more that there is also room for the use of experimental data and methods based on soft reasoning, which is different from the method based on the well developed notion of deduction in mathematical logic. This new reasoning method, may be called adaptive judgment, should also take into account issues of induction, abduction, experience, perception, action, etc. Alberto, following his way based on the absolute perfection, in particular precision, decided to continue his research dealing with transformation techniques based on logic programming, without using techniques for reasoning under uncertainty.

Starting from the very first meeting, I recognized Alberto not only as a very dedicated research scientist and a very talented researcher, but also as a great human being, who is totally abided by the principles of Catholic religion. During the whole long period of our friendship I can't recall a moment when he turned away from these values. Often, I was telling him that I would treat him as a saint to degree one. Alberto realized that his way to perfection in life is ahead of many people including myself. For example, during our trip to Castel Gandolfo, when we decided to return to Rome by jogging. Even if it has happened many years ago, I still remember that before I arrived to Alberto's house, he was already able to take a shower and a nap. We have experienced so many miracles during our meetings in Rome, Warsaw, Edinburgh, Charlotte and many other places that we decided to write a book about them. Unfortunately, this has not yet been done.

I was, only very rarely, able to astonish Alberto; he rather has had a critical opinion about me, and has been trying to make me better. One case is related to his visit to the Vatican, with some priests and bishops, for an audience with Pope John Paul II, now our great saint. When the Pope was passing by close to Alberto, he stopped and touched his face saying: You should eat more because you are very slim. After a while, Alberto answered: I am cooperating with a professor from Poland. What is his name?, asked the Pope. When Alberto said my name, the Pope was said: I know him, I know. Alberto was shocked and told this to his mother. Then my position in her eyes grew up very much. I will always remember her great hospitality and openness during my visit in Alberto's family house in Loreto. Unfortunately, this professor, about whom the Pope was speaking, was another professor from Cracow, much older than me, a well-known specialist in biology and colleague of our Pope.

During meetings with Alberto it was always possible to observe his talent in adding some higher values to meetings, and his talent for infecting participants in the meeting with a kind of joy and enthusiasm, together with lot of stimuli for research. In particular, I experienced his great help during the martial law in Poland, and not only that, when he supported very much my family both on the spiritual and material level.

Many thanks, my friend and younger brother, Alberto. I have learned from you so much which is beyond the problems related to the program transformation techniques.

Andrzej Skowron

Warsaw, May 14, 2020

# Transforming Programs with Alberto in the Rome of the Last Century

Sophie Renault (European Patent Office - The Hague, The Netherlands)

When I heard that Alberto was retiring this year, I realized that I had never thought that such a thing would ever happen. Because to me, Alberto has never shown any signs of having made his time or getting tired of his teaching and researching. On the contrary, Alberto has always been looking forward to new challenges, and always with a big smile on his face. And also because Alberto is just not changing over the years. It does not matter how long you haven't seen him, when you meet him again, he will be the same, kind, friendly, smart and alert person who remembers the things you have shared together, in fact remembering everything. Alberto's memory is just impressive.

In 1996-1998, I had the chance to do a post-doctorate program with Università di Roma Tor Vergata and the IASI-CNR with Alberto and Maurizio. Before that, I had met them in a couple of workshops in relation to the LOPSTR network while I was doing my PhD with Pierre Deransart at INRIA on verification of logic programs.

In Roma, Alberto and Maurizio introduced me to their work on unfolding/folding. They had been exploring that field of research for several years already. Together we have been working on partial deduction in particular via unfold/fold rules [1] and had a nice result on the KMP string matching example [2]. Our work was presented by Maurizio in the POPL'97 symposium [3] which was held in Paris La Sorbonne that year.

Apart from the scientific work in Roma with Alberto and Maurizio, I have great memories of my daily life at the IASI during my stay there. First of all going to work every day passing by the Collosseo on my motorino to arrive at Viale Manzoni, where the IASI was located at that time, was a great start to the day. Arriving there at the entrance of the building, there was Mauro, a charismatic guy in his uniform of carabinieri who would check on everyone entering the building. Very friendly guy, but initially quite intimidating as you are not usually used to seeing a man in uniform with a gun checking on you when you arrive at work in a research centre!

Lunch time with Alberto and Maurizio at the IASI was a special moment of the day. We normally had a sandwich in Maurizio's room, the three of us, or the four of us sometimes when Laurent Fribourg was visiting the IASI during his sabbatical year in 1997. During lunch time, Alberto was very talkative and always had something interesting to tell. Most of the time about work, but not only. It could have been about anything in fact. Every aspect of science was a source of interest to him and he would love to share his knowledge. He would use the white board and make drawings to explain things about physics for example. He was talented at explaining difficult things and I could easily tell how great a teacher he was at university. A colleague of mine nowadays, at the European Patent Office, Elisa Cilia, was one of Alberto's student at University di Roma Tor Vergata, and confirmed to me how good Alberto was for teaching and sharing his passion for computer science. Both Alberto and Maurizio are also very knowledgeable in history of art. I have learnt a lot from them. Last time I saw them in Roma, we went to visit the Basilica San Clemente which is not far from the IASI. San Clemente is of special importance among churches in Rome, for it is a unique example of a building that illustrates a sequence of epochs. One can see example of Roman architecture on three levels, from classical antiquity through the early Christian Era, to the late Middle Ages. One can also see spectacular frescoes by the Florentine artists Masolino and Masaccio in the upper church. During the visit with Alberto and Maurizio, I have been astonished by the amount of knowledge they both had on the subject-matter.

During lunch time also, Alberto liked to tell anecdotes about people he had met during his stay in Edinburgh at the beginning of his career, more generally about people he had met throughout his years as a researcher. He inevitably had something nice to say about all those people and always showed great respect towards their work. In telling stories, Alberto could be carried away and be so funny at times. We laughed a lot. His enthusiasm, optimism, and willingness to make you feel good in what you are doing is what I remember most about Alberto during my stay in Roma. I have learnt from him that if you contemplate doing something, whatever it is, then you need to put your best into it, otherwise the whole thing is pointless. And indeed, Alberto has always been committed in everything he is doing.

Beyond work, I have been very lucky to share private moments with Alberto in Italy. He would share about his family, especially about his old mum who was living in Loreto (province of Ancona, Marche) at the time. Alberto would visit her almost at every weekend. He always had nice stories to share about Loreto, about his relatives, his mum, nephews and nieces... The family was important. And Alberto had been very supportive when my Mum became ill and when she eventually passed away. He told me she would be in his prayers. He also gave me something for her, so that she could pray Virgin Mary of Loreto. He would write to her (in a very good French by the way). Although my Mum never met him in person, she always asked about Alberto and asked me to send her love. As we all know, Alberto is a sporty man, very keen on climbing, and very fit. A couple of times he took me with him and his group of friends from the church for a walk in the slopes of the Abbruzzi. Nice memories.

What I also very much appreciate in Alberto is his love for children and the way he interacts with them. I have seen him many times with Marco and Massimilano, Maurizio and Vincenza's boys, in Roma, when they were small but also more recently. Alberto is for them lo zio, and indeed, he was behaving as a loving uncle to them. My children also had the chance to meet Alberto in Roma a couple of years ago, and have a good memory of him. During a meal at Maurizio's, Alberto was telling them stories and playing tricks they still remember.

Now that Alberto is officially retiring, nothing much will change in my view. Alberto will probably carry on keeping an eye on what is happening in the scientific world, and of course especially in the field of computer science. No doubt that he will be remembered for a long time for his great contribution to the field. I am just curious to see where he will decide to live as a pensioner. Roma? Loreto?

Well, Alberto, I would like to say how much inspiring you have been to me. And I am very happy to meet you here in Dublin. I wish you all the best for the future and I hope to see you again. Happy retirement!

Un abbraccio forte !

### References

[1] Alberto Pettorossi, Maurizio Proietti and Sophie Renault (1996): Enhancing Partial Deduction via Unfold/Fold Rules. LOPSTR'96, Stockholm, Sweden, pp. 146–168. doi:10.1007/3-540-62718-9_9.

[2] Alberto Pettorossi, Maurizio Proietti and Sophie Renault (1996): How to Extend Partial Deduction to Derive the KMP String-Matching Algorithm from a Naive Specification (Poster Abstract). JICSLP 96, Bonn, Germany, p. 539.

[3] Alberto Pettorossi, Maurizio Proietti and Sophie Renault (1997): Reducing Nondeterminism while Specializing Logic Programs. POPL'97, Paris, France, pp. 414–427. doi:10.1145/263699.263759.

# Theory Exploration: Conjecturing, Testing and Reasoning about Programs

Moa Johansson (Chalmers University - Gothenburg, Sweden)

Theory Exploration is a technique for automatically discovering (and proving) interesting properties about programs. This has successfully been used for automation of inductive proofs, where theory exploration can be used to discover auxiliary lemmas needed in all but the simplest inductive proofs. A richer background theory, consisting of additional equational properties, is constructed automatically, allowing harder theorems to be proved automatically. I believe theory exploration can also have applications in program transformation and I look forward to discuss such possibilities at this workshop.

We have developed several systems for theory exploration, the most recent ones being QuickSpec and Hipster. QuickSpec is a tool written in Haskell concerned with the task of efficiently conjecturing candidate properties about functional programs, using term generation and property based testing. It produces a set of candidate properties, which can either be directly presented to the user, or passed to an automated theorem prover. Hipster is our link to such a theorem prover, and is built on top of the proof assistant Isabelle/HOL. Hipster communicates with QuickSpec, and will attempt to automatically prove candidate properties, possibly using other discovered properties as lemmas. Naturally, we only want to present interesting properties to the user, not swamping the output with trivial equations. But how do we define what is to be considered interesting? In Hipster, the user can control this by configuring the system with two reasoning strategies: If the system can prove something by the easy reasoning strategy (e.g. simplification) we may discard it as uninteresting, while properties requiring proof by the hard reasoning strategy (e.g. induction) are presented to the user.

I will also mention some of our ongoing work, where we consider extensions of theory exploration for very large theories (or programs), which otherwise are computationally expensive.

# Towards an SMT-LIB Theory of Heap

Zafer Esen (Uppsala University)
Philipp Rümmer (Uppsala University)

### Introduction

Constrained Horn Clauses (CHC) are a convenient intermediate verification language that can be generated by several verification tools, and that can be processed by several mature and efficient Horn solvers. One of the main challenges when using CHC in verification is the encoding of program with heap-allocated data-structures: such data-structures are today either represented explicitly using the theory of arrays (e.g., [4, 2]), or are transformed away with the help of invariants or refinement types (e.g., [6, 1, 5, 3]). Both approaches have disadvantages: they are low-level, do not preserve the structure of a program well, and leave little design choice with respect to the handling of heap to the Horn solver. This abstract presents ongoing work on the definition of a high-level SMT-LIB theory of heap, which in the context of CHC gives rise to standard interchange format for programs with heap data-structures. The abstract presents the signature and intuition behind the theory. We are still working through the theory axioms, which will be published in a subsequent paper. The abstract is meant as a starting point for discussion, and request for comments.

A theory of heap has to cover a number of functionalities, including: (i) representation of the type system associated with heap data, and of pointers; (ii) reading and updating of data on the heap; (iii) handling of object allocation. In our proposal, we use algebraic data-types (ADTs), as already standardised by the SMT-LIB, as a flexible way to handle (i); for (ii) our theory offers operations akin to the theory of arrays, and (iii) is provided by additional allocation functions. The theory is deliberately kept simple, so that it is easy to add support to Horn solvers: a Horn solver can, for instance, internally encode heap using the existing theory of arrays, or implement transformational approaches like [2, 5]. Since we want to stay high-level, arithmetic operations on pointers are excluded in our theory, as are low-level tricks like extracting individual bytes from bigger pieces of data through pointer manipulation. (Object-local pointer arithmetic could be handled in a verification system before encoding a program as CHCs.)

### Syntax and Semantics of the Theory

In order to explain the syntax and semantics of the theory, we start with a simple example program. The example defines a pair named Node in a C-like language, and then creates a linked list on the heap using this type. The new syntax is shorthand for a memory allocation function, which allocates memory on the heap using the passed object, which in this case is a Node.

1 struct Node { int data; Node* next; };
2 Node* list = new Node(0, NULL);
3 list->next = new Node(list->data + 1, NULL);


To encode the program using our theory, first a heap has to be declared that covers the program types, as shown in the upper half of Listing 1. Each heap declaration introduces several sorts: a sort Heap of heaps; a sort Address of heap addresses, and a number of data-types (i.e., Algebraic Data Types) used to represent heap data. The data-type declaration is integrated into the heap declaration because heap objects naturally have to store heap addresses, but is otherwise equivalent to a declare-datatypes command in SMT-LIB 2.6.

Data-types are used to represent the hierarchy of types on the heap, and allow us to have just a single sort for all objects on the heap. Consider the example program, in which type int maps to the mathematical integers Int, pointer types are stored using sort Address, and we can represent the Node type using a data-type Node with two fields data and next. We can assume that the only types stored on the heap are Int, Node, and Address. Again using data-types, these sorts can be wrapped to be represented as a single Object sort. In the declaration of the Heap in Listing 1, the Object sort is defined as a data-type with wrappers for each possible sort on the heap (lines 7–9), and Node as a record with a single constructor (line 10). To get back the data stored in an object, a single selector is defined for each wrapped sort, which is named a getter.

Data-types are a clean and flexible way to represent types in programs. In C, note that nested structs, enums, and unions can all easily be mapped to data-types, while recursive data-types could be used for strings or arrays (although it is probably more efficient to natively integrate the theory of arrays for this purpose). Inheritance in Java-like languages can be represented through a parent field added to sub-classes, and multiple-inheritance in C++ through multiple parent fields; sub-typing becomes explicit.

1  (declare-heap
2   Heap                                   ; name of the heap sort to declare
3   Address                                ; name of the Address sort to declare
4   Object                                 ; object sort, usually one of the data-types
5   (WrappedInt 0)                         ; the default object stored at unallocated addresses
6   ((Object 0) (Node 0))                  ; data-types
7   (((WrappedInt  (getInt Int))           ; constructors for sort Object
8     (WrappedNode (getNode Node))
9     (WrappedAddr (getAddr Address)))
10    ((Node (data Int) (next Address))))) ; constructors for sort Node
11 ; CHC below are given in Prolog notation instead of SMT-LIB syntax for brevity.
12 I1(emptyHeap).
13 I2(ar._1, ar._2)          :- I1(h), ar = allocate(h, WrappedNode(Node(0, NULL))).
14 I3(h, list, n)            :- I2(h, list), n = getNode(read(h, list)).
15 false                     :- I2(h, list), !valid(h, list).
16 false                     :- I2(h, list), !isWrappedNode(read(h, list)).
17 I4(ar._1, list, n, ar._2) :- I3(h, list, n),
18                              ar = allocate(h, WrappedNode(Node(data(n)+1, NULL))).
19 I5(h1, list)              :- I4(h, list, n, p),
20                              x = data(getNode(read(h,list))),
21                              h1 = write(h, list, WrappedNode(Node(x, p))).
22 false                     :- I4(h, list, n, p), !valid(h, list).
23 false                     :- I4(h, list, n, p), !isWrappedNode(read(h, list)).


Listing 1: An SMT-LIB declaration of a Heap and CHC for the example program given in Prolog syntax. Note that all variables are implicitly universally quantified with the correct sort (e.g., ∀h: Heap etc.). The clauses are given in Prolog notation for the sake of brevity, which could also be written in SMT-LIB.

#### Encoding in Horn Clauses

The example program can be encoded in Horn clauses using Prolog notation as given in Listing 1, and the used operations are introduced further below.

Line 12 creates an empty heap term, and in line 13, memory is allocated on the heap using a zero-initialised Node Object. Allocation returns a pair: the new heap after allocation (ar._1), and the address of the allocated location (ar._2), which in SMT-LIB can be handled through a further data-type AllocationResult. The returned Address value ar._2 is assigned to list.

On line 14, list is read from the heap and the returned Node is stored in the temporary variable n, and lines 15–16 assert that this heap access is safe. In C, such checks are relevant for each read and write, due to the possibility of pointer casts and of uninitialised pointers, whereas in Java corresponding checks should be added for reference typecasts. Line 15 can be read as: “It is not the case that I2 holds and Heap h at Address list is unallocated.” Line 16 ensures that the read Object is actually of type Node, using the tester available for each data-type constructor. Lines 22–23 contain similar checks for the other heap accesses in the program.

The clause at lines 17-18 allocates memory for the second Node, and assigns the returned Address to the temporary variable p (i.e., the last argument of I4). Lines 19–21 update the list's next field, by creating a new Node where the data field remains the same, and the next field is assigned the value of p. The constructed Node is then wrapped and written to the the Address pointed to by list.

Representing all pointer types using the single sort Address simplifies the theory; however, an Address has no associated type information. This is closer in semantics to languages like C, where casts between arbitrary pointer types are possible; in languages with a stronger type system, like Java, some memory safety assertions (e.g., lines 16 and 23 in Listing 1) can be turned into assumptions, since those properties are guaranteed by the type system.

#### Functions and Predicates of the Theory

nullAddress :	()	                 → Address
emptyHeap   :	()	                 → Heap
allocate    :	Heap x Object            → Heap x Address (AllocationResult)
read	    :	Heap x Address           → Object
write       :	Heap x Address x Object	 → Heap
valid       :	Heap x Address	         → Bool


Functions and predicates of the theory are given above. Function nullAddress returns an Address which is always unallocated/invalid, while emptyHeap returns the Heap that is unallocated everywhere.

Function allocate takes a Heap and an Object, and returns an AllocationResult. AllocationResult is a data-type representing the pair Heap,Address. The returned Heap at Address contains the passed Object, with all other locations unchanged.

Functions read and write are similar to the array select and store operations; however, unlike an array, a heap is a partial mapping from addresses to objects. This means the read and write functions only behave as their array counterparts if the heap is allocated at the address being read or written to.

The behavior of accessing unallocated memory locations is undefined in many languages. Regarding reads, we left the choice to the user of the theory, who can designate a default Object in the heap declaration (line 5 in Listing 1). This Object, which we named defObj in our axioms, is returned on an invalid read. The function write normally returns a new Heap if the address that is being accessed was allocated. If not, then the original Heap is returned without any changes. (An alternative semantics would be to return the emptyHeap; however, returning the same heap simplifies the read-over-write axiom by removing the validity check from the left-hand side of the implication). Validity of a write can be checked via memory-safety assertions as shown in Listing 1.

We propose a further short-hand notation nthAddress, which is not listed in the list of functions and predicates of the theory, but is useful when presenting satisfying assignments. It is used to concisely represent Address values which would be returned after i allocate calls. This becomes possible with a deterministic allocation axiom added to the theory.

### References

1. Nikolaj Bjørner, Kenneth L. McMillan & Andrey Rybalchenko (2013): On Solving Universally Quantified Horn Clauses. In: Francesco Logozzo & Manuel Fähndrich: Static Analysis - 20th International Symposium, SAS 2013, Seattle, WA, USA, June 20-22, 2013. Proceedings, Lecture Notes in Computer Science 7935. Springer, pp. 105–125, doi:10.1007/978-3-642-38856-9_8.
2. Emanuele De Angelis, Fabio Fioravanti, Alberto Pettorossi & Maurizio Proietti (2017): Program Verification using Constraint Handling Rules and Array Constraint Generalizations. Fundam. Inform. 150(1), pp. 73–117, doi:10.3233/FI-2017-1461.
3. Temesghen Kahsai, Rody Kersten, Philipp Rümmer & Martin Schäf (2017): Quantified Heap Invariants for Object-Oriented Programs. In: Thomas Eiter & David Sands: LPAR-21, 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Maun, Botswana, May 7-12, 2017, EPiC Series in Computing 46. EasyChair, pp. 368–384, doi:10.29007/zrct.
4. Anvesh Komuravelli, Nikolaj Bjørner, Arie Gurfinkel & Kenneth L. McMillan (2015): Compositional Verification of Procedural Programs using Horn Clauses over Integers and Arrays. In: Roope Kaivola & Thomas Wahl: Formal Methods in Computer-Aided Design, FMCAD 2015, Austin, Texas, USA, September 27-30, 2015. IEEE, pp. 89–96. Available at https://dl.acm.org/doi/10.5555/2893529.2893548.
5. David Monniaux & Laure Gonnord (2016): Cell Morphing: From Array Programs to Array-Free Horn Clauses. In: Xavier Rival: Static Analysis - 23rd International Symposium, SAS 2016, Edinburgh, UK, September 8-10, 2016, Proceedings, Lecture Notes in Computer Science 9837. Springer, pp. 361–382, doi:10.1007/978-3-662-53413-7_18.
6. Patrick Maxim Rondon, Ming Kawaguchi & Ranjit Jhala (2008): Liquid types. In: Rajiv Gupta & Saman P. Amarasinghe: Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, Tucson, AZ, USA, June 7-13, 2008. ACM, pp. 159–169, doi:10.1145/1375581.1375602.

# Horn2VMT: Translating Horn Reachability into Transition Systems

Denis Bueno (University of Michigan)
Karem A. Sakallah (University of Michigan)


### 1 Introduction

Relational transition system models have for decades been the bread and butter representation of hardware and software model checking. The VMT format1 provides a means of describing sequential model checking problems using SMT-LIB-compatible combinational formulas. MathSAT [4], EUForia [3], and other tools support VMT natively, but unfortunately they do not support Horn clause formats. While it is well known that linear Horn clauses can be expressed as transition systems, we are not aware of a precise description in the literature. This paper contributes a procedure, proof of correctness, and prototype implementation for translating linear Horn clauses into VMT. Our procedure is able to convert SV-COMP 2 and SeaHorn [6] benchmarks into VMT for use with various model checkers. Our prototype implementation will be available at https://github.com/dbueno/horn2vmt.

### 2 Background

Consider a first-order language with equality with signature $$\mathcal{S}$$ and two common sorts, $$\BoolSort$$s and $$\IntSort$$s. Our language also uses a set of relation symbols $$\mathcal{R}$$. We say a relation $$R$$ of arity $$n=\abs{R}$$ has $$n$$ places; each place refers to a dimension of the relation. For instance, in the formula $$R(1, 2)$$, $$1$$ is in the first place of the relation and $$2$$ is in the second. The semantics of these formulas is standard. A Horn clause or rule is a universally quantified formula with a body part and a head part: $$\forall x_1,\ldots ,x_m. \underbrace{\bigwedge _{k=1}^j P_k(\overline{x}_k) \wedge \phi (x_1,\ldots ,x_m)}_{\text{body}} \Rightarrow \head$$ where for every $$k$$, $$P_k \in \mathcal{R}$$ is an uninterpreted predicate symbol, $$\overline{x}_k \subseteq \{ x_1, \ldots , x_m \}$$, and $$\abs{\overline{x}_k} = \abs{P_k}$$ [8]. The constraint $$\phi$$ is a formula over interpreted atoms. $$\head$$ must either be an application of an uninterpreted predicate or an interpreted formula. In this paper, $$j=1$$, which corresponds to linear Horn clauses.

A transition system [5, 2] is a tuple $$(X, Y, I, T)$$ consisting of (non-empty) sets of state variables $${{X}} = \{{{x_1}, \ldots ,{x_n}} \}$$ and corresponding next-state variables $$X' = \{ x_1', x_2', \ldots , x_n'\}$$, a (possibly empty) set of input variables $${{Y}} = \{{{y_1}, \ldots ,{y_m}} \}$$, and two formulas: $$I(X)$$, the initial states, and $$T(X,Y,X')$$, the transition relation. For every formula $$\sigma$$, the set $$\Vars (\sigma )$$ denotes the set of state variables free in $$\sigma$$. The $$X$$ in $$\sigma (X)$$ indicates that the free variables in $$\sigma$$ are drawn solely from the set $$X$$; we may omit the argument and write $$\sigma$$. $$\Prime (\sigma (X))$$ stands for the formula $$\sigma (X')$$, that is, all state variable occurrences are replaced with primed (i.e., next state) state variables. In this paper, we will also provide a property formula $$P(X)$$ that we wish to prove is an invariant of $$T$$. We write $$\sigma \xrightarrow{T}{\omega }$$ if each state in $$\sigma$$ transitions to some state in $$\omega$$ under $$T$$, i.e., $$\sigma \wedge T \models \omega '$$. A transition system execution is a (possibly-infinite) sequence of transitions $$\sigma _0 \xrightarrow{T}{\sigma _1} \xrightarrow{T}{\sigma _2}, \ldots$$ such that $$\sigma _0 \models I$$. Every reachable state occurs in some execution.

### 3 Translating Horn to Transition Systems

We begin with a motivating example: a program expressed as Horn clauses which we will translate into a transition system. The program's Horn clauses are defined over interpreted symbols $$\{<, +\}$$ and relation symbols $$\mathcal{R} = \{E, L, M, U \}$$; relations model the program's control locations and state (see [1] for program encoding details): \begin{align} \text{true} &\Rightarrow E \label{eq:horn-rule1} \\ E &\Rightarrow L(0) \label{eq:horn-rule-b} \\ \forall x.\ L(x) \wedge (x<5) &\Rightarrow L(x+3) \label{eq:horn-rule2} \\ \forall x.\ L(x) \wedge \neg (x<5) &\Rightarrow M(x) \label{eq:horn-rule4} \\ \forall x.\ M(x) \wedge \neg (x<7) &\Rightarrow U \label{eq:horn-rule5} \end{align}

In order to use a VMT-capable model checker to answer the question, "is the relation $$U$$ inhabited?" we show below how to encode Horn clauses $$\eqref{eq:horn-rule1}$$-$$\eqref{eq:horn-rule5}$$ as a transition system $$\mathbb{A}=(X,\emptyset ,I,T)$$ where $$X=\{ \ell _E, \ell _L, \ell _M, \ell _{U}, P_{L,1}, P_{M,1} \}$$, $$I=(\neg \ell _E \land \neg \ell _L \land \neg \ell _M \land \neg \ell _U)$$, and $$T$$ is defined below. The property $$P=(\neg \ell _U)$$. The variables $$\ell _E, \ell _L, \ell _M, \ell _U$$ are Boolean relation variables that correspond to the relation symbols in the Horn clauses. $$P_{L,1},P_{M,1}$$ are integer place variables that correspond to elements inhabiting Horn clause relations. We use the function $$\Preserve{S} \equiv (\bigwedge _{x\in S} x'=x)$$ to express that values of variables in $$S$$ are preserved, i.e., they don't change. $$\mathbb{A}$$'s transition relation $$T$$, then, is defined as the disjunction of the following constraints: \begin{align} (\ell _E' \land \text{} &\Preserve{X\setminus \{\ell _E\}}) \label{eq:t-entry}\tag{\ref{eq:horn-rule1}*} \\ (\ell _E \land \ell _{L}' \land (P_{L,1}'=0) \land \text{} &\Preserve{X\setminus \{\ell _{L},P_{L,1} \}}) \label{eq:X0-update}\tag{\ref{eq:horn-rule-b}*} \\ (\ell _{L} \land (P_{L,1} < 5) \land \ell _{L}' \land (P_{L,1}'=P_{L,1}+3) \land \text{} &\Preserve{X\setminus \{\ell _{L},P_{L,1}\}}) \label{eq:trans-rule-2}\tag{\ref{eq:horn-rule2}*} \\ (\ell _{L} \land \neg (P_{L,1} < 5) \land \ell _{M}' \land (P_{M,1}'=P_{L,1}) \land \text{} &\Preserve{X\setminus \{\ell _M,P_{M,1}\}}) \label{eq:X1-update}\tag{\ref{eq:horn-rule4}*} \\ (\ell _M \land \neg (P_{M,1}<7) \land \ell _U' \land \text{} &\Preserve{X\setminus \{\ell _U\}}) \label{eq:U-update}\tag{\ref{eq:horn-rule5}*} \end{align}

Each disjunct of $$T$$ corresponds to a single Horn rule; $$\eqref{eq:t-entry}$$ corresponds to $$\eqref{eq:horn-rule1}$$, $$\eqref{eq:X0-update}$$ to $$\eqref{eq:horn-rule-b}$$, and so on. It is possible in $$\mathbb{A}$$ to reach states $$(\ell _L\wedge P_{L,1}=\mathbf{0})$$, $$(\ell _L\wedge P_{L,1}=\mathbf{3})$$, and $$(\ell _L\wedge P_{L,1}=\mathbf{6})$$3 , meaning $$\{0,3,6\}\subseteq L$$. Moreover, every reachable state satisfies $$P$$, implying that clauses $$\eqref{eq:horn-rule1}$$-$$\eqref{eq:horn-rule5}$$ are not satisfiable.

#### 3.1 General Translation

We now present our general translation from set of $$n$$ linear Horn clauses over $$\mathcal{R}$$ into a transition system $$\mathbb{G}=(X,Y,I,T)$$ such that a reachability query (i.e., whether a relation is derivable) holds if and only if a safety property on $$T$$ fails. Without loss of generality, we assume each Horn clause head is a relation occurence and that we wish to solve a single query, a 0-ary relation $$U$$.4

The states of the resulting transition system are defined over a finite set of state variables $$X=\{\ell _R ~|~ R \in \mathcal{R}\} \cup \{ \RelPlaceVar{R}{i} ~|~ R \in \mathcal{R}, 1 \le i \le k \text{ where } R \text{ has arity } k \}$$: Boolean relation variables $$\ell _R$$ and place variables $$\RelPlaceVar{R}{i}$$; and fresh primary inputs of the form $$\{ \InputVar{j}{x} ~|~ 1 \le j \le n \}$$, as explained below. Horn clauses are translated with the help of the syntactic mapping $$\llbracket \cdot \rrbracket$$ defined over quantifier-free formulas. Let (possibly-subscripted) $$e,f,g,s,t$$ be expressions: \begin{align} \llbracket x \rrbracket &\equiv \InputVar{i}{x} \qquad \text{quantified variable $x$ occurs in rule $i$}\\ \label{eq:horn2vmt-relation} \llbracket R(x_1,\ldots ,x_k) \rrbracket &\equiv \ell _R \wedge \RelPlaceVar{R}{1}= \llbracket x_1 \rrbracket \wedge \cdots \wedge \RelPlaceVar{R}{k}= \llbracket x_k \rrbracket \\ \llbracket F(e_1,\ldots ,e_k) \rrbracket &\equiv F(\llbracket e_1 \rrbracket , \ldots , \llbracket e_k \rrbracket ) \qquad \text{for interpreted $F$}\\ \llbracket s = t \rrbracket \equiv \llbracket s \rrbracket = \llbracket t \rrbracket \qquad \llbracket f \wedge g \rrbracket &\equiv \llbracket f \rrbracket \wedge \llbracket g \rrbracket \qquad \llbracket \neg f \rrbracket \equiv \neg \llbracket f \rrbracket \end{align}

During translation, $$T$$ is treated as a disjunction. For every Horn clause with atom $$A$$ in its body, $$(\forall x_1,\ldots ,x_k. A \wedge \phi \Rightarrow \head )$$, add the following disjunct to $$T$$: $$\llbracket A \wedge \phi \rrbracket \wedge \Prime (\llbracket \head \rrbracket ) \wedge \Preserve{X\setminus \Vars (\head )}.$$ The initial state $$I= (\bigwedge _{\ell _R} \neg \ell _R)$$ and the property $$P = \neg \ell _U$$. By cases it can be tediously but straightforwardly shown that if a single Horn clause is satisfiable, the resulting transition system has a corresponding satisfying assignment.

#### 3.2 Proof of Correctness

The transition system has the property that the state $$\ell _R$$ is reachable in $$T$$ if and only if relation $$R$$ is derivable under the Horn clauses.

Direction ($$\Leftarrow$$): We proceed by induction on the length of the derivation of $$R$$. All relations are initially empty; this is correctly modeled by the definition of $$I$$. Length-1 derivations use a single Horn clause whose body contains no uninterpreted relations with head $$R$$. Such a clause translates to a transition that can be similarly satisfied without relation variables and which also satisfies $$\ell _R'$$.

Consider a relation $$R$$ derivable in $$n+1$$ steps. Its last derivation step involves some rule with head $$R$$; by the induction hypothesis, a state satisfying the body of this rule is reachable in $$T$$. Examining $$T$$ and the definition of $$\llbracket \cdot \rrbracket$$ allows us to conclude that the next state satisfies $$\ell _R'$$.

Direction ($$\Rightarrow$$): We proceed by induction on the number of transitions. Initially, no relation variable is reached, due to $$I$$'s definition. Next, suppose that $$I\wedge T \wedge \ell _R'$$ is satisfied. $$T$$ guarantees that the body of the Horn rule corresponding to the transition is satisfied, so $$R$$ is derivable.

Assume that some state $$\sigma \wedge \ell _R'$$ is reachable after $$n+1$$ steps. By the induction hypothesis, the current state, which took $$n$$ steps to reach, has a Horn derivation. The current state corresponds to the body of a rule with head $$R$$, since the only transitions to $$\ell _R'$$ in $$T$$ correspond to such rules. Therefore $$R$$ is derivable. QED.

Our translation takes linear time and uses space linear in the number of Horn clauses and the relation symbols. The number of state variables is proportional to the sum of the relation symbol arities. In addition, an $$n$$-step Horn derivation corresponds to an $$O(n)$$-length execution.

### Footnotes

3Boldface indicates the only difference among the three formulas.

4If we wish to solve a more complex query, for example $$P(1,4,x)$$ (for $$P\in \mathcal{R}$$), simply modify the Horn clauses as follows: add a fresh relation symbol $$U$$ to $$\mathcal{R}$$ and a rule $$(\forall x. P(1,4,x) \Rightarrow U)$$.

### References

[1] Nikolaj Bjørner, Arie Gurfinkel, Kenneth L. McMillan & Andrey Rybalchenko (2015): Horn Clause Solvers for Program Verification. In Lev D. Beklemishev, Andreas Blass, Nachum Dershowitz, Bernd Finkbeiner & Wolfram Schulte, editors: Fields of Logic and Computation II - Essays Dedicated to Yuri Gurevich on the Occasion of His 75th Birthday, Lecture Notes in Computer Science 9300, Springer, pp. 24-51, doi:10.1007/978-3-319-23534-9_2.

[2] Aaron R. Bradley & Zohar Manna (2007): Checking Safety by Inductive Generalization of Counterexamples to Induction. In: Formal Methods in Computer-Aided Design, IEEE Computer Society, pp. 173-180, doi:10.1109/FAMCAD.2007.15.

[3] Denis Bueno & Karem A. Sakallah (2019): euforia: Complete Software Model Checking with Uninterpreted Functions. In Constantin Enea & Ruzica Piskac, editors: Verification, Model Checking, and Abstract Interpretation - 20th International Conference, VMCAI 2019, Cascais, Portugal, January 13-15, 2019, Proceedings, Lecture Notes in Computer Science 11388, Springer, pp. 363-385, doi:10.1017/S1471068415000204.

[4] Alessandro Cimatti, Alberto Griggio, Bastiaan Schaafsma & Roberto Sebastiani (2013): The MathSAT5 SMT Solver. In Nir Piterman & Scott Smolka, editors: Proceedings of TACAS, LNCS 7795, Springer, doi:10.1007/978-3-642-36742-7_7.

[5] Edmund M. Clarke, Orna Grumberg & David E. Long (1994): Model Checking and Abstraction. ACM Trans. Program. Lang. Syst. 16(5), pp. 1512-1542, doi:10.1145/3828.3837.

[6] Arie Gurfinkel, Temesghen Kahsai, Anvesh Komuravelli & Jorge A. Navas (2015): The SeaHorn Verification Framework. In Daniel Kroening & Corina S. Pasareanu, editors: Computer Aided Verification, Lecture Notes in Computer Science 9206, Springer, pp. 343-361, doi:10.1007/978-3-642-31424-7_42.

[7] Roope Kaivola & Thomas Wahl, editors (2015): Formal Methods in Computer-Aided Design. IEEE, doi:10.5555/2893529.

[8] Anvesh Komuravelli, Nikolaj Bjørner, Arie Gurfinkel & Kenneth L. McMillan (2015): Compositional Verification of Procedural Programs using Horn Clauses over Integers and Arrays. In Kaivola & Wahl [7], pp. 89-96, doi:10.5555/2893529.