Published: 29th September 2020 DOI: 10.4204/EPTCS.327 ISSN: 2075-2180

# Proceedings of the Sixteenth International Workshop on theACL2 Theorem Prover and its Applications Worldwide, Planet Earth, May 28-29, 2020

Edited by: Grant Passmore and Ruben Gamboa

 Preface Grant Passmore and Ruben Gamboa Invited Presentation: Formal Verification of Arithmetic RTL: Translating Verilog to C++ to ACL2 David M. Russinoff 1 Iteration in ACL2 Matt Kaufmann and J Strother Moore 16 New Rewriter Features in FGL Sol Swords 32 Computing and Proving Well-founded Orderings through Finite Abstractions Rob Sumners 47 RP-Rewriter: An Optimized Rewriter for Large Terms in ACL2 Mertcan Temel 61 Quadratic Extensions in ACL2 Ruben Gamboa, John Cowles and Woodrow Gamboa 75 Type Inference Using Meta-extract for Smtlink and Beyond Yan Peng and Mark R. Greenstreet 87 Cauchy-Schwarz in ACL2(r) Abstract Vector Spaces Carl Kwan, Yan Peng and Mark R. Greenstreet 90 Minimal Fractional Representations of Integers mod M David Greve 93 Generating Mutually Inductive Theorems from Concise Descriptions Sol Swords 95 Ethereum's Recursive Length Prefix in ACL2 Alessandro Coglio 108 Isomorphic Data Type Transformations Alessandro Coglio and Stephen Westfold 125 Put Me on the RAC David Hardin 142

# Preface

ACL2 is an industrial-strength automated reasoning system, the latest in the Boyer-Moore family of theorem provers. Since 1999, the ACL2 Workshop series has been the major technical forum for users of the ACL2 theorem proving system to present research related to the ACL2 theorem prover and its applications.

It is our privilege to present the 16th ACL2 Workshop, which was originally planned to take place in Austin, Texas, the birthplace of ACL2. Due to the COVID-19 pandemic of 2020, the Workshop was reimagined as a virtual event. It should be mentioned that the switch to a virtual workshop took place after all submissions had been fully refereed, so this workshop continues the tradition of presenting the state-of-the-art in ACL2. Even a casual glance through the table of contents reveals several papers devoted to improvements of the ACL2 system itself, formalizations of mathematics, and ACL2 models of software and hardware systems.

This workshop also featured three invited talks from distinguished researchers in automated reasoning: David Russinoff (Arm), John Harrison (Amazon Web Services), and Leonardo de Moura (Microsoft Research). All three speakers were originally invited for a raucous in-person workshop in Austin, and kindly transitioned their presentations to our virtual event when COVID-19 necessitated it. Their talks span formal methods in ACL2, applications of formal methods to industry, and descriptions of other approaches to formal methods highlight and enrich the many deep connections between both ACL2 and industry, and ACL2 and computational logic at large.

# Type Inference Using Meta-extract for Smtlink and Beyond

Yan Peng (University of British Columbia)
Mark R. Greenstreet (University of British Columbia)

We present clause processors for type inference of ACL2 terms. The motivating application is Smtlink: we need a bridge from the untyped logic of ACL2 to the many-sorted logic of SMT solvers. Unlike the ACL2 typealist, our clause processors support user-defined types. Furthermore, we introduce a typedterm data structure that associates type-judgements with subterms of a clause. This means that when syntactically identical subterms can occur in logically distinct contexts, our clause processors can make distinct type-judgements for each occurrence of the subterm. These capabilities make these clause processors useful for applications beyond Smtlink.

### Overview

The logic of ACL2 is untyped, whereas the logic of SMT solvers is many-sorted. The soundness argument of smtlink [3] relies on a notion of "type-hypotheses": there should be a hypothesis asserting the type of each free-variable appearing in a theorem; a verified clause processor shows that the theorem trivially holds when variables have values outside their intended domain; and the remaining "typed goal" is then discharged by the SMT solver. Our new clause processors improve on this type deduction in three ways. First, missing type-information is detected early in the translation pipeline, allowing for clearer error reporting. Second, smtlink the support for user-defined types is now verified -- using FTY is convenient, but no longer required or trusted. Third, users were required to annotate their goals with type information by using fixing functions; now most of these type judgements are inferred automatically.

ACL2 has typealist [1] which records type information. While, the typealist checks for a small, fixed set of properties used by the ACL2 proof engine, our clause processors support a wide range of user-defined types. The typealist is an alist that maps pseudo-terms to type-sets. If the same pseudo-term occurs in multiple contexts, different type-judgements may occur for each instance. We introduce a new data structure, typedterm, that records context-dependent type judgements.

### The Type Inference Algorithm and the Clause-Processors

We define a new data structure typedterm.

(defprod typed-term ((term pseudo-termp) (path-cond pseudo-termp) (judgements pseudo-termp)))


The term field represents a clause or a subterm of the clause. path-cond is the conjunction of branch conditions that leads to term, and judgements are the type-judgements for term. For example if a goal includes the hypothesis (integerp x) and has the subterm (* x x), the typed-term for (* x x) is

(typed-term '(binary-* x x) '(integerp x) '(if (if (integerp (binary-* x x)) 't 'nil)
(if (if (integerp x) 't 'nil)
(if (if (integerp x) 't 'nil) 't 'nil)
'nil)
'nil))


The shape of the judgement maps directly to the shape of the term. (Note that for each occurrence of x in (binary-* x x), there is a type judgment for it.) The judgement is added as a hypothesis to the main goal. We provide a set of constructor and destructors for typedterms that make it easy to access subterms, their path conditions, and their type judgements. The clause processors are invoked with an argument that lists the type-signatures for functions (by listing the relevant "returns" theorems), and the type-recognizers to be used for type-judgements, along with supertype/subtype relationships (again, specified by naming the relevant theorems). For use in Smtlink, type-inference consists of three clause processors. The first clause processor performs a post-order traversal of the goal, and constructs a typedterm that annotates each subterm with all type recognizers (from the provided list) that it can be shown to be satisfied by the subterm. Supertype and subtype judgements are added taking into account the path condition. The clause processor recursively traverses the pseudo-term for with the following five cases:

• Constants: Types of constants are trivially proved using meta-extract [2].
• Variables: We find type judgements from the path condition. The disjuncts of a clause are sorted so that "type hypotheses" such as (integerp i) appear early in the path. A "type-judgement" is an expression with only one free variable that is implied by the path condition.
• if-expressions: We treat if specially because the condition of an if is used to augment the path-condition for the "then" and "else" arguments.
• Lambdas: We properly shadow the path condition and pass type judgements of the actuals into the formals in the body to infer types in the body.
• Function calls (other than if): Return type information is obtained using meta-extract. Polymorphism is supported by allowing the same function to have multiple :returns theorems.

The second and third clause processors are SMT specific. Having found all valid type-judgements for each subterm with the first processor, the second clause processor chooses type-judgements that are consistent for the many-sorted logic of the SMT solver. For example, (list 1 2 3) satisfies both integer-listp and rational-listp, but Z3 considers these to be distinct sorts. Note that this clause processor "chooses" by discarding the type-judgements (i.e. disjuncts of the clause) that won't be used for the translation. A third clause processor is introduced to do static dispatch of polymorphic functions. For example for function car, we define its typed version integer-list-car:


(defun integer-list-car (x) (if (consp x) (car x) (ifix x)))


The clause processor uses meta-extract with a "replace theorem" as shown below to justify replacing a call to (car term) with a call to (integer-list-car term).

(defthm replace-of-car
(implies (and (integerp-list-p x) (consp x)) (equal (car x) (integer-list-car x))))


By combining type judgements inferred for term and the path condition establishes the hypotheses for the theorem automatically (in most cases). As a fall-back, the user can add fixing functions to force type assertions. The clause processor can generate a new term that is well-typed for Z3 and provably equivalent to the original term.

### Conclusion and Future Work

We implemented three clause-processors for type inference, type unification, and term transformation. These clause-processors will allow smtlink to infer type information for translation of goals to the many-sorted logic of an SMT solver. We are currently completing the correctness proofs of these clause-processors and will apply them to real examples.

### References

1. Robert S. Boyer & J. Strother Moore (1979): A Computational Logic. ACM Monograph Series. Academic Press, doi:10.1016/C2013-0-10411-4.
2. Matt Kaufmann & Sol Swords (2017): Meta-extract: Using Existing Facts in Meta-reasoning. In: Anna Slobodová & Warren A. Hunt Jr.: Proceedings 14th International Workshop on the ACL2 Theorem Prover and its Applications, Austin, Texas, USA, May 22-23, 2017, EPTCS 249, pp. 47–60, doi:10.4204/EPTCS.249.4.
3. Yan Peng & Mark R. Greenstreet (2018): Smtlink 2.0. In: Shilpi Goel & Matt Kaufmann: Proceedings of the 15th International Workshop on the ACL2 Theorem Prover and Its Applications, Austin, Texas, USA, November 5-6, 2018, EPTCS 280, pp. 143–160, doi:10.4204/EPTCS.280.11.

# Cauchy-Schwarz in ACL2(r) Abstract Vector Spaces

Carl Kwan
Yan Peng
Mark R. Greenstreet

We present a proof of the Cauchy-Schwarz inequality for ACL2(r) abstract vector spaces. Moreover, our proof uses Smtlink, an ACL2 book that uses the SMT solver Z3 at the backend, and requires little user-guidance beyond stating the basic inner-product and vector-space properties. By necessity, we also present a formal theory of abstract vector spaces. The proof and theory are based on our previous work on real vector spaces in ACL2(r). Abstraction is obtained via encapsulation, but reasoning about abstract vectors involved building Smtlink support for Z3's theory of uninterpreted functions and sorts, which we also discuss. To our knowledge, this is the first formal proof of Cauchy-Schwarz for abstract vector spaces in a first-order logic.

### Overview

Let $(V,\mathbb R)$ be a vector space and $\langle\cdot,\cdot\rangle:V\times V\to \mathbb R$ an inner product. For any $u,v\in V$, the Cauchy-Schwarz inequality states $$\label{ineq:cs1} \tag{Cauchy-Schwarz} \left|\langle u, v\rangle\right| ^2 \leq \langle u,u\rangle\cdot \langle v,v\rangle.$$ Proof. (sketch) Suppose $u,v\neq 0$. Let $\alpha=\langle v,v\rangle^{-1}\langle u,v\rangle\in \mathbb R$ and observe $$\label{ineq:cs-proof} 0\leq \|u-\alpha v\|^2 =\langle u,u\rangle- 2 \alpha\langle u, v\rangle + \alpha^2\langle v,v\rangle = \langle u,u\rangle-(2\alpha - \alpha) \langle u, v\rangle = \langle u,u\rangle - {\langle v,v\rangle}^{-1}{\langle u,v\rangle}^2.$$ Rearranging produces \ref{ineq:cs1}. For the full proof and statement of the theorem, see [2]. $$\tag*{\blacksquare}$$

This extended abstract presents two novel contributions: (a) an ACL2(r) formalisation of abstract vector spaces over $\mathbb R$ by way of encapsulation; and, (b) a highly-automated Smtlink-leveraged proof of the Cauchy-Schwarz inequality (including square-rooted statements and equality conditions) for such abstract spaces. Previously, we only proved Cauchy-Schwarz for $V=\mathbb R^n$. Here, we encapsulate the definitions of the basic functions (e.g. vector addition, etc.) for $(\mathbb R^n,\mathbb R, \langle\cdot,\cdot\rangle)$ thus suppressing the real properties of $\mathbb R^n$ but exporting the inner-product space axioms. As discussed later, we require the co-domain of inner-product to be $\mathbb{R}$. The subsequent theorems (e.g. \ref{ineq:cs1}, etc.) only depend on the inner-product space axioms and do not assume that the vectors are real.

We will focus the rest of our discussion on (b), our proof using Smtlink. In [2], the ACL2(r) proof of Cauchy-Schwarz involved "hand"-substituting $\alpha$ and guiding the theorem prover through each step of the algebraic manipulation via instantiating the appropriate properties of the inner product and vector space operations. This was onerous. The current proof offloads nearly all algebraic manipulations to the Z3 SMT solver using the Smtlink [3, 4] package for ACL2. While ACL2 excels at induction, rewriting, and reasoning about complex systems, the intricate algebraic manipulations involving inequalities that consistently appear in the proof of Cauchy-Schwarz (as well as other "mathematical" theorems in, e.g. applied analysis) are more efficiently performed by SMT solvers. By introducing user-defined structures with finite rules or axioms and proving theorems over them, we have thematically delved into the realm of SMT solvers wherein an extensive body of quality research has developed procedures and heuristics for finding satisfiable outcomes given some finite number of assumptions. Thus, it is natural to exploit SMT techniques for problems where the reasoning is largely "algebraic."

From a user's point of view, the proof of Cauchy-Schwarz essentially amounts to stating the basic properties of the inner product as hypotheses to Smtlink. Prior to the statement of the main theorem, there are only three lemmas involved in algebraic manipulations! The application of inner product space axioms to prove the desired hypotheses are automatically verified by ACL2(r) itself. The proof in ACL2(r) follows easily and directly from a pencil-and-paper proof of Cauchy-Schwarz.

The somewhat surprising observation underlying this paper is that SMT techniques offer dramatic benefits even when the theorems to be proven are statements over abstract domains. Z3 is a many-sorted logic whereas ACL2 is untyped. For the proof of Cauchy-Schwarz in $(\mathbb R^n,\mathbb R)$, this is a largely benign issue because Z3 supports reasoning over the reals. The exception is when functions act on objects outside their intended domains. For example, many functions in ACL2(r) on $\mathbb R^n$ are fixed to return zero on non-real inputs, but analogous functions in Z3 will return an arbitrary value of the appropriate type. Given the correct hypotheses in the ACL2(r) theorem statement, e.g. (implies (real-vec-p x) ...), this problem is mitigated. For abstract Cauchy-Schwarz, on the other hand, we require the proof to be type-independent, or the hypotheses to be otherwise free of assumptions on the particular domain in which the vectors live. The challenge is that Smtlink requires type-recognisers for each free variable in the hypothesis, discharges the trivial cases when a variable has a value of a wrong'' type, and invokes Z3 to discharge the intended case.

Smtlink supports abstract types by using Z3 theories of uninterpreted sorts and functions. These allow users to define functions (and constants as nullary functions) over abstract types and, given some assumptions about the model, reason about them. Instead of writing the type-recogniser (real-vec-p v), we now may provide Smtlink with a recogniser for an abstract type (a-vec-p v) for which the definition is encapsulated (within the encapsulation, (a-vec-p v) is functionally equivalent to (real-vec-p v)). Theorems dependent on manipulations of inequalities involving functions over abstract algebraic structures are automatically proven with Smtlink. Here is the ACL2(r) proof of \ref{ineq:cs1} above:


(local (defthm cs1-when-v-not-zero
(implies (and (a-vec-p u) (a-vec-p v) (vector-compatible u v) (not (vector-zero-p v)))
(b* ((uu (inner-prod u u)) (uv (inner-prod u v)) (vv (inner-prod v v)))
(<= (* uv uv) (* uu vv))))
:hints
:functions((vector-add :formals ((u a-vec-p) (v a-vec-p))
:returns ((sum a-vec-p))
:level 0) ... ) ;; elided functions
:hypotheses( ...  			    ;; elided hypotheses
((equal (inner-prod (vector-add u (scalar-vector-prod (- (aa u v)) v))
(vector-add u (scalar-vector-prod (- (aa u v)) v)))
(+ (inner-prod u u)
(* (- 2) (aa u v) (inner-prod u v))
(* (aa u v) (aa u v) (inner-prod v v)))))))))))



Another form of introducing abstraction into ACL2 is via monoids [1]. However, encapsulating the type-recogniser for abstract vectors was more amenable to leveraging Smtlink, which greatly simplified the proof of Cauchy-Schwarz. Regarding our choice of $\mathbb R$ as the co-domain of inner-prod: Cauchy-Schwarz typically only deals with vector spaces over $\mathbb R$ or $\mathbb C$, and an ordered subfield is necessary for inequalities to be well-defined. This immediately excludes finite fields and the usual induced norm by taking the square root of the inner product eliminates the last conventional candidate for fields, $\mathbb Q$. In the case of $\mathbb C$, supporting complex numbers in Z3 would require representing constants as uninterpreted nullary functions, which significantly increases the number of hypotheses to Smtlink. Nevertheless, we leave the ACL2 proof of Cauchy-Schwarz for $(V,\mathbb C)$ as ongoing research.

### References

1. Sebastiaan J. C. Joosten, Bernard van Gastel & Julien Schmaltz (2013): A Macro for Reusing Abstract Functions and Theorems. Electronic Proceedings in Theoretical Computer Science 114, pp. 29-41, doi:10.4204/eptcs.114.3.
2. Carl Kwan & Mark R. Greenstreet (2018): Real Vector Spaces and the Cauchy-Schwarz Inequality in ACL2(r). Electronic Proceedings in Theoretical Computer Science 280, pp. 111-127, doi:10.4204/eptcs.280.9.
3. Yan Peng & Mark Greenstreet (2015): Extending ACL2 with SMT Solvers. Electronic Proceedings in Theoretical Computer Science 192, pp. 61-77, doi:10.4204/eptcs.192.6.
4. Yan Peng & Mark R. Greenstreet (2018): Smtlink 2.0. Electronic Proceedings in Theoretical Computer Science 280, pp. 143-160, doi:10.4204/eptcs.280.11.

# Minimal Fractional Representations of Integers mod M

David Greve

We say that the fraction $\frac{N}{D}$ represents $x \in \mathbf{Z}/m\mathbf{Z}$ when $x*D \equiv N \ (\text{mod} \ {M})$. Our definition admits many possible fractional representations. We say that $\frac{N}{D}$ is a minimal representation of $x$ if no smaller denominator ($D$) results in a numerator with a magnitude less than the magnitude of $N$. We introduce a function for computing such fractional representations and prove that it generates minimal fractions. We also prove that every $x \in \mathbf{Z}/m\mathbf{Z}$ has a minimal fractional representation in which the magnitude of $N$ and $D$ are less or equal to $\sqrt{M}$.

### Introduction

We say that the fraction $\frac{N}{D}$ represents $x \in \mathbf{Z}/m\mathbf{Z}$ when $x*D \equiv N\ (\text{mod} \ {M})$. We denote this relationship as $x \cong \frac{N}{D}\ (\text{mod} \ {M})$ (we might say "congruent to" when $D \perp M$ but we don't require this condition). This definition admits many possible fractional representations of $x$, some possibly not reduced. For example, $7 \ (\text{mod} \ 17)$ has the following representations:

$\{ \,7/1,\, 14/2,\, 4/3,\, 11/4,\, 1/5,\, 8/6,\, 15/7,\, 5/8,\, 12/9 ,\, 2/10,\, 9/11,\, 16/12 ,\,6/13,\, 13/14,\, 3/15,\, 10/16,\, 0/17 \, \}$

In this work we consider both positive and negative residues in the numerator. A positive residue is defined as $(x*D \bmod M)$ for $0 < D \leq Q$ while a negative residue is defined as $(x*D \bmod M) - M$ for $0 \leq D < Q$. Using the negative residue the fractional representations of $7 \ (\text{mod} \ 17)$ are:

$\{ -17/0,\, -10/1,\, -3/2,\, -13/3,\, -6/4,\, -16/5,\, -9/6,\, -2/7,\, -12/8,\, -5/9,\, -15/10,\, -8/11,\, -1/12,\, -11/13,\, -4/14,\, -14/15,\, -7/16\, \}$

Within the same residue class (positive or negative), we say that $\frac{N}{D}$ is a minimal representation of $x$ if no denominator smaller than ($D$) results in a numerator with a magnitude less than the magnitude of $N$. ($7/1$) is minimal for $7 \ (\text{mod} \ {17})$ simply because we don't consider positive residues with denominators less than 1. The fraction ($-3/2$) is also minimal because $\lvert -3 \rvert$ is less than the magnitude of the numerators of both negative fractions with denominators less than $2$, ($-17/0$) and ($-10/1$). ($-6/4$), however, is not minimal because ($-3/2$) has both a smaller magnitude numerator and a smaller denominator. Our proof of correctness actually requires a stronger, more general notion of minimality that is invariant over our algorithm for computing minimal fractions. This more general invariant is expressed with respect to a pair of fractions: one negative residual and one positive residual. The pair is considered minimal if, for all possible denominators ($d$), if the magnitude of either the positive or negative residual of $d*x$ is less than the sum of the magnitudes of the numerators of the pair of fractions, then $d$ must be greater than or equal to the denominator of the fraction with the same residual sign. Under this generalization the pair of fractions $(-3/2,\,4/3)$ is considered minimal because no denominator less than $3$ has a positive residual and no denominator less than $2$ has a negative residual less than $\lvert -3 \rvert + \lvert 4 \rvert = 7$. Our computation of minimal fractions relies on the following property of the mediant computation (as used in the generation of Farey sequences [1] except that, in our case, $N_1*D_2 - N_2*D_1$ is equal to $M$ rather than $1$): \begin{equation*} x \cong \frac{N_1}{D_1} \; \land \; x \cong \frac{N_2}{D_2} \; \implies \; x \cong \frac{N_1 + N_2}{D_1 + D_2} \; \;\ (\text{mod} \ {M}) \end{equation*} Our algorithm takes as input two minimal fractions with differing signs, initially $(x-M)/0$ and $x/1$ (which are trivially minimal). It then recursively replaces the fraction with the larger magnitude numerator with the mediant of the two original fractions until one of the numerators is zero. The key to the termination of our algorithm is the observation that the mediant of two fractions whose numerators differ in sign is a fraction whose numerator is smaller in magnitude than the larger of the magnitudes of the two original numerators. We prove that this algorithm preserves our notion of minimal fractional pairs. The minimal fractional pairs generated by our algorithm for $7 (\text{mod} \ {17})$ are listed below.

$(-17/0,\,7/1),\, (-10/1,\,7/1),\,(-3/2,\,7/1),\,(-3/2,\,4/3),\,(-3/2,\,1/5),\,(-2/7,\,1/5),\,(-1/12,\,1/5),\, (-1/12,\,0/17)\,$

In addition to our minimality invariant we prove that every $x$ has a fractional representation in which both $\lvert N \rvert$ and $D$ are less than or equal to $\sqrt{M}$. This result follows from the fact that $N_1*D_2 - N_2*D_1 = M$ where $N_1,D_1,D_2 >= 0 \;\land \;N2 < 0$ is also an invariant of our algorithm. Consider the case where $D_1 < \sqrt{M}$ and $D_2 < \sqrt{M}$ but $D_1 + D_2 \geq \sqrt{M}$. If both $N_1 > \sqrt{M}$ and $- N_2 > \sqrt{M}$, then in the following step, $\lvert (D_1 + D_2)*N_i \rvert$ will be greater than $M$, violating our invariant. Thus, at least one $\lvert N_i \rvert \leq \sqrt{M}$. It is possible for a number to have more than one representation whose coefficients are less than $\sqrt{M}$. For example, $12 \ (\text{mod} \ {17})$ is represented by both ($-3/4$) and ($2/3$), both of whose coefficient magnitudes are less than $\sqrt{17}$. Deciding which is the minimum is a judgment call. We say that the minimum fraction is the one with the smallest maximum coefficient, resolving ties with the smaller denominator. Under those conditions the minimum fractional representation of $12 \ (\text{mod} \ {17})$ is ($2/3$). Using this minimality criteria the minimum fractional representations for each of the numbers $1\dots16 \ (\text{mod} \ {17})$ are:

$\{ \,1,\,2,\,3,\,4,\,-2/3,\,1/3,\,-3/2,\,-1/2,\,1/2,\,3/2,\,-1/3,\,2/3,\,-4,\,-3,\,-2,\,-1\, \}$

### Conclusion

The properties presented in this paper were verified usng ACL2 and are distributed along with the ACL2 source code in the directory /books/workshops/2020/greve. In that book we verify an algorithm for computing minimal fractional representations $\frac{N}{D}$ of numbers $x \in \mathbf{Z}/m\mathbf{Z}$. We also prove that all such numbers have a representation in which the magnitude of both $N$ and $D$ are less than or equal to $\sqrt{M}$. In the cryptographic community there is interest in finding smooth numbers that result from specific computations. The quadratic sieve algorithm [2], for example, attempts to find small numbers (numbers on the order of $\sqrt{M}$) in hopes of factoring them into a smooth factor base. We show that any residue relatively prime to $M$ can be represented as a quotient of two numbers less than or equal to $\sqrt{M}$.

### References

1. (2020): Farey Sequence. Available at https://en.wikipedia.org/wiki/Farey_sequence.
2. (2020): General Number Field Sieve. Available at https://en.wikipedia.org/wiki/General_number_field_sieve.

# Put Me on the RAC

David Hardin (Collins Aerospace)

We have a keen research interest in Hardware/Software Co-Assurance. This interest is motivated in part by emerging application areas, such as autonomous and semi-autonomous platforms for land, sea, air, and space, that require sophisticated algorithms and data structures, are subject to stringent accreditation/certification, and encourage hardware/software co-design approaches. As part of our research, we have conducted several experiments employing a state-of-the-art toolchain, due to Russinoff and O'Leary, and originally designed for use in floating-point hardware verification [7], to determine its suitability for the creation of various safety-critical/security-critical applications in numerous domains.

Algorithmic C [5] defines C++ header files that enable compilation to both hardware and software platforms, including support for the peculiar bit widths employed, for example, in floating-point hardware design. The Russinoff-O'Leary Restricted Algorithmic C (RAC) toolchain translates a subset of Algorithmic C source to the Common Lisp subset supported by the ACL2 theorem prover, as augmented by Russinoff's Register Transfer Logic (RTL) books. In this extended abstract, we summarize several recent experiments on the use of the RAC toolchain in various safety-critical/security-critical domains, focusing first on software implementations, but extending to hardware design/verification, as well as hardware/software co-assurance (in future work).

In a first experiment, we created a number of algebraic data types with fixed maximum size, suitable for both hardware and software implementation, in Restricted Algorithmic C, including lists, stacks, queues, deques, and binary trees. This approach was inspired by our previous work on fixed-size formally verified algebraic data types [4], and produced a set of reusable data structures that could be employed in further work. We used the RAC toolchain to translate these algebraic data types to ACL2, performed proofs of correctness in ACL2, and validated the data types by testing. This testing was conducted using a C++ compilation toolchain, as well as within ACL2. This dual validation was not redundant effort, as the RAC translator is not itself formally verified. Our success with this effort gave us confidence that the RAC toolchain could be productively used in domains outside of floating-point hardware design and verification.

As a simplified example, consider the development of a basic stack data type, implemented using a fixed-size array. The Algorithmic C header file declaration for this type is given below. We arbitrarily set the maximum number of stack elements to 16383 for this example.


#define STK_MAX_SZ 16383
#define STK_OK 0
#define STK_OCCUPANCY_ERR 255

struct STKObj {
ui14 nodeTop;
array nodeArr;
};


Note the use of specific integer widths, such as the ui14 declaration for the 14-bit unsigned integer indices, that are not readily available in "vanilla" C++. One might deem these exacting type declarations to be bothersome, but, in our experience, the benefits of such strong typing in areas such as early error identification outweigh the costs. Also note that the RAC toolchain translates this struct-of-array into an ACL2 record, with the usual ACL2 record AG (get) and AS (set) operators. The body of the stack code consists of a number of basic C++ functions implementing various stack operations. Three such stack operators are the push, pop, and top operators, depicted below.

STYP STK_push (i64 v, STKObj amp(SObj)) {
if (SObj.nodeTop > 0) {
// Note: Stack grows down
SObj.nodeTop--;
SObj.nodeArr[SObj.nodeTop] = v;
}
return SVAL;
}

STYP STK_pop (STKObj amp(SObj)) {
if (SObj.nodeTop < STK_MAX_SZ) {
SObj.nodeTop++;
}
// Note: Pop of empty stack is a nop
return SVAL;
}

tuple STK_top (STKObj amp(SObj)) {
if (SObj.nodeTop < STK_MAX_SZ) {
return tuple(STK_OK, SObj.nodeArr[SObj.nodeTop]);
} else {
return tuple(STK_OCCUPANCY_ERR, 0);
}
}


Note that the stack "grows" down, towards smaller array indices, and that the top function returns a two-tuple consisting of an error code as well as the top-of-stack value. Once the stack data type code has been translated into ACL2 by the RAC toolchain, we can begin to reason about the translated functions in the ACL2 environment, using the RTL books, as well as other ACL2 books. One functional correctness property to prove of our stack representation is that the top-of-stack resulting from a push followed by a pop is the same as the original top-of-stack, given that space exists for the push. This is expressed in ACL2 as

(defthm STK_top-of-STK_pop-of-STK_push-val--thm
(implies
(and
(stkp Obj)
(spacep Obj)
(acl2::signed-byte-p 64 n))
(equal (nth 1 (STK_top (STK_pop (STK_push n Obj))))
(nth 1 (STK_top Obj)))))


where stkp is a well-formedness predicate for the stack object, and spacep is true if there is space for more elements on the stack. ACL2 readily proves this theorem after a few basic lemmas are introduced. In a second experiment, we created an Instruction Set Architecture (ISA) simulator for a representative 64-bit RISC ISA in the RAC C++ subset [2]. It is common practice to create such a simulator early in the microprocessor development process, usually written in C/C++, in order to allow for early compiler toolchain development, to serve as a platform to perform tradeoffs for various candidate ISA enhancements, as well as to act as an "oracle" for subsequent detailed microprocessor development. We used the RAC tool to translate the simulator code to ACL2, produced small binary programs for the ISA used to validate the simulator, and then utilized the ACL2 Codewalker decompilation-into-logic facility to prove those test programs correct. In a third effort, we implemented a high-assurance filter for JSON-formatted data used in an Unmanned Air Vehicle (UAV) application [3] using the RAC toolchain. Our JSON filter was built using a table-driven lexer/parser, supported by mathematically-proven lexer and parser table generation technology, as well as a verified stack data structure (reused from our earlier RAC data structure work). As in previous experiments, the JSON filter implementation was validated via tests generated via C++ compilation, as well as in ACL2, with the outputs of the two test environments compared for equality. We also gathered performance data, indicating that our RAC-generated lexer/parser for JSON was very competitive in speed (up to 20% faster) when compared to JSON parsers generated by non-verified parser generators. The aforementioned efforts have focused on software implementations of RAC applications. Recently, we have begun work on targeting Field-Programmable Gate Array (FPGA) hardware using the Mentor Graphics Catapult [6] toolsuite, in collaboration with colleagues at Kansas State University. A goal of this collaboration is to enable the generation of high-assurance hardware and/or software from high-level architectural specifications expressed in the Architecture Analysis and Design Language (AADL) [1]. Future experiments will include the generation of high-assurance hardware/software co-designs with proofs of correctness in ACL2.

### References

1. Peter Feiler & David Gluch (2012): Model-Based Engineering with AADL: An Introduction to the SAE Architecture Analysis and Design Language. Addison-Wesley Professional.
2. David S. Hardin (2019): Instruction Set Architecture Specification, Verification, and Validation using Algorithmic C and ACL2. In: Workshop on Instruction Set Architecture Specification (SpISA 19). Available at https://www.cl.cam.ac.uk/~jrh13/spisa19/paper_09.pdf.
3. David S. Hardin (2020): Verified Hardware/Software Co-Assurance: Enhancing Safety and Security for Critical Systems. In: Proceedings of the 2020 IEEE Systems Conference (to appear).
4. David S. Hardin & Konrad L. Slind (2018): Using ACL2 in the Design of Efficient, Verifiable Data Structures for High-Assurance Systems. In: Shilpi Goel & Matt Kaufmann: Proceedings of the 15th International Workshop on the ACL2 Theorem Prover and its Applications, EPTCS 280, pp. 61–76, doi:10.4204/EPTCS.280.5.
5. Mentor Graphics Corporation (2019): Algorithmic C (AC) Datatypes. Available at https://www.mentor.com/hls-lp/downloads/ac-datatypes.
6. Mentor Graphics Corporation (2020): Catapult High-Level Synthesis. Available at https://www.mentor.com/hls-lp/catapult-high-level-synthesis/.
7. David M. Russinoff (2018): Formal Verification of Floating-Point Hardware Design: A Mathematical Approach. Springer, doi:10.1007/978-3-319-95513-1.