Published: 14th August 2019
DOI: 10.4204/EPTCS.298
ISSN: 2075-2180

EPTCS 298

Proceedings Third Joint Workshop on
Developments in Implicit Computational complExity
and
Foundational & Practical Aspects of Resource Analysis
Prague, Czech Republic, 6.4.2019 - 7.4.2019

Edited by: Thomas Seiller and Steffen Jost

Preface
Steffen Jost
Applications of Linear Defeasible Logic: combining resource consumption and exceptions to energy management and business processes
Francesco Olivieri, Guido Governatori, Claudio Tomazzoli and Matteo Cristani
1
On the Elementary Affine Lambda-Calculus with and Without Fixed Points
Lê Thành Dũng Nguyen
15
Pointers in Recursion: Exploring the Tropics
Paulin Jacobé de Naurois
31
Type-Based Resource Analysis on Haskell
Franz Siglmüller
47
Type-two Iteration with Bounded Query Revision
Bruce M. Kapron and Florian Steinberg
61
A Hierarchical System of Linear Logic for Exponential Time Complexity
Brian F. Redmond
75
Modular Runtime Complexity Analysis of Probabilistic While Programs
Martin Avanzini, Michael Schaper and Georg Moser
81

Preface

These proceedings present the accepted regular papers and some selected extended abstracts from the 3rd joint DICE-FOPARA workshop, which was held in Prague, Czech Republic on April 6-7, 2019, as a part of ETAPS. The joint workshop provides synergies by combining two complementary communities.

The 10th DICE workshop explores the area of Implicit Computational Complexity (ICC), which grew out from several proposals to use logic and formal methods to provide languages for complexity-bounded computation (e.g. Ptime, Logspace computation). It aims at studying the computational complexity of programs without referring to external measuring conditions or a particular machine model, but only by considering language restrictions or logical/computational principles entailing complexity properties. Several approaches have been explored for that purpose, such as restrictions on primitive recursion and ramification, rewriting systems, linear logic, types and lambda calculus, interpretations of functional and imperative programs.

The 6th FOPARA workshop serves as a forum for presenting original research results that are relevant to the analysis of resource (e.g. time, space, energy) consumption by computer programs. The workshop aims to bring together the researchers that work on foundational issues with the researchers that focus more on practical results. Therefore, both theoretical and practical contributions are encouraged. We also encourage papers that combine theory and practice.

This third joint DICE-FOPARA workshop at ETAPS 2019 follows the successful experiences of co-location of DICE-FOPARA at ETAPS 2015 in London and ETAPS 2017 in Uppsala.

Program Committee

Steering Committee DICE

Steering Committee FOPARA

Acknowledgements

We thank the support staff of ETAPS 2019 for their excellent support during the workshop; all of EPTCS for their support in publishing the proceedings of our joint workshop, and especially Rob van Glabbeek for his active technical support.

A Hierarchical System of Linear Logic for Exponential Time Complexity

Brian F. Redmond

Abstract

We show that Lafont's system of Soft Linear Logic together with a hierarchical version of digging captures the complexity class EXPTIME.

Introduction

Lafont's Soft Linear Logic (SLL$_2$) is a system of second-order intuitionistic linear logic with modified exponentials that captures polynomial time computations ([1]). In other words, any SLL$_2$ proof net reduces to a unique normal form in a polynomial number of steps, and conversely, any polynomial time Turing machine algorithm can be encoded in the system. The exponential rules in SLL$_2$ have a particularly simple form: soft promotion (i.e. monoidal-functoriality) and multiplexing of rank $n$, $n \in \mathbb{N}$. These rules enforce a predicative control on the duplication of resources, and in fact the polynomial time bound on reduction is easy to prove. However, programming in SLL$_2$ is more of a challenge. SLL$_2$ has a strict programming discipline whereby programs are required to be generic (i.e. do not use multiplexing) and data is required to be homogeneous (i.e. all multiplexors have the same rank). In this paper, we shall maintain the genericity of programs but we shall relax slightly the homogeneity restriction on data. In the conclusion of Lafont's paper, it was suggested that the introduction of a hierarchy of modalities $!_0, !_1,!_2,\dots$ with a hierarchical version of digging would lead to a system for elementary recursive computation. We show that in fact only EXPTIME complexity is obtained.

Soft Linear Logic with Hierarchical Digging

The setting for this paper is Lafont's Soft Linear Logic ([1]) extended with hierarchical digging. Formulas are given by the following grammar: \begin{eqnarray*} A,B &::=& \alpha \mid A \& B \mid {\bf 1} \mid A \otimes B \mid A \multimap B \mid !_s A \mid \forall \alpha.A \end{eqnarray*} where $\alpha$ is an atomic type variable and $s \in \mathbb{N}$. We shall use the abbreviation $!_s^{n}$ to denote the iteration $!_s\cdots !_s$ ($n \geq 1$ times) and $!_{s,r}$ to denote $!_s!_{s-1}\dots !_{r}$ whenever $s \geq r \geq 0$. Further to this, the notation $A^{(n)}$ is short for $A,\dots,A$ ($n\geq 0$ times). Sequents are written intuitionistically in the form $\Gamma \vdash A$ where $\Gamma$ is a finite (possibly empty) list of formulas and $A$ is a single formula. Also, $!_s\Gamma = !_s A_1,\dots,!_s A_k$ if $\Gamma = A_1,\dots,A_k$. The rules are as follows:

Proofs in this paper are presented using the sequent calculus formulation above, but are implicitly identified with their corresponding proof nets (see [1]). Furthermore, only external reduction is assumed. Proof nets can be interpreted in second-order lambda calculus with explicit pairs. It is straightforward to show that SLL$_2$ with hierarchical digging satisfies cut-elimination.

Theorem 1: SLL$_2$ with hierarchical digging satisfies cut-elimination.

Proof: The three non-standard reduction steps in the cut-elimination procedure are as follows. The first two already appear in SLL$_2$. \begin{eqnarray*} \cfrac{\cfrac{\cfrac{\vdots}{\Gamma \vdash A}}{!_s\Gamma \vdash !_s A} \quad \cfrac{\cfrac{\vdots}{\Delta, A \vdash B}}{!_s\Delta, !_sA \vdash !_s B}}{!_s \Gamma, !_s \Delta \vdash !_sB} &\quad\leadsto\quad& \cfrac{\cfrac{\cfrac{\vdots}{\Gamma \vdash A} \quad \cfrac{\vdots}{\Delta,A \vdash B}}{\Gamma, \Delta \vdash B}}{!_s \Gamma, !_s \Delta \vdash !_sB} \\ \cfrac{\cfrac{\cfrac{\vdots}{\Gamma \vdash A}}{!_0\Gamma \vdash !_0A} \quad \cfrac{\cfrac{\vdots}{\Delta,A^{(n)} \vdash B}}{\Delta, !_0A \vdash B}}{!_0\Gamma,\Delta \vdash B} &\quad\leadsto\quad& \cfrac{\cfrac{\cfrac{\vdots}{\Gamma \vdash A} \quad \cdots \quad \cfrac{\vdots}{\Gamma\vdash A} \quad \cfrac{\vdots}{\Delta,A^{(n)} \vdash B}}{\Gamma^{(n)},\Delta \vdash B}}{!_0 \Gamma,\Delta \vdash B} \\ \cfrac{\cfrac{\cfrac{\vdots}{\Gamma \vdash A}}{!_{s+1}\Gamma \vdash !_{s+1}A} \quad \cfrac{\cfrac{\vdots}{\Delta,!_s^nA \vdash B}}{\Delta,!_{s+1}A \vdash B}}{!_{s+1}\Gamma,\Delta \vdash B} &\quad\leadsto\quad& \cfrac{\cfrac{\cfrac{\cfrac{\vdots}{\Gamma \vdash A}}{!_s^n\Gamma \vdash !_s^nA} \quad \cfrac{\vdots}{\Delta, !_s^nA \vdash B}}{!_s^n\Gamma,\Delta \vdash B}}{!_{s+1}\Gamma, \Delta \vdash B} \end{eqnarray*} In the last reduction, note that soft promotion is applied successively $n \geq 1$ times and digging is used for each of the formulas in $!_s^n\Gamma$. $$\tag*{$\blacksquare$}$$

The following are the important measures on proof nets in this setting:

If $s = 0$, we take the empty product $Y_1 \cdots Y_s$ to be 1 and $X^0 =1 $ so that $W_{!_0 u} = XW_u +1$, as expected. The following theorem gives the complexity of (external) reduction.

Theorem 2: A proof net $u$ of level $s$ and rank $n$ reduces to a unique normal form in at most $W(n,\dots,n)$ steps.

Proof: We shall omit the first two cases above as they are the same as in SLL$_2$. We shall prove the third by comparing weights before and after reduction. \begin{eqnarray*} W_{!^n_s u} + W_{v} &=& X^{Y_1 \cdots Y_s}W_{!_s^{n-1}u}+ (s+1)\sum_{i=0}^{Y_1 \cdots Y_s-1} X^i + W_v\\ &=& X^{Y_1 \cdots Y_s}\left( X^{Y_1 \cdots Y_s}W_{!_s^{n-2}u} + (s+1)\sum_{i=0}^{Y_1 \cdots Y_s-1} X^i \right) + (s+1)\sum_{i=0}^{Y_1 \cdots Y_s-1} X^i + W_v\\ &=& X^{Y_1 \cdots Y_s2}W_{!_s^{n-2}u} + (s+1)\sum_{i=Y_1 \cdots Y_s}^{Y_1 \cdots Y_s2-1} X^i + (s+1)\sum_{i=0}^{Y_1 \cdots Y_s -1}X^i + W_v \\ &=& X^{Y_1 \cdots Y_s2}W_{!_s^{n-2}u}+ (s+1)\sum_{i=0}^{Y_1 \cdots Y_s2-1} X^i + W_v \\ && \qquad \qquad \qquad \qquad \vdots \\ &=& X^{Y_1 \cdots Y_sn}W_{u}+ (s+1)\sum_{i=0}^{Y_1 \cdots Y_sn-1} X^i + W_v \\ &<& X^{Y_1 \cdots Y_sY_{s+1}} W_u + (s+2)\sum_{i=0}^{Y_1 \cdots Y_sY_{s+1}-1} X^i + W_v \\ &=& W_{!_{s+1}u} + W_v \end{eqnarray*} $$\tag*{$\blacksquare$}$$

By induction on the structure of $u$, it is not difficult to show that if $u$ is a proof net of degree $d$ and level $s$, then $W_u(n,\dots,n) \leq k n^{d n^s}$ where $k = W_u(1,\dots,1)$. Hence, in general, reduction requires more than exponential time. As we shall see in the next section, however, the level $s$ of data remains fixed, so reduction is in exponential time as claimed.

Encoding Exponential Time

Data types in this system have a somewhat peculiar form. For example, the data type of natural numbers is defined as follows: \begin{eqnarray*} {\bf N}_{s} &=& \forall \alpha. !_s ( \alpha \multimap \alpha) \multimap \alpha \multimap \alpha \\ {\bf N}_{s+1/2} &=& \forall \alpha. !_{s+1} (!_{s} \alpha \multimap \alpha) \multimap !_{s+1} \alpha \multimap \alpha \end{eqnarray*} where $s \in \mathbb{N}$. Nets of the above types translate to the usual Church numerals. For example, the following is a level 1 non-homogeneous proof for the number 3, which has type ${\bf N}_{1/2}$, rank $3$, and degree $3$. Homogeneity can be maintained for data at level 0, but it seems that higher level data generally require additional rank 1 instances of both multiplexing and digging. \[ \cfrac{\cfrac{\cfrac{\cfrac{\cfrac{\cfrac{\cfrac{\cfrac{\cfrac{\cfrac{\cfrac{\cfrac{}{\alpha \vdash \alpha}}{!_0 \alpha \vdash !_0 \alpha} \quad \cfrac{}{\alpha \vdash \alpha}}{ !_0 \alpha,!_0 \alpha \multimap \alpha \vdash \alpha }}{!_0!_0\alpha, !_0(!_0 \alpha \multimap \alpha) \vdash !_0 \alpha} \quad \cfrac{}{\alpha \vdash \alpha}}{!_0!_0\alpha, !_0(!_0\alpha \multimap \alpha), !_0\alpha \multimap \alpha \vdash \alpha}}{!_0!_0!_0\alpha, !_0!_0(!_0\alpha \multimap \alpha), !_0(!_0\alpha \multimap \alpha) \vdash !_0 \alpha} \quad \cfrac{}{\alpha \vdash \alpha}}{!_0!_0!_0\alpha, !_0!_0(!_0\alpha \multimap \alpha), !_0(!_0\alpha \multimap \alpha),!_0 \alpha \multimap \alpha \vdash \alpha}}{!_0!_0!_0 \alpha, !_0!_0(!_0 \alpha \multimap \alpha)^{(3)} \vdash \alpha }}{!_0!_0!_0 \alpha, !_0!_0!_0(!_0\alpha \multimap \alpha) \vdash \alpha}}{!_1 \alpha, !_1(!_0 \alpha \multimap \alpha) \vdash \alpha}}{\vdash !_1 (!_0 \alpha \multimap \alpha) \multimap !_1\alpha \multimap \alpha}}{\vdash \forall \alpha. !_1 (!_0 \alpha \multimap \alpha) \multimap !_1 \alpha \multimap \alpha} \] In this system there is a proof for the exponential function: \[ \cfrac{\cfrac{\cfrac{\cfrac{\cfrac{\cfrac{\vdots}{l_0: {\bf N}_0, \vdash l_0 : !_0 A \multimap A}}{l_0: !_1 {\bf N}_0 \vdash l_0 : !_1(!_0 A \multimap A)} \quad \cfrac{ \cfrac{}{y: !_1(\alpha \multimap \alpha) \vdash y: !_1A} \quad \cfrac{}{z: A \vdash z: \alpha \multimap \alpha}}{y: !_1(\alpha \multimap \alpha), g: !_1A \multimap A \vdash gy : \alpha \multimap \alpha } }{y: !_1 (\alpha \multimap \alpha),l_0: !_1 {\bf N}_0, l_1: !_1(!_0 A \multimap A) \multimap !_1 A \multimap A \vdash l_1 l_0 y: \alpha \multimap \alpha}}{y:!_1(\alpha \multimap \alpha), l_0: !_1 {\bf N}_0, l_1: {\bf N}_{1/2} \vdash l_1 l_0 y : \alpha \multimap \alpha} }{l_0: !_1 {\bf N}_0, l_1: {\bf N}_{1/2} \vdash \lambda y.l_1 l_0 y :!_1 (\alpha \multimap \alpha) \multimap \alpha \multimap \alpha }}{l_0: !_1 {\bf N}_0, l_1:{\bf N}_{1/2} \vdash \lambda y.l_1 l_0 y: {\bf N}_1 } \] where $A = \alpha \multimap \alpha$. Indeed, up to $\eta$-equivalence this leads to the lambda term $\lambda l_0l_1.l_1 l_0$. Also note that this is a generic proof! The above proof generalizes by induction on $s \geq 1$ to give a hierarchy of exponential maps: \begin{eqnarray*} l_0: !_1 {\bf N}_0 , l_1 : {\bf N}_{1/2} &\vdash& l_1 l_0 : {\bf N}_1 \\ l_0: !_2 !_1{\bf N}_0, l_1: !_2 {\bf N}_{1/2}, l_2: {\bf N}_{3/2} &\vdash& l_2 (l_1 l_0) : {\bf N}_2 \\ l_0: !_3 !_2 !_1{\bf N}_0, l_1: !_3 !_2 {\bf N}_{1/2}, l_2: !_3 {\bf N}_{3/2}, l_3: {\bf N}_{5/2} &\vdash& l_3(l_2(l_1 l_0)): {\bf N}_3 \\ &\vdots& \end{eqnarray*} where terms are written up to $\eta$-equivalence for clarity. For each $s \geq 1$, these terms compute the exponential function $l_0^{l_1 l_2 \cdots l_s}$.

Other data types that are required in our encoding include binary strings, Turing machines (with $k$ states) and booleans: \begin{eqnarray*} {\bf S}_{s} &=& \forall \alpha. !_s ( ( \alpha \multimap \alpha) \& (\alpha \multimap \alpha)) \multimap \alpha \multimap \alpha \\ {\bf S}_{s+1/2} &=& \forall \alpha. !_{s+1} ((!_{s} \alpha \multimap \alpha) \& (!_{s} \alpha \multimap \alpha)) \multimap !_{s+1} \alpha \multimap \alpha \\ {\bf M}_s &=& \forall \alpha. !_s \left((\alpha \multimap \alpha) \& (\alpha \multimap \alpha) \right) \multimap {\bf F}_k \otimes {\bf F}_3 \otimes (\alpha \multimap \alpha)^2 \\ {\bf B} &=& \forall \alpha. (\alpha \& \alpha) \multimap \alpha \end{eqnarray*} where $s \in \mathbb{N}$ and ${\bf F}_k = {\bf 1} \oplus \cdots \oplus {\bf 1}$ ($k$ times) is a finite type ([1]).

Our encoding of exponential time algorithms is similar to Lafont's. We start by building a machine with a sufficiently long tape to handle the entire computation, and then iterate the transition function an exponential number of times. In the following theorem, we do not consider constant factors in our exponential bound for simplicity. However, a more precise statement can be given by following Lafont's accounting techniques (see Theorem 6 in [1]).

Theorem 3: If a predicate on boolean strings is computable by a Turing machine in time and space bounded by $n^{n^s}$ (for some fixed $s>0$), then there is a generic proof for the sequent $$!_{s,1} {\bf S}_0^{(2)}, !_{s,2} {\bf S}_{1/2}^{(2)}, !_{s,3} {\bf S}_{3/2}^{(2)},\dots, {\bf S}_{s-1/2}^{(2)}, {\bf S}_0 \vdash {\bf B}$$ which corresponds to this predicate.

Proof: (Sketch) There is a generic proof for the sequent $!_{s,1} {\bf S}_0, !_{s,2} {\bf S}_{1/2}, !_{s,3} {\bf S}_{3/2},\dots, {\bf S}_{s-1/2} \vdash {\bf M}_s$ that constructs a Turing machine with tape of length $n^{n^s}$, filled with 0s, and with the head at the beginning of the tape. There is a generic proof for the sequent ${\bf S}_0, {\bf M}_s \vdash {\bf M}_s$ that writes a string on the tape of the machine (assuming the tape bounds the length of the string) and puts the machine in the starting configuration. There is a generic proof for the sequent ${\bf M}_s \vdash {\bf M}_s$ that encodes the transition function of the machine; it is identical to Lafont's, but lifted to level $s$. There is a generic proof for the sequent ${\bf N}_s, {\bf M}_s \vdash {\bf M}_s$ that iterates the transition function a given number of times on a given starting configuration. There is a generic proof for the sequent $!_{s,1} {\bf S}_0, !_{s,2} {\bf S}_{1/2}, !_{s,3} {\bf S}_{3/2},\dots, {\bf S}_{s-1/2} \vdash {\bf N}_s$ that constructs a natural number of size $n^{n^s}$, bounding the total number of steps in the computation. Finally, there is a generic proof for the sequent ${\bf M}_s \vdash {\bf B}$ that determines whether the machine is in a accepting state. Putting this all together, we get a generic sequent of the form $!_{s,1} {\bf S}_0^{(2)}, !_{s,2} {\bf S}_{1/2}^{(2)}, !_{s,3} {\bf S}_{3/2}^{(2)},\dots, {\bf S}_{s-1/2}^{(2)}, {\bf S}_0 \vdash {\bf B}$ which corresponds to the predicate. $$\tag*{$\blacksquare$}$$

Conclusion

In this brief note, we have explored a natural extension of SLL$_2$ that captures EXPTIME complexity, adding to the growing list of the so-called light linear logics. Although we did not prove it here, it is interesting to note that hierarchical digging can be used to give a compressed-rank representation of data, and that generic proofs for sequents of the form ${\bf S}_s^{(d)} \vdash {\bf B}$ correspond to polynomial time algorithms (for all $s \geq 0)$. Finally, it appears EXPTIME cannot be captured in this system by collapsing to a finite number of levels.

References

  1. Yves Lafont (2004): Soft linear logic and polynomial time. Theoretical Computer Science 318(1-2), pp. 163 – 180, doi:10.1016/j.tcs.2003.10.018.

Modular Runtime Complexity Analysis of Probabilistic While Programs

Martin Avanzini (INRIA Sophia Antipolis)
Michael Schaper (University of Innsbruck)
Georg Moser (University of Innsbruck)

Introduction

We are concerned with the average case runtime complexity analysis of a prototypical imperative language \(\mathsf{pWhile}\) in the spirit of Dijkstra's Guarded Command Language. This language is endowed with primitives for sampling and probabilistic choice so that randomized algorithms can be expressed. Complexity analysis in this setting is particularly appealing as efficiency is one striking reason why randomized algorithms have been introduced and studied: in many cases, the most efficient algorithm for the problem at hand is randomized [5].

Our staring point towards an automated analysis is the \(\mathsf{ert}\)-calculus of Kaminski et al. [4], which constitutes a sound and complete method for deriving expected runtimes of probabilistic programs. The \(\mathsf{ert}\)-calculus has been recently automated within [6], showing encouraging results. Indeed, their prototype Absynth can derive accurate bounds on the expected runtime of a wealth of non-trivial, albeit academic, imperative programs with probabilistic choice. Since the average case runtime of probabilistic programs is inherently non-modular (see e.g. [4]), different program fragments cannot be analysed in general independently within the \(\mathsf{ert}\)-calculus. This work aims at overcoming this situation, by enriching the calculus with a form of expected value analysis. Conceptually, our result rests on the observation that if \(f\) and \(g\) measure the runtime of deterministic programs \(\mathtt{C}\) and \(\mathtt{D}\) as a function in the variables assignment \(\sigma\) before executing the command, then \(f(\sigma) + g(\sigma')\) for \(\sigma'\) the store after the execution of \(\mathtt{C}\) gives the runtime of the composed command \(\mathtt{C};\mathtt{D}\). Estimating \(\sigma'\) in terms of \(\mathtt{C}\) and \(\sigma\), and ensuring some monotonicity property on \(g\), gives rise to a compositional analysis. When \(\mathtt{C}\) exhibits some probabilistic behavior though, the command \(\mathtt{D}\) may be executed after \(\mathtt{C}\) on several probabilistic branches \(b\), each with probability \(p_b\) with a variable assignment \(\sigma_b\). Assuming bounding functions \(f\) and \(g\) on the expected runtime of \(\mathtt{C}\) and \(\mathtt{D}\) respectively, yields a bound \(f(\sigma) + \sum_b p_b \cdot g(\sigma_b)\) on the expected runtime of the probabilistic program \(\mathtt{C};\mathtt{D}\). As the number of probabilistic branches \(b\) is unbounded for all but the most trivial programs \(\mathtt{C}\), estimating all assignments \(\sigma_b\) in terms of \(\sigma\) soon becomes infeasible. The crux of our approach towards a compositional analysis lies in the observation that if we can give the runtime of \(\mathtt{D}\) in terms of a concave function (i.e., described by a multi-linear polynomial), the expected runtime \(\sum_b p_b \cdot g(\sigma_b)\) can be bounded in terms of \(g\) and the variable assignment \(\sum_b p_b \cdot \sigma_b\) expected after executing \(\mathtt{C}\). This way, a compositional analysis is recovered. This observation then also enables some form of modularity for the analysis of nested loops.

To prove this machinery sound, we first give a novel structural operational semantics in terms of weighted probabilistic ARSs. These constitute a refinement to probabilistic ARSs introduced by Bournez and Garnier [2] where operations do not necessarily have uniform cost. Probabilistic ARSs give rise to a reduction relation on (multi-)distributions that is equivalent to the standard operational semantic via stochastic processes [1]. We then generalise the \(\mathsf{ert}\)-calculus to one for reasoning about expected costs consumed by a command \(\texttt{tick}\texttt{(} \cdot \texttt{)}\), and expected values in final configurations. This machinery is proven sound and complete with respect to our new operational semantics. Finally, we conclude with some words on a prototype implementation that we are currently developing.

A Probabilistic While Language

For a finite set of integer-valued variables \(\mathsf{Var}\), we denote by \(\Sigma\triangleq\mathsf{Var}\to \mathbb{Z}\) the set of stores. The syntax of program commands \(\mathsf{Cmd}\) over \(\mathsf{Var}\) is given as follows: \[\begin{aligned} \mathtt{C},\mathtt{D} & \mathrel{::=}\texttt{tick}\texttt{(} r \texttt{)} \mid x\,\texttt{:=}\,d \mid \texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}} \mid \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}} \mid \texttt{\{} \mathtt{C} \texttt{\}} \, \texttt{<>} \, \texttt{\{} \mathtt{D} \texttt{\}} \mid \texttt{\{} \mathtt{C} \texttt{\}}\, \texttt{[} p \texttt{]}\, \texttt{\{} \mathtt{D} \texttt{\}} \mid \mathtt{C}; \mathtt{D}\;.\end{aligned}\] In this grammar, \(\phi\in \mathsf{BExp}\) denotes a Boolean expression over \(\mathsf{Var}\) and \(d\in \mathsf{DExp}\) an Integer-valued distribution expression over \(\mathsf{Var}\). With \([\![\cdot]\!] \colon\mathsf{DExp}\to \Sigma\to \mathcal{D}(\mathbb{Z})\) we denote the evaluation functions of distribution expressions, i.e., \([\![d]\!](\sigma)\) gives the result of evaluating \(d\) under the current store \(\sigma\), resulting in a probability distribution over \(\mathbb{Z}\). For Boolean expressions \([\![\phi]\!] \in \mathsf{BExp}\) and \(\sigma\in \Sigma\), we indicate with \(\sigma\vDash\phi\) that \(\phi\) holds when the variables in \(\phi\) take values according to \(\sigma\). Program commands are fairly standard. The command \(\texttt{tick}\texttt{(} r \texttt{)}\) consumes \(r \in \mathbb{Q}^+\) resource units but otherwise acts as a no-op. The command \(x\,\texttt{:=}\,d\) assigns a value sampled from \(d(\sigma)\) to \(x\), for \(\sigma\) the current store. This generalises the usual non-probabilistic assignment \(x\,\texttt{:=}\,e\) for \(e\) an integer expression. The commands \(\texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}}\) and \(\texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\) have the usual semantics, with \(\psi\) being an invariant. Here, an invariant holds along all probabilistic branches (e.g. probabilistic choice over-approximated with non-determinism) and can in practice be inferred with off-the shelf methods. In case that \(\psi\) does not hold, the program terminates abnormally. The command \(\texttt{\{} \mathtt{C} \texttt{\}} \, \texttt{<>} \, \texttt{\{} \mathtt{D} \texttt{\}}\) executes either \(\mathtt{C}\) or \(\mathtt{D}\), in a non-deterministic fashion. In contrast, the probabilistic choice \(\texttt{\{} \mathtt{C} \texttt{\}}\, \texttt{[} p \texttt{]}\, \texttt{\{} \mathtt{D} \texttt{\}}\) executes \(\mathtt{C}\) with probability \(0 \leq p \leq 1\) and \(\mathtt{D}\) with probability \(1-p\).

We give the small step operational semantics for our language via a (weighted) probabilistic ARS \({\stackrel{}{\to}}\) over configurations \(\mathsf{Conf}\triangleq(\mathsf{Cmd}\times \Sigma) \cup \Sigma\cup \{\bot\}\). Elements \((\mathtt{C},\sigma) \in \mathsf{Conf}\) are denoted by \(\langle \mathtt{C} \rangle(\sigma)\) and signal that the command \(\mathtt{C}\) is to be executed under the current store \(\sigma\), whereas \(\sigma\in \mathsf{Conf}\) and \(\bot\in \mathsf{Conf}\) indicate that the computation has halted, abnormally in the latter case. The probabilistic ARS \(\stackrel{}{\to}\) is depicted in Figure 1. In this reduction system, rules have the form \(\gamma\stackrel{w}{\to} \mu\) for \(\gamma\in\mathsf{Conf}\) and \(\mu\) a multidistribution over \(\mathsf{Conf}\), i.e., countable multisets of the form \(\{\!\{ p_i:\gamma_i \}\!\}_{i \in I}\) for probabilities \(0 < p_i \leq 1\) with \(\sum_{i \in I} p_i \leq 1\) and \(\gamma_i \in \mathsf{Conf}\) ( \(i \in I\)). A rule \(\gamma\stackrel{w}{\to} \{\!\{ p_i:\gamma_i \}\!\}_{i \in I}\) signals that \(\gamma\) reduces with probability \(p_i\) to \(\gamma_i\), consuming cost \(w\). By identifying dirac multidistributions \(\{\!\{ 1:\gamma \}\!\}\) with \(\gamma\), we may write \(\gamma \stackrel{w}{\to} \gamma'\) for a reduction step without probabilistic effect.

\[\infer{\text{[Tick]}}{\langle \texttt{tick}\texttt{(} r \texttt{)} \rangle(\sigma) \stackrel{r}{\to} \sigma}{} \quad \infer{\text{[Assign]}}{\langle x\,\texttt{:=}\,d \rangle(\sigma)\stackrel{0}{\to} \{\!\{ p_i : \sigma[x \mathrel{:=} i] \mid i \in \mathbb{Z}, p_i = [\![d]\!](\sigma)(i) > 0 \}\!\}} {}\] \[\infer{\text{[IfT]}} {\langle \texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \langle \mathtt{C} \rangle(\sigma)} {\phantom{abcdefgh} \sigma\vDash\psi\land \phi\phantom{abcdefgh}} \quad \quad \infer{\text{[IfF]}} {\langle \texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \langle \mathtt{D} \rangle(\sigma)} { \sigma\vDash\psi\land \neg\phi}\] \[\infer{\text{[WhT]}} {\langle \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \langle \mathtt{C}; \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}} \rangle(\sigma)} { \sigma\vDash\psi\land \phi} \quad \quad \infer{\text{[WhF]}} {\langle \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \sigma} { \sigma\vDash\psi\land \neg\phi}\] \[\infer{\text{[AbortIf]}} {\langle \texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \bot} { \sigma\vDash\neg\psi} \quad \quad \infer{\text{[AbortWhile]}} {\langle \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \bot} { \sigma\vDash\neg\psi}\] \[\infer{\text{[ChoiceL]}} {\langle \texttt{\{} \mathtt{C} \texttt{\}} \, \texttt{<>} \, \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \langle \mathtt{C} \rangle(\sigma)}{} {} \quad \quad \infer{\text{[ChoiceR]}} {\langle \texttt{\{} \mathtt{C} \texttt{\}} \, \texttt{<>} \, \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \langle \mathtt{D} \rangle(\sigma)}{} {}\] \[\infer{\text{[ProbChoice]}} {\langle \texttt{\{} \mathtt{C} \texttt{\}}\, \texttt{[} p \texttt{]}\, \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \{\!\{ p : \langle \mathtt{C} \rangle(\sigma), 1-p : \langle \mathtt{D} \rangle(\sigma) \}\!\} } {}\] \[\begin{array}{c} \infer{\text{[Compose]}} {\langle \mathtt{C};\mathtt{D} \rangle(\sigma) \stackrel{r}{\to} \{\!\{ p_i:\mathsf{step}_\mathtt{D}(\gamma_i) \}\!\}_{i \in I}} {\langle \mathtt{C} \rangle(\sigma) \stackrel{r}{\to} \{\!\{ p_i:\gamma_i \}\!\}_{i \in I}} \end{array} \text{ where } \mathsf{step}_\mathtt{D}(\gamma) \triangleq \begin{cases} \langle \mathtt{C};\mathtt{D} \rangle(\sigma) & \text{if $\gamma= \langle \mathtt{C} \rangle(\sigma)$} \\ \langle \mathtt{D} \rangle(\sigma) & \text{if $\gamma= \sigma\in \Sigma$} \\ \bot& \text{if $\gamma= \bot$.} \end{cases}\]
Figure 1: One-step reduction relation as weighted probabilistic ARS (Excerpt).

The weighted one-step reduction relation of \(\stackrel{}{\to}\) is defined by (i) \(\mu\stackrel{0}{\twoheadrightarrow} \mu\), (ii) \(\{\!\{ 1:\gamma \}\!\} \stackrel{w}{\twoheadrightarrow} \mu\) if \(\gamma \stackrel{w}{\to} \mu\), and (iii) \(\biguplus_{i\in I} p_i\cdot\mu_i \stackrel{w}{\twoheadrightarrow} \biguplus_{i\in I} p_i\cdot\nu_i\) where \(w = \sum_{i \in I} p_i \cdot w_i\), \(\mu_i \stackrel{w_i}{\to} \nu_i\) for all \(i \in I\) and \(\sum_{i\in I}p_i \le 1\) for probabilities \(0 < p_i \leq 1\). Here, \(\biguplus_{i\in I} p_i\cdot\mu_i\) denotes the countable convex union of multidistributions \(\mu_i\) ( \(i \in I\)), e.g., \(\frac{1}{2} \cdot \{\!\{ 1 : a \}\!\} \uplus \frac{1}{2} \cdot \{\!\{ \frac{1}{3} : a, \frac{1}{2} : b \}\!\} % = \mset{\frac{1}{2} : a} \uplus \mset{\frac{1}{6} : a, \frac{1}{4} : b} = \{\!\{ \frac{1}{2} : a, \frac{1}{6} : a, \frac{1}{4} : b \}\!\}\).

Finally, with \(\stackrel{w}{\twoheadrightarrow}\) we denote the weighted multi-step reduction relation defined by \(\mu\stackrel{w}{\twoheadrightarrow} \nu\) if \(\mu= \mu_0 \stackrel{w_1}{\twoheadrightarrow} \cdots \stackrel{w_n}{\twoheadrightarrow} \mu_n = \nu\) and \(w = \sum_{i=1}^n w_i\). Expected cost and value functions for \(f \colon\Sigma\to \mathbb{R}_{\ge 0}^{\infty}\) are defined by \[\begin{aligned} \mathsf{ec}[\mathtt{C}](\sigma) & \triangleq\sup\{ w \mid \langle \mathtt{C} \rangle(\sigma) \stackrel{w}{\twoheadrightarrow} \mu\} & \mathsf{ev}[\mathtt{C}](\sigma)(f) & \triangleq\sup\{ \mathbb{E}_{{\mu{\downharpoonright}\Sigma}}(f) \mid \langle \mathtt{C} \rangle(\sigma) \stackrel{w}{\twoheadrightarrow} \mu\}\;,\end{aligned}\] where \({\mu{\downharpoonright}\Sigma}\) denotes the restriction of \(\mu\) to elements of \(\Sigma\) and \(\mathbb{E}_{\nu}(f) \triangleq\sum_{i \in I} p_i \cdot f(a_i)\) gives the expected value of \(f\) with respect to \(\nu= \{\!\{ p_i : a_i \}\!\}_{i \in I}\).

Expectation Transformers

To overcome the problems concerning composability, Kaminski et al. [4] express the expected runtime in continuation passing style, via an expectation transformer \(\mathsf{ert}[\cdot] \colon\mathtt{C}\to \mathbb{T}\to \mathbb{T}\) over expectations \(\mathbb{T}\triangleq\Sigma\to \mathbb{R}_{\ge 0}^{\infty}\). Given the cost \(f\) of executing a program fragment \(\mathtt{D}\), \(\mathsf{ect}[\mathtt{C}](f)\) computes the cost of first executing \(\mathtt{C}\) and then \(\mathtt{D}\). We suite this transformer to two transformers \(\mathsf{ect}[\mathtt{C}]\) and \(\mathsf{evt}[\mathtt{C}]\) that compute the expected cost and expected value function of the program \(\mathtt{C}\), respectively. Their definition coincide up to the case where \(\mathtt{C}= \texttt{tick}\texttt{(} r \texttt{)}\), the former taking into account the cost \(r\) while the latter is ignoring it. We thus generalise \(\mathsf{ect}[\cdot] \colon\mathsf{Cmd}\to \mathbb{T}\to \mathbb{T}\) and \(\mathsf{evt}[\cdot] \colon\mathsf{Cmd}\to \mathbb{T}\to \mathbb{T}\) to a function \(\mathsf{et}_{c}[\mathtt{C}]\) and set \(\mathsf{ect}[\mathtt{C}] \triangleq\mathsf{et}_{\top}[\mathtt{C}]\) and \(\mathsf{evt}[\mathtt{C}] \triangleq\mathsf{et}_{\bot}[\mathtt{C}]\), where \(\mathsf{et}_{c}[\mathtt{C}]\) is given in Figure 2. Here, functions \(f \colon(\mathbb{R}_{\ge 0}^{\infty})^k \to \mathbb{R}_{\ge 0}^{\infty}\) are extended pointwise on expectations and denoted in bold face, e.g., for each \(r \in \mathbb{R}_{\ge 0}^{\infty}\) we have a constant function \(\mathbf{r}(\sigma) \triangleq r\), \(f \mathrel{\mathbf{+}}g \triangleq\lambda \sigma. f(\sigma) + g(\sigma)\) for \(f,g \in \mathbb{T}\) etc. For \(\phi\in \mathsf{BExp}\) we use Iverson's bracket \([\phi]\) to denote the expectation function \([\phi](\sigma) \triangleq 1\) if \(\sigma\vDash\phi\), and \([\phi](\sigma) \triangleq 0\) otherwise. Finally, with \(\mu F.e\) we denote the least fixed point of the function \(\lambda F.e \colon\mathbb{T}\to \mathbb{T}\) with respect to the pointwise ordering \(\preceq\) on expectations. It can be shown that ( \(\mathbb{T},{\preceq}\)) forms an \(\omega\)-CPO with bottom element \(\mathbf{0}\) and top element \(\mathbf{\infty}\), and that the transformer \(\mathsf{et}_{c}[\mathtt{C}]\) is \(\omega\)-continuous. Consequently, \(\mathsf{et}_{c}[\mathtt{C}]\) is well-defined.

\[\begin{aligned} \mathsf{et}_{c}[\mathtt{skip}](f) & \triangleq f \\ \mathsf{et}_{c}[\texttt{tick}\texttt{(} r \texttt{)}](f) & \triangleq\mathbf{[} c\mathbf{]} \mathrel{\mathbf{*}}\mathbf{r} \mathrel{\mathbf{+}}f \\ \mathsf{et}_{c}[\mathtt{abort}](f) & \triangleq\mathbf{0}\\ \mathsf{et}_{c}[x\,\texttt{:=}\,d](f) & \triangleq\lambda \sigma. \mathbb{E}_{[\![d]\!](\sigma)}(\lambda i. f(\sigma[x \mathrel{:=} i])) \\ \mathsf{et}_{c}[\texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}}](f) & \triangleq\mathbf{[} \psi\land \phi\mathbf{]} \mathrel{\mathbf{*}}\mathsf{et}_{c}[\mathtt{C}](f) \mathrel{\mathbf{+}}\mathbf{[} \psi\land \neg\phi\mathbf{]} \mathrel{\mathbf{*}}\mathsf{et}_{\mathtt{D}}[(]f) \\ \mathsf{et}_{c}[\texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}](f) & \triangleq\mu F. \mathbf{[} \psi\land \phi\mathbf{]} \mathrel{\mathbf{*}}\mathsf{et}_{c}[\mathtt{C}](F) \mathrel{\mathbf{+}}\mathbf{[} \psi\land \neg\phi\mathbf{]} \mathrel{\mathbf{*}}f \\ \mathsf{et}_{c}[\texttt{\{} \mathtt{C} \texttt{\}} \, \texttt{<>} \, \texttt{\{} \mathtt{D} \texttt{\}}](f) & \triangleq\mathbf{max}(\mathsf{et}_{c}[\mathtt{C}](f),\mathsf{et}_{c}[\mathtt{D}](f)) \\ \mathsf{et}_{c}[\texttt{\{} \mathtt{C} \texttt{\}}\, \texttt{[} p \texttt{]}\, \texttt{\{} \mathtt{D} \texttt{\}}](f) & \triangleq\mathbf{p} \mathrel{\mathbf{*}}\mathsf{et}_{c}[\mathtt{C}](f) \mathrel{\mathbf{+}}(\mathbf{1-p}) \mathrel{\mathbf{*}}\mathsf{et}_{c}[\mathtt{D}](f) \\ \mathsf{et}_{c}[\mathtt{C}; \mathtt{D}](f) & \triangleq\mathsf{et}_{c}[\mathtt{C}](\mathsf{et}_{c}[\mathtt{D}](f)) \end{aligned}\]
Figure 2: Definition of expectation transformer.

We note that \(\mathsf{evt}[\mathtt{C}]\) coincides with the weakest precondition transformer \(\mathsf{wp}[\mathtt{C}]\) of Olmedo et al. [7] on fully probabilistic programs, i.e., those without non-deterministic choice. In contrast to \(\mathsf{evt}[\mathtt{C}]\), \(\mathsf{wp}[\mathtt{C}]\) minimises over non-deterministic choice.

For expectations \(f\), we suite the function \(\mathsf{et}_{c}[\cdot](f) \colon\mathsf{Cmd}\to \Sigma\to \mathbb{R}_{\ge 0}^{\infty}\) to a function \(\overline{\mathsf{et}}_{c}(f) \colon\mathsf{Conf}\to \mathbb{R}_{\ge 0}^{\infty}\) by \(\overline{\mathsf{et}}_{c}(f)(\langle \mathtt{C} \rangle(\sigma)) \triangleq\mathsf{et}_{c}[\mathtt{C}](f)(\sigma)\), \(\overline{\mathsf{et}}_{c}(f)(\sigma) \triangleq f(\sigma)\) and \(\overline{\mathsf{et}}_{c}(f)(\bot) \triangleq 0\). The following constitutes our first technical result. What it tells us is that \(\mathsf{et}_{c}[\cdot](f)\) decreases in expectation along reductions, taking into account the cost of steps in the case of \(\mathsf{ect}[\cdot](f)\).

Theorem 1 \(\mathbb{E}_{\mu}(\overline{\mathsf{et}}_{c}(f)) = \sup \{ [c] \cdot w + \mathbb{E}_{\nu}(\overline{\mathsf{et}}_{c}(f)) \mid \mu\stackrel{w}{\twoheadrightarrow} \nu\}\).

To prove this theorem, we first show its variations based on the probabilistic ARS \(\stackrel{}{\to}\) and the single-step reduction relation \(\stackrel{}{\twoheadrightarrow}\). Both of these intermediate results follow by a straight forward induction on the corresponding reduction relation. The following is then immediate:

Corollary 1 (Soundness and Completeness of Expectation Transformers). For all commands \(\mathtt{C}\in \mathsf{Cmd}\) and stores \(\sigma\in \Sigma\), (i) \(\mathsf{ec}[\mathtt{C}](\sigma) = \mathsf{ect}[\mathtt{C}](\mathbf{0})(\sigma)\) and (ii) \(\mathsf{ev}[\mathtt{C}](\sigma)(f) = \mathsf{evt}[\mathtt{C}](f)(\sigma)\).

By (i), the expected cost of running \(\mathtt{C}\) is given by \(\mathsf{ect}[\mathtt{C}](\mathbf{0})\). When \(\mathtt{C}\) does not contain loops, the latter is easily computable. To treat loops, Kaminski et al. [4] propose to search for upper invariants: \(I_f \colon\mathbb{T}\) is an upper invariant for \(\mathtt{C}= \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{D} \texttt{\}}\) with respect to \(f \in \mathbb{T}\) if it is a pre-fixpoint of the cost through which \(\mathsf{et}_{c}[\mathtt{C}](f)\) is defined.

Proposition 1 \(\mathbf{[} \psi\land \phi\mathbf{]} \mathrel{\mathbf{*}}\mathsf{et}_{c}[\mathtt{D}](I_f) \mathrel{\mathbf{+}}\mathbf{[} \psi\land \neg\phi\mathbf{]} \mathrel{\mathbf{*}}f \preceq I_f \implies \mathsf{et}_{c}[\texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{D} \texttt{\}}](f) \preceq I_f\).

This immediately suggests the following two stage approach towards an automated expected runtime analysis of a program \(\mathtt{C}\) via Corollary 1 (i): In the first stage, one evaluates \(\mathsf{et}_{c}[\mathtt{C}](\mathbf{0})\) symbolically on some form of cost expressions \(\mathsf{CExp}\), generating constraints according to Proposition 1 whenever a while-loop is encountered. Based on the collection of generated constraints, in the second phase concrete upper invariants can be synthesised. From these, a symbolic upper bound to the expected cost \(\mathsf{ec}[\mathtt{C}]\) can be constructed. Conceptually, this is the approach taken by Absynth [6], where \(\mathsf{ert}[\mathtt{C}]\) is formulated in terms of a Hoare style calculus, and \(\mathsf{CExp}\) is amendable to Linear Programming.

Towards A Compositional Analysis

With Proposition 1 alone it is in general not possible to modularize this procedure so that individual components can be treated separately. In particular, nested loops generate mutual constraints that cannot be solved independently. Of course, this situation is in general unavoidable as the problem itself is inherently non-modular. Nevertheless, with Theorem 2 drawn below, we give conditions under which this global analysis can be broken down into a local one.

For expectations \(\vec{g} = g_{1},\ldots,g_{k}\) and \(f \colon(\mathbb{R}_{\ge 0}^{\infty})^k \to \mathbb{R}_{\ge 0}^{\infty}\), let us denote the composition \(\lambda \sigma. f(\vec{g}(\sigma))\) by \(f \mathrel{\circ}\vec{g}\). Call \(f\) concave if \(f(p \cdot \vec{r} + (1-p) \cdot \vec{s}) \geq p \cdot f(\vec{r}) + (1-p) \cdot f(\vec{s})\) (where \(0 \leq p \leq 1\)) and (weakly) monotone if \(\vec{r} \geq \vec{s}\) implies \(f(\vec{r}) \geq f(\vec{s})\). The following presents our central observation:

Lemma 1 \(\mathsf{ect}[\mathtt{C}](f \mathrel{\circ}(g_{1},\ldots,g_{k})) % \leqf \ec{\cmd} \plusf \evt{\cmd}(f \compose (g_1,\dots,g_k)) \preceq\mathsf{ec}[\mathtt{C}] \mathrel{\mathbf{+}}f \mathrel{\circ}(\mathsf{evt}[\mathtt{C}](g_1),\dots,\mathsf{evt}[\mathtt{C}](g_k))\) if \(g\) is monotone and concave.

The intuition behind this lemma is as follows. The functions \(g_i \colon\sigma\to \mathbb{R}_{\ge 0}^{\infty}\), also referred to as norms, represent an abstract view on program stores \(\sigma\). In the most simple case, \(g_i\) could denote the absolute value of the \(\text{$i$}^{\text{th}}\) variable. If \(g\) measures the expected resource consumption of \(\mathtt{D}\) in terms of \(g_i\), i.e., \(\mathsf{ec}[\mathtt{D}](\sigma) \leq f(g_1(\sigma),\dots,g_k(\sigma))\), by monotonicity of \(\mathsf{ect}[\mathtt{C}]\) this lemma tells us then that \[\mathsf{ec}[c]{\mathtt{C};\mathtt{D}}(\sigma) % = \ect{\cmd}(\ec{\cmdtwo})(\st) \leq \mathsf{ect}[\mathtt{C}](f \mathrel{\circ}(g_1,\dots,g_k))(\sigma) \leq \mathsf{ec}[c]{\mathtt{C}}(\sigma) + f(\mathsf{evt}[\mathtt{C}](g_1)(\sigma),\dots, \mathsf{evt}[\mathtt{C}](g_k)(\sigma)) \; .\] The expected cost of \(\mathtt{C};\mathtt{D}\) is thus the expected cost of \(\mathtt{C}\), plus the expected cost of \(\mathtt{D}\) measured in the values \(\mathsf{evt}[\mathtt{C}](g_i)\) of the norms \(g_i\) expected after executing \(\mathtt{C}\). Note that concavity can be dropped when \(\mathtt{C}\) admits no probabilistic behaviour. Combining this lemma with Proposition 1 then yields:

Theorem 2 For monotone and concave \(g\), \[\begin{gathered} \mathbf{[} \psi\land \phi\mathbf{]} \mathrel{\mathbf{*}}\bigl(\mathsf{ec}[\mathtt{C}] \mathrel{\mathbf{+}}g \mathrel{\circ}(\mathsf{evt}[\mathtt{C}](g_1),\dots,\mathsf{evt}[\mathtt{C}](g_k))\bigr) \preceq g \mathrel{\circ}(g_{1},\ldots,g_{k}) \\ \land \mathbf{[} \psi\land \neg\phi\mathbf{]} \mathrel{\mathbf{*}}f \preceq g \mathrel{\circ}(g_{1},\ldots,g_{k}) \implies \mathsf{ect}[\texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}](f) \preceq g \mathrel{\circ}(g_{1},\ldots,g_{k}) \; . \end{gathered}\]

Implementation

At the moment, we are working on a prototype implementation that accepts \(\mathsf{pWhile}\) programs with finite distributions over integer expressions \(a\) in probabilistic assignments. Integer and cost expressions \(c,d \in \mathsf{CExp}\) over variables \(x \in \mathsf{Var}\), \(z \in \mathbb{Z}\), constants \(q \in \mathbb{Q}_{\ge 0}\) are given as follows: \[a,b \mathrel{::=}x \mid z \mid a + b \mid a * b \mid \dots \qquad c,d \mathrel{::=}q \mid \operatorname{\mathsf{nat}}(a) \mid \mathbf{[} \phi\mathbf{]} *c\mid c +d \mid c *d \mid \mathsf{max}(c,d)\] Norms \(\operatorname{\mathsf{nat}}(a)\) lift expressions that depend on the store to cost expressions. For brevity, the interpretation of norms is fixed to \(\operatorname{\mathsf{nat}}(a) \triangleq\max(0,a)\). All other operations are interpreted in the expected way. We denote the evaluation function of cost expressions also by \([\![\cdot]\!] \colon \mathsf{CExp}\to \Sigma \to \mathbb{Q}_{\ge 0}\). Notice that \([\![c]\!] \in \mathbb{T}\). To automate the cost inference of programs we provide a variation of the expectation transformer, \(\mathsf{et}^\sharp_{c}[\cdot]\colon{\mathsf{Cmd}\to \mathsf{CExp}\to \mathsf{CExp}}\) (as well as \(\mathsf{ect}^\sharp\) and \(\mathsf{evt}^\sharp\)), sound in the following sense:

Theorem 3 \(\mathsf{et}_{c}[C]([\![f]\!]) \preceq[\![\mathsf{et}^\sharp_{c}[C](f)]\!]\) , for all commands \(\mathtt{C}\in \mathsf{Cmd}\) and cost expressions \(f \in \mathsf{CExp}\).

The function \(\mathsf{et}^\sharp_{c}[\cdot]\) is defined along the way of \(\mathsf{et}_{c}[\cdot]\) from Figure 2. As an example consider the assignment which is defined by \(\mathsf{et}^\sharp_{c}[x\,\texttt{:=}\,\{p_1\colon a_1, \ldots, p_2 \colon a_k\}](f) \triangleq\sum_{1 \leqslant i \leqslant k} {p_i} \cdot f[a_i/x]\). To obtain closed-form expressions on while loops we make use of decomposition (cf. Theorem 2) and should that fail upper invariants (cf. Proposition 1). Notably, using decomposition we can define a recursive strategy that infers bounds on loops individually. We comment on the application of Theorem 2 in the implementation. Assume that we want to compute \(\mathsf{ect}^\sharp[\texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}](f)\). First, we compute \(g = \mathsf{ect}^\sharp[\mathtt{C}](0)\). We heuristically select norms \(g_1,\ldots,g_k\) based on the invariants and conditions of the program (e.g. \(\operatorname{\mathsf{nat}}(x-y)\) for condition \(x > y\)). Second, we recursively compute \(h_i = \mathsf{evt}^\sharp[\mathtt{C}](g_i)\) for all \(g_i\). We have \(\mathsf{ect}[C](\mathbf{0}) \preceq[\![g]\!]\) and \(\mathsf{evt}[C]({[\![g_i]\!]}) \preceq[\![h_i]\!]\). Third, we express the necessary conditions as constraints over cost expressions: \[\begin{aligned} \psi\wedge \phi& \vDash g + \mathtt{h}\mathrel{\circ}(h_1,\ldots,h_k)\leqslant \mathtt{h}\mathrel{\circ}(g_1,\ldots,g_k)\\ \psi\wedge \neg \phi& \vDash f \leqslant \mathtt{h}\mathrel{\circ}(g_1,\ldots,g_k)\; . \end{aligned}\] A constraint \(\phi\vDash c \leqslant d\) holds if \([\![\phi]\!] \vDash[\![c]\!] \preceq[\![d]\!]\) holds for all states. When generating constraints only \(\mathtt{h}\) is unknown. To obtain a concrete cost expression for \(\mathtt{h}\) we follow the method presented in [3]. Here \(\mathtt{h}\) is a template expression with undetermined coefficients \(q_i\) (e.g. \(\lambda h_i \to \sum q_i \cdot h_i\)), and we search for an assignment such that all constraints hold and \(q_i \geqslant 0\). We apply case-elimination and case-distinction to reduce the problem \(\phi\vDash c \leqslant d\) to inequality constraints of polynomials. For example, given a norm \(\operatorname{\mathsf{nat}}(a) = \max(0,a)\) we eliminate \(\max\) when we can show that \([\![a]\!] \geqslant 0\) for all states that satisfy \(\phi\). The obtained inequality constraints of polynomials have undetermined coefficient variables. We reduce the problem to certification of non-negativity, which can then be solved using SMT solvers.

References

Introduction

We are concerned with the average case runtime complexity analysis of a prototypical imperative language \(\mathsf{pWhile}\) in the spirit of Dijkstra's Guarded Command Language. This language is endowed with primitives for sampling and probabilistic choice so that randomized algorithms can be expressed. Complexity analysis in this setting is particularly appealing as efficiency is one striking reason why randomized algorithms have been introduced and studied: in many cases, the most efficient algorithm for the problem at hand is randomized [5].

Our staring point towards an automated analysis is the \(\mathsf{ert}\)-calculus of Kaminski et al. [4], which constitutes a sound and complete method for deriving expected runtimes of probabilistic programs. The \(\mathsf{ert}\)-calculus has been recently automated within [6], showing encouraging results. Indeed, their prototype Absynth can derive accurate bounds on the expected runtime of a wealth of non-trivial, albeit academic, imperative programs with probabilistic choice. Since the average case runtime of probabilistic programs is inherently non-modular (see e.g. [4]), different program fragments cannot be analysed in general independently within the \(\mathsf{ert}\)-calculus. This work aims at overcoming this situation, by enriching the calculus with a form of expected value analysis. Conceptually, our result rests on the observation that if \(f\) and \(g\) measure the runtime of deterministic programs \(\mathtt{C}\) and \(\mathtt{D}\) as a function in the variables assignment \(\sigma\) before executing the command, then \(f(\sigma) + g(\sigma')\) for \(\sigma'\) the store after the execution of \(\mathtt{C}\) gives the runtime of the composed command \(\mathtt{C};\mathtt{D}\). Estimating \(\sigma'\) in terms of \(\mathtt{C}\) and \(\sigma\), and ensuring some monotonicity property on \(g\), gives rise to a compositional analysis. When \(\mathtt{C}\) exhibits some probabilistic behavior though, the command \(\mathtt{D}\) may be executed after \(\mathtt{C}\) on several probabilistic branches \(b\), each with probability \(p_b\) with a variable assignment \(\sigma_b\). Assuming bounding functions \(f\) and \(g\) on the expected runtime of \(\mathtt{C}\) and \(\mathtt{D}\) respectively, yields a bound \(f(\sigma) + \sum_b p_b \cdot g(\sigma_b)\) on the expected runtime of the probabilistic program \(\mathtt{C};\mathtt{D}\). As the number of probabilistic branches \(b\) is unbounded for all but the most trivial programs \(\mathtt{C}\), estimating all assignments \(\sigma_b\) in terms of \(\sigma\) soon becomes infeasible. The crux of our approach towards a compositional analysis lies in the observation that if we can give the runtime of \(\mathtt{D}\) in terms of a concave function (i.e., described by a multi-linear polynomial), the expected runtime \(\sum_b p_b \cdot g(\sigma_b)\) can be bounded in terms of \(g\) and the variable assignment \(\sum_b p_b \cdot \sigma_b\) expected after executing \(\mathtt{C}\). This way, a compositional analysis is recovered. This observation then also enables some form of modularity for the analysis of nested loops.

To prove this machinery sound, we first give a novel structural operational semantics in terms of weighted probabilistic ARSs. These constitute a refinement to probabilistic ARSs introduced by Bournez and Garnier [2] where operations do not necessarily have uniform cost. Probabilistic ARSs give rise to a reduction relation on (multi-)distributions that is equivalent to the standard operational semantic via stochastic processes [1]. We then generalise the \(\mathsf{ert}\)-calculus to one for reasoning about expected costs consumed by a command \(\texttt{tick}\texttt{(} \cdot \texttt{)}\), and expected values in final configurations. This machinery is proven sound and complete with respect to our new operational semantics. Finally, we conclude with some words on a prototype implementation that we are currently developing.

A Probabilistic While Language

For a finite set of integer-valued variables \(\mathsf{Var}\), we denote by \(\Sigma\triangleq\mathsf{Var}\to \mathbb{Z}\) the set of stores. The syntax of program commands \(\mathsf{Cmd}\) over \(\mathsf{Var}\) is given as follows: \[\begin{aligned} \mathtt{C},\mathtt{D} & \mathrel{::=}\texttt{tick}\texttt{(} r \texttt{)} \mid x\,\texttt{:=}\,d \mid \texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}} \mid \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}} \mid \texttt{\{} \mathtt{C} \texttt{\}} \, \texttt{<>} \, \texttt{\{} \mathtt{D} \texttt{\}} \mid \texttt{\{} \mathtt{C} \texttt{\}}\, \texttt{[} p \texttt{]}\, \texttt{\{} \mathtt{D} \texttt{\}} \mid \mathtt{C}; \mathtt{D}\;.\end{aligned}\] In this grammar, \(\phi\in \mathsf{BExp}\) denotes a Boolean expression over \(\mathsf{Var}\) and \(d\in \mathsf{DExp}\) an Integer-valued distribution expression over \(\mathsf{Var}\). With \([\![\cdot]\!] \colon\mathsf{DExp}\to \Sigma\to \mathcal{D}(\mathbb{Z})\) we denote the evaluation functions of distribution expressions, i.e., \([\![d]\!](\sigma)\) gives the result of evaluating \(d\) under the current store \(\sigma\), resulting in a probability distribution over \(\mathbb{Z}\). For Boolean expressions \([\![\phi]\!] \in \mathsf{BExp}\) and \(\sigma\in \Sigma\), we indicate with \(\sigma\vDash\phi\) that \(\phi\) holds when the variables in \(\phi\) take values according to \(\sigma\). Program commands are fairly standard. The command \(\texttt{tick}\texttt{(} r \texttt{)}\) consumes \(r \in \mathbb{Q}^+\) resource units but otherwise acts as a no-op. The command \(x\,\texttt{:=}\,d\) assigns a value sampled from \(d(\sigma)\) to \(x\), for \(\sigma\) the current store. This generalises the usual non-probabilistic assignment \(x\,\texttt{:=}\,e\) for \(e\) an integer expression. The commands \(\texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}}\) and \(\texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\) have the usual semantics, with \(\psi\) being an invariant. Here, an invariant holds along all probabilistic branches (e.g. probabilistic choice over-approximated with non-determinism) and can in practice be inferred with off-the shelf methods. In case that \(\psi\) does not hold, the program terminates abnormally. The command \(\texttt{\{} \mathtt{C} \texttt{\}} \, \texttt{<>} \, \texttt{\{} \mathtt{D} \texttt{\}}\) executes either \(\mathtt{C}\) or \(\mathtt{D}\), in a non-deterministic fashion. In contrast, the probabilistic choice \(\texttt{\{} \mathtt{C} \texttt{\}}\, \texttt{[} p \texttt{]}\, \texttt{\{} \mathtt{D} \texttt{\}}\) executes \(\mathtt{C}\) with probability \(0 \leq p \leq 1\) and \(\mathtt{D}\) with probability \(1-p\).

We give the small step operational semantics for our language via a (weighted) probabilistic ARS \({\stackrel{}{\to}}\) over configurations \(\mathsf{Conf}\triangleq(\mathsf{Cmd}\times \Sigma) \cup \Sigma\cup \{\bot\}\). Elements \((\mathtt{C},\sigma) \in \mathsf{Conf}\) are denoted by \(\langle \mathtt{C} \rangle(\sigma)\) and signal that the command \(\mathtt{C}\) is to be executed under the current store \(\sigma\), whereas \(\sigma\in \mathsf{Conf}\) and \(\bot\in \mathsf{Conf}\) indicate that the computation has halted, abnormally in the latter case. The probabilistic ARS \(\stackrel{}{\to}\) is depicted in Figure 1. In this reduction system, rules have the form \(\gamma\stackrel{w}{\to} \mu\) for \(\gamma\in\mathsf{Conf}\) and \(\mu\) a multidistribution over \(\mathsf{Conf}\), i.e., countable multisets of the form \(\{\!\{ p_i:\gamma_i \}\!\}_{i \in I}\) for probabilities \(0 < p_i \leq 1\) with \(\sum_{i \in I} p_i \leq 1\) and \(\gamma_i \in \mathsf{Conf}\) ( \(i \in I\)). A rule \(\gamma\stackrel{w}{\to} \{\!\{ p_i:\gamma_i \}\!\}_{i \in I}\) signals that \(\gamma\) reduces with probability \(p_i\) to \(\gamma_i\), consuming cost \(w\). By identifying dirac multidistributions \(\{\!\{ 1:\gamma \}\!\}\) with \(\gamma\), we may write \(\gamma \stackrel{w}{\to} \gamma'\) for a reduction step without probabilistic effect.

\[\infer{\text{[Tick]}}{\langle \texttt{tick}\texttt{(} r \texttt{)} \rangle(\sigma) \stackrel{r}{\to} \sigma}{} \quad \infer{\text{[Assign]}}{\langle x\,\texttt{:=}\,d \rangle(\sigma)\stackrel{0}{\to} \{\!\{ p_i : \sigma[x \mathrel{:=} i] \mid i \in \mathbb{Z}, p_i = [\![d]\!](\sigma)(i) > 0 \}\!\}} {}\] \[\infer{\text{[IfT]}} {\langle \texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \langle \mathtt{C} \rangle(\sigma)} {\phantom{abcdefgh} \sigma\vDash\psi\land \phi\phantom{abcdefgh}} \quad \quad \infer{\text{[IfF]}} {\langle \texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \langle \mathtt{D} \rangle(\sigma)} { \sigma\vDash\psi\land \neg\phi}\] \[\infer{\text{[WhT]}} {\langle \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \langle \mathtt{C}; \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}} \rangle(\sigma)} { \sigma\vDash\psi\land \phi} \quad \quad \infer{\text{[WhF]}} {\langle \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \sigma} { \sigma\vDash\psi\land \neg\phi}\] \[\infer{\text{[AbortIf]}} {\langle \texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \bot} { \sigma\vDash\neg\psi} \quad \quad \infer{\text{[AbortWhile]}} {\langle \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \bot} { \sigma\vDash\neg\psi}\] \[\infer{\text{[ChoiceL]}} {\langle \texttt{\{} \mathtt{C} \texttt{\}} \, \texttt{<>} \, \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \langle \mathtt{C} \rangle(\sigma)}{} {} \quad \quad \infer{\text{[ChoiceR]}} {\langle \texttt{\{} \mathtt{C} \texttt{\}} \, \texttt{<>} \, \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \langle \mathtt{D} \rangle(\sigma)}{} {}\] \[\infer{\text{[ProbChoice]}} {\langle \texttt{\{} \mathtt{C} \texttt{\}}\, \texttt{[} p \texttt{]}\, \texttt{\{} \mathtt{D} \texttt{\}} \rangle(\sigma) \stackrel{0}{\to} \{\!\{ p : \langle \mathtt{C} \rangle(\sigma), 1-p : \langle \mathtt{D} \rangle(\sigma) \}\!\} } {}\] \[\begin{array}{c} \infer{\text{[Compose]}} {\langle \mathtt{C};\mathtt{D} \rangle(\sigma) \stackrel{r}{\to} \{\!\{ p_i:\mathsf{step}_\mathtt{D}(\gamma_i) \}\!\}_{i \in I}} {\langle \mathtt{C} \rangle(\sigma) \stackrel{r}{\to} \{\!\{ p_i:\gamma_i \}\!\}_{i \in I}} \end{array} \text{ where } \mathsf{step}_\mathtt{D}(\gamma) \triangleq \begin{cases} \langle \mathtt{C};\mathtt{D} \rangle(\sigma) & \text{if $\gamma= \langle \mathtt{C} \rangle(\sigma)$} \\ \langle \mathtt{D} \rangle(\sigma) & \text{if $\gamma= \sigma\in \Sigma$} \\ \bot& \text{if $\gamma= \bot$.} \end{cases}\]
Figure 1: One-step reduction relation as weighted probabilistic ARS (Excerpt).

The weighted one-step reduction relation of \(\stackrel{}{\to}\) is defined by (i) \(\mu\stackrel{0}{\twoheadrightarrow} \mu\), (ii) \(\{\!\{ 1:\gamma \}\!\} \stackrel{w}{\twoheadrightarrow} \mu\) if \(\gamma \stackrel{w}{\to} \mu\), and (iii) \(\biguplus_{i\in I} p_i\cdot\mu_i \stackrel{w}{\twoheadrightarrow} \biguplus_{i\in I} p_i\cdot\nu_i\) where \(w = \sum_{i \in I} p_i \cdot w_i\), \(\mu_i \stackrel{w_i}{\to} \nu_i\) for all \(i \in I\) and \(\sum_{i\in I}p_i \le 1\) for probabilities \(0 < p_i \leq 1\). Here, \(\biguplus_{i\in I} p_i\cdot\mu_i\) denotes the countable convex union of multidistributions \(\mu_i\) ( \(i \in I\)), e.g., \(\frac{1}{2} \cdot \{\!\{ 1 : a \}\!\} \uplus \frac{1}{2} \cdot \{\!\{ \frac{1}{3} : a, \frac{1}{2} : b \}\!\} % = \mset{\frac{1}{2} : a} \uplus \mset{\frac{1}{6} : a, \frac{1}{4} : b} = \{\!\{ \frac{1}{2} : a, \frac{1}{6} : a, \frac{1}{4} : b \}\!\}\).

Finally, with \(\stackrel{w}{\twoheadrightarrow}\) we denote the weighted multi-step reduction relation defined by \(\mu\stackrel{w}{\twoheadrightarrow} \nu\) if \(\mu= \mu_0 \stackrel{w_1}{\twoheadrightarrow} \cdots \stackrel{w_n}{\twoheadrightarrow} \mu_n = \nu\) and \(w = \sum_{i=1}^n w_i\). Expected cost and value functions for \(f \colon\Sigma\to \mathbb{R}_{\ge 0}^{\infty}\) are defined by \[\begin{aligned} \mathsf{ec}[\mathtt{C}](\sigma) & \triangleq\sup\{ w \mid \langle \mathtt{C} \rangle(\sigma) \stackrel{w}{\twoheadrightarrow} \mu\} & \mathsf{ev}[\mathtt{C}](\sigma)(f) & \triangleq\sup\{ \mathbb{E}_{{\mu{\downharpoonright}\Sigma}}(f) \mid \langle \mathtt{C} \rangle(\sigma) \stackrel{w}{\twoheadrightarrow} \mu\}\;,\end{aligned}\] where \({\mu{\downharpoonright}\Sigma}\) denotes the restriction of \(\mu\) to elements of \(\Sigma\) and \(\mathbb{E}_{\nu}(f) \triangleq\sum_{i \in I} p_i \cdot f(a_i)\) gives the expected value of \(f\) with respect to \(\nu= \{\!\{ p_i : a_i \}\!\}_{i \in I}\).

Expectation Transformers

To overcome the problems concerning composability, Kaminski et al. [4] express the expected runtime in continuation passing style, via an expectation transformer \(\mathsf{ert}[\cdot] \colon\mathtt{C}\to \mathbb{T}\to \mathbb{T}\) over expectations \(\mathbb{T}\triangleq\Sigma\to \mathbb{R}_{\ge 0}^{\infty}\). Given the cost \(f\) of executing a program fragment \(\mathtt{D}\), \(\mathsf{ect}[\mathtt{C}](f)\) computes the cost of first executing \(\mathtt{C}\) and then \(\mathtt{D}\). We suite this transformer to two transformers \(\mathsf{ect}[\mathtt{C}]\) and \(\mathsf{evt}[\mathtt{C}]\) that compute the expected cost and expected value function of the program \(\mathtt{C}\), respectively. Their definition coincide up to the case where \(\mathtt{C}= \texttt{tick}\texttt{(} r \texttt{)}\), the former taking into account the cost \(r\) while the latter is ignoring it. We thus generalise \(\mathsf{ect}[\cdot] \colon\mathsf{Cmd}\to \mathbb{T}\to \mathbb{T}\) and \(\mathsf{evt}[\cdot] \colon\mathsf{Cmd}\to \mathbb{T}\to \mathbb{T}\) to a function \(\mathsf{et}_{c}[\mathtt{C}]\) and set \(\mathsf{ect}[\mathtt{C}] \triangleq\mathsf{et}_{\top}[\mathtt{C}]\) and \(\mathsf{evt}[\mathtt{C}] \triangleq\mathsf{et}_{\bot}[\mathtt{C}]\), where \(\mathsf{et}_{c}[\mathtt{C}]\) is given in Figure 2. Here, functions \(f \colon(\mathbb{R}_{\ge 0}^{\infty})^k \to \mathbb{R}_{\ge 0}^{\infty}\) are extended pointwise on expectations and denoted in bold face, e.g., for each \(r \in \mathbb{R}_{\ge 0}^{\infty}\) we have a constant function \(\mathbf{r}(\sigma) \triangleq r\), \(f \mathrel{\mathbf{+}}g \triangleq\lambda \sigma. f(\sigma) + g(\sigma)\) for \(f,g \in \mathbb{T}\) etc. For \(\phi\in \mathsf{BExp}\) we use Iverson's bracket \([\phi]\) to denote the expectation function \([\phi](\sigma) \triangleq 1\) if \(\sigma\vDash\phi\), and \([\phi](\sigma) \triangleq 0\) otherwise. Finally, with \(\mu F.e\) we denote the least fixed point of the function \(\lambda F.e \colon\mathbb{T}\to \mathbb{T}\) with respect to the pointwise ordering \(\preceq\) on expectations. It can be shown that ( \(\mathbb{T},{\preceq}\)) forms an \(\omega\)-CPO with bottom element \(\mathbf{0}\) and top element \(\mathbf{\infty}\), and that the transformer \(\mathsf{et}_{c}[\mathtt{C}]\) is \(\omega\)-continuous. Consequently, \(\mathsf{et}_{c}[\mathtt{C}]\) is well-defined.

\[\begin{aligned} \mathsf{et}_{c}[\mathtt{skip}](f) & \triangleq f \\ \mathsf{et}_{c}[\texttt{tick}\texttt{(} r \texttt{)}](f) & \triangleq\mathbf{[} c\mathbf{]} \mathrel{\mathbf{*}}\mathbf{r} \mathrel{\mathbf{+}}f \\ \mathsf{et}_{c}[\mathtt{abort}](f) & \triangleq\mathbf{0}\\ \mathsf{et}_{c}[x\,\texttt{:=}\,d](f) & \triangleq\lambda \sigma. \mathbb{E}_{[\![d]\!](\sigma)}(\lambda i. f(\sigma[x \mathrel{:=} i])) \\ \mathsf{et}_{c}[\texttt{if}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}\ \texttt{\{} \mathtt{D} \texttt{\}}](f) & \triangleq\mathbf{[} \psi\land \phi\mathbf{]} \mathrel{\mathbf{*}}\mathsf{et}_{c}[\mathtt{C}](f) \mathrel{\mathbf{+}}\mathbf{[} \psi\land \neg\phi\mathbf{]} \mathrel{\mathbf{*}}\mathsf{et}_{\mathtt{D}}[(]f) \\ \mathsf{et}_{c}[\texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}](f) & \triangleq\mu F. \mathbf{[} \psi\land \phi\mathbf{]} \mathrel{\mathbf{*}}\mathsf{et}_{c}[\mathtt{C}](F) \mathrel{\mathbf{+}}\mathbf{[} \psi\land \neg\phi\mathbf{]} \mathrel{\mathbf{*}}f \\ \mathsf{et}_{c}[\texttt{\{} \mathtt{C} \texttt{\}} \, \texttt{<>} \, \texttt{\{} \mathtt{D} \texttt{\}}](f) & \triangleq\mathbf{max}(\mathsf{et}_{c}[\mathtt{C}](f),\mathsf{et}_{c}[\mathtt{D}](f)) \\ \mathsf{et}_{c}[\texttt{\{} \mathtt{C} \texttt{\}}\, \texttt{[} p \texttt{]}\, \texttt{\{} \mathtt{D} \texttt{\}}](f) & \triangleq\mathbf{p} \mathrel{\mathbf{*}}\mathsf{et}_{c}[\mathtt{C}](f) \mathrel{\mathbf{+}}(\mathbf{1-p}) \mathrel{\mathbf{*}}\mathsf{et}_{c}[\mathtt{D}](f) \\ \mathsf{et}_{c}[\mathtt{C}; \mathtt{D}](f) & \triangleq\mathsf{et}_{c}[\mathtt{C}](\mathsf{et}_{c}[\mathtt{D}](f)) \end{aligned}\]
Figure 2: Definition of expectation transformer.

We note that \(\mathsf{evt}[\mathtt{C}]\) coincides with the weakest precondition transformer \(\mathsf{wp}[\mathtt{C}]\) of Olmedo et al. [7] on fully probabilistic programs, i.e., those without non-deterministic choice. In contrast to \(\mathsf{evt}[\mathtt{C}]\), \(\mathsf{wp}[\mathtt{C}]\) minimises over non-deterministic choice.

For expectations \(f\), we suite the function \(\mathsf{et}_{c}[\cdot](f) \colon\mathsf{Cmd}\to \Sigma\to \mathbb{R}_{\ge 0}^{\infty}\) to a function \(\overline{\mathsf{et}}_{c}(f) \colon\mathsf{Conf}\to \mathbb{R}_{\ge 0}^{\infty}\) by \(\overline{\mathsf{et}}_{c}(f)(\langle \mathtt{C} \rangle(\sigma)) \triangleq\mathsf{et}_{c}[\mathtt{C}](f)(\sigma)\), \(\overline{\mathsf{et}}_{c}(f)(\sigma) \triangleq f(\sigma)\) and \(\overline{\mathsf{et}}_{c}(f)(\bot) \triangleq 0\). The following constitutes our first technical result. What it tells us is that \(\mathsf{et}_{c}[\cdot](f)\) decreases in expectation along reductions, taking into account the cost of steps in the case of \(\mathsf{ect}[\cdot](f)\).

Theorem 1 \(\mathbb{E}_{\mu}(\overline{\mathsf{et}}_{c}(f)) = \sup \{ [c] \cdot w + \mathbb{E}_{\nu}(\overline{\mathsf{et}}_{c}(f)) \mid \mu\stackrel{w}{\twoheadrightarrow} \nu\}\).

To prove this theorem, we first show its variations based on the probabilistic ARS \(\stackrel{}{\to}\) and the single-step reduction relation \(\stackrel{}{\twoheadrightarrow}\). Both of these intermediate results follow by a straight forward induction on the corresponding reduction relation. The following is then immediate:

Corollary 1 (Soundness and Completeness of Expectation Transformers). For all commands \(\mathtt{C}\in \mathsf{Cmd}\) and stores \(\sigma\in \Sigma\), (i) \(\mathsf{ec}[\mathtt{C}](\sigma) = \mathsf{ect}[\mathtt{C}](\mathbf{0})(\sigma)\) and (ii) \(\mathsf{ev}[\mathtt{C}](\sigma)(f) = \mathsf{evt}[\mathtt{C}](f)(\sigma)\).

By (i), the expected cost of running \(\mathtt{C}\) is given by \(\mathsf{ect}[\mathtt{C}](\mathbf{0})\). When \(\mathtt{C}\) does not contain loops, the latter is easily computable. To treat loops, Kaminski et al. [4] propose to search for upper invariants: \(I_f \colon\mathbb{T}\) is an upper invariant for \(\mathtt{C}= \texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{D} \texttt{\}}\) with respect to \(f \in \mathbb{T}\) if it is a pre-fixpoint of the cost through which \(\mathsf{et}_{c}[\mathtt{C}](f)\) is defined.

Proposition 1 \(\mathbf{[} \psi\land \phi\mathbf{]} \mathrel{\mathbf{*}}\mathsf{et}_{c}[\mathtt{D}](I_f) \mathrel{\mathbf{+}}\mathbf{[} \psi\land \neg\phi\mathbf{]} \mathrel{\mathbf{*}}f \preceq I_f \implies \mathsf{et}_{c}[\texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{D} \texttt{\}}](f) \preceq I_f\).

This immediately suggests the following two stage approach towards an automated expected runtime analysis of a program \(\mathtt{C}\) via Corollary 1 (i): In the first stage, one evaluates \(\mathsf{et}_{c}[\mathtt{C}](\mathbf{0})\) symbolically on some form of cost expressions \(\mathsf{CExp}\), generating constraints according to Proposition 1 whenever a while-loop is encountered. Based on the collection of generated constraints, in the second phase concrete upper invariants can be synthesised. From these, a symbolic upper bound to the expected cost \(\mathsf{ec}[\mathtt{C}]\) can be constructed. Conceptually, this is the approach taken by Absynth [6], where \(\mathsf{ert}[\mathtt{C}]\) is formulated in terms of a Hoare style calculus, and \(\mathsf{CExp}\) is amendable to Linear Programming.

Towards A Compositional Analysis

With Proposition 1 alone it is in general not possible to modularize this procedure so that individual components can be treated separately. In particular, nested loops generate mutual constraints that cannot be solved independently. Of course, this situation is in general unavoidable as the problem itself is inherently non-modular. Nevertheless, with Theorem 2 drawn below, we give conditions under which this global analysis can be broken down into a local one.

For expectations \(\vec{g} = g_{1},\ldots,g_{k}\) and \(f \colon(\mathbb{R}_{\ge 0}^{\infty})^k \to \mathbb{R}_{\ge 0}^{\infty}\), let us denote the composition \(\lambda \sigma. f(\vec{g}(\sigma))\) by \(f \mathrel{\circ}\vec{g}\). Call \(f\) concave if \(f(p \cdot \vec{r} + (1-p) \cdot \vec{s}) \geq p \cdot f(\vec{r}) + (1-p) \cdot f(\vec{s})\) (where \(0 \leq p \leq 1\)) and (weakly) monotone if \(\vec{r} \geq \vec{s}\) implies \(f(\vec{r}) \geq f(\vec{s})\). The following presents our central observation:

Lemma 1 \(\mathsf{ect}[\mathtt{C}](f \mathrel{\circ}(g_{1},\ldots,g_{k})) % \leqf \ec{\cmd} \plusf \evt{\cmd}(f \compose (g_1,\dots,g_k)) \preceq\mathsf{ec}[\mathtt{C}] \mathrel{\mathbf{+}}f \mathrel{\circ}(\mathsf{evt}[\mathtt{C}](g_1),\dots,\mathsf{evt}[\mathtt{C}](g_k))\) if \(g\) is monotone and concave.

The intuition behind this lemma is as follows. The functions \(g_i \colon\sigma\to \mathbb{R}_{\ge 0}^{\infty}\), also referred to as norms, represent an abstract view on program stores \(\sigma\). In the most simple case, \(g_i\) could denote the absolute value of the \(\text{$i$}^{\text{th}}\) variable. If \(g\) measures the expected resource consumption of \(\mathtt{D}\) in terms of \(g_i\), i.e., \(\mathsf{ec}[\mathtt{D}](\sigma) \leq f(g_1(\sigma),\dots,g_k(\sigma))\), by monotonicity of \(\mathsf{ect}[\mathtt{C}]\) this lemma tells us then that \[\mathsf{ec}[c]{\mathtt{C};\mathtt{D}}(\sigma) % = \ect{\cmd}(\ec{\cmdtwo})(\st) \leq \mathsf{ect}[\mathtt{C}](f \mathrel{\circ}(g_1,\dots,g_k))(\sigma) \leq \mathsf{ec}[c]{\mathtt{C}}(\sigma) + f(\mathsf{evt}[\mathtt{C}](g_1)(\sigma),\dots, \mathsf{evt}[\mathtt{C}](g_k)(\sigma)) \; .\] The expected cost of \(\mathtt{C};\mathtt{D}\) is thus the expected cost of \(\mathtt{C}\), plus the expected cost of \(\mathtt{D}\) measured in the values \(\mathsf{evt}[\mathtt{C}](g_i)\) of the norms \(g_i\) expected after executing \(\mathtt{C}\). Note that concavity can be dropped when \(\mathtt{C}\) admits no probabilistic behaviour. Combining this lemma with Proposition 1 then yields:

Theorem 2 For monotone and concave \(g\), \[\begin{gathered} \mathbf{[} \psi\land \phi\mathbf{]} \mathrel{\mathbf{*}}\bigl(\mathsf{ec}[\mathtt{C}] \mathrel{\mathbf{+}}g \mathrel{\circ}(\mathsf{evt}[\mathtt{C}](g_1),\dots,\mathsf{evt}[\mathtt{C}](g_k))\bigr) \preceq g \mathrel{\circ}(g_{1},\ldots,g_{k}) \\ \land \mathbf{[} \psi\land \neg\phi\mathbf{]} \mathrel{\mathbf{*}}f \preceq g \mathrel{\circ}(g_{1},\ldots,g_{k}) \implies \mathsf{ect}[\texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}](f) \preceq g \mathrel{\circ}(g_{1},\ldots,g_{k}) \; . \end{gathered}\]

Implementation

At the moment, we are working on a prototype implementation that accepts \(\mathsf{pWhile}\) programs with finite distributions over integer expressions \(a\) in probabilistic assignments. Integer and cost expressions \(c,d \in \mathsf{CExp}\) over variables \(x \in \mathsf{Var}\), \(z \in \mathbb{Z}\), constants \(q \in \mathbb{Q}_{\ge 0}\) are given as follows: \[a,b \mathrel{::=}x \mid z \mid a + b \mid a * b \mid \dots \qquad c,d \mathrel{::=}q \mid \operatorname{\mathsf{nat}}(a) \mid \mathbf{[} \phi\mathbf{]} *c\mid c +d \mid c *d \mid \mathsf{max}(c,d)\] Norms \(\operatorname{\mathsf{nat}}(a)\) lift expressions that depend on the store to cost expressions. For brevity, the interpretation of norms is fixed to \(\operatorname{\mathsf{nat}}(a) \triangleq\max(0,a)\). All other operations are interpreted in the expected way. We denote the evaluation function of cost expressions also by \([\![\cdot]\!] \colon \mathsf{CExp}\to \Sigma \to \mathbb{Q}_{\ge 0}\). Notice that \([\![c]\!] \in \mathbb{T}\). To automate the cost inference of programs we provide a variation of the expectation transformer, \(\mathsf{et}^\sharp_{c}[\cdot]\colon{\mathsf{Cmd}\to \mathsf{CExp}\to \mathsf{CExp}}\) (as well as \(\mathsf{ect}^\sharp\) and \(\mathsf{evt}^\sharp\)), sound in the following sense:

Theorem 3 \(\mathsf{et}_{c}[C]([\![f]\!]) \preceq[\![\mathsf{et}^\sharp_{c}[C](f)]\!]\) , for all commands \(\mathtt{C}\in \mathsf{Cmd}\) and cost expressions \(f \in \mathsf{CExp}\).

The function \(\mathsf{et}^\sharp_{c}[\cdot]\) is defined along the way of \(\mathsf{et}_{c}[\cdot]\) from Figure 2. As an example consider the assignment which is defined by \(\mathsf{et}^\sharp_{c}[x\,\texttt{:=}\,\{p_1\colon a_1, \ldots, p_2 \colon a_k\}](f) \triangleq\sum_{1 \leqslant i \leqslant k} {p_i} \cdot f[a_i/x]\). To obtain closed-form expressions on while loops we make use of decomposition (cf. Theorem 2) and should that fail upper invariants (cf. Proposition 1). Notably, using decomposition we can define a recursive strategy that infers bounds on loops individually. We comment on the application of Theorem 2 in the implementation. Assume that we want to compute \(\mathsf{ect}^\sharp[\texttt{while}\ \texttt{[} \psi \texttt{]}\ \texttt{(} \phi \texttt{)}\ \texttt{\{} \mathtt{C} \texttt{\}}](f)\). First, we compute \(g = \mathsf{ect}^\sharp[\mathtt{C}](0)\). We heuristically select norms \(g_1,\ldots,g_k\) based on the invariants and conditions of the program (e.g. \(\operatorname{\mathsf{nat}}(x-y)\) for condition \(x > y\)). Second, we recursively compute \(h_i = \mathsf{evt}^\sharp[\mathtt{C}](g_i)\) for all \(g_i\). We have \(\mathsf{ect}[C](\mathbf{0}) \preceq[\![g]\!]\) and \(\mathsf{evt}[C]({[\![g_i]\!]}) \preceq[\![h_i]\!]\). Third, we express the necessary conditions as constraints over cost expressions: \[\begin{aligned} \psi\wedge \phi& \vDash g + \mathtt{h}\mathrel{\circ}(h_1,\ldots,h_k)\leqslant \mathtt{h}\mathrel{\circ}(g_1,\ldots,g_k)\\ \psi\wedge \neg \phi& \vDash f \leqslant \mathtt{h}\mathrel{\circ}(g_1,\ldots,g_k)\; . \end{aligned}\] A constraint \(\phi\vDash c \leqslant d\) holds if \([\![\phi]\!] \vDash[\![c]\!] \preceq[\![d]\!]\) holds for all states. When generating constraints only \(\mathtt{h}\) is unknown. To obtain a concrete cost expression for \(\mathtt{h}\) we follow the method presented in [3]. Here \(\mathtt{h}\) is a template expression with undetermined coefficients \(q_i\) (e.g. \(\lambda h_i \to \sum q_i \cdot h_i\)), and we search for an assignment such that all constraints hold and \(q_i \geqslant 0\). We apply case-elimination and case-distinction to reduce the problem \(\phi\vDash c \leqslant d\) to inequality constraints of polynomials. For example, given a norm \(\operatorname{\mathsf{nat}}(a) = \max(0,a)\) we eliminate \(\max\) when we can show that \([\![a]\!] \geqslant 0\) for all states that satisfy \(\phi\). The obtained inequality constraints of polynomials have undetermined coefficient variables. We reduce the problem to certification of non-negativity, which can then be solved using SMT solvers.

References

  1. Avanzini, U. Dal Lago & Yamada (2018): On Probabilistic Term Rewriting. In: Proc. of 14th FLOPS, LNCS 10818, pp. 132-148, doi:10.1007/978-3-319-90686-7_9.
  2. O. Bournez & F. Garnier (2005): Proving Positive Almost-Sure Termination. In: Proc. of 16th RTA, LNCS 3467, pp. 323-337, doi:10.1007/11805618_27.
  3. C. Fuhs, J. Giesl, A. Middeldorp, P. Schneider-Kamp, R. Thiemann & H. Zankl (2007): SAT Solving for Termination Analysis with Polynomial Interpretations. In: Proc. of 10th SAT, LNCS 4501, pp. 340-354, doi:10.1007/978-3-540-72788-0_33.
  4. B. Lucien Kaminski, J.-P. Katoen, C. Matheja & F. Olmedo (2016): Weakest Precondition Reasoning for Expected Run-Times of Probabilistic Programs. In: Proc. of 25th ESOP, LNCS 9632, pp. 364-389, doi:10.1007/978-3-662-49498-1_15.
  5. R. Motwani & P. Raghavan (1995): Randomized Algorithms. Cambridge University Press, doi:10.1017/CBO9780511814075.
  6. N. C. Ngo, Q. Carbonneaux & J. Hoffmann (2018): Bounded Expectations: Resource Analysis for Probabilistic Programs. In: Proc. of 39th PLDI0, pp. 496-512, doi:10.1145/3192366.3192394.
  7. F. Olmedo, B. Lucien Kaminski, J.-P. Katoen & C. Matheja (2016): Reasoning about Recursive Probabilistic Programs. In: Proc. of 16th LICS, pp. 672-681, doi:10.1145/2933575.2935317.