UNIVERSITÀ DEGLI STUDI “G. d’Annunzio” CHIETI - PESCARA

SCUOLA SUPERIORE “G. x’Xxxxxxxx” DOTTORATO DI RICERCA in SCIENZE

Ciclo XXVI

Software Verification and Synthesis using Constraints and Program Transformation

Dipartimento di Economia

Settore Scientifico Disciplinare INF/01

DOTTORANDO COORDINATORE

Xxxx. Xxxxxxxx Xx Xxxxxxx Prof.ssa Xxxxxxx Xxxx

TUTOR

Xxxx. Xxxxx Xxxxxxxxxx

Xxxx Xxxxxxxxxx 2011/2013

Contents ‌

Contents 2

Acknowledgements 7

Introduction 9

I	Verification	13
1	Software Model Checking by Program Transformation	15
	1.1 A Transformation-based Verification Framework . . . . . . . . .	18
	1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	26
2	Transformation of Constraint Logic Programs	29
	2.1 Constraint Logic Programming . . . . . . . . . . . . . . . . . . .	29
	2.1.1 Syntax of Constraint Logic Programs . . . . . . . . . . .	29
	2.1.2 Semantics of Constraint Logic Programs . . . . . . . . . .	31
	2.2 Transformation Rules . . . . . . . . . . . . . . . . . . . . . . . .	32
3	Generating Verification Conditions by Specializing Interpreters	35
	3.1 Encoding Imperative Programs into CLP . . . . . . . . . . . . .	36
	3.2 Encoding Operational Semantics into CLP . . . . . . . . . . . . .	38
	3.3 Encoding Partial Correctness into CLP . . . . . . . . . . . . . .	41
	3.4 Soundness of the CLP Encoding . . . . . . . . . . . . . . . . . .	43
	3.5 The Specialization Strategy . . . . . . . . . . . . . . . . . . . . .	44
	3.5.1 Termination and Soundness of the Specialization Strategy	45
	3.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	51

4 Verifying Programs by Specializing Verification Conditions 53

4.1 The Verification Method 55

4.2 The Specialization Strategy 60

4.2.1 Generalization Operators 62

4.2.2 Generalization Strategy 65

4.2.3 Termination and Soundness of the Specialization Strategy 67

4.3 Experimental Evaluation 67

4.4 Related Work 70

5 Iterated Program Specialization 73

5.1 The Verification Method 74

5.2 The Iterated Specialization Strategy 76

5.2.1 Propagation of Constraints 76

5.2.2 Lightweight Correctness Analysis 82

5.2.3 The Reverse Transformation 84

5.2.4 Soundness of the Iterated Specialization Strategy 88

5.3 Experimental Evaluation 88

5.4 Related Work 100

6 Verifying Array Programs 103

6.1 Constraint Logic Programs on Arrays 104

6.2 The Verification Method 105

6.3 The Transformation Strategy 111

6.3.1 Termination and Soundness of the Transformation Strategy115 6.4 Experimental Evaluation 118

6.5 Related Work 124

7 Recursively Defined Properties 127

8 The VeriMAP Software Model Checker 133

8.1 Architecture 134

8.2 Usage 136

8.3 Proving Partial Correctness with the VeriMAP Tool 136

II Synthesis 145

1 Synthesizing Concurrent Programs using Answer Set Programming 147

1.1 Preliminaries 148

1.1.1 Guarded commands 148

1.1.2 Groups 149

1.1.3 Computation Tree Logic 150

1.1.4 Answer Set Programming 151

2 Specifying Concurrent Programs 153

3 Synthesizing Concurrent Programs 159

4 Synthesis examples 165

4.1 Comparison of ASP Solvers on the Synthesis Examples 170

4.2 Related Work 174

5 Proofs 179

6 Source code 191

Conclusions 195

Bibliography 199

Dedicated to the memory of my mother Xxxxxxx

Acknowledgements ‌

This is the hardest part to write, especially for a shy person like me. I will start with three invaluable persons: Xxxxx Xxxxxxxxxx, Xxxxxxxx Xxxxxxxx, and Xxxxxxx Xxxxxxxxxx, without whom this thesis would not be possible.

Xxxxx, my tutor and supervisor, deserves my deep respect and admiration for the devotion he puts in his role of supervisor. He everyday places a lot of effort, and frequently spends his free time, in teaching and helping me to improve my skills. His advices and constant presence are invaluable. On a personal level, I would like to thank him for his understanding, patience, and generosity. His friendship is really important to me. He is an extraordinary person, I found more than an outstanding supervisor, Xxxxx is the big brother that I never had. I am indebted to Xxxxxxxx, my supervisor, for supporting me everyday since

I started to work with him during my master degree. He always believed in me, encouraging me to apply for a PhD position. I will never be able to pay him back for all the precious time he spent with me. The passion he puts in his work is a source of inspiration. In addition to his generosity and patience in sharing with me its expertise, I would like to thank him for his friendship and also for the kind understanding shown during some difficult moments of my life.

Xxxxxxx has been the supervisor of both my bachelor and master degrees. I would like to thank Xxxxxxx for having taught me so much. His devotion for students and the remarkable care he puts in his work inspire me everyday to be more hard working and pursue higher academic achievement.

I would like to thank Xxxx. Xxxx Xxxxxxxxx and Xxxx. Xxxxxxx Xxxxxxxx for their comments on the preliminary version of the thesis. I would also to thank Xxxx. Xxxxx Xxxxxx Xxx, Xxxx. Xxxxxxx Xxxx and Xxxx. Xxxxxxxx Xxxxxx Xxxxxxxxxx who are the Director of the Computer Science Program, the Head of the PhD Program, and her predecessor, respectively.

The work presented in this thesis has been financially supported by the Uni- versity of Chieti-Pescara, the Institute for Systems Analysis and Computer Sci- ence (IASI–CNR), the Italian Association for Logic Programming (GULP), the ICLP Doctoral Consortium 2012, and the National Group of Computing Science (GNCS–INDAM).

Finally, I would like to thank Xxxxxxx Xxxxx for his friendship and my father, relatives and friends for their love.

Xxxxxxxx Xx Xxxxxxx

March 2014

Introduction ‌

In the last decade formal methods applied to software production have received a renewed attention as the basis of a methodology for increasing the reliability of software artefacts (for example, source code, as well as analysis and design models) and reducing the cost of software production (for example, by reducing the time to market).

In particular, a massive effort has been made to devise automatic verifica- tion techniques, such as software model checking, for proving the correctness of programs with respect to their specifications. This thesis addresses the prob- lem of program verification by combining and extending ideas developed in the fields of logic programming, constraint solving, abstract interpretation, and au- tomated theorem proving. In particular, we consider program transformation of constraint logic programs to define a general verification framework which is parametric with respect to the programming language and the logic used for specifying the properties of interest. Program transformation is a software development methodology that consists in manipulating the program text by applying semantic preserving rules. It turns out to be a very flexible and gen- eral methodology by which it is possible to rapidly implement modifications of the semantics of the considered imperative language and of the logics used for expressing the properties of interest. Moreover, constraint logic programming, that is, logic programming extended with constraint solving, has been shown to be a powerful and flexible metalanguage for specifying the program syntax, the operational semantics, and the proof rules for many different programming languages and program properties.

A complementary approach to program verification is represented by program synthesis, which, starting from a formal specification of the intended behavior, has the objective of automatically deriving a program that complies with the given specification. However, program synthesis does not represent an alter- native to verification in all cases. Indeed, synthesis techniques open up to the

possibility of producing software artefacts that satisfy their specifications by construction, but they are much harder to put in practice, especially when scal- ability becomes a critical factor. This thesis addresses the problem of program synthesis by using, as done for verification of programs, techniques based on logic and constraint solving. In particular, we consider answer set programming to define a framework for automatically deriving synchronization protocols of concurrent programs. The formal specification of the protocol is given by us- ing temporal logic formulas. Design of protocols is reduced to finding stable models, also called answer sets, of the logic program that encodes the tempo- ral specification and the semantics of both the temporal logic and the protocol implementation language. Then, the different protocols satisfying the given specification can be derived by a simple decoding of the computed answer sets.

Overview of the Thesis

The first part of this thesis is devoted to the presentation of the verification framework [37, 38, 39, 40, 41, 42, 43, 44, 45].

In Chapter 1 we introduce the reader to the verification problem and we outline our verification framework. We also show, by developing a complete example taken from [44], how the verification framework can be instantiated to prove partial correctness of imperative programs written in a simple imperative programming language. This chapter is based on the work presented in [37, 38, 41, 42].

In Chapter 2 we recall basic notions of constraint logic programming, or CLP programs [83]. In particular, we present the transformation rules for CLP pro- grams [57, 60] which will be used to define transformation strategies that realize the various steps of the verification framework presented in Chapter 1.

In Chapter 3 we show how to generate verification conditions for proving par- tial correctness of imperative programs written in a subset of the C programming language. We present the CLP encoding of the imperative programs and the CLP interpreter defining the semantics of the imperative language and the logic to reason about partial correctness of imperative programs. We also introduce a specialization strategy, based on the unfold/fold transformation rules, that performs the so-called removal of the interpreter which, given as input the CLP encoding of an incorrectness triple specifying the partial correctness problem, returns a set of CLP clauses expressing the verification conditions for that par- tial correctness problem (specified by the so-called incorrectness triple). This chapter is based on the work presented in [42, 44] and lays the foundation for

the transformation techniques which will be used to check the satisfiability of the verification conditions presented in the subsequent chapters.

In Chapter 4 we show how program specialization can be used not only as a preprocessing step to generate verification conditions, but also as a means of analysis on its own [39], as an alternative to static analysis techniques of CLP programs. We extend the specialization strategy presented in Chapter 3 and we introduce generalization operators which are used to both discover program invariants and ensure the termination of the specialization process. The special- ization strategy is an instance of the general specialization strategy presented in [42].

In Chapter 5 we further extend the verification method by introducing the iterated specialization strategy, which is based on a repeated application of pro- gram specialization [42]. During the various iterations we may apply different strategies for specializing and transforming constraint logic programs, and differ- ent operators for generalizing predicate definitions. We also perform an extended experimental evaluation of the method by using a benchmark set consisting of a significant collection of programs taken from the literature, and we compare the results with some of state-of-the-art CLP-based software model checkers [43].

In Chapter 6 we extend the verification method to perform verification of imperative programs that manipulate arrays. We propose a transformation strategy which is based on the specialization strategy for generating verification conditions and the propagation of constraints as presented in Chapters 3–5. The novel transformation strategy makes use of the constraint replacement rule that is based on an axiomatization of the array data structure. This chapter basically reports the work presented in [44], which generalizes and extends the verification method presented in [40, 41].

In Chapter 7 we present a further extension of the transformation strategy to deal with recursively defined properties. In particular, we introduce the rules of conjunctive definition introduction and conjunctive folding. We show, through an example, that this extended method can be applied in a rather systematic way, and is amenable to automation by transferring to the field of program ver- ification many techniques developed in the field of program transformation [40]. In Chapter 8 we present the VeriMAP tool which implements the verification framework and the different strategies we have proposed. This chapter, based on the tool paper [45], also shows how to use the system by means of two examples.

In the second part of this thesis we present the synthesis framework [4, 46].

In Chapter 1 we recall basic notions of answer set programming (ASP), that is logic programming with stable model semantics. ASP is the metalanguage

we use to define the synthesis framework. We also recall some basic notions of Computational Tree Logic and group theory which will be used to specify the behavioural and structural properties, of the concurrent programs to be synthesized.

In Chapter 2 we present the syntax and the semantics of concurrent pro- grams. In particular, we introduce symmetric structures and symmetric concur- rent programs which allow us to specify structural properties shared by the pro- cesses which compose the concurrent programs. We also introduce the usual be- havioural properties such as mutual exclusion, starvation freedom, and bounded overtaking.

In Chapter 3 we introduce our synthesis procedure by defining the answer set programs which encode structural and behavioural properties. We present a result establishing the soundness and completeness of the synthesis procedure, and we also prove that this procedure has optimal time complexity.

In Chapter 4 we present some examples of synthesis of symmetric concurrent programs and we compare the results obtained by using different state-of-the-art answer set solvers.

In the Appendixes we show the proofs of the results presented, and the ASP source code of our synthesis procedure.

PART I ‌

VERIFICATION

CHAPTER 1 ‌

Software Model Checking by Program Transformation

Software model checking is a body of formal verification techniques for impera- tive programs that combine and extend ideas and techniques developed in the fields of static program analysis and model checking (see [90] for a recent survey).

Unfortunately, even for small programs manipulating integer variables, an exhaustive exploration of the state space generated by the execution of pro- grams is practically infeasible, and simple properties such as safety (which es- sentially states that ‘something bad never happens’) are undecidable. Despite these limitations, software model checking techniques work in many practical cases [9, 90, 114]. Indeed, in order to tackle these issues, many software model checking techniques follow approaches based on abstraction [30], by which the data domain is mapped into an abstract domain so that reachability is preserved, in the sense that if a concrete configuration is reachable, then the corresponding abstract configuration is reachable. By a suitable choice of the abstract domain one can design reachability algorithms that terminate and, whenever they prove that an abstract error configuration is not reachable from any abstract initial configuration, then the program is proved to be correct. However, due to the use of abstraction, the reachability of an abstract error configuration does not necessarily imply that the program is indeed incorrect. It may happen that the abstract reachability algorithm produces a spurious counterexample, that is, a sequence of configurations leading to an abstract error configuration which does not correspond to any concrete computation. Hence, constructing such an ab- straction is a critical aspect of software model checking, as it tries to meet two

somewhat conflicting requirements. On one hand, in order to make the verifica- tion process of large programs viable in practice, it has to construct a model by abstracting away as many details as possible. On the other hand, it would be desirable to have a model which is as precise as possible to reduce the number of spurious error detections.

Notable abstractions are those based on convex polyhedra, that is, conjunc- tions of linear inequalities, also called constraints. In many software model checking techniques, the notion of constraint has been shown to be very effec- tive, both for constructing models of programs and for reasoning about them [13, 16, 17, 34, 42, 49, 63, 74, 87, 86, 85, 119, 123]. Several types of constraints have been considered, such as equalities and inequalities over the booleans, the inte- gers, the reals, the finite or infinite trees. By using constraints we can represent in a symbolic, compact way (possibly infinite) sets of values computed by pro- grams and, more in general, sets of states reached during program executions. Then, we can use constraint solvers, that is, ad-hoc theorem provers, specifically designed for the various types of constraints to reason about program properties in an efficient way.

In particular, Horn clauses and constraints have been advocated by many researchers as suitable logical formalisms for the automated verification of im- perative programs [16, 73, 119]. Indeed, the verification conditions (VC’s) that express the correctness of a given program, can often be expressed as constrained Horn clauses [17], that is, Horn clauses extended with constraints in specific do- mains such as the integers or the rationals. For instance, let us consider the following C-like program:

int x, y, n;

while (x<n) { x=x+1; y=y+2;

}

Listing 1.1: Program double

and let us assume that we want to prove the following Xxxxx triple:

{x = 0 ∧ y = 0 ∧ n≥ 1} double {y>x}.

This triple holds if the following three verification conditions are satisfiable:

1. x = 0 ∧ y = 0 ∧ n ≥ 1 → P (x, y, n)

2. P (x, y, n) ∧ x < n → P (x + 1, y + 2, n)

3. P (x, y, n) ∧ x ≥ n → y>x

that is, if there exists an interpretation for P such that, for all x, y and n, the

three implications hold. Constraints such as the equalities and inequalities in clauses 1–3, are formulas defined in a background (possibly non-Horn) theory.

The correctness of a program is implied by the satisfiability of the verification conditions. Various methods and tools for Satisfiability Modulo Theory (see, for instance, [47]) prove the correctness of a given program by finding an interpre- tation (that is, a relation specified by constraints) that makes the verification conditions true. For instance, one such interpretation for the VC’s of the pro- gram double is:

P (x, y, n) ≡ (x = 0 ∧ y = 0 ∧ n≥ 1) ∨ y>x

It has been noted (see, for instance, [17]) that verification conditions can also be viewed as constraint logic programs, or CLP programs [83]. Indeed, clauses 1 and 2 above can be considered as clauses of a CLP program over the integers, and clause 3 can be rewritten as the following goal:

4. P (x, y, n) ∧ x≥n ∧ y≤x → false

Various verification methods based on constraint logic programming have been proposed in the literature (see, for instance, [2, 42, 49, 63, 79, 119] and the papers on which this thesis is based [39, 40, 42, 44]). These methods consist of two steps: (i) the first one is the translation of the verification task into a CLP program, and (ii) the second one is the analysis of that CLP program.

≥ ≥ ≤

In this thesis we will show how program transformation of constraint logic programs can be used as a means to prove the satisfiability of the verification conditions. In particular, in Chapters 4 and 5 we will see that in many cases it is helpful to transform a CLP program (expressing a set of verification conditions) into an equisatisfiable program whose satisfiability is easier to show. For in- stance, if we propagate, according to the transformations which will see in next section and, in detail, in Chapter 5, the two constraints representing the initial- ization condition (x = 0 ∧ y = 0 ∧ n 1) and the error condition (x n ∧ y x), then from clauses 1, 2, and 4 we derive the following new verification conditions:

5. Q(x, y, n) ∧ x<n ∧ x>y ∧ y≥ 0 → Q(x + 1, y + 2, n)‌

6. Q(x, y, n) ∧ x≥n ∧ x≥y ∧ y≥ 0 ∧ n≥ 1 → false

This propagation of constraints preserves the least model, and hence, by the extension of the van Xxxxx-Xxxxxxxx Theorem [137] to CLP programs and constrained Horn clauses, the verification conditions expressed by clauses 5 and 6 are satisfiable iff clauses 1–3 are satisfiable. Now, proving the satisfiability of clauses 5 and 6 is trivial because they are made true by simply taking Q(x, y, n) to be the predicate false.‌

1.1 A Transformation-based Verification Framework

Program transformation is a software development methodology that consists in manipulating the program text by applying semantics preserving rules [24, 121, 135].

We consider program transformation techniques as a basis for building a gen- eral verification framework in which software model checking of programs can be performed in a very agile, effective way. The main advantage of the transforma- tional approach to program verification over other approaches is that it allows one to construct highly configurable verification tools. Indeed, transformation- based verification techniques can easily be adapted to the various syntaxes and semantics of the programming languages under consideration and also to the various logics in which the properties of interest may be expressed.

The verification framework exploits a variety of transformation techniques

(i) to generate a set of verification conditions for proving the correctness of a program, and (ii) to check the satisfiability of the generated verification condi- tions. In particular, it makes use of a notable program transformation technique, called program specialization, whose objective is the adaptation of a program to the specific context of use with the aim of deriving an equivalent program where the verification task may be easily carried out. Indeed, by following this ap- proach, which can be regarded as a generalization of the one proposed in [119], given a program Prog written in a programming language L, and a property ϕ specified in a logic M , we can verify whether or not ϕ holds for Prog by:

(i) writing the interpreter I, in a suitable metalanguage, defining the semantics

of L and M , (ii) specializing the interpreter with respect to Prog and ϕ, thus deriving a specialized interpreter I', and finally (iii) analyzing I'. In particular, the interpretation overhead is compiled away by specialization, and in I' no ref- erence to statements of the program Prog is present, thereby representing the verification conditions for Prog in purely logical form. Moreover, since the out- put of a program transformation is a semantically equivalent program where the properties of interest are preserved, we can apply a sequence of transformations, more powerful than those needed for program specialization, thereby refining the analysis to the desired degree of precision.

In this thesis we develop an instance of the general transformation-based ver- ification framework for proving partial correctness of imperative programs. We reduce the problem of verifying the partial correctness to a reachability problem, and we adopt constraint logic programming as a metalanguage for representing imperative programs, their executions, and their properties. We assume that

⟨ ⟩

⇒

⟨ ⟩

{ } { }

the imperative program Prog is written in an imperative language L whose se- mantics is defined by a transition relation, denoted = , between configurations. Each configuration is a pair c, e of a command c and an environment e. An en- vironment e is a function that maps variable identifiers to their values. We also assume that the property ϕ is specified as a pair of formulas ϕinit and ϕerror , de- scribing a set of initial and error configurations, respectively. Then, we address the problem of verifying whether or not, starting from any initial configuration (that is, satisfying the property ϕinit), the execution of Prog eventually leads to a error configuration (that is, satisfying the property ϕerror ). This problem is for- malized by defining an incorrectness triple of the form ϕinit Prog ϕerror . We say that a program Prog, whose free variables are assumed to be among z1, . . . , zr, is incorrect with respect to ϕinit and ϕerror if there exist environments einit and xxxxx such that: (i) ϕinit(einit(z1), . . . , einit(zr)) holds, (ii) c0, einit

⇒ ⟨ ⟩

= ∗ halt, xxxxx , and (iii) ϕerror (xxxxx(z1), . . . , xxxxx(zr)) holds, where c0 is the

⇒

{ } {¬ }

first command of Prog, halt is the (unique) command of Prog which, when executed, causes the termination of Prog, and = ∗ is the reflexive, transitive closure of = . A program is said to be correct with respect to ϕinit and ϕerror if and only if it is not incorrect with respect to ϕinit and ϕerror . Note that this notion of correctness is equivalent to the usual notion of partial correctness specified by the Xxxxx triple ϕinit Prog ϕerror .

We translate the problem of checking whether or not the program Prog is in- correct with respect to the properties ϕinit and ϕerror into the problem of check- ing whether or not the nullary predicate incorrect (standing for the predicate false) is a consequence of the CLP program I defining the interpreter for proving the partial correctness of Prog with respect to the properties ϕinit and ϕerror . Therefore, the correctness of Prog is defined as the negation of incorrect. The CLP interpreter I defines the following predicates:

(i) incorrect, which holds iff there exists an execution of the program Prog

that leads from an initial configuration to an error configuration;

(ii) tr (short, for transition relation), which encodes the semantics of the im- perative language L. In particular, the predicate tr encodes the transition relation =⇒ from any given configuration to the next one.

{ } { }

Given an incorrectness triple ϕinit Prog ϕerror and the interpreter I, the following four steps are performed:

Step 1: CLP Translation. The given incorrectness triple is translated into the CLP program T, where Prog is encoded as a set of CLP facts, and ϕinit and ϕerror are encoded by CLP predicates defined by (possibly recursive) clauses.

Step 2: Verification Conditions Generation. A set of verification conditions is gen- erated by specializing the interpreter I with respect to the CLP program T, thereby deriving a new CLP program V where tr does not occur ex- plicitly (in this sense the interpreter is removed or compiled-away and V represents a set of verification conditions for the given incorrectness triple). We have that Prog is incorrect with respect to ϕinit and ϕerror if and only if incorrect holds in X.

Xxxx 0: Verification Conditions Transformation. By applying a sequence of pro- gram transformations the constraints encoding the properties ϕinit and ϕerror are propagated through the structure of the CLP program V. This step has the effect of modifying the structure of the program V and explicitly adding new constraints that denote invariants of the compu- tation, thereby producing a new CLP program S such that incorrect holds in V if and only if incorrect holds in S. During this step we apply transformation techniques that are more powerful than those needed for program specialization and, in particular, for the removal of the inter- preter performed by the Verification Conditions Generation step.

Step 4: Verification Conditions Analysis. The CLP program S is transformed into an equivalent CLP program Q where one of the following conditions holds: either (i) Q contains the fact incorrect, and the verification process stops reporting that Prog is incorrect with respect to ϕinit and ϕerror , or (ii) Q contains no clauses for the predicate incorrect, and the verification process stops reporting that Prog is correct with respect to ϕinit and ϕerror , or (iii) Q contains a clause of the form incorrect :- G, where G is not the empty goal, and the verification process returns to Step 3.

Obviously, due to undecidability limitations, it may be the case that we never get a program Q such that either Property (i) or Property (ii) holds, and hence the verification process does not terminate.

In order to give the reader a snapshot of our method for performing software model checking, we develop the example presented in the previous section by providing the output of each step of the verification framework in Figure 1, and the sketches of the program transformations we will present in detail in the following chapters of this thesis.

Example 1 (Proving correctness of the double program). Let us consider again the program double presented in Listing 1.1 and let us suppose that we want to prove the partial correctness of double with respect to ϕinit(x, y, n) =def

Translate Prog

Step 1: ϕinit and ϕerror

into CLP

Program Prog

(written in L)

Properties ϕinit and ϕerror

(specified in M )

Verification Conditions (VC’s) V

Verification Condition Transformation

Verification Condition Generation

CLP

Translation

Encoding of the incorrectness triple T

Step 2: Specialize I w.r.t. T

Interpreter I (Semantics of L) (Semantics of M )

Step 3: Transform V

Transformed VC’s S

unknown

Step 4: Check whether or not

incorrect holds in S

Verification Condition Analysis

true false

Figure 1: Verification Framework. The verification framework relies on the sub- sidiary modules, represented by rectangular boxes, responsible for the operations associated with Step 1–Step 4.

∧ ∧ ≥ ≤

x = 0 y = 0 x 0 xxx xxxxxx (x, x, x) =def y x. That is, we want to show that, if a configuration satisfies ϕinit(x, y, n), by executing the program double, no configuration satisfying ϕerror (x, y, n) is reached.

{ } { }

In order to check whether or not ϕinit(x, y, n) double ϕerror (x, y, n) holds, we instantiate the verification framework of Figure 1 and we show the verification method in action on the given incorrectness triple.

{ } { }

We start off by executing the CLP Translation step which translates the in- correctness triple ϕinit(x, y, n) double ϕerror (x, y, n) into a CLP program where the predicates at, phiInit, and phiError represent the commands of double, and the initial ϕinit and the error ϕerror properties.

First, the commands in double are translated into a set of CLP facts of the form at(L,C), where L is the label associated with the command C.

7. at(0,ite(bexp(lt(exp(id(x)),exp(id(n))),1,h)).

8. at(1,asgn(id(x),exp(plus(exp(id(x)),exp(int(1)))),2)).

9. at(2,asgn(id(y),exp(plus(exp(id(y)),exp(int(2))),0)).

10. at(h,halt).

The C-like commands are translated into commands of a simpler non structured core language consisting of conditionals, assignments, and jumps. For instance, the while loop of double is translated into the conditional (ite) at line 7. The first argument of ite represents the expression in the while loop, where: (i) lt represents the ‘<’ operator, (ii) bexp and exp represent boolean and arithmetic expressions, respectively, and (iii) id represents program identifiers. The second and third argument of ite represent the labels of the first command occurring in the conditional branches.

Then, the initial and error properties are translated into the following CLP clauses defining the predicates phiInit and phiError, respectively.

11. phiInit([[int(x),X],[int(y),Y],[int(n),N]]) :- X=0, Y=0, N≥1.

12. phiError([[int(x),X],[int(y),Y],[int(n),N]]) :- Y≤X.

Now, in order to proceed with the Verification Conditions Generation step, we need to define the CLP interpreter I for proving partial correctness of C- like programs. This will be done by introducing the predicates below. The partial correctness is defined as the negation of the predicate incorrect specified by the following CLP clauses which encode the reachability relation between configurations:

13. incorrect :- initConf(X), reach(X).

14. reach(X) :- tr(X,X1), reach(X1).

15. reach(X) :- errorConf(X).

16. initConf(cf(cmd(0,C),E)) :- at(0,C), phiInit(E).

17. errorConf(cf(cmd(h,halt),E)) :- phiError(E).

where: (i) initConf encodes the initial configuration (that is, initConf(X) holds if X is a configuration consisting of the first command of the impera- tive program and an environment E where the initial property phiInit holds),

(ii) reach encodes the reachability of a configuration (that is, reach(X) holds if an error configuration can be reached from the configuration X), (iii) tr en- codes the transition relation corresponding to the operational semantics of the C-like language (see clauses 18–23 below), and (iv) errorConf encodes the error configurations (that is, errorConf(X) holds if X is a configuration consisting of the command halt and an environment E where the error property phiError holds). The predicate tr is defined as follows.

18. tr(cf(cmd(L,asgn(Id,Ae,L1)),E), cf(cmd(L1,C),Ep)) :- eval(Ae,E,Val), update(Id,Val,E,Ep), at(L1,C).

19. tr(cf(cmd(L,ite(Be,L1,L2)),E),cf(cmd(L1,C),E)) :- eval(Be,E), at(L1,C).

20. tr(cf(cmd(L,ite(Be,L1,L2)),E),cf(cmd(L2,C),E)) :- eval(not(Be),E), at(L2,C).

21. eval(exp(plus(Ae1,Ae2),E,Val) :- XxxxXxx0xXxx0, xxxx(Xx0,X,Xxx0), xxxx(Xx0,X,Xxx0).

22. eval(exp(id(I)),E,X) :- X=Y, lookup(I,E,Y).

23. eval(exp(int(C)),E,V) :- V=C.

where each configuration cf(cmd(L,C),E) consists of (i) a labelled command cmd(L,C), and (ii) an environment E. The environment E is a function that maps program variables to their values and for the program double is repre- sented by the list [[int(x),X],[int(y),Y],[int(n),N]]. The environment is manipulated by two auxiliary predicates: (i) update(Id,Val,E,Ep) updates the environment E by binding the variable Id to the value Val, thereby constructing a new environment Ep, and (ii) lookup(I,E,Y) retrieves the value Y associated with the program variable I in the environment E.

In order to derive the verification conditions for the program double, we perform the Verification Conditions Generation step by specializing clauses 13–23 with respect to the clauses in T that encode the given incorrectness triple. This step is performed also in other specialization-based techniques for the program verification (see, for instance, [119]). We get the CLP program V consisting of

the following clauses:

≥

24. incorrect :- X=0, Y=0, N 1, new1(X,Y,N).

25. new1(X,Y,N) :- X<N, X1=X+1, Y1=Y+2, new1(X1,Y1,N).

26. new1(X,Y,N) :- X≥N, Y≤X.

where new1 is a new predicate symbol automatically introduced by the special- ization algorithm. We have that the satisfiability of clauses 24–26 implies the satisfiability of the corresponding VC’s 1–3 of the example 1.1.

Unfortunately, it is not possible to check by direct evaluation whether or not incorrect is a consequence of the above CLP clauses. Indeed, the evaluation of the query incorrect using the standard top-down strategy enters into an infinite loop. Tabled evaluation [35] does not terminate either, as infinitely many tabled atoms are generated. Analogously, bottom-up evaluation is unable to return an answer, because infinitely many facts for new1 should be generated for deriving that incorrect is not a consequence of the given CLP clauses.

≥ ≤

Our verification method avoids the direct evaluation of the above clauses, and applies some symbolic evaluation methods based on program transformation. In particular, starting from the CLP program V obtained by the removal of the interpreter, we execute the Verification Conditions Transformation step, which per- forms further transformations, called the propagation of the constraints. These transformations propagate the constraints ‘X=0, Y=0, N 1’ and ‘Y X’, charac- terizing the initial and error configurations, respectively.

≥

We execute the Verification Conditions Transformation step which transforms the program V by propagating the constraint ‘X=0, Y=0, N 1’ and we get the following program S:

27. incorrect :- X=0, Y=0, N≥1, new2(X,Y,N).

28. new2(X,Y,N) :- X=0, Y=0, N≥1, X1=1, Y1=2, new3(X1,Y1,N).

29. new3(X,Y,N) :- X<N, X1=X+1, Y1=Y+2, X1≥1, Y1≥2, new3(X1,Y1,N).

30. new3(X,Y,N) :- X≥N, Y≤X, Y≥0, N≥1.

where new2 and new3 are new predicate symbols introduced by the transforma- tion algorithm.

This step produces a new program whose constraints are those occurring in the CLP program V enriched with new constraints derived by various constraint manipulation operations which we will see in Chapters 4 and 5. The reader may see the effect of this transformation on clauses 28–30 where we have the extra constraints (underlined in the program) ‘X1=1, Y1=2’, ‘X1≥1, Y1≥2’, and ‘Y≥0, N≥1’, respectively.

Now, we perform the Verification Conditions Analysis step. This step is para- metric with respect to the analyzer which is used (see Chapters 4 and 5 for two possible choices). Here, we use a lightweight analyzer which is based on a sim- ple inspection of the program S and is guaranteed to terminate. Unfortunately, the analyzer is unable to check whether or not incorrect is a consequence of the above CLP clauses. (Note that a different analyzer, for instance, based on bottom-up evaluation, does not terminate because infinitely many facts for new3 should be generated for deriving that incorrect is not a consequence of the given CLP clauses). Therefore, we proceed with a further transformation by executing again Step 3 on a new CLP program, say R, obtained from the CLP program S by reversing the direction of the state space exploration of the reachability relation reach. In particular, the CLP program S, which checks the reachability of the error configurations from the initial configurations, is trans- formed into the following CLP program R, which checks the reachability of the initial configurations from the error configurations.

31. incorrect :- X≥N, Y≤X, Y≥0, N≥1 new3(X,Y,N).

32. new3(X1,Y1,N) :- X=0, Y=0, N≥1, X1=1, Y1=2, new2(X,Y,N).

33. new3(X1,Y1,N) :- X<N, X1=X+1, Y1=Y+2, X1≥1, Y1≥2, new3(X,Y,N).

34. new2(X,Y,N) :- X=0, Y=0, N≥1.

≤

≥ ≤ ≥ ≥

The Verification Conditions Transformation step, applied to the CLP program R, propagates the constraints ‘X N, Y X, Y 0, N 1’ (that is, the constraint ‘Y X’ representing ϕerror together with the constraints added by the previous execution of the Step 3). By doing so, we get the following clauses:

35. incorrect :- X≥N, Y≤X, Y≥0, N≥1, new4(X,Y,N).‌

36. new4(X1,Y1,N) :- X<N, X1=X+1, Y1=Y+2, X>Y, Y≥0, new4(X,Y,N).

o ‌

where new4 is a new predicate symbol introduced by the transformation algo- rithm, corresponding to the predicate Q of the verification condition 5, encoded by clause 36, and the verification condition 6, encoded by clause 35. Since among these final clauses 35 and 36 there are no constrained facts, their least model is empty. Thus, incorrect does not hold in the least model of the clauses 35 and 36. Then, by the correctness of the CLP encoding (Theorem 2 on page 43) and by the correctness of CLP program specialization with respect to the least model semantics (Theorem 3 on page 45 and Theorem 4 on page 67) we get, as desired, that the given imperative program double is correct with respect

to ϕ

init

and ϕ

error .

1.2 Related Work

The use of logic programming techniques for program analysis is not novel. For instance, Datalog (the function-free fragment of logic programming) has been proposed for performing various types of program analysis such as dataflow analysis, shape analysis, and points-to analysis [22, 125, 139]. In order to encode the properties of interest into Datalog, all these analysis techniques make an abstraction of the program semantics. In contrast, our transformational method manipulates a CLP program which encodes the full semantics (up to a suitable level of detail) of the program to be analyzed. An advantage of our method is that the output of a transformation-based analysis is a new program which is equivalent to the initial one, and thus the analysis can be iterated to the desired degree of precision.

Program verification techniques based on constraint logic programming, or equivalently constrained Horn clauses, have gained increasing popularity during the last years [17, 74, 119]. As an evidence of the expressive power and flexibility of constraints we want to point out that CLP programs have been recently proposed in [16] as a common intermediate language for exchanging program properties between software verifiers. In the same line of work, [74] presents a method for the automatic synthesis of software verification tools that use proof rules for reachability and termination written in a formalism similar to Horn clauses. Moreover, the use of the CLP-based techniques allows one to enhance the reasoning capabilities within Horn clause logic because one may take advantage of the many special purpose solvers that are available for various data domains, such as integers, arrays, and other data structures.

This renewed attention to program verification techniques based on CLP can also be explained as a consequence of the development of very efficient con- straint solving tools [127]. Indeed, several CLP-based verification tools are cur- rently available. Among them, ARMC [123], Duality [113], ELDARICA [82], HSF [73], TRACER [85], µZ [81], implement reasoning techniques within CLP by following approaches based on interpolants, satisfiability modulo theories, counterexample-guided abstraction refinement, and symbolic execution of CLP programs.

As pointed out in the first section of this chapter, defining suitable program models that capture the properties of interest is a critical step in the software model checking process. In constructing such models, a software model checker may follow two dual approaches. It may start from a concrete model and then progressively abstract away some irrelevant facts. This is the approach followed

by the verification method we propose in this thesis and, in a broader sense, by approaches based on symbolic execution [86]. Conversely, it may start from a coarse model and then progressively refine it by incorporating new facts. This approach is represented by the well established and widely used technique, called Counter-Example Guided Abstraction Refinement (CEGAR) [90], which is im- plemented by several software model checkers (that is, SLAM and BLAST). In particular, given a program P and a property ϕ, CEGAR uses an abstract model α(P ) to check whether or not P satisfies ϕ. If α(P ) satisfies ϕ then P satisfies ϕ, otherwise a counterexample, i.e., an execution which that leads to an error configuration, is produced. The counterexample is then analyzed: if it turns out to correspond to a real execution of P (genuine counterexample) then the program is proved to be incorrect, otherwise it has been generated due to a too coarse abstraction (spurious counterexample) and α(P ) is refined to a more concrete model.

CHAPTER 2 ‌

Transformation of Constraint Logic Programs

This chapter is devoted to introduce the reader with the transformation rules for constraint logic programs. Transformations rules represent the basic operations for defining transformation procedures which, together with suitable strategies to control their application, will be used to perform the following steps of our verification framework: Verification Conditions Generation, Verification Conditions Transformation, and Verification Conditions Analysis.‌

2.1 Constraint Logic Programming

We first recall some basic notions and terminology concerning constraint logic programming. We consider CLP( ) programs, which are parametric with re- spect to the constraint domain . For more details the reader may refer to [83]. We assume that the reader is familiar with the basic notions of first order logic and logic programming [105].‌

2.1.1 Syntax of Constraint Logic Programs

We consider a first order language L generated by an infinite set V of variables, a set F of function symbols with arity n ≥ 0, and a set P of predicate symbols with arity n ≥ 0. We assume that F is the union of two disjoint sets: (i) the set Fc of constraint function symbols, and (ii) the set Fu of user defined function symbols. Similarly, we also assume that P is the union of two disjoint sets:

(i) the set Pc of constraint predicate symbols, including true and false, and

(ii) the set Pu of user defined predicate symbols.

A term is either a variable in V or an expression of the form f(t1, . . . , tm), where f is a symbol in F and t1, . . . , tm are terms.

An atomic formula is an expression of the form p(t1, . . . , tm), where p is a symbol in and t1, . . . , tm are terms.

∃ ∀ ← ∧

¬ ∧ ∨ ← ↔

A formula of is either an atomic formula or a formula constructed, as usual, from already constructed formulas by means of connectives ( , , , , ) xxx xxxxxxxxxxx ( , ). We also denote ‘ ’ by ‘:-’ and ‘ ’ by ‘,’.

∃

∀

∀ ∀ ∀

{ } ∀

Let e be a term or a formula. The set of variables occurring in e is denoted by vars(e). Similar notation will be used for denoting the set of variables occurring in a set of terms or formulas. A term or a formula is ground if it contains no variables. Given a set X = X1, . . . , Xn of variables, by X ϕ we denote the formula X1 . . . Xn ϕ. By (ϕ) we denote the universal closure of ϕ, that is, the formula X ϕ, where X is the set of the variables occurring free in ϕ. Analogously, by (ϕ) we denote the existential closure of ϕ. We denote a formula ϕ whose free variables are among X1, . . . , Xn also by ϕ(X1, . . . , Xn).

An atomic constraint is an atomic formula p(t1, . . . , tm), where p is a predi- cate symbol in c and t1, . . . , tm are terms.

A constraint c is either true, or false, or an atomic constraint, or a conjunc- tion c1,c2 of constraints.

An atom is an atomic formula of the form p(t1, . . . , tm), where p is a predicate symbol in u and t1, . . . , tm are terms.

→ ¬

A CLP program P is a finite set of clauses of the form A :- c, B, where A is an atom, c is a constraint, and B is a (possibly empty) conjunction of atoms. A is called the head and c, B is called the body of the clause. The clause A :- c is called a constrained fact. When the constraint c is true then it is omitted and the constrained fact is called a fact. A goal is a formula of the form :- c, B (standing for c ∧ B false or, equivalently, (c ∧ B)). A CLP program is said to be linear if all its clauses are of the form A :- c, B, where B consists of at most one atom.

The definition of a predicate p in a program P is the set of all clauses of P

whose head predicate is p.

We say that a predicate p depends on a predicate q in a program P if either in P there is a clause of the form p( . . . ) :- c, B such that q occurs in B, or there exists a predicate r such that p depends on r in P and r depends on q in Π.

The set of useless predicates in a program P is the maximal set 𝘚 of predicates occurring in P such that p is in 𝘚 iff every clause γ with head predicate p is of the form p(. . . ) ← c ∧ G1 ∧ q( . . . ) ∧ G2 for some q in 𝘚 . A clause in a program P

is useless if the predicate of its head is a useless predicate in P .‌

2.1.2 Semantics of Constraint Logic Programs

We introduce a parametric (with respect to the interpretation of constraints [83, 84]) semantics of constraint logic programming. The definitions introduced in this section rely and build upon the notions introduced for logic programs [5, 105].

{⟨⟩}

D D

P D

F D

D →

A constraint interpretation consists of: (1) a non-empty set D, called car- rier, (2) an assignment of a function f : Dn D to each n-ary constraint function symbol f in , and (3) an assignment of a relation p over Dn to each n-ary constraint predicate symbol p in c. We say that f is interpreted as f and p is interpreted as p . In particular, for any constraint interpretation , true is interpreted as the relation D0, that is, the xxxxxxxxx whose unique element is the empty tuple and false is interpreted as the empty set.

D |

Given a formula ϕ where all predicate symbols belong to c, we consider the satisfaction relation = ϕ, which is defined as usual in the first order predicate calculus (see, for instance, [5, 105]).

P D

F D

Let T denote the set of ground terms built out of the elements of D and the function symbols u in the language of P . Given a constraint interpretation , an interpretation of the predicate symbols in u is called a -interpretation and is defined as follows.

A D-interpretation is an interpretation with universe TD such that: (i) it assigns to Fc and Pc the meaning according to D, and (ii) it is the Xxxxxxxx interpretation [105] for function and predicate symbols in Fu and Pu. We can identify a D-interpretation I with the set of ground atoms (with arguments in TD) which are true in I.

± | ∀ →

| ∃

D | D

We write = ϕ if ϕ is true in every -interpretation. A constraint c is satisfiable if D = (c). A constraint is unsatisfiable if it is not satisfiable. A constraint c entails a constraint d, denoted c d, if D = (c d).

We say that a clause of the form H :- c, B is satisfiable (unsatisfiable) if c is satisfiable (unsatisfiable).

We say that a clause of the form H :- c, B is subsumed by the constrained fact H :- d if c d.

We say that a clause newp(X) :- c(X), Q(X) is more general than a clause

newq(X) :- d(X), Q(X) if d(X) ± c(X).

The semantics of a CLP program P is defined to be the least D-model of P , denoted M (P ), that is, the least D-interpretation that makes true every clause

of P [84].‌

2.2 Transformation Rules

The verification method we propose is based on transformations of CLP pro- grams that preserve the least -model semantics [57, 60]. The process of trans- forming a given CLP program P , hence deriving a new program Pt, consists in constructing a transformation sequence P0, . . . , Pn of CLP programs such that:

(i) the initial program P0 is the input program P ,

(ii) the final program Pn is the output program Pt,

−

(iii) for k = 0, . . . , n 1, the program Pk+1 is obtained from Pk by applying one of the transformation rules presented in the following Definitions 1–5.

We assume that in every clause the head arguments denoting elements of are distinct variables. This assumption is needed to guarantee that the use of unification in the unfolding rule preserves the least -model semantics. The transformation rules we present are variants of the unfold/fold rules considered in the literature for transforming (constraint) logic programs [11, 57, 60, 106,

131, 135].

We start by presenting the unfolding rule which essentially consists in replac- ing an atom A occurring in the body of a clause in Pk by the bodies of the clauses which define A in P0.

{ | }

Definition 1 (Unfolding). Given a clause C in Pk of the form H :- c,L,A,R, where H and A are atoms, c is a constraint, and L and R are (possibly empty) conjunctions of atoms, let Ki :- ci,Bi i = 1, . . . , m be the set of the (renamed apart) clauses in program P0 such that, for i = 1, . . . , m, A is unifiable with Ki via the most general unifier ϑi and (c,ci) ϑi is satisfiable. We define the following function Unf :

Unf (C, A) = {(H :- c,ci,L,Bi,R) ϑi | i = 1, . . . , m}

Each clause in Unf (C, A) is said to be derived by unfolding C with respect to A. By unfolding C in Pk w.r.t. A we derive the program Pk+1 = (Pk −{C}) ∪Unf (C, A).

We present a rule to remove: (i) unsatisfiable clauses, (ii) subsumed clauses, and (iii) useless clauses.

− { }

Definition 2 (Clause removal). Let C be a clause in Pk of the form H :- c, X. By clause removal we derive the program Pk+1 = Pk C if either (i) the constraint c is unsatisfiable, or (ii) C is subsumed by a clause occurring in Pk − {C}, or (iii) C is useless in Pk.

We present a rule to replace a constraint occurring in the body of a clause by an equivalent disjunction of constraints.

Definition 3 (Constraint replacement). Given a clause C of the form:

H :- c0, B, and some constraints c1, . . . , cn such that

A | ∀ ∃ ↔ ∃ ∃

= (( X0 c0) ( X1 c1 ∨ . . . ∨ Xn cn))

−

where, for i = 0, . . . , n, Xi = vars(ci) vars(H, B), then, we derive n clauses C1, . . . , Cn obtained by replacing in the body of C the constraint c0 by the n constraints c1, . . . , cn, respectively. By constraint replacement we derive the program Pk+1 = (Pk − {C}) ∪ {C1, . . . , Cn}.

A transformation sequence P0, . . . , Xx introduces a set Defsk of new predicate definitions through the following rule.

{ }

Definition 4 (Definition introduction). A definition is a clause C of the form: newp(X) :- c, A, where: (i) newp is a fresh new predicate symbol not occurring in P0, . . . , Pk , (ii) X is the tuple of distinct variables occurring in A,

(iii) c is a constraint, and (iv) every predicate symbol occurring in A also occurs in P0. By definition introduction we derive the program Pk+1 = Pk ∪ {C}.

The folding rule consists in replacing an instance of the body of the definition of a predicate by the corresponding head.

Definition 5 (Folding). Given a clause E: H :- e, L, A, R in Pk and a clause

D: K :- d, D in Defsk. Suppose that, for some substitution ϑ, (i) A = D ϑ, and

(ii) e ± dϑ. Then by folding E using D we derive the clause F : H :- e, L, K ϑ, R

and the program Pk+1 = (Pk − {E}) ∪ {F}.

The following Theorem states that, under suitable conditions, the transfor- mation rules preserve the least D-model of the initial program P0.

−

Theorem 1 (Correctness of the Transformation Rules). Let P0 be a CLP program and let P0, . . . , Pn be a transformation sequence obtained by applying the transformation rules. Let us assume that for every k, with 0 <k <n 1, if Pk+1 is derived by applying folding to a clause in Pk using a clause D in Defsk , then there exists j, with 0 < j < n 1, such that: (i) D belongs to Pj, and

(ii) Pj+1 is derived by unfolding D with respect to the only atom in its body. Then, for every ground atom A whose predicate occurs in P0, we have that A ∈ M (P0) iff A ∈ M (Pn).

Proof. The result is proved in [57, 61].

CHAPTER 3 ‌

Generating Verification Conditions by Specializing Interpreters

Verification conditions can be automatically generated either from a formal spec- ification of the operational semantics of programs [119] or from the proof rules that formalize program correctness in an axiomatic way [74].

We address the problem of generating verification conditions by following an approach based on program specialization techniques. Program specialization is a program transformation technique which, given a program P and a portion in1 of its input data, returns a specialized program Ps that is equivalent to P in the sense that when the remaining portion in2 of the input of P is given, then Ps(in2) = P (in1, in2) [65, 93, 101].

{ } { }

In this thesis we follow an approach which can be regarded as a generaliza- tion of the one proposed in [119]. Given an incorrectness triple of the form ϕinit Prog ϕerror , where: (i) the imperative program Prog is written in a programming language L, and (ii) the formulas ϕinit and ϕerror are specified in a logic M , we generate the verification conditions for checking whether or not ϕinit Prog ϕerror holds by: (1) writing the interpreter I encoding the semantics of L and M in CLP, and (2) specializing the interpreter I with respect

to the given incorrectness triple.

The result of this transformation step, called Verification Conditions Generation, is an equivalent CLP program, say V, with respect to the properties of interest where the clauses of the interpreter I no longer occur. Since any reference to the statements of the original imperative program Prog is compiled away, the set of clauses V represents a set of verification conditions for Prog, and we say that they are satisfiable if and only if incorrect /∈ M (V) (or equivalently V |= incorrect).

Thus, the satisfiability of the verification conditions V guarantees that Prog is correct with respect to ϕinit and ϕerror .

This chapter is organized as follows. In Section 3.1 we present the syntax of our imperative language and the CLP encoding of imperative programs produced by the CLP Translation step. In Section 3.2 we define the CLP program encoding the semantics of the imperative language. In Section 3.3 we present the proof rules for proving partial correctness of imperative programs. In Section 3.4 we establish the soundness of the encodings presented in Sections 3.1, 3.2 and 3.3. Finally, in Section 3.5, we present the specialization strategy to perform the Verification Conditions Generation step.‌

3.1 Encoding Imperative Programs into CLP

We consider an imperative programming language, subset of the C language [94], whose abstract syntax is shown in Figure 2. We assume that: (i) every label occurs in every program at most once, (ii) every variable occurrence is either a global occurrence or a local occurrence with respect to any given function definition, (iii) in every program one may statically determine whether any given variable occurrence is either global or local. Note that there are no blocks, and thus no nested levels of locality. We also assume that: (i) there are no definitions of recursive functions, (ii) there are no side effects when evaluating expressions and functions, and (iii) there are neither structures, nor pointers.

A program Prog whose initial declarations are of the form: type z1; . . . ; type zr, is said to act on the global variables z1, . . . , zr. Without loss of generality, we also assume that the last command of every program is lh: halt which, when executed, causes program termination (it essentially encodes the return of the main function of C programs).

In order to simplify the analysis we use the C Intermediate Language [115] tool to compile the program into a simplified subset of the C language, and we use the CIL API to define a visitor that realizes the CLP Translation step of the verification framework depicted in Figure 1. In particular, we use the CIL front-end to translate any C program into a simpler program Prog which consists only of labelled commands (for reasons of brevity, we will feel free to say ‘command’, instead of ‘labelled command’). For instance, the while command is reduced to a suitable sequences of if-else and goto commands. This translation makes the task of generating verification conditions for the program easier. In particular, given a C program Prog, our CIL visitor produces a set of facts defining the predicate at(Lab,Cmd) which encodes commands, where Lab is the

x, y, . . . ∈ Vars (variable identifiers)

ƒ, g, . . . ∈ Functs (function identifiers)

l, l1, . . . ∈ Labs (labels)

const ∈ Z (integer constants, character constants, . . .)

type ∈ Types (int, char, . . .)

uop, bop ∈ Ops (unary and binary operators: +, −, ≤, . . .)

⟨prog⟩ ::= ⟨decl⟩∗ ⟨fundef⟩∗ ⟨labcmd⟩+; lh : halt (programs)

⟨decl⟩ ::= type x ∗

(declarations)

⟨fundef⟩ ::= type f (⟨decl⟩ ) { ⟨decl⟩∗ ⟨labcmd⟩+ } (function definitions)

⟨labcmd⟩ ::= l : ⟨cmd⟩ (labelled commands)

⟨cmd⟩ ::= x = ⟨expr⟩

| x = f (⟨expr⟩∗)

| return ⟨expr⟩

| goto l

| if (⟨expr⟩) l1 else l2

⟨expr⟩ ::= const | x | uop ⟨expr⟩ | ⟨expr⟩ bop ⟨expr⟩ (expressions)

Figure 2: Abstract syntax of the imperative language. The superscripts + and ∗

denote non-empty and possibly empty finite sequences, respectively.

label associated with the command Cmd.

We show how this encoding can be done through the following example.

Example 2 (Encoding of an imperative program). Let us consider the following program sum acting on the global variables x, y, and n:

int x, y, n;

while (x<n) { x=x+1; y=x+y;

}

Listing 3.1: Program increment

The while command is reduced into the following sequence of labelled com- mands by the CIL visitor:

1. l0 : if (x<n) l0 else lh;

2. l1 : x=x+1;

3. l2 : y=x+y;

4. l3 : goto l0;

5. lh : halt;

Then, this sequence of statements is translated into the following set of facts: 1. at(0,ite(less(id(x),id(n)),1,h)).‌

2. at(1,asgn(id(x),expr(plus(id(x),int(1))),2)).

3. at(2,asgn(id(y),expr(plus(id(x),id(y))),3)). 4. at(3,goto(0)).

5. at(h,halt).

The facts 1–5 encode the labelled commands at lines 1–5. Note that the defini- tion of at also includes information about the control flow of the program. For instance, the third argument of the fact at line 2 represents the label of the com- mand to be executed after the assignment is performed, that is, the command at line 3 .

3.2 Encoding Operational Semantics into CLP ‌

In this section we present the CLP clauses which encode the semantics of the imperative language L shown in Figure 2.

Let us first introduce the following data structures.

→

(i) A global environment δ : Vars Z is a function that maps global variables to their values.

→

(ii) A local environment σ: Vars Z is a function that maps function param- eters and local variables to their values.

⟨ ⟩

(iii) An activation frame is a triple of the form l, y, σ , where: (1) l is the label where to jump after returning from a function call, (2) y is the variable that stores the value returned by a function call, and (3) σ is the local environment to be initialized when making a function call.

⟨ ⟩

(iv) A configuration is a pair of the form c, e , where c is a labelled command e is an environment of the form δ, τ consisting of a global environment δ and a list τ of activation frames. We operate on the list τ of activation frames by the usual head (hd) and tail (tl) functions and the right-associative list constructor cons (:). The empty list is denoted by [ ]. Given a function ƒ , a variable identifier x, and an integer v, the term update(ƒ, x, v) denotes the function ƒ' that is equal to ƒ , except that ƒ'(x) = v.

For any program Prog (see Figure 2), for any label l, (i) at(l) denotes the command in Prog with label l, and (ii) nextlab(l) denotes the label of the com- mand in Prog that is written immediately after the command with label l. Given a function identifier ƒ , firstlab(ƒ ) denotes the label of the first command of the

definition of the function ƒ in Prog. For any expression e, any global envi-

J ) J )

ronment δ, and any local environment σ, e δ σ is the integer value of e. For

instance, if x is a global variable and δ(x) = 5, then x+1 δ σ = 6.

⇒

The semantics of our language is defined by a transition relation, denoted = , between configurations. That relation, denoted = , is defined by the following rules R1–R5.

(R1). Assignment. Let hd(τ ) be the activation frame ⟨l', y, σ⟩ and v be the

integer e δ σ.

If x is aJ g)lobal variable:

⟨l : x = e, ⟨δ, τ ⟩⟩⟩ =⇒ ⟨at(nextlab(l)), ⟨update(δ, x, v), τ ⟩⟩⟩

If x is a local variable:

⟨l : x = e, ⟨δ, τ ⟩⟩⟩ =⇒ ⟨at(nextlab(l)), ⟨δ, ⟨l', y, update(σ, x, v)⟩ : tl(τ )⟩⟩⟩

Informally, an assignment updates either the global environment δ or the local environment σ of the topmost activation frame ⟨l', y, σ⟩.

(R2). Conditional. Let hd(τ ) be the activation frame ⟨l', y, σ⟩.

If e δ σ = true:

⟨Jl :)if (e) l1 else l2, ⟨δ, τ ⟩⟩⟩ =⇒ ⟨at(l1), ⟨δ, τ ⟩⟩⟩

J )

If e δ σ = false:

⟨l : if (e) l1 else l2, ⟨δ, τ ⟩⟩⟩ =⇒ ⟨at(l2), ⟨δ, τ ⟩⟩⟩

(R3). Jump.

⟨l : goto l', ⟨δ, τ ⟩⟩⟩ =⇒ ⟨at(l'), ⟨δ, τ ⟩⟩⟩

{ }

⟨ ⟩ { }

(R4). Function call. Let hd(τ ) be the activation frame l', y, σ . Let x1, . . . , xk and y1, . . . , yh be the set of the formal parameters and the set of the local variables, respectively, of the definition of the function ƒ .

⟨l : x = ƒ (e1,. . . ,ek), ⟨δ, τ ⟩⟩⟩ =⇒ ⟨at(firstlab(ƒ )), ⟨δ, ⟨nextlab(l), x, σ⟩ : τ ⟩⟩⟩

J ) J )

⟨ ⟩ ⟨ ⟩}

{⟨ ⟩ ⟨ ⟩

where σ is a local environment of the form: x1, e1 δ σ , . . . , xk, ek δ σ , y1, n1 , . . . , yh, nh , for some values n1, . . . , nh in Z (indeed, when the local variables y1, . . . , yh are declared, they are not initialized). Note that since the values of the ni’s are left unspecified, this transition is nondeterministic.

Informally, a function call creates a new activation frame with the label where to jump after returning from the call, the variable where to store the returned value, and the new local environment.

J )

(R5). Return. Let τ be ⟨l', y, σ⟩ : ⟨l'', z, σ'⟩ : τ '' and v be the integer e δ σ. If y is a global variable:

⟨l : return e, ⟨δ, τ ⟩⟩⟩ =⇒ ⟨at(l'), ⟨update(δ, y, v), tl(τ )⟩⟩⟩ If y is a local variable:

⟨l : return e, ⟨δ, τ ⟩⟩⟩ =⇒ ⟨at(l'), ⟨δ, ⟨l'', z, update(σ', y, v)⟩ : τ ''⟩⟩⟩

⟨ ⟩

Informally, a return command first evaluates the expression e and computes the value v to be returned, then erases the topmost activation frame l', y, σ , and then updates either the global environment δ or the local environment σ' of the new topmost activation frame ⟨l'', z, σ'⟩.

Obviously, no rule is given for the command halt, because no new configuration is generated when halt is executed.

⟨ ⟩

⟨ ⟨ ⟩⟩⟩ ⟨ ⟩

⇒

The CLP interpreter for the imperative language is given by the following clauses for the binary predicate tr that relates old configurations to new con- figurations and defines the transition relation = . A configuration of the form c, δ, τ , where c is a command and δ, τ is an environment, is encoded by using the term cf(c,e), where c and e are terms encoding c and δ, τ , respec- tively. We have the following clauses encoding rules R1–R5: (R1) assignments (clause 6), (R2) conditionals (clauses 7 and 8), (R3) jumps (clause 9), and (R4

) function calls (clause 10), and (R5) function returns (clause 11), defined as follows.

6. tr(cf(cmd(L,asgn(Id,Ae,Lp)),E), cf(cmd(Lp,C),Ep)) :- aeval(Ae,E,Val), update(Id,Val,E,Ep), at(Lp,C).

where: (i) the term asgn(Id,Ae,Lp) encodes the assignment of the value of the expression Ae to a variable Id, and (ii) E is a pair of lists of the form (Glb,Afl) encoding the global environment δ and the list of activation frames τ , respectively. The predicate aeval(Ae,E,Val) computes the value Val of the expression Ae in the environment E. The predicate update(Id,Val,E,Ep) updates the environment E by binding the variable Id to the value Val, thereby constructing a new environment Ep. In particular, it updates either the list Glb, or the list Loc encoding the local environment of the topmost activation frame [Ret,RId,Loc] in the list Afl. The information about the scope of a program variable is encoded in Id.

7. tr(cf(cmd(L,ite(Be,Lt,Le)),E), cf(cmd(Lt,C),E)) :- beval(Be,E), at(Lt,C).

8. tr(cf(cmd(L,ite(Be,Lt,Le)),E), cf(cmd(Le,C),E)) :- beval(not(Be),E), at(Le,C).

where the term ite(Be,Lt,Le) encodes the boolean expression Be and the la- bels Lp and Le where to jump according to the evaluation beval(Be,E) of Be in the environment E. The predicate beval(Be,E) holds if Be is true in the environment E.

9. tr(cf(cmd(L,goto(Lp)),E), cf(cmd(Lp,C),E)) :- at(Lp,C).

where Lp is the label of the next command cmd(Lp,C) to be executed.

10. tr(cf(cmd(L,call(F)),E),cf(cmd(Lp,C),Ep)) :- prologue(F,E,Lp,Ep), at(Lp,C).

where F is a list of the form [Ael,RId,Lp,Ret], defined as follows: (i) Ael is the list of the actual parameters, (ii) RId is the variable where to store the returned value, (iii) Lp is the label of the first command C in the definition of the function F, and (iv) Ret is the label where to jump after returning from the function call. The predicate prologue(F,E,Lp,Ep): (1) builds the new local environment Loc, where the body of the function should be executed, by evalu- ating the list of the actual parameters Ael in the environment E, and (2) adds a new activation frame of the form [Ret,RId,Loc] to E thereby producing a new environment Ep.

11. tr(cf(cmd(L,ret(Ae)),E),cf(cmd(Lp,C),Ep)) :- epilogue(Ae,E,Lp,Ep), at(Lp,C).

where ret(Ae) encodes the expression Ae returned from a function call. The predicate epilogue(Ae,E,Lp,Ep): (1) computes the value Val of the expression Ae in the environment E, (2) updates the environment E by binding the variable RId (that is, the variable where to store the value returned by the called funtion) to the value Val, and (3) erases the topmost activation frame [Lp,RId,Loc] from E, thereby producing a new environment Ep.

Note that the CLP clauses 6–11 are clauses without constraints in their bodies. However, constraints are used in the definitions of the predicates aeval and beval.‌

3.3 Encoding Partial Correctness into CLP

In this section we introduce the clauses which encode the semantics of the logic M , that is, the reachability relation which allows us to prove the partial cor- rectness of imperative programs.

The problem of verifying the correctness of a programs Prog is the problem of checking whether or not, starting from an initial configuration, the execution

of Prog leads to a so-called error configuration. This problem is formalized by defining an incorrectness triple of the form:

{ϕinit(z1, . . . , zr) } Prog {ϕerror (z1, . . . , zr) }

where:

(i) Prog is a program acting on the global variables z1, . . . , zr,

(ii) ϕinit(z1, . . . , zr) is a disjunction of constraints that characterizes the values of the global variables in the initial configurations, and

(iii) ϕerror (z1, . . . , zr) is a disjunction of constraints that characterizes the values of the global variables in the error configurations.

{ } {¬ }

Note that our notion of correctness is equivalent to the usual notion of partial correctness specified by the Xxxxx triple ϕinit prog ϕerror .

We say that a program Prog is incorrect with respect to a set of initial con- figurations satisfying ϕinit(z1, . . . , zr) and a set of error configurations satisfying ϕerror (z1, . . . , zr) or simply, Prog is incorrect with respect to ϕinit and ϕerror , if there exist two global environments δinit and δhalt such that the following conjunction holds:

ϕinit(δinit(z1), . . . , δinit(zr))

∧ ⟨cinit, ⟨δinit,[ ]⟩⟩⟩ =⇒∗ ⟨chalt, ⟨δhalt, [ ]⟩⟩⟩ (𝛙)

∧ ϕerror (δhalt(z1), . . . , δhalt(zr))

⇒ 𝛙

⇒

where: cinit is the first command of Prog and, as already mentioned, chalt is the last command of Prog. As usual, = ∗ denotes the reflexive, transitive closure of = . The definition of incorrectness ( ) can be easily translated into CLP as follows:

12. incorrect :- initConf(X), reach(X).‌

13. reach(X) :- tr(X,X1), reach(X1).

14. reach(X) :- errorConf(X).

15. initConf(cf(cmd(0,C),E)) :- at(0,C), phiInit(E).

16. errorConf(cf(cmd(h,halt),E)) :- phiError(E).

⇒

where: (i) the predicate tr represents the transition relation = , (ii) the predi- cate reach represents the reachability of the error configurations, (iii) the pred- icate initConf represents the initial configurations, (iv) the predicate phiInit represents the initial property ϕinit, (v) the predicate errorConf represents the error configurations, and (vi) the predicate phiError represents the error prop- erty ϕerror.

Example 3 (Enconding of an incorrectness triple). Let us consider the in- correctness xxxxxx {xxxxx(x, x, x) } xxx {xxxxxx (x, x, x) }, where: (i) ϕinit(x, y, n)

∧

is x = 0 y = 0, (ii) ϕerror (x, y, n) is x>y, and (iii) sum is the program defined in Example 2.

The formulas ϕinit(x, y, n) and ϕerror (x, y, n) are encoded as follows:

17. phiInit(([[int(x),X],[int(y),Y],[int(n),N]],[])) :- X=0, Y=0.‌

18. phiError(([[int(x),X],[int(y),Y],[int(n),N]],[])) :- X>Y.

In clauses 17 and 18 the pair ([[int(x),X],[int(y),Y],[int(n),N]],[]) encodes the program environment. The first component, that is the global envi- ronment, is the list [[int(x),X],[int(y),Y],[int(n),N]] that provides the bindings for the global variables x, y, and n, respectively. The second compo- nent, that is, the list of activation frames, is the empty list [].

3.4 Soundness of the CLP Encoding ‌

In this section we establish the soundness of the CLP encodings presented in the previous sections.

Let us first introduce the definition of CLP Encoding.

{ } { }

Definition 6 (CLP Encoding). Let us consider an incorrectness triple of the form ϕinit Prog ϕerror . Let T be the CLP program consisting of the clauses that defines the predicates phiInit, at, and phiError encoding the ini- tial property ϕinit, the imperative program Prog, and the error property ϕerror , respectively. Let I the CLP program consisting of the clauses 6–16 encoding the interpreter for the correctness problem. The CLP program, say it P, which is the union of T and I, is called the CLP Encoding of the correctness problem for {ϕinit } Prog {ϕerror }.

The following result establishes that the CLP Encoding is sound.

Theorem 2 (Soundness of the CLP Encoding). Let P be the CLP Encoding of the correctness problem for {ϕinit } Prog {ϕerror }. The program Prog is correct with respect to ϕinit and ϕerror if and only if incorrect ∈/ M (P).

| ⇒

⇒

Proof. In the CLP program P, the predicate tr encodes the transition relation = associated with the given imperative program Prog, that is, P = tr(cf1,cf2) if and only if cf1 = cf2, where cf1 and cf2 are terms encoding the configurations cf1 and cf2, respectively. The predicates initConf and errorConf encode the initial and error configurations, respectively, that is, the following Properties (A) and (B) hold.

⟨ ⟨ ⟩⟩⟩

Property (A): P = initConf(init-cf) iff init-cf is the term encoding a configuration of the form l0 : c0, δinit, [ ] such that ϕinit(δinit(z1), . . . , δinit(zr)) holds, and

⟨ ⟨ ⟩⟩⟩

Property (B): P = errorConf(error-cf) iff error-cf is the term encoding a configuration of the form lh : halt, δhalt, [ ] such that ϕerror (δhalt(z1), . . . , δhalt(zr)) holds.

⇒

⟨ ⟨ ⟩⟩⟩

By clauses 13 and 14 of the CLP program P and Property (B), for any configu- ration cf encoded by the term cf, we have that P = reach(cf) iff there exists a configuration cfh of the form lh : halt, δhalt, [ ] such that ϕerror (δhalt(z1), . . . , δhalt(zr)) holds and cf = ∗ cfh.

Now, by clause 12 of the CLP program P and Property (A), we get that P = incorrect iff there exist configurations cf0 and cfh such that the following hold:

⟨ ⟨ ⟩⟩⟩

(i) cf0 is of the form l0 : c0, δinit, [ ] ,

(xx) xxxxx(xxxxx(x0), . . . , xxxxx(xx)),

(xxx) cf0 = ∗ cfh,

⟨ ⟨ ⟩⟩⟩

(iv) cfh is of the form lh : halt, δhalt, [ ] , and

(v) ϕerror (δhalt(z1), . . . , δhalt(zr)).

Thus, by the definition of incorrectness, P = incorrect iff Prog is incorrect with respect to the properties ϕinit and ϕerror . The thesis follows from the fact that P |= incorrect iff incorrect ∈ M (P) [84].

3.5 The Specialization Strategy ‌

In this section we present a transformation strategy, called Specializevcg, which realizes the Verification Conditions Generation step of the verification framework. In particular, Specializevcg unfolds away the relations on which incorrect de- pends and introduces new predicate definitions corresponding to (a subset of) the ‘program points’ of the original imperative program. The transformation strategy Specializevcg specializes the CLP Interpreter I, consisting of clauses 6– 16, with respect to the CLP encoding of the incorrectness triple T, consisting of the clauses defining the predicates at, phiInit, and phiError.

The result of this first transformation step is a new CLP program V, such that

incorrect ∈ M (P) iff incorrect ∈ M (V).

The transformation strategy Specializevcg, shown in Figure 3, makes use of the following transformation rules: Unfolding, Clause removal, Definition introduc- tion, and Folding.

In order to perform the Unfolding phase, we need to annotate the atoms occurring in bodies of clauses as either unfoldable or not unfoldable. This an- notation, which is a critical part of a transformation strategy, ensures that any sequence of clauses constructed by unfolding w.r.t. unfoldable atoms is finite. We will see a possible choice for this annotation in Example 4. We refer to [101] for a survey of techniques for controlling unfolding that guarantee this finiteness property.

In order to perform the Definition Introduction phase, we make use

of a set Defs of definitions arranged as a tree whose root is the initial input clause incorrect:- initConf(X), reach(X) considered at the beginning of the specialization process. Each new definition D introduced during specialization determines a new node of a tree which is placed as a child of the definition C, and represented by child(D, C), if it has been introduced to fold a clause obtained by unfolding C. The root is represented by child(D, T). We define the ancestor relation as the reflexive, transitive closure of the child relation.‌

3.5.1 Termination and Soundness of the Specialization Strategy

The following theorem establishes the termination and soundness of the Special- izevcg strategy.

Theorem 3 (Termination and Soundness of the Specializevcg strategy).

(i) The Specializevcg strategy terminates. (ii) Let program V be the output of the Specializevcg strategy applied on the input program P. Then incorrect ∈ M (P) iff incorrect ∈ M (V).

Proof. (i) In order to prove the termination of the Specializevcg procedure we assume that the unfoldable annotations and not unfoldable annotations guar- antee the termination of the unfolding while-loop (α1). Since the clause removal while-loop (α2), the definition introduction while-loop (α3), and the folding while-loop (β) clearly terminate, we are left with showing that the first, outermost while-loop (α) terminates, that is, a finite number of new predicate definitions is added to InCls by definition introduction. This finiteness is guaranteed by the following facts:

(1) all new predicate definitions are of the form newk(Y) :- reach(cf(cmd,E)) where reach(cf(cmd,E)) is the atom with fresh new variables occurring in the body of the clause to be folded (this construction ensures that every other clause whose atom occurring in the body is a variant of reach(cf(cmd,E)) can be folded using such a definition), and (2) only a finite set of configurations is

Input: Program P.

Output: Program V such that incorrect ∈M (P) if f incorrect ∈M (V).

Initialization:

V := ∅;

InCls := { incorrect :- initConf(X), reach(X) };

Defs := ∅;

(α) while in InCls there is a clause C that is not a constrained fact do

Unfolding:

SpC := Unf (C, A), where A is the leftmost atom in the body of C;

(α1) while in SpC there is a clause D whose body contains an occurrence of an unfoldable atom A do

− { } ∪

SpC := (SpC D ) Unf (D, A);

end-while;

Clause Removal:

(α2) while in SpC there are two distinct clauses D1 and D2 such that

− { }

D1 subsumes D2 do SpC := SpC D2 ;

end-while;

Definition Introduction:

(α3) while in SpC there is a clause D of the form

⊆

H(X) :- c(X,Y), reach(cf(cmd,E)), where: c(X,Y) is a constraint and cf(cmd,E) is a configuration with Y vars(E), such that it cannot be folded using a definition in Defs do

∪ { }

Defs := Defs child(newk(Y) :- reach(cf(cmd,E)), C) ;

∪ { }

InCls := InCls newk(Y) :- reach(cf(cmd,E)) ;

end-while;

InCls := InCls − {C};

V := V ∪ SpC;

end-while;

Folding:

(β) while in V there is a clause E that can be folded by a clause D in Defs do

− { } ∪ { }

V := (V E ) F , where F is derived by folding E using D;

end-while;

Remove from V all clauses for predicates on which incorrect does not depend.

Figure 3: The Specializevcg Procedure.

generated during the specialization process. All configurations are of the form cf(cmd,E), where: (2.1) the term cmd ranges over the set of commands belong- ing to the imperative program, and (2.2) the term E ranges over the set of terms encoding environments. Since the set of commands in a program is finite, then the set at point (2.1) is finite. An environment E is encoded as a pair of list: a list of global variables and a list of activation frames (each of which contains a local environment). Since the number of program variables is finite, then the global and the local environments are finite lists. Moreover, since we do not con- sider recursive functions, then the list of activation frames is finite. Hence, the list at point (2.2) is finite. (3) no two new predicate definitions that are equal modulo the head predicate name are introduced by definition introduction (indeed, definition introduction introduces a new predicate definition only if the definitions already present in Defs cannot be used to fold clause E).

(ii) in order to prove the soundness of Specializevcg we need to ensure that Specializevcg enforces all the applicability conditions for the unfolding and fold- ing rules presented in Chapter 2. The Specializevcg procedure constructs a trans- formation sequence P0, . . . , Pn, such that:

(1) P0 is P, and

− { } ∪

(2) Pn is (P incorrect:- initConf(X), reach(X) ) P', where P' is the value of V at the exit of the Folding while-loop (β).

∈

The hypothesis of the Theorem 1 (Correctness of the Transformation Rules) is fulfilled, as all clauses in Defs are unfolded. Thus, incorrect M (P) iff incorrect M (Pn). The thesis follows from the fact that, by deleting from Pn the clauses defining predicates on which incorrect does not depend, we get a final program V such that incorrect ∈ M (V) iff incorrect ∈ M (Pn).

{ ∧ } { }

Example 4 (Generating Verification Conditions). Let us consider again the incorrectness triple x = 0 y = 0 sum x>y . In order to generate the verification conditions for sum we apply the Specializevcg strategy.

In order to guarantee the termination of the Unf procedure, an atom A in the body of a clause Cl is unfoldable iff one of the following conditions holds:

(i) the predicate of A is different from reach,

(ii) A is of the form reach(cf(cmd(L,C),E)) and C is either ret(Ae), or

asgn(Id,expr(Ae),L1), or halt,

(iii) A is of the form reach(cf(cmd(L,goto(L1)),E)) and Cl has not been derived (in one or more steps) by unfolding a clause with respect to an atom of the form reach(cf(cmd(L,goto(L1)),E)).

Note that a reach atom containing a command of the form asgn(x,call(...))

is assumed to be not unfoldable. Condition (iii) allows unfolding with respect to a reach atom containing a goto command, but prevents infinite unfolding. (Recall that we have assumed that the imperative program Prog does not contain definitions of recursive functions.) Finally, note that a reach atom containing an ite command is not unfoldable, and hence a potential exponential blowup due to the unfolding of conditionals is avoided. Indeed, it can easily be shown that the size of the output V of Specializevcg is linear with respect to the size of P (and thus, also with respect to the size of the imperative program Prog of the triple encoded by T).

In order to perform folding, a clause C is folded using either (i) the clause introduced during the Definition introduction, (ii) the clause, say D, from which C has been derived, or (iii) the most recent ancestor of D.

We apply the Specializevcg strategy as we now show.

The Initialization procedure performs the assignments: V := ∅, InCls :=

{incorrect :- initConf(X), reach(X)} (that is, clause 12), and Defs := ∅.

/ ∅

Since InCls = , Specializevcg enters the while-loop (α). First, the Unfolding loop (α1) executes the assignment SpC :=Unf (12, initConf(X)), which unfolds clause 12 w.r.t. the atom initConf(X). We get:

19. incorrect :- X=0, Y=0, reach(cf(cmd(0,ite(less(int(x),int(n)),l1,h)), [[int(x),X],[int(y),Y],[int(n),N]],[ ])).

The reach atom in clause 19 is not unfoldable because it contains an ite com- mand, hence, the while-loop (α1) terminates. No clause can be removed by

Clause Removal loop (α2), and therefore, the while-loop (α) continues by

executing the Definition Introduction loop (α3). In order to fold clause 19 the Definition-Introduction introduces the following clause:

20. new1(X,Y,N) :- reach(cf(cmd(0,ite(less(int(x),int(n)),l1,h)), [[int(x),X],[int(y),Y],[int(n),N]],[ ])).

∪ { } { }

and performs the assignments: Defs := Defs child(20, T) = 20 , and

∪ { } { }

InCls := InCls 20 = 12, 20 .

The first execution of the while-loop (α) terminates by performing the follow- ing assignments: InCls := InCls − {12} = {20}, and V := V ∪ SpC = {19}.

/ ∅

Since InCls = , we perform a second iteration of the while-loop (α). The Unfolding procedure executes SpC := Unf (20, reach(cf(cmd(0,ite(. . .))). We get two clauses:

21. new1(X,Y,N) :- tr(cf(cmd(0,ite(less(int(x),int(n)),1,h)), [[int(x),X],[int(y),Y],[int(n),N]],[ ]),Z), reach(Z).

22. new1(X,Y,N) :- errorConf(cf(cmd(0,ite(less(int(x),int(n)),1,h)), [[int(x),X],[int(y),Y],[int(n),N]],[ ])).

The tr and errorConf atoms in the bodies of clauses 21 and 22, respectively, are unfoldable. Thus, the while-loop (α1) perform some more unfolding steps and from clause 21, after a few steps that can be viewed as mimicking the symbolic evaluation of the conditional command, we get the following two clauses:

23. new1(X,Y,N) :- X<N, reach(cf(cmd(1,asgn(int(x),expr(plus(int(x),int(1))))), [[int(x),X],[int(y),Y],[int(n),N]],[ ])).

≥

24. new1(X,Y,N) :- X N, reach(cf(cmd(h,halt),[[int(x),X],[int(y),Y],[int(n),N]],[ ])).

≥

Clauses 23 and 24 represent the branches of the conditional command of the im- perative program sum. Indeed, the test on the condition less(int(x),int(n)) in the command of clause 21 generates the two constraints X<N and X N.

Then, we delete clause 22 because by unfolding it we derive the empty set of clauses (indeed, the term cmd(0,...) does not unify with the term cmd(h,...)).

The reach atom in the body of clause 23 is unfoldable, because it contains a command of the form asgn(int(x), expr(...)). From clause 23, after two unfolding steps executed by the while-loop (α1), we get:

25. new1(X,Y,N) :- X<N, tr(cf(cmd(1,asgn(int(x),expr(plus(int(x),int(1))))), [[int(x),X],[int(y),Y],[int(n),N]],[ ]),Z)),reach(Z).

Then, by unfolding clause 25 with respect to tr atom in its body, we get: 26. new1(X,Y,N) :- X<N, X1=X+1,

reach(cf(cmd(2,asgn(int(y),expr(plus(int(x),int(y))))),

[[int(x),X1],[int(y),Y],[int(n),N]],[ ])).

The reach atom in the body of clause 26 is unfoldable. After two unfolding steps, we get:

27. new1(X,Y,N) :- X<N, X1=X+1,

tr(cf(cmd(1,asgn(int(y),expr(plus(int(x),int(y))))), [[int(x),X1],[int(y),Y],[int(n),N]][ ]),Z)), reach(Z).

By unfolding clause 27 we get:

28. new1(X,Y,N) :- X<N, X1=X+1, Y1=X1+Y, reach(cf(cmd(3,goto(l0)), [[int(x),X1],[int(y),Y1],[int(n),N]],[ ])).

The sequence of clauses 19, 23, 26, and 28, which we have obtained by unfolding, mimics the execution of the sequence of the four commands: (i) l0 : if(x<n) l1;

else lh, (ii) l1 : x=x+1, (iii) l2 : y=x+y, and (iv) l3 : goto l0 (note that in those clauses the atoms reach(cf(cmd(i,C),E)), for i = 0, 1, 2, 3). Indeed, in general, by unfolding one can to perform the symbolic execution of the commands of any given program. The conditions that should hold so that a particular command cmd(i,C) is executed, are given by the constraints in the clause where the atom reach(cf(cmd(i,C),E)) occurs.

The reach atom in the body of clause 28 is unfoldable, because clause 28 has not been derived from another clause containing a goto command. From clause 28, after some more unfolding steps, we get:

29. new1(X,Y,N) :- X<N, X1=X+1, Y1=X1+Y,

reach(cf(cmd(0,ite(less(int(x),int(n)),1,h)), [[int(x),X1],[int(y),Y1],[int(n),N]],[ ])).

The reach atom in the body of clause 29 is not unfoldable, because it contains the ite command.

Also the reach atom in the body of clause 24, which represents the else branch of the conditional command, is unfoldable. After two unfolding steps, we get:

≥

30. new1(X,Y,N) :- X N, errorConf(cf(cmd(h,halt), [[int(x),X],[int(y),Y],[int(n),N]],[ ])).

Then, by unfolding clause 30 we get:

31. new1(X,Y,N) :- X≥N, X>Y.

Since in clauses 29 and 31 no unfoldable atoms occur the while-loop (α1) terminates with SpC := {29, 31}.

The Definition Introduction procedure terminates without introducing any new definition. Indeed, clause 29 can be folded using definition 20. The sec- ond execution of the while-loop (α) terminates by performing the assignments: InCls := InCls − {20} = ∅, and V := V ∪ SpC = {19, 29, 31}.

∅

Since InCls = , the while-loop (α), proceeds by executing the Folding pro- cedure, which concludes the Specializevcg strategy. By folding clauses {19, 29} using definition 20 we get the following final program V:

32. incorrect :- X=0, Y=0, new1(X,Y,N).

33. new1(X,Y,N) :- X<N, X1=X+1, Y1=X1+Y, new1(X1,Y1,N).

34. new1(X,Y,N) :- X≥N, X>Y.

Note that the application of the folding rule on clause 29 using the definition for predicate new1 has been possible because the execution of the program goes back to the ite command to which the definition of new1 refers.

o ‌

3.6 Related Work

The idea of encoding imperative programs into CLP programs for reasoning about their properties was presented in various papers [16, 63, 88, 119, 127], which show that through CLP programs one can express in a simple manner both (i) the symbolic executions of imperative programs, and (ii) the invariants that hold during their executions.

The use of constraint logic program specialization for analyzing imperative programs has also been proposed by [119], where the interpreter of an imperative language is encoded as a CLP program. Then the interpreter is specialized with respect to a specific imperative program to obtain a residual program on which a static analyzer for CLP programs is applied.

The verification method presented in [63] is based on a semantics preserv- ing translation from an imperative language with dynamic data structures and recursive functions into CLP. This translation reduces the verification of the (in)correctness of imperative programs to a problem of constraint satisfiability within standard CLP systems.

CHAPTER 4 ‌

Verifying Programs by

Specializing Verification Conditions

In this chapter we show how program specialization can be used not only as a preprocessing step to generate verification conditions, but also as a means of analysis on its own, as an alternative to static analysis techniques of CLP programs.

{ } { }

Let us consider an incorrectness triple of the form ϕinit Prog ϕerror . Ac- cording to the method proposed in Chapter 3 we have that the CLP program V, obtained at the end of the Verification Conditions Generation step, consists of a set of clauses representing the verification conditions for proving the partial cor- rectness of Prog with respect to ϕinit and ϕerror . By Theorems 2 and 3, checking the satisfiability of the verification conditions for Prog reduces to check whether or not the atom incorrect is a consequence of V.

∈ ∈

Unfortunately, the problem of deciding whether or not incorrect is a con- sequence of V is undecidable. Consequently, verification methods based on top- down, bottom-up, and tabled query evaluation strategies may not terminate. In order to cope with this undecidability limitation, and improve the termination of the verification process, we propose a strategy based on program special- ization. In particular, instead of applying program analysis techniques to the CLP program V, in the Verification Conditions Transformation step we further specialize V with respect to the initial property ϕinit, thereby deriving a new CLP program S, which is equivalent to V with respect to the property of inter- est, that is, incorrect M (V) iff incorrect M (S). The effect of this further transformation is the modification of the structure of V and the explicit addition of new constraints that denote invariants of the computation. Through various

experiments we show that by exploiting these invariants, the construction of the least model of the CLP program S, which is realized in the Verification Con- ditions Analysis step through a bottom-up evaluation procedure, terminates in many interesting cases and, thus, it is possible to verify the correctness of Prog with respect to ϕinit and ϕerror by simply inspecting that model.

An essential ingredient of program specialization are the generalization oper- ators, which introduce new predicate definitions representing invariants of the program executions. Generalizations are used to enforce the termination of program specialization (recall that program specialization terminates when no new predicate definitions are introduced) and, in this respect, they are similar to the widening operators used in static program analysis [30, 34]. One prob- xxx encountered with generalizations is that sometimes they introduce predicate definitions which are too general, thereby making specialization useless. We in- troduce a new generalization strategy, called constrained generalization, whose objective is indeed to avoid the introduction of new predicate definitions that are too general.

The basic idea of constrained generalization is related to the branching be- havior of the unfolding steps, as we now indicate. Given a sequence of unfolding steps performed during program specialization, we may consider a symbolic eval- uation tree made out of clauses, such that every clause has as children the clauses which are generated by unfolding that clause. Suppose that a clause γ has n children which are generated by unfolding using clauses γ1, . . . , γn, and suppose that during program specialization we have to generalize clause γ. Then, we would like to perform this generalization by introducing a new predicate defini- tion, say δ, such that by unfolding clause δ, we get again, if possible, n children and these children are due to the same clauses γ1, . . . , γn.

Since in this generalization the objective of preserving, if possible, the branch- ing structure of the symbolic evaluation tree, is realized by adding extra con- straints to the clause obtained after a usual generalization step (using, for in- stance, the widening operator [30] or the convex-hull operator [34]), we call this generalization a constrained generalization. Similar proposals have been pre- sented in [15, 78] and in Section 4.4 we will briefly compare those proposals with ours.

This chapter is organized as follows. In Section 4.1 we outline our software model checking method by developing an example taken from [75]. In Sec- tions 4.2 we describe our strategy for specializing CLP programs, and in Sec- tion 4.2.1 we presents some generalization operator and, in particular, our novel constrained generalization technique. In Section 4.3 we report on some ex-

periments we have performed by using a prototype implementation based on the MAP transformation system [108]. We also compare the results we have obtained using the MAP system with the results we have obtained using state- of-the-art software model checking systems such as ARMC [123], HSF(C) [73], and TRACER [85].‌

4.1 The Verification Method

In this section we outline our method for software model checking which is obtained from the general verification framework (see Figure 1) by providing suitable subsidiary procedures that realize Step 1–Step 4.

The Software Model Checking Method

{ } { }

Input: An incorrectness triple ϕinit Prog ϕerror and

the CLP program I defining the predicate incorrect.

Output: The answer correct iff Prog is correct with respect to ϕinit and ϕerror .

Step 1: T := C2CLP (Prog, ϕinit, ϕerror ); P := T ∪ I;

Step 2: V := Specializevcg(P); Step 3: S := Specializeprop(V); Step 4: M (S) := BottomUp(S);

Return the answer correct if f incorrect /∈M (S).

Figure 4: The Verification Method

The CLP Translation step (Step 1) and the Verification Conditions Generation step (Step 2), of the verification method shown in Figure 4, rely on the C2CLP and Specializevcg procedures, respectively. In particular, in order to guarantee the termination of the Unf subsidiary procedure of Specializevcg (see Figure 3), an atom A is selected for unfolding only if it has not been derived by unfolding a variant of A itself.

The verification method shown in Figure 4 avoids the direct evaluation of the clauses in the CLP program V and applies symbolic evaluation methods based on program specialization. Indeed, starting from the CLP program V, the Verifi- cation Conditions Transformation step (Step 3) performs a further specialization, called the propagation of the constraints, which consists in specializing V with respect to the constraint representing the initial property ϕinit, with the aim of deriving, if possible, a CLP program S whose least model M (S) is a finite set

of constrained facts. The least model M (S) is computed by using a bottom-up evaluation procedure.

In order to perform the Verification Conditions Transformation step, we pro- pose a specialization strategy, called Specializeprop (see Figure 6), which extends Specializevcg with a more powerful Definition Introduction phase. In par- ticular, the Definition Introduction of Specializeprop makes use of general- ization operators that are related to some abstract interpretation techniques [30] and they play a role similar to that of abstraction in the verification methods described in [26, 36, 1]. However, since it is applied during the verification pro- cess, and not before the verification process, our generalization is often more flexible than abstraction.

By means of an example borrowed from [75], we argue that program special- ization can prove program correctness in some cases where the CEGAR method (as implemented in ARMC [123]) does not work. In particular we show that the construction of the least model M (S) terminates and we can prove the correct- ness of the imperative program Prog with respect to ϕinit and ϕerror by showing that the atom incorrect does not belong to that model.

Example 5 (Proving the partial correctness of an imperative pro- gram). Let us consider the following incorrectness triple

{ϕinit(x, y, n) } doubleLoop {ϕerror (x, y, n) }

∧ ≥

where: (i) ϕinit(x, y, n) is x = 0 y = 0, x 0 (xx) xxxxxx (x, x, x) is x<y, and

(iii) doubleLoop is the program:

while (x<n) { x = x+1;

y = y+1;

}

while (x>0) { x = x-1;

y = y-1;

}

Listing 4.1: Program doubleLoop

We want to prove that doubleLoop is correct with respect to ϕinit(x, y, n) and ϕerror (x, y, n), that is, there is no execution of doubleLoop with input val- ues of x, y, and n satisfying ϕinit(x, y, n), such that a configuration satisfying ϕerror (x, y, n) is reached.

∞

As shown in Table 4.1 of Section 4.3, CEGAR fails to prove this property, because an infinite set of counterexamples is generated (see the entry ‘ ’ for Program doubleLoop in the ARMC column). Conversely, by applying the specialization-based software model checking method depicted in Figure 4 we will be able to prove that doubleLoop is indeed correct. By performing the CLP Translation step (Step 1) and the Verification Conditions Generation step (Step 2) of our framework we get the following CLP clauses encoding the verification conditions for doubleLoop.

1. incorrect :- a(X,Y,N), new1(X,Y,N).‌

2. new1(X,Y,N) :- X<N, X1=X+1, Y1=Y+1, new1(X1,Y1,N).

3. new1(X,Y,N) :- X≥1, X≥N, X1=X-1, Y1=Y-1, new2(X1,Y1,N).

4. new1(X,Y,N) :- X≤0, X≥N, b(X,Y,N).

5. new2(X,Y,N) :- X≥1, X1=X-1, Y1=Y-1, new2(X1,Y1,N).

6. new2(X,Y,N) :- X≤0, b(X,Y,N).

where:

≥

7. a(X,Y,N) :- X=1, Y=1, N 1.

8. b(X,Y,N) :- X<Y.

encode the specialized initial and error configurations (note that according to the Unf function of the Specializevcg procedure each loop is unrolled once, and therefore we get the atomic constraint occurring in clause 8).

Unfortunately, it is not possible to check by direct evaluation whether or not the atom incorrect is a consequence of the above CLP clauses. Indeed, the evaluation of the query incorrect using the standard top-down strategy enters into an infinite loop. Tabled evaluation [35] does not terminate either, as infinitely many tabled atoms are generated. Analogously, bottom-up evaluation is unable to return an answer, because infinitely many facts for new1 and new2 should be generated for deriving that incorrect is not a consequence of the given clauses.

Then, the Verification Conditions Transformation step (Step 3) specializes the CLP program V with respect to the property ϕinit, thereby deriving the spe- cialized program S. During Step 3 the constraints occurring in the definitions of new1 and new2 are generalized according to a suitable generalization strategy based both on widening [30, 58, 62] and on the novel constrained generalization strategy.

We proceed by following the specialization pattern described in Figure 3 for the Step 2, but in this specialization, we also show a novel definition introduction

approach.

We start off by unfolding clause 1 with respect to a and we get:

9. incorrect:- X=1, Y=1, N≥1, new1(X,Y,N).

Since no clause in Defs can be used to fold clause 9 we introduce the following definition:

10. new3(X,Y,N) :- X=1, Y=1, N≥1, new1(X,Y,N).

Each new definition introduced during specialization determines a new node of a tree, called Defs, whose root is clause 10, which is the first definition we have introduced. The tree Defs of all the definitions introduced during the Verification Conditions Transformation step, can be depicted as in Figure 5.

Then, we unfold clause 10 and we get:

11. new3(X,Y,N) :- X=1, Y=1, N≥2, X1=2, Y1=2, new1(X1,Y1,N).‌

12. new3(X,Y,N) :- X=1, Xx0, Xx0, X0x0, X0x0, new2(X1,Y1,N).

Now, we should fold these two clauses. Let us deal with them, one at the time, and let us first consider clause 11. In order to fold clause 11 we consider a definition, called the candidate definition, which is of the form:

13. new4(X,Y,N) :- X=2, Y=2, N≥2, new1(X,Y,N).

The body of this candidate definition is obtained by (i) projecting the constraint in clause 11 with respect to the variable N and the primed variables X1 and Y1, and (ii) renaming the primed variables to unprimed variables. Since in Defs there is an ancestor definition, namely the root clause 10, with the predicate new1 in the body, we apply the widening operator , introduced in [62], to clause 10 and clause 13, and we get the definition:

14. new4(X,Y,N) :- X≥1, Y≥1, N≥1, new1(X,Y,N).

≥ ≤

(Recall that the widening operation of two clauses c1 and c2, after replacing every equality A=B by the equivalent conjunction A B, A B, returns the atomic constraints of clause c1 which are implied by the constraint of clause c2.)

At this point, we do not introduce clause 14 (as we would do if we perform a usual generalization using widening alone, as indicated in [58, 62]), but we apply our constrained generalization, which imposes the addition of some extra constraints to the body of clause 14, as we now explain.

With each predicate newk we associate a set of constraints, called the re- gions for newk, which are all the atomic constraints on the unprimed variables (that is, the variables in the heads of the clauses) occurring in any one of the clauses for newk in the CLP program V consisting of clauses 1–8. Now, let

newq(. . . ) :- d, newk be the candidate definition (clause 13, in our case). Then, we add to the body of the generalized definition obtained by widening, say newp(. . . ) :- c, newk, (clause 14, in our case), all negated regions for newk which are implied by d.

≥

≥ ≥

≥

≥ ≥ ≤

In our example, the regions for new1 are: X<N, X 1, X N, X 0, X<Y (see clauses 2, 3, 4 and 8) and the negated regions are, respectively: X N, X<1, X<N, X>0, X Y. The negated regions implied by the constraint X=2, Y=2, N 2, occurring in the body of the candidate clause 13, are: X>0 and X Y.

Thus, instead of clause 14, we introduce the following clause 15 (we wrote neither X>0 nor X≥1 because those constraints are implied by X≥Y, Y≥1):

15. new4(X,Y,N) :- X≥Y, Y≥1, N≥1, new1(X,Y,N).

and we say that clause 15 has been obtained by constrained generalization from clause 13. Clause 15 is placed in Defs as a child of clause 10, as clause 11 has been derived by unfolding clause 10.

Now, it remains to fold clause 12 and in order to do so, we consider the following candidate definition:

16. new5(X,Y,N) :- X=0, Y=0, N=1, new2(X,Y,N).

Clause 16 is placed in Defs as a child of clause 10, as clause 12 has been derived by unfolding clause 10. We do not make any generalization of this clause, because no definition with new2 in its body occurs as an ancestor of clause 16 in Defs.

Now, we consider the last two definition clauses we have introduced, that is, clauses 15 and 16. First, we deal with clause 15. Starting from that clause, we perform a sequence of unfolding-definition steps similar to the sequence we have described above. During this sequence of steps, we introduce two predicates, new6 and new7 (see the definition clauses for those predicates in Figure 5), for performing the required folding steps.

Then, we deal with clause 16. Again, starting from that clause we perform a sequence of unfolding-definition steps. By unfolding clause 16 w.r.t. new2 we get an empty set of clauses for new5. Then, we also delete clause 12, which should be folded with definition 16, because there are no clauses for new5.

Eventually, we get the program S made out of the following folded clauses:

17. incorrect:- X=1, Y=1, N≥1, new3(X,Y,N).

18. new3(X,Y,N) :- X=1, Y=1, N≥2, X1=2, Y1=2, new4(X1,Y1,N).

19. new4(X,Y,N) :- X≥Y, X<N, Y>0, X1=X+1, Y1=Y+1, new4(X1,Y1,N).

20. new4(X,Y,N) :- X≥Y, X≥N, Y>0, N>0, X1=X-1, Y1=Y-1, new6(X1,Y1,N).

21. new6(X,Y,N) :- X>0, X≥Y, X≥N-1, Y≥0, N>0, X1=X-1, Y1=Y-1, new7(X1,Y1,N).

Defs : new3(X,Y,N) :- X=1,Y=1,N≥1,new1(X,Y,N).

new4(X,Y,N) :- X≥Y,Y≥1,N≥1,new1(X,Y,N).

new5(X,Y,N) :- X=0,Y=0,N=1,new2(X,Y,N).

new6(X,Y,N) :- X≥Y,X+1≥N,Y≥0,N≥1,new2(X,Y,N).

new7(X,Y,N) :- X≥Y,N≥1,new2(X,Y,N).

Figure 5: The definition tree Defs.

22. new7(X,Y,N) :- X>0, X≤Y, N>0, X1=X-1, Y1=Y-1, new7(X1,Y1,N).

This concludes the Verification Conditions Transformation step.

/∈

Now, we can perform the Verification Conditions Analysis step of our method. This phase terminates immediately because in S there are no constrained facts (that is, clauses whose bodies consist of constraints only) and M (S) is the empty set. Thus, incorrect M (S) and we conclude that the imperative program Prog is correct with respect to ϕinit and ϕerror .

One can verify that if we were to do the generalization of Step 3 using the widening technique alone (without the constrained generalization), we could not derive a program that allows us to prove correctness, because during Step 4 the execution of the BottomUp procedure does not terminate.

4.2 The Specialization Strategy ‌

In this section we present the specialization strategy Specializeprop shown in Figure 6. Initially, Specializeprop considers the clauses of the form:

≤ ≤

incorrect:- c1(X), A1(X), . . . , incorrect:- cj(X), Aj(X) (𝛙) where, for 1 i j, ci is either an atom or a constraint and Ai is a atom.

The Unfolding phase consists in unfolding a clause C with respect to the leftmost atom in its body. It makes use of the Unf function which takes as input a clause D and an atom A, and returns as output a set SpC of satisfiable clauses

Input: Program P (either V or R).

Output: Program S such that incorrect ∈M (P ) iff incorrect ∈M (S).

Initialization:

S := ∅;

InCls := { incorrect:- c1(X), A1(X), . . . , incorrect:- cj(X), Aj(X) };

Defs := ∅;

(α) while in InCls there is a clause C that is not a constrained fact do

Unfolding:

SpC := Unf (C, A), where A is the leftmost atom in the body of C;

Clause Removal:

(α1) while in SpC there are two distinct clauses E1 and E2 such that

− { }

E1 subsumes E2 do SpC := SpC E2 ;

end-while;

Definition Introduction:

(α2) while in SpC there is a clause E that is not a constrained fact and cannot be folded using a definition in Defs do

G := Gen(E, Defs);

∪ { }

Defs := Defs child(G, C) ;

∪ { }

InCls := InCls G ;

end-while;

InCls := InCls − {C};

S := S ∪ SpC;

end-while;

Folding:

(β) while in S there is a clause E that can be folded by a clause D in Defs do

− { } ∪ { }

S := (S E ) F , where F is derived by folding E using D;

end-while;

Remove from S all clauses for predicates on which incorrect does not depend.

Figure 6: The Specializeprop Procedure.

derived from D by a single application of the unfolding rule (see Definition 1), which consists in: (i) replacing an atom A occurring in the body of a clause by the bodies of the clauses in P whose head is unifiable with A, and (ii) applying the unifying substitution.

At the end of the Unf procedure, Clause Removal removes subsumed clauses.

The specialization strategy proceeds to the Definition Introduction phase and terminates when no new definitions are needed for performing the subse- quent Folding phase. Unfortunately, an uncontrolled application of the Def-

inition Introduction procedure may lead to the introduction of infinitely

many new definitions, thereby causing the nontermination of the specialization procedure. In order to deal with this potential nontermination issue we intro- duce a subsidiary procedure, called Gen, which introduces new definitions and is parametric with respect to generalization operators.

In the following section we will define suitable generalization operators which guarantee the introduction of finitely many new definitions.‌

4.2.1 Generalization Operators

In this section we define some generalization operators which are used to ensure the termination of the specialization strategy and, as mentioned in the intro- duction, we also introduce constrained generalization operators that generalize the constraints occurring in a candidate definition and, by adding suitable extra constraints, have the objective of preventing that the set of clauses generated by unfolding the generalized definition is larger than the set of clauses generated by unfolding the candidate definition. In this sense we say the objective of con- strained generalization is to preserve the branching behaviour of the candidate definitions.

More generalization operators used for the specialization of logic programs and also constraint logic programs can be found in [58, 62, 102, 101, 118].

≤

C R

R | ∀ ↔ ∃

∧ ≤

We will consider linear constraints over the set of the real numbers. The set is the minimal set of constraints which: (i) includes all atomic constraints of the form either p1 p2 or p1 < p2, where p1 and p2 are linear polynomials with variables X1, . . . , Xk and integer coefficients, and (ii) is closed under conjunction (which we denote by ‘,’ and also by ‘ ’). An equation p1 = p2 stands for p1 p2 p2 p1. The projection of a constraint c onto a tuple X of variables, denoted project(c, X), is a constraint such that = X (project(c, X) Yc), where Y is the tuple of variables occurring in c and not in X.

In order to introduce the notion of a generalization operator we need the following definition [50].

Definition 7 (Well-Quasi Ordering -). A well-quasi ordering (or wqo, for short) on a set S is a reflexive, transitive relation - on S such that, for every infinite sequence e0e1 . . . of elements of S, there exist i and j such that i<j and ei - ej. Given e1 and e2 in S, we write e1 ≈ e2 if e1 - e2 and e2 - e1. A wqo

- is thin iff for all e ∈ S, the set {e' ∈ S | e ≈ e'} is finite.

The use of a thin wqo guarantees that during the Specializeprop procedure each definition can be generalized a finite number of times only, and thus the termi- nation of the procedure is guaranteed.

The thin wqo Maxcoeff, denoted by -M , compares the maximum absolute

values of the coefficients occurring in polynomials. It is defined as follows. For

≤

{| | | | | |}

≤

any atomic constraint a of the form p < 0 or p 0, where p is q0 + q1X1 + . . . + qkXk, we define maxcoeff (a) to be max q0 , q1 , . . . , qk . Given two atomic constraints a1 of the form p1 < 0 and a2 of the form p2 < 0, we have that a1 -M a2 iff maxcoeff (a1) maxcoeff (a2).

≤

Similarly, if we are given the atomic constraints a1 of the form p1 0 and a2

∈ { }

≤ ≡ ≡

of the form p2 0. Given two constraints c1 a1, . . . , am, and c2 b1, . . . , bn, we have that c1 -M c2 iff, for i = 1, . . . , m, there exists j 1, . . . , n such that ai -M bj. For example, we have that:

(i) (1−2X1 < 0) -M (3+X1 < 0),

(ii) (2−2X1+X2 < 0) -M (1+3X1 < 0), and

(iii) (1+3X1 < 0) M (2−2X1+X2 < 0).

Definition 8 (Generalization Operator g). Let - be a thin wqo on the set C of constraints. A function g from C×C to C is a generalization operator with respect to - if, for all constraints c and d, we have: (i) d ± c g d, and

(ii) c g d - c.

A trivial generalization operator is defined as c d = true, for all constraints c and d (without loss of generality we assume that true - c for every constraint c).

Definition 8 generalizes several operators proposed in the literature, such as the widening operator [30] and the most specific generalization operator [102, 132].

Other generalization operators defined in terms of relations and operators on constraints such as widening and convex-hull which have been proposed for the static analysis of programs [30, 34] and also applied to the specialization

of constraint logic programs (see, for instance, [62, 118]). These generalization operators have been extensively studied in the above cited papers.

Here we define some generalization operators which have been used in the experiments we have performed (see also [62]).

g { } { |

• ≡

(W ) Given any two constraints c a1, . . . , am, and d, the operator Widen, denoted W , returns the constraint ai1, . . . , air, such that ai1, . . . , air = ah

≤ ≤ ± }

1 h m and d ah . Thus, Widen returns all atomic constraints of c that are entailed by d (see [30] for a similar widening operator used in static program analysis). The operator gW is a generalization operator w.r.t. the thin wqo -M .

b 1 and b c .

• (WM ) Given any two constraints c ≡ a1, . . . , am, and d ≡ b1, . . . , bn, the oper- ator WidenMax, denoted gWM , returns the conjunction ai1, . . . , air, bj1, . . . , bjs, where: (i) {ai1, . . . , air} = {ah | 1 ≤h≤m and d ± ah}, and (ii) {bj1, . . . , bjs} =

{ | ≤ ≤ }

k k n k -M

The operator WidenMax is a generalization operator w.r.t. the thin wqo -M . It is similar to Widen but, together with the atomic constraints of c that are

entailed by d, it returns also the conjunction of a subset of the atomic constraints of d.

Next we define a generalization operator by using the convex hull operator, which is often used in static program analysis [34].

± C ± ±

• C

(CH ) The convex hull of two constraints c and d in , denoted by ch(c, d), is the least (w.r.t. the ordering) constraint h in such that c h and d h. (Note that ch(c, d) is unique up to equivalence of constraints.)

g g g

•

(CHWM ) Given any two constraints c and d, we define the operator CHWiden- Max, denoted CHWM , as follows: c CHWM d = c WM ch(c, d). The operator

CHWM is a generalization operator w.r.t. the thin wqo -M .

CHWidenMax returns the conjunction of a subset of the atomic constraints

of c and a subset of the atomic constraints of ch(c, d).

Constrained Generalization

g g

Now we describe a method for deriving, from any given generalization opera- tor , a new version of that operator, denoted cns, which adds some extra constraints and still is a generalization operator. The operator cns is called the constrained generalization operator derived from .

In order to specify the constrained generalization operator we need the fol- lowing notions.

Let P be the input program of the Specializeprop procedure. For any con- straint d and atom A, we define the unfeasible clauses for the pair (d, A), de-

∧

{ }

noted UnfCl(d, A), to be the set (H1:- c1,G1), . . . , (Hm:- cm,Gm) , of (renamed apart) clauses of P such that, for i = 1, . . . , m, A and Hi are unifiable via the most general unifier ϑi and (d ci) ϑi is unsatisfiable.

− ≤ ≤ −

The head constraint of a clause of the form H :- c, A is the constraint project(c, X), where X is the tuple of variables occurring in H. For any atomic constraint a, neg(a) denotes the negation of a defined as follows: neg(p < 0) is p 0 and neg(p 0) is p < 0. Given a set C of clauses, we define the set of

the negated regions of C, denoted NegReg(C), as follows:

NegReg(C) = {neg(a) | a is an atomic constraint of a head constraint

of a clause in C}.

For any constraint d and atom A, we define the following constraint:

cns(d, A) = { r | r ∈ NegReg(UnfCl(d, A)) ∧ d ± r}.

± g

We have that d cns(d, A). Now, let be a generalization operator with respect to the thin wqo -. We define the constrained generalization operator derived from g, as follows:

gcns(c, d, A) = (c g d) ∧ cns(d, A).

∧ ∧ − ∧ ∧ ∧

∧ ∧ ∈ { } ∈

∧ ∧

Now we show that cns is indeed a generalization operator w.r.t. the thin wqo -B we now define. Given a finite set B of (non necessarily atomic) constraints, a constraint c1 . . . cn, where c1, . . . , cn are atomic, and a constraint d, we define the binary relation -B on constraints as follows: c1 . . . cn -B d iff either (i) (c1 . . . cn) - d, or (ii) there exists i 1, . . . , n such that ci B and (c1 . . . ci 1 ci+1 . . . cn) -B d. It can be shown that -B is a thin wqo.

g g

± ± g

± g

We observe that, for all constraints c, d, and all atoms A: (i) since d x x xxx x xxx(x, X), xxxx xxxx x xxx(x, d, A), and (ii) by definition of -B, for all constraints e, if c d - e, then cns(c, d, A) -B e, where B = NegReg(P ).

Thus, we have the following result.

∪ { }

Proposition 1. For any program P γ0 given as input to the Specializevcg procedure, for any atom A, the operator cns(_, _, A) is a generalization operator with respect to the thin well-quasi ordering -B, where B = NegReg(P ).

4.2.2 Generalization Strategy ‌

The Specializeprop procedure introduces new definitions by using the subsidiary Gen strategy which, given a clause E and a set Defs of definitions, yields a new definition clause G.

Definition 9 (Generalization Strategy). Let E be a clause of the form H(X) :- e(X,X1), Q(X1), where X and X1 are tuples of variables, e(X,X1) is a constraint, and Q(X1) is an atom. Let Defs be a set of definitions. Then, Gen(E, Defs) is a clause G: newp(X) :- g(X), Q(X), such that: (i) newp is a new predicate symbol, and (ii) e(X,X1) ± g(X1).‌‌

{ }

∅

For any infinite sequence E1, E2, . . . of clauses, let G1, G2, . . . be a sequence of clauses constructed as follows: (1) G1 = Gen(E1, ), and (2) for every i > 0, Gi+1 = Gen(Ei+1, G1, . . . , Gi ). We assume that the sequence G1, G2, . . . stabilizes, that is, there exists an index k such that, for every i>k, Gi is equal, modulo the head predicate name, to a clause in {G1, . . . , Gk}.

𝛙

In order to control the application of the generalization operators, we follow an approach which is similar to one considered in the context of partial de- duction [101, 109]. In particular, the Gen strategy makes use of a set Defs of definitions arranged as a forest of trees whose roots are among the clauses ( ) considered at the beginning of the specialization process. Each new definition G introduced during specialization determines a new node child(G, C) of a tree which is placed as a child of definition C if G is introduced to fold a clause derived by unfolding C.

g g

Figure 7 shows the generalization strategy used in Specializeprop. In Section 4.3 we will see the specific (constrained) generalization operators ( cns) which will be used in the experimental evaluation.

Input: A clause E and a set Defs of definitions structured as a tree.

Output: A new definition G.

Let E be a clause of the form q :- d, A(X) and dX = project(d, X). if there exists a clause D in Defs such that:

(i) D is of the form p(Y) :- c, A(Y), and

(ii) D is the most recent ancestor of E in Defs

such that A(Y) is a variant of A(X)

g g

then G := newp(X) :- g, A where g is either c d or cns(c, dX, A)

else G := newp(X) :- dX,A

Figure 7: The generalization strategy Gen(E,Defs) of Specializeprop

4.2.3 Termination and Soundness of the Specialization Strategy

The following theorem establishes the termination and soundness of the Special- izeprop strategy.

Theorem 4 (Termination and Correctness of the Specializeprop strat- egy). (i) The Specializeprop procedure always terminates. (ii) Let program S be the output of the Specializeprop procedure. Then incorrect ∈ M (V) iff incorrect ∈M (S).

Proof. (i) Since the Unfolding subsidiary procedure, the Clause Removal while-loop (α1), the Definition introduction while-loop (α2), and the Fold- ing while-loop (β) clearly terminate, we are left with showing that the first, outermost while-loop (α) terminates, that is, a finite number of new predicate definitions is added to InCls by Definition Introduction. This finiteness is guaranteed by the following facts:

(1) all new predicate definitions are introduced by the Gen strategy,

(2) by Definition 9 the set of all new predicate definitions generated by a se- quence of applications of a generalization operator is finite, modulo the head predicate names, and

(3) no two new predicate definitions that are equal modulo the head predi- cate name are introduced by definition introduction (indeed, definition introduction introduces a new predicate definition only if the definitions al- ready present in Defs cannot be used to fold clause E).

(ii) see the proof of point (ii) of Theorem 3.

4.3 Experimental Evaluation ‌

In this section we present some preliminary results obtained by applying our Software Model Checking method to some benchmark programs taken from the literature. The results show that our approach is viable and competitive with the state-of-the-art software model checkers.

Programs ex1, f1a, f2, and interp have been taken from the benchmark set of DAGGER [75]. Programs substring and tracerP are taken from [91] and [86], respectively. Programs doubleLoop and singleLoop have been introduced to illustrate the constrained generalization strategy. Finally, selectSort is an encoding of the Selection sort algorithm where references to arrays have been

abstracted away to perform array bounds checking. The source code of all the above programs is available at xxxx://xxx.xxxxxxx0.xx/xxx/.

The experiments have been performed by using the VeriMAP software model checker (see Chapter 8) that implements our verification method. We have also run three state-of-the-art CLP-based software model checkers on the same set of programs, and we have compared their performance with that of our model checker. In particular, we have used: (i) ARMC [123], (ii) HSF(C) [73], and

(iii) TRACER [85]. ARMC and HSF(C) are CLP-based software model check- ers which implement the CEGAR technique. TRACER is a CLP-based model checker which uses Symbolic Execution (SE) for the verification of partial cor- rectness properties of sequential C programs using approximated preconditions or approximated postconditions.

Table 4.1 reports the results of our experimental evaluation which has been performed on an Intel Core Duo E7300 2.66Ghz processor with 4GB of memory under the GNU Linux operating system.

In Columns W and CHWM we report the results obtained by the MAP sys- tem when using the procedure presented in Section 4.2.1 and the generalization operators Widen and CHWidenMax [62], respectively. In Columns Wcns and CHWMcns we report the results for the constrained versions of those general- ization operators, called Widencns and CHWidenMaxcns, respectively. In the remaining columns we report the results obtained by ARMC, HSF(C), and TRACER using the strongest postcondition (SPost) and the weakest precon- dition (WPre) options, respectively.

On the selected set of examples, we have that the MAP system with the CHWidenMaxcns is able to verify 9 properties out of 9, while the other tools do not exceed 7 properties. Also the verification time is generally comparable to that of the other tools, and it is not much greater than that of the fastest tools. Note that there are two examples (doubleLoop and singleLoop) where con- strained generalization operators based on widening and convex-hull are strictly more powerful than the corresponding operators which are not constrained.

We also observe that the use of a constrained generalization operator usually causes a very small increase of the verification time with respect to the non- constrained counterparts, thus making constrained generalization a promising technique that can be used in practice for software verification.

In Table 4.2 we present in some more detail the time taken for proving the properties of interest by using our method for software model checking with the generalization operators Widen (Column W ) and CHWidenMax (Col- umn CHWM ), and the constrained generalization operators derived from them

Program	MAP				ARMC	HSF(C)	TRACER
Program	W	Wcns	CHWM	CHWMcns	ARMC	HSF(C)	SPost	WPre

ex1	1.08	1.09	1.14	1.25	0.18	0.21	∞	1.29
f1a	∞	∞	0.35	0.36	∞	0.20	⊥	1.30
f2	∞	∞	0.75	0.88	∞	0.19	∞	1.32
interp	0.29	0.29	0.32	0.44	0.13	0.18	∞	1.22
doubleLoop	∞	0.33	0.33	0.33	∞	0.19	∞	∞
selectSort	4.34	4.70	4.59	5.57	0.48	0.25	∞	∞
singleLoop	∞	∞	∞	0.26	∞	∞	⊥	1.28
substring	88.20	171.20	5.21	5.92	931.02	1.08	187.91	184.09
tracerP	0.11	0.12	0.11	0.12	∞	∞	1.15	1.28

Table 4.1: Time (in seconds) taken for performing model checking. ‘∞’ means ‘no answer within 20 minutes’, and ‘⊥’ means ‘termination with error’.

Widencns (Column Wcns) and CHWidenMaxcns (Column CHWMcns), respec- tively.

Columns Step 1-2, Step 3, and Step 4 show the time required for performing the corresponding Step, respectively, of our Software Model Checking method presented in Section 4.1. The sum of these three times for each phase is reported in Column Tot.‌

4.4 Related Work

The specialization of logic programs and constraint logic programs has also been used in techniques for the verification of infinite state reactive systems [59, 62, 61, 103]. By using these techniques one may verify properties of Xxxxxx structures, instead of properties of imperative programs as we did here. In [103] the authors do not make use of constraints, which are a key ingredient of the technique we present here. In [61, 62, 103] program specialization is combined with the computation of the least model of the specialized program, or the computation of an overapproximation of the least model via abstract interpretation.

The use of program specialization for the verification of properties of imper- ative programs is not novel. It has been investigated, for instance, in [119]. In that paper a CLP interpreter for the operational semantics of a simple imper- ative language is specialized with respect to the input program to be verified. Then, a static analyzer for CLP programs is applied to the residual program for computing invariants (that is, overapproximations of the behavior) of the input imperative program. These invariants are used in the proof of the prop- erties of interest. Unlike [119], our verification method does not perform any static analysis phase separated from the specialization phase and, instead, we discover program invariants during the specialization process by applying suit- able generalization operators. These operators are defined in terms of operators and relations on constraints such as widening and convex-hull [30, 34, 62]. As in [119], we also use program specialization to perform the so-called removal of the interpreter, but in addition, in this paper we use specialization for prop- agating the information about the constraints in the initial configurations. In particular, the CLP program is specialized with respect to the property to be verified, by using constrained generalization operators which have the objective of preserving, if possible, the branching behaviour of the definitions to be gen- eralized. In this way we may avoid loss of precision, and at the same time, we enforce the termination of the specialization process Step 3.

The idea of constrained generalization which has the objective of preserving

Program	Step 1-2	W			Wcns			CHWM			CHWMcns
Program	Step 1-2	Step 3	Step 4	Tot	Step 3	Step 4	Tot	Step 3	Step 4	Tot	Step 3	Step 4	Tot

ex1	1.02	0.05	0.01	1.08	0.07 a	0	1.09	0.11	0.01	1.14	0.23 a	0	1.25
f1a	0.35	0.01	∞	∞	0.01	∞	∞	0 a	0	0.35	0.01 a	0	0.36
f2	0.71	0.03	∞	∞	0.13	∞	∞	0.03 a	0.01	0.75	0.17 a	0	0.88
interp	0.27	0.01	0.01	0.29	0.02 a	0	0.29	0.04	0.01	0.32	0.17 a	0	0.44
doubleLoop	0.31	0.01	∞	∞	0.02 a	0	0.33	0.02 a	0	0.33	0.02 a	0	0.33
selectSort	4.27	0.06	0.01	4.34	0.43 a	0	4.70	0.3	0.02	4.59	1.3 a	0	5.57
singleLoop	0.22	0.02	∞	∞	0.02	∞	∞	0.03	∞	∞	0.04 a	0	0.26
substring	0.24	0.01	87.95	88.20	0.02	170.94	171.2	4.96 a	0.01	5.21	5.67 a	0.01	5.92
tracerP	0.11	0 a	0	0.11	0.01 a	0	0.12	0 a	0	0.11	0.01 a	0	0.12

∞

Table 4.2: Time (in seconds) taken for performing software model checking with the MAP system. ‘ ’ means ‘no answer within 20 minutes’. Times marked by ‘a’ are relative to the programs obtained after Step 3 and have no constrained facts (thus, for those programs the times of Step 4 are very small (≤ 0.01 s)).

the branching behaviour of a clause, is related to the technique for preserving characteristic trees while applying abstraction during partial deduction [104]. Indeed, a characteristic tree provides an abstract description of the tree gener- ated by unfolding a given goal, and abstraction corresponds to generalization. However, the partial deduction technique considered in [104] is applied to ordi- nary logic programs (not CLP programs) and constraints such as equations and inequations on finite terms, are only used in an intermediate phase.

∇

In order to get a conservative model of a program, different generalization op- erators have been introduced in the literature. In particular, in [15] the authors introduce the bounded widen operator c B d, defined for any given constraint c and d and any set B of constraints. This operator, which improves the precision of the widen operator introduced in [30], has been applied in the verification of synchronous programs and linear hybrid systems. A similar operator c B d, called widening up to B, has been introduced in [78]. In this operator the set B of constraints is statically computed once the system to be verified is given. There is also a version of that operator, called interpolated widen, in which the set B is dynamically computed [75] by using the interpolants which are derived during the counterexample analysis.

Similarly to [15, 34, 75, 78], the main objective of the constrained generaliza- tion operators introduced in this paper is the improvement of precision during program specialization. In particular, this generalization operator, similar to the bounded widen operator, limits the possible generalizations on the basis of a set of constraints defined by the CLP program obtained as output of Step 3. Since this set of constraints which limits the generalization depends on the output of Step 3, our generalization is more flexible than the one presented in [15]. More- over, our generalization operator is more general than the classical widening operator introduced in [30]. Indeed, we only require that the set of constraints which have a non-empty intersection with the generalized constraint c d, are entailed by d.

CHAPTER 5 ‌

Iterated Program Specialization

We have shown that program specialization can be used not only as a prepro- cessing step to generate verification conditions, but also as a means of analysis on its own. Indeed, by specializing the CLP program V with respect to the con- straints characterizing the input values of Prog, in some cases one can derive a new CLP program S whose least model M (S) can be computed in finite time because S can be represented by a finite (possibly empty) set of constraints. Thus, in these cases it is possible to verify whether or not Prog is correct with respect to ϕinit and ϕerror by simply inspecting that model.

However, due to the undecidability of partial correctness, it is impossible to devise a specialization technique that always terminates and produces a special- ized program whose least model can be finitely computed. Thus, the best one can do is to propose a verification technique based on some heuristics and show that it works well in practice.

In this chapter we present a method, called iterated specialization, which ex- tends the method proposed in Chapter 4 and it is based on the repeated applica- tion of program specialization. By iterated specialization we produce a sequence of programs of the form S1, S2, S3, . . . Each program specialization step termi- nates and has the effect of modifying the structure of the CLP program (and consequently the structure of the corresponding set of verification conditions) by explicitly adding new constraints that denote invariants of the computation. The effect of the iterated specialization is the propagation of these constraints from one program version to the next, and since each new iteration starts from the output of the previous one, we can refine program analysis and possibly increase the level of precision.

The use of iterated specialization avoids the least model construction per-

formed by the technique presented in the previous Chapter 4 after program specialization.

{ } { }

Iterated specialization terminates at step k, if a lightweight analysis based on a simple inspection of program Sk is able to decide whether or not the given incorrectness triple ϕinit Prog ϕerror holds. While each transformation step is guaranteed to terminate, due to undecidability of the partial correctness, the overall process may not terminate.

In order to validate the heuristics used in our verification method from an experimental point of view, we have used our prototype verification system called VeriMAP (see Chapter 8). We have performed verification tests on a significant set of over 200 programs taken from various publicly available benchmarks. The precision of our system, that is the ratio of the successfully verified programs over the total number of programs, is about 85 percent. We have also compared the results we have obtained using the VeriMAP system with the results we have obtained using other state-of-the-art software model checking systems, such as ARMC [123], HSF(C) [73], and TRACER [85]. These results show that our verification system has a considerably higher precision.

This chapter is organized as follows. In Section 5.1 we describe the overall strategy of iterated specialization, and in Section 5.2 also some specific strategies for performing the individual specialization steps. In Section 5.3 we report on the experiments we have performed by using our prototype implementation, and we compare our results with the results we have obtained using ARMC, HSF(C), and TRACER. Finally, in Section 5.4 we discuss the related work and, in particular, we compare our approach with other existing methods of software model checking.‌

5.1 The Verification Method

As an alternative to the construction of the least model and to standard query evaluation strategies, we present a software model checking method based on iterated specialization, which performs a sequence of program specializations, thereby producing a sequence S1, S2, S3, . . . of specialized CLP programs. Dur- ing the various specializations we may apply different strategies for propagating constraints (either propagating forward from an initial configuration to an error configuration, or propagating backward from an error configuration to an initial configuration) and different operators (such as the widening and convex hull operators) for generalizing predicate definitions.

Iterated specialization has the objective of deriving a new CLP program Si+1

∈ ∈

{ } { }

such that: (i) incorrect M (Si) iff incorrect M (Si+1), and (ii) Si+1 either contains the fact incorrect or contains no clauses with head incorrect. In the former case the incorrectness triple ϕinit Prog ϕerror holds and the given imperative program Prog is incorrect with respect to ϕinit and ϕerror , while in the latter case the incorrectness triple does not h old and the given imperative program Prog is correct with respect to ϕinit and ϕerror .

The Software Model Checking Method

{ } { }

Input: An incorrectness triple ϕinit Prog ϕerror and

the CLP program I defining the predicate incorrect.

Output: If Prog is correct with respect to ϕinit and ϕerror then ‘correct’ else ‘incorrect’.

Step 1: T := C2CLP(Prog, ϕinit, ϕerror ); P := T ∪ I;

Step 2: V := Specializevcg(P);

Step 3: S := Specializeprop(V);

Step 4: if CorrectnessTest (S) = ‘correct’ then return ‘correct’;

elseif CorrectnessTest (S) = ‘incorrect’ then return ‘incorrect’; else { V := Reverse(S); goto Step 3 }

Figure 8: The Iterated Verification Strategy

The verification method is outlined in Figure 8. The given incorrectness triple

{ } { }

ϕinit Prog ϕerror is processed by the CLP Translation and Verification Con- ditions Generation steps presented in Chapter 3.

Then, the Iterated Specialization strategy applies the procedure Specializeprop, which propagates the constraints of the initial configuration. The constraints of the initial configuration can be propagated through the program V obtained after removing the interpreter, by specializing V itself with respect to ϕinit, thereby deriving a new specialized program S.

Next, our Iterated Specialization strategy performs a lightweight analysis, called the CorrectnessTest, to check whether or not incorrect belongs to M (S), that is, whether or not Prog is correct. In particular, CorrectnessTest checks whether S can be transformed into an equivalent program Q where one of the following conditions holds: either (i) the fact incorrect belongs to Q, hence there is a computation leading to an error configuration and the strategy halts reporting ‘incorrect’, or (ii) Q has no constrained facts, hence no computation leads to an error configuration and the strategy halts reporting ‘correct’, or

(iii) Q contains a clause of the form incorrect :- G, where G is not the empty goal (thus, neither (i) nor (ii) holds), and hence the strategy proceeds to the subsequent step.

In that subsequent step our strategy propagates the constraints of the error configuration. This is done by: (i) first applying the Reverse procedure, which, so to say, inverts the flow of computation by interchanging the roles of the initial configuration and the error configuration, and (ii) then specializing (using the procedure Specializeprop again) the ‘reversed’ program with respect to ϕerror .

The strategy iterates the applications of the Reverse and the Specializeprop procedures until hopefully CorrectnessTest succeeds, thereby reporting either ‘correct’ or ‘incorrect’. Obviously, due to the undecidability of partial correctness, the Iterated Specialization strategy may not terminate. However, we will show that each iteration terminates, and hence we can refine the analysis and possibly increase the level of precision by starting each new iteration from the CLP program obtained as output of the previous iteration.‌

5.2 The Iterated Specialization Strategy

In this section we describe the basic components required to realize the Iter- ated Specialization strategy, that is, (i) the Specializeprop procedure, (ii) the CorrectnessTest procedure, and (iii) the Reverse procedure.‌

5.2.1 Propagation of Constraints

The procedure Specializeprop specializes the CLP program V, obtained after the application of Specializevcg, by propagating the constraints that characterize the initial or the error configuration. (The fact that they characterize either an initial or an error configuration depends on the number of applications of the Reverse procedure.)

Now we describe how the unfolding and generalization steps are performed during Specializeprop.

In order to guide the application of the unfolding rule during the application of Specializeprop, we stipulate that every atom with a new predicate introduced during the previous specialization is unfolded only once at the first step of the Unfolding phase. This choice guarantees the termination of the Unfolding phase (notice that new predicates can depend on themselves), and also avoids an undesirable, excessive increase of the number of clauses in the specialized program. Obviously, more sophisticated strategies for guiding unfolding could

be applied. For instance, one may allow unfolding only if it generates at most one clause. This unfolding policy, called determinate unfolding in [65], does not increase the number of clauses and is useful in most cases. However, as we will show in Section 6.4, our simple choice is effective in practice.

In order for Specializeprop to be effective, we need to use a generalization strategies that retains as much information as possible, while guaranteeing the termination of the specialization.

Now we will consider four generalization strategies and in Section 6.4 we will compare them with respect to their strength and efficacy for the verification of program properties. These generalization strategies are based on the widening and convex hull generalization operators presented in Chapter 4.

Let us first present two monovariant generalization strategies that during the construction of Defs, for any given atom Q(X) to be folded, introduce a new definition of the form newp(X) :- c(X), Q(X), which is more general than any other definition in Defs with the atom Q(X) in its body. Thus, when using these generalization operators, the definitions in Defs whose body contains the atom Q(X) are linearly ordered with respect to the ‘more general’ relation.

Monovariant Generalization with Widening. This strategy is denoted GenM . Let E be a clause of the form H(X) :- e(X,X1), Q(X1) and Defs be a set of predicate definitions. Then,

(µ1) if in Defs there is no clause whose body atom is (a variant of) Q(X1), then

GenM (E, Defs) is defined as the clause newp(X1) :- ep(X1), Q(X1), and

(µ2) if in Defs there is a definition D of the form newq(X1) :- d(X1), Q(X1) and D is the most general such definition in Defs, then GenM (E, Defs) is the clause newp(X1) :- w(X1), Q(X1), where w(X1) is the widening of d(X1) with respect to ep(X1).

Note that at any given time the last definition of the form: newp(X1) :- c(X1), Q(X1) that has been introduced in Defs is the most general one with the atom Q(X1) in its body.

Monovariant Generalization with Widening and Convex Hull. This strategy, denoted GenMH, alternates the computation of convex hull and widening. In- deed, in Case (µ2) above, if D has been derived by projection or widening, then GenMH(E, Defs) is newp(X1) :- ch(X1), Q(X1), where ch(X1) is the convex hull of d(X1) and ep(X1). In all other cases the function GenMH(E, Defs) is defined like the function GenM(E, Defs).

Other generalization strategies can be defined by computing any fixed number of consecutive convex hulls before applying widening.

Now we present two polyvariant generalization strategies, which may intro- duce several distinct, specialized definitions (with different constraints) for each atom. Polyvariant strategies allow, in principle, more precision with respect to monovariant operators, but in some cases they could also cause the introduc- tion of too many new predicates, and hence an increase of both the size of the specialized program and the time needed for verification. We will consider this issue in Section 6.4 when we discuss the outcome of our experiments.

When we use a polyvariant generalization strategies, for any given atom Q(X), the definitions in Defs whose body contains Q(X) are not necessarily linearly ordered with respect to the ‘more general’ relation. However, the definitions in the same path of the definition tree Defs are linearly ordered, and the last definition introduced is more general than all its ancestors.

Polyvariant Generalization with Widening. This strategy is denoted GenP . Let E be a clause of the form H(X) :- e(X,X1), Q(X1) and Defs be a tree of predicate definitions. Suppose that E has been derived by unfolding a definition C. Then, (π1) if in Defs there is no ancestor of C whose body atom is of the form Q(X1), then GenP (E, Defs) is the clause newp(X1) :- ep(X1), Q(X1), and

(π2) if C has a most recent ancestor D in Defs of the form: newq(X1) :- d(X1), Q(X1), then GenP (E, Defs) is the clause newp(X1) :- w(X1), Q(X1), where w(X1) is the widening of d(X1) with respect to ep(X1).

Polyvariant Generalization with Widening and Convex Hull. This strategy, de- noted GenPH, is defined as GenP , except that it alternates the computation of convex hull and widening. Formally, Case (π2) above is modified as follows: (π2PH ) if C has a most recent ancestor D in Defs of the form newq(X1) :- d(X1), Q(X1) that has been derived by projection or widening, then GenP H(E, Defs) is the clause newp(X1) :- ch(X1), Q(X1), where ch(X1) is the convex hull of d(X1) and ep(X1), else GenP H(E, Defs) is the clause newp(X1) :- w(X1), Q(X1), where w(X1) is the widening of d(X1) with respect to ep(X1).

Now we show that all four strategies guarantee the soundness and termination of Specializeprop.

Proposition 2. The strategy GenM , GenMH, GenP , and GenP H are general- ization strategies.

}

{

Proof. By Definition 8, we have to show that, for each strategy Gen in GenM , GenMH, GenP , and GenP H , the following two properties hold.

Property (P1): for every clause E of the form: H(X) :- e(X,X1), Q(X1), for every clause newp(X1) :- g(X1), Q(X1) obtained by applying the strategy Gen to E and some set Defs of definitions, we have that e(X,X1) ± g(X1), and

∅

Property (P2): for every infinite sequence E1, E2, . . . of clauses, for every infinite sequence G1, G2, . . . of clauses constructed as follows: (1) G1 = Gen(E1, ), and

{ }

(2) for every i> 0, Gi+1 = Gen(Ei+1, G1, . . . , Gi ), there exists an index k such that, for every i>k, Gi is equal, modulo the head predicate name, to a clause in {G1, . . . , Gk}.

(GenM is a generalization strategy)

± ±

– Let us prove that Property (P1) holds for GenM . If g(X1) is ep(X1) (see case (µ1) above), then e(X,X1) ep(X1) because ep(X1) is the projection of e(X,X1) onto X1. If g(X1) is w(X1) (see case (µ2) above), then e(X,X1) w(X1) because w(X1) is the widening of d(X1) with respect to ep(X1), and hence ep(X1) w(X1). Thus, e(X,X1) g(X1).

– Let us prove that Property (P2) holds for GenM . This property is a straight- forward consequence of the following two facts:

(i) each new definition introduced by GenM is a clause of the form: newp(X1) :- g(X1),Q(X1) , where Q(X1) is a function-free atom whose predicate symbol occurs in the input program I (recall that program I is obtained by Specializevcg, and this strategy removes, by folding, all function symbols occurring in the atoms of its input program); hence case (µ1) can occur a finite number of times only, and

(ii) if g(X1) is the widening of d(X1) with respect to ep(X1) and g(X1) is different from d(X1), then the set of atomic constraints of g(X1) is a proper subset of the atomic constraints of d(X1), and hence case (µ2) will eventually generate new predicate definitions whose body is equal to the body of previously generated definitions.

(GenMH is a generalization strategy) The proof is similar to that for GenM .

± ±

– In order to prove that Property (P1) holds for GenMH, we use the fact that e(X,X1) ch(X1). Indeed, e(X,X1) ep(X1) (because ep(X1) is the projec- tion of e(X,X1) onto X1) and ep(X1) ch(X1) (because ch(X1) is the convex hull of d(X1) and ep(X1)).

– In order to show that Property (P2) holds for GenMH, it is enough to note that Property (P2) is preserved if one interleaves the projection and widening operators with an application of the convex hull operator, and hence the proof already done for GenM readily extends to GenMH.

(GenP and GenP H are generalization strategies)

The proof is a straightforward extension of the proof for GenM and GenMH, respectively. In particular, in order to show that Property (P2) holds for GenP

and GenP H, it suffices to use the following fact. Suppose that G1, G2, . . . is an infinite sequence of predicate definitions. Let T be an infinite tree of definitions such that:

(i) if G occurs in G1, G2, . . . , then G occurs in T ,

(ii) if A is an ancestor of B in T , then A precedes B in G1, G2, . . . (that is,

G1, G2, . . . is a linear order consistent with the ancestor relation in T ),

(iii) T is finitely branching, and

(iv) every branch in T stabilizes. Then G1, G2, . . . stabilizes.

Different strategies can be adopted for applying the folding rule during the Folding phase. These folding strategies depend on the generalization strat- egy that is used for introducing new definitions. In the case where we use a monovariant strategy (either GenM or GenMH), we fold every clause of the form H(X) :- e(X,X1), Q(X1) using the most general definition of the form newq(X1) :- g(X1), Q(X1) occurring in the set Defs obtained at the end of the execution of the while-loop (α). We call this strategy the most general fold- ing strategy. In the case where we use a polyvariant strategy (that is, GenP or GenP H), we fold every clause E using the definition computed by applying the generalization strategy to E. We call this strategy the immediate folding strategy.

Example 6 (Propagation of the constraints of the initial configura- tion). Let us consider the program in Example 2. We perform our second program specialization starting from the CLP program V we have derived by applying the Verification Conditions Generation step in Example 4 on page 47. This second specialization propagates the constraint ‘X=0, Y=0’ characterizing the initial configuration which occurs in clause 32.

We apply the Specializeprop strategy with the GenP generalization opera- tor. We start off by executing the while-loop (α) that repeats the Unfolding, Clause Removal, and Definition Introduction phases.

First execution of the body of the while-loop (α).

Unfolding.

we get:

We unfold clause 32 with respect to the atom new1(X,Y,N) and

35. incorrect :- X1=1, Y1=1, N>0, new1(X1,Y1,N).

≥

(Note that new1(X,Y,N) is also unifiable with the head of clause 34, but the constraint ‘X=0,Y=0,X N,X>Y’ is unsatisfiable.) No Clause Removal can be applied.

Definition Introduction.

the following new predicate:

Clause 35 cannot be folded, and hence we define

36. new2(X,Y,N) :- X=1, Y=1, N>0, new1(X,Y,N).

Thus, the set Defs of new predicate definitions consists of clause 36 only.

Second execution of the body of the while-loop (α).

Unfolding. Now we unfold the last definition which has been introduced

(clause 36) and we get:

37. new2(X,Y,N) :- X=1, Y=1, X1=2, Y1=3, N>1, new1(X1,Y1,N).

≥ ≤ ≥ ≤

Definition Introduction. Clause 37 cannot be folded by using any definition in Defs. Thus, we apply a generalization operator GenP based on widening. This operator matches the constraint appearing in the body of clause 37 against the constraint appearing in the body of clause 36, which is the only clause in Defs. First, the constraint in clause 36 is rewritten as a conjunction c of inequalities: X 1,X 1,Y 1,Y 1,N>0. Then the variables of clause 37 are renamed so that the atom in its body is identical to the atom in the body of clause 36, as follows:

38. new2(Xr,Yr,Nr) :- Xr=1, Yr=1, X=2, Y=3, N>1, new1(X,Y,N).

≥ ≥

Then the generalization operator GenP computes the projection of the constraint appearing in clause 38 onto the variables of the atom new1(X,Y,N), which is the constraint d: X=2,Y=3,N>1. The widening of c with respect to d is the constraint ‘X 1,Y 1,N>0’ obtained by taking the atomic constraints of c that are entailed by d. Thus, GenP introduces the following new predicate definition:

39. new3(X,Y,N) :- X≥1, Y≥1, N>0, new1(X,Y,N).

which is added to Defs.

Third execution of the body of the while-loop (α).

Unfolding. We unfold clause 39 and we get:

40. new3(X,Y,N) :- X≥1, Y≥1, X<N, X1=X+1, Y1=X1+Y, new1(X1,Y1,N).

41. new3(X,Y,N) :- X≥N, X>Y, Y≥1, N>0.

Clause 40 can be folded using clause 40 in Defs. Thus, the while-loop (α) terminates without introducing any new definition.

Folding. Now we fold clause 35 using definition 36 and clauses 37 and 40 using definition 39. We get the following final program S:

42. incorrect :- N>0, X1=1, Y1=1, new2(X1,Y1,N).

43. new2(X,Y,N) :- X=1, Y=1, N>1, X1=2, Y1=3, new3(X1,Y1,N).

44. new3(X,Y,N) :- X≥1, Y≥1, X<N, X1=X+1, Y1=X1+Y, new3(X1,Y1,N).

45. new3(X,Y,N) :- X≥N, X>Y, Y≥1, N>0.

≥

The application of Specializeprop has propagated the constraints defining the initial configuration. For instance, the constrained fact (clause 45) has the extra constraint ‘Y 1, N>0’, with respect to the constrained fact in program V which has only the constraint ‘X N, X>Y’ (see clause 34 on page 50). However, in this example the presence of a constrained fact does not allow us to conclude that it has an empty least model, and hence at this point we are not able show the correctness of our program sum.

5.2.2 Lightweight Correctness Analysis o‌

The procedure CorrectnessTest (see Figure 9) analyzes the CLP program S and

tries to determine whether or not incorrect belongs to M (S).

The analysis performed by CorrectnessTest is lightweight in the sense that, unlike the Iterated Specialization strategy, it always terminates, possibly return- ing the answer ‘unknown’.

Note that the output S of Specializeprop is a linear program. The Correct- nessTest procedure transforms S into a new, linear CLP program Q by using two auxiliary functions: (1) the UnfoldCfacts function, which takes a linear CLP program Q1 and replaces as long as possible a clause C in Q1 by Unf (C, A), whenever A is defined by a set of constrained facts, and (2) the Remove function, which takes a linear CLP program Q2 and removes every clause C that satisfies one of the following two conditions: either (i) the head predicate of C is useless in Q2, or (ii) C is subsumed by a clause in Q2 distinct from C.

The CorrectnessTest procedure iterates the application of the function Un- foldCfacts followed by Remove until a fixpoint, say Q, is reached.

Now we prove that CorrectnessTest constructs program Q in a finite number of steps. Moreover, Q is equivalent to S with respect to the least model semantics, and hence incorrect ∈ M (S) iff incorrect ∈ M (Q).

Theorem 5 (Termination and Correctness of CorrectnessTest). Let S be a linear CLP program defining predicate incorrect. Then, (i) CorrectnessTest terminates for the input program S, and (ii.1) if CorrectnessTest (S) = ‘correct’ then incorrect /∈ M (S), and (ii.2) if CorrectnessTest (S) = ‘incorrect’ then incorrect ∈ M (S).

Proof. (i) the procedure CorrectnessTest constructs a fixpoint of the function λQ.Remove(UnfoldCfacts(Q)) in a finite number of steps, as we now show. For a CLP program P , let pn(P ) denote the number of distinct predicate symbols

Input: A linear CLP program S defining the predicate incorrect.

Output: Either ‘correct’ (implying incorrect /∈ M (S)), or ‘incorrect’ (implying

incorrect ∈ M (S)), or ‘unknown’.

Q := S;

Q /

while Q = Remove(UnfoldCfacts(Q)) do

:= Remove(UnfoldCfacts(Q)); end-while;

if incorrect has a fact in Q then return ‘incorrect’

elseif no clause in T has predicate incorrect then return ‘correct’

else return ‘unknown’ Figure 9: The CorrectnessTest Procedure.

that occur in the body of a clause in P , and let cn(P ) denote the number of clauses in P . Then, the following facts hold:

(1) either P = UnfoldCfacts(P ) or pn(P ) > pn(UnfoldCfacts(P )),

≥

(2) pn(P ) pn(Remove(P )),

(3) either P = Remove(P ) or cn(P ) > cn(Remove(P )).

⟩ ≥

⟨ ⟩ ≥ ⟨

Thus, we have that pn(P ), cn(P ) lex pn(Remove(UnfoldCfacts(P ))), cn(Re- move(UnfoldCfacts(P ))) , where lex is the lexicographic ordering on pairs of integers.

⟨ ⟩ ⟨ ⟩

Since >lex is well-founded, CorrectnessTest eventually gets a program Q s.t. pn(Q), cn(Q) = pn(Remove(UnfoldCfacts(Q))), cn(Remove(UnfoldCfacts(Q))) , and hence, by (1) and (3), Q = Remove(UnfoldCfacts(Q)).

∈ ∈

/∈ /∈

(ii.1) No application of the folding rule is performed by CorrectnessTest, and hence the condition for the correctness of the transformation rules of Theorem 1 is trivially satisfied. Thus, incorrect M (S) iff incorrect M (Q). If Cor- rectnessTest (S) = ‘correct’, then no clause in Q has predicate incorrect, and then incorrect M (Q). Hence, incorrect M (S).

(ii.2) If CorrectnessTest (S) = ‘incorrect’, then a fact in Q has predicate incorrect, and incorrect ∈ M (Q). Since at Point (ii) we have shown that incorrect ∈ M (S) iff incorrect ∈ M (Q), we conclude that incorrect ∈ M (S).

In our running example, program S is the CLP program obtained by apply- ing the procedure Specializeprop. Now program S consists of clauses 42–45 and therefore we are not able to determine whether or not the atom incorrect is a consequence of S. Indeed, (i) in program S no predicate is defined by con- strained facts only, and hence UnfoldCfacts has no effect, (ii) in S no predicate

is useless and no clause is subsumed by any other, and hence also Remove leaves

S unchanged, and (iii) in S there is a clause for incorrect which is not a fact.‌

5.2.3 The Reverse Transformation

The Reverse procedure implements a transformation that reverses the flow of computation: the top-down evaluation (that is, the evaluation from the head to the body of a clause) of the transformed program corresponds to the bottom- up evaluation (that is, the evaluation from the body to the head) of the given program. In particular, if the Reverse procedure is applied to a program that checks the reachability of the error configurations by exploring the transitions in a forward way starting from the initial configurations, then the reversed pro- gram checks reachability of the initial configurations by exploring the transitions in a backward way starting from error configurations. Symmetrically, from a program that checks reachability by a backward exploration of the transitions, Reverse derives a program that checks reachability by a forward exploration of the transitions.

Let us consider a linear CLP program S of the form:

incorrect :- a1(X), p1(X).

inc·o·r·rect :- a (X), p (X).

k k

q1(X) :- t1(X,X1), r1(X1).

q (·X·)· :- t (X,X1), r (X1).

m m m

s1(X) :- b1(X).

s (X·)· · :- b (X).

n n

where: (i) a1(X), . . . , ak(X), t1(X,X1), . . . , tm(X,X1), b1(X), . . . , bn(X) are con- straints, and (ii) the pi’s, q i’s, ri’s, and si’s are possibly non-distinct predicate symbols.

The Reverse procedure transforms program S in two steps as follows.

Step 1. Program S is transformed into a program T of the following form (round parentheses make a single argument out of a tuple of arguments):

t1. incorrect :- a(U), r1(U).

t2. r1(U) :- trans(U,V), r1(V).

t3. r1(U) :- b(U).

a((p1,X)) :- a1(X).

· · ·

a((pk,X)) :- ak(X). trans((q1,X),(r1,X1)) :- t1(X,X1).

t·r·a·ns((q

,X),(r ,X1)) :- t (X,X1).

m m m

b((s1,X)) :- b1(X).

b·(·(·s

,X)) :- bn(X).

Step 2. Program T is transformed into a program R by replacing the first three clauses t1–t3 of T with the following ones:

r1. incorrect :- b(U), r2(U).

r2. r2(V) :- trans(U,V), r2(U).

r3. r2(U) :- a(U).

The correctness of the transformation of S into R is shown by the following result.

Theorem 6 (Soundness of the Reverse procedure). Let R be the program derived from program S by the Reverse procedure. Then incorrect ∈M (R) iff incorrect ∈M (S).

Proof. (Step 1.) By unfolding clauses t1, t2, and t3 of program T with respect to a(U), trans(U,V), and b(U), respectively, we get the following CLP program T1:

incorrect :- a1(X), r1((p1,X)).

inc·o·r·rect :- a (X), r1((p ,X)).

k k

r1((q1,X)) :- t1(X,X1), r1((r1,X1)).

r1((· q· · ,X)) :- t (X,X1), r1((r ,X1)).

m m m

r1((s1,X)) :- b1(X).

r1((· s· ·,X)) :- b (X).

n n

∈

By the correctness of the unfolding rule (see Theorem 1), we get that incorrect

∈

M (T) iff incorrect M (T1).

Then, by rewriting all atoms in T1 of the form r1((pred,Z)) into pred(Z), we get back R. (The occurrences of predicate symbols in the arguments of a, trans, and b should be considered as individual constants.) The correctness of this rewriting is straightforward, as it is based on the syntactic isomorphism between S and T1. A formal proof of correctness can be made by observing that

the above rewriting can also realized by introducing predicate definitions of the form pred(Z) :- r1((pred,Z)), and applying the unfolding and folding rules. Thus, incorrect ∈M (T1) iff incorrect ∈M (S).

(Step 2.) The transformation of T into R (and the opposite transformation) can be viewed as a special case of the grammar-related transformation studied in [23]. We refer to that paper for a proof of correctness. Thus, we have that incorrect ∈M (T) iff incorrect ∈M (R), and we get the thesis.

The predicates a, trans, and b are assumed to be unfoldable in the subsequent application of the Specializeprop procedure.

Example 7 (Propagation of the constraints of the error configuration). Now let us continue our running example. The program S of Example 6 derived by the Specializeprop procedure can be transformed into a program R of the form t1– t3, where the predicates a, trans, and b are defined as follows:

45. a((new2,X1,Y1,N)) :- N>0, X1=1, Y1=1.‌

46. trans((new2,X,Y,N),(new3,X1,Y1,N)) :- X=1, Y=1, N>1, X1=2, Y1=3.

47. trans((new3,X,Y,N),(new3,X1,Y1,N)) :- X≥1, Y≥1, X<N, X1=X+1, Y1=X1+Y.

48. b((new3,X,Y,N)) :- Y≥1, N>0, X≥N, X>Y.

Then, the reversed program R is as follows.

49. incorrect :- b(U), r2(U). 50. r2(V) :- trans(U,V), r2(U). 51. r2(U) :- a(U).‌

together with clauses 45–48 above.

The idea behind program reversal is best understood by considering the reach- ability relation in the (possibly infinite) transition graph whose transitions are defined by the (instances of) clauses 46 and 47. Program S checks the reach- ability of a configuration c2 satisfying b(U) from a configuration c1 satisfying a(U), by moving forward from c1 to c2. Program R checks the reachability of c2 from c1, by moving backward from c2 to c1. Thus, in the case where a(U) and b(U) are predicates that characterize the initial and final configurations, respectively, by the reversal transformation we derive a program that checks the reachability of an error configuration starting from an initial configuration by moving backward from the error configuration. In particular, in the body of the clause for incorrect in R the constraint b(U) contains, among others, the constraint X>Y characterizing the error configuration (see clause 48) and, by specializing R, we will propagate the constraint of the error configuration.

Now let us specialize the CLP program R consisting of clauses 49–51 and 45– 48 by applying again Specializeprop. We start off from the first while-loop (α) of that procedure.

First execution of the body of the while-loop (α).

Unfolding. We unfold the clause for incorrect (clause 49) with respect to

the leftmost atom b(U) and we get:

52. incorrect :- Y≥1, N>0, X≥N, X>Y, r2((new3,X,Y,N)).

Definition Introduction. In order to fold clause 52 we introduce the defi- nition:

53. new4(X,Y,N) :- Y≥1, N>0, X≥N, X>Y, r2((new3,X,Y,N)).

Second execution of the body of the while-loop (α).

Unfolding. Then we unfold clause 53 and we get:

≥ ≥

54. new4(X,Y,N) :- Y 1, N>0, X N, X>Y, a((new3,X,Y,N)).‌

≥ ≥

55. new4(X1,Y1,N) :- Y1 1, N>0, X1 N, X1>Y1,

trans(U,(new3,X1,Y1,N)), r2(U).

By unfolding, clause 54 is deleted because the head of lause 45 is not unifiable with a((new3,X,Y,N)).

By unfolding clause 55 with respect to trans(U,(new3,X1,Y1,N)), also this clause is deleted because unsatisfiable constraints are derived. Thus, no new definition is introduced, and the while-loop (α) terminates.

Folding. By folding clause 52 using definition 53 we get:

56. incorrect :- Y≥1, N>0, X≥N, X>Y, new4(X,Y,N).

and the final, specialized CLP program S consists of clause 56 only.

Now we apply the CorrectnessTest procedure to program S, which detects that incorrect is a useless predicate. Then, the Iterated Specialization terminates by reporting the correctness of the given imperative program sum.

Thus, in this example we have seen that by iterating the specializations which propagate the constraints occurring in the initial configuration and in the error configuration we have been able to show the correctness of the given program.

It can be shown that, if we perform our specializations (using the same unfold- ing and generalization procedures) by taking into account only the constraints of the initial configuration or only the constraints of the error configuration, it is not possible to prove program correctness in our example. Thus, as we advocate here, if we perform a sequence of program specializations, we may gain an extra power when we have to prove program properties. This is confirmed by the

experiments we have performed on various examples taken from the literature. We will report on those experiments in Section 6.4.‌

5.2.4 Soundness of the Iterated Specialization Strategy

Finally, we get the following soundness result for the Iterated Specialization method.

{ } { }

Theorem 7 (Soundness of the Iterated Verification method). Let P be the CLP Encoding of the incorrectness problem for ϕinit Prog ϕerror . If the Iterated Specialization strategy terminates for the input program P and returns ‘correct’, then Prog is correct with respect to ϕinit and ϕerror . If the strategy terminates and returns ‘incorrect’, then P is incorrect with respect to ϕinit and ϕerror .

Proof. The Iterated Specialization method terminates for the input program I and returns ‘correct’ (respectively, ‘incorrect’) if and only if there exists n such that:

V:=Specializevcg (P); S1:=Specializeprop(V); R2:=Reverse(S1); S2:=Specializeprop(R2);

·R· ·:=Reverse(S ); S

:=Specialize (R );

n n−0 x

xxxx x

xxx XxxxxxxxxxxXxxx (Xx) x ‘xxxxxxx’ (xxxxxxxxxxxx, CorrectnessTest (Sn) = ‘incorrect’). Then (by Theorem 5),

/∈ ∈

incorrect M (Sn) (respectively, incorrect M (Sn)) if and only if (by Theorems 4 and 6 and Proposition 2)

/∈ ∈

incorrect M (P) (respectively, incorrect M (P)) if and only if (by Theorem 2)

Prog is correct (respectively, incorrect) with respect to ϕinit and ϕerror .

5.3 Experimental Evaluation ‌

We have performed an experimental evaluation of our software model checking method on several benchmark programs taken from the literature. The results of our experiments show that our approach is competitive with state-of-the-art software model checkers.

The benchmark set used in our experiments consists of 216 verification prob- lems of C programs (179 of which are correct, and the remaining 37 are incor- rect). Most problems have been taken from the benchmark sets of other tools used in software model checking, like DAGGER [75] (21 problems), TRACER [85] (66 problems) and InvGen [76] (68 problems), and from the TACAS 2013 Soft- ware Verification Competition [12] (52 problems). The size of the input programs ranges from a dozen to about five hundred lines of code.

The verification problems came in different source formats (and, in particu- lar, they used different methods for specifying the initial and error conditions), and thus they could not be directly used by other software model checkers. We automatically converted all problems from the original format to a common, intermediate format, and then from the intermediate format to the format ac- cepted by each tool we have considered in our experiments. We have put great care and effort in the coding of the conversion programs to ensure maximum compatibility. Nonetheless, some software model checkers failed to run on some seemingly harmless verification problems. The source code of all the verification problems we have considered and detailed reports about the verification results are available at xxxx://xxx.xxxxxxx0.xx/XxxxXXX/xxx/.

We have realized the VeriMAP software model checker [138] that implements our verification method (see Chapter 8). Our software model checker has been configured to execute the following program transformation:

Specializevcg; Specializeprop; CorrectnessTest; (Reverse; Specializeprop; CorrectnessTest)∗

It executes: (i) a first program specialization consisting of a single application of the Specializevcg procedure that performs the removal of the interpreter, and (ii) a sequence of applications of the Specializeprop procedure (from now on called iterations) that performs the propagation of the constraints of the initial and the error configurations. After the removal of the interpreter, the first applica- tion of Specializeprop propagates the constraints of the initial configuration. This corresponds to a forward propagation along the graph of configurations associ- ated with the reachability relation tr (see Section 3.2), while the propagation of the constraints of the error configuration corresponds to a backward propa- gation along the graph of configurations. The Specializeprop procedure has been executed by using the four generalization operators presented in Section 5.1:

(i) GenM, that is a monovariant generalization with widening only, (ii) GenMH, that is a monovariant generalization with widening and convex hull, (iii) GenP, that is a polyvariant generalization with widening only, and (iv) GenPH, that is polyvariant generalization with widening and convex hull.

We have also tested the following three state-of-the-art CLP-based software model checkers for C programs: ARMC [123], HSF(C) [73], and TRACER [85]. ARMC and HSF(C) are based on the Counter-Example Guided Abstraction Refinement technique (CEGAR) [29, 90, 128], while TRACER uses a technique based on approximated preconditions and approximated postconditions. We have compared the performance of those software model checkers with the per- formance of VeriMAP on our benchmark programs.

All experiments have been performed on an Intel Core Duo E7300 2.66Ghz processor with 4GB of memory under the GNU Linux operating system Ubuntu

12.10 (64 bit) (kernel version 3.2.0-27). A timeout limit of five minutes has been set for all model checkers.

In Table 5.1 we summarize the verification results obtained by the four soft- ware verification tools we have considered. In the column labelled by Ver- iMAP (GenPH) we have reported the results obtained by using the VeriMAP system that implements our Iterated Specialization method with the generaliza- tion operator GenPH. In the remaining columns we have reported the results obtained, respectively, by ARMC, HSF(C), and TRACER using the strongest postcondition (SPost) and the weakest precondition (WPre) options.

Line 1 reports the total number of correct answers of which those for cor- rect problems and incorrect problems are indicated in line 2 and 3, respectively. Line 4 reports the number of verification tasks that ended with an incorrect an- swer. These verification tasks refer to correct programs that have been proved incorrect (false alarms, at line 5), and incorrect programs that have been proved correct (missed bugs, at line 6). Line 7 reports the number of verification tasks that aborted due to some errors (originating from inability of parsing or insuf- ficient memory). Line 8 reports the number of verification tasks that did not provide an answer within the timeout limit of five minutes.

−

The total score obtained by each tool using the score function of the TACAS 2013 Software Verification Competition [12], is reported at line 9 and will be used in Figure 10. The score function assigns to every program p the integer score(p) determined as follows: (i) 2, if p is correct and has been correctly verified, (ii) 1, if p is incorrect and has been correctly verified, (iii) 4, if a false alarm has been generated, and (iv) 8, if a bug has been missed. Programs that caused errors or timed out do not contribute to the total score.

At line 9 we have indicated between round parentheses the negative compo- nent of the score due to false alarms and missed bugs. Line 10 reports the total CPU time, in seconds, taken to run the whole set of verification tasks: it in- cludes the time taken to produce (correct or incorrect) answers and the time

		VeriMAP (GenPH)	ARMC	HSF(C)	TRACER
		VeriMAP (GenPH)	ARMC	HSF(C)	SPost	WPre
1	correct answers	000	000	000	91	103
2	correct problems	154	112	137	74	85
3	incorrect problems	31	26	22	17	18
4	incorrect answers	0	9	5	13	14
5	false alarms	0	8	3	13	14
6	missed bugs	0	1	2	0	0
7	errors	0	18	0	20	22
8	timed-out problems	31	51	52	92	77
9	total score	339 (0)	210 (-40)	268 (-28)	113 (-52)	132 (-56)
10	total time	10717.34	15788.21	15770.33	27757.46	23259.19
11	average time	57.93	114.41	99.18	305.03	225.82

Table 5.1: Verification results using VeriMAP, ARMC, HSF(C) and TRACER. For each column the sum of the values of lines 1, 4, 7, and 8 is 216, which is the number of the verification problems we have considered. The timeout limit is five minutes. Times are in seconds.

spent on tasks that timed out (we did not include the negligible time taken for tasks that aborted due to errors). Line 11 reports the average time needed to produce a correct answer, which is obtained by dividing the total time (line 10) by the number of correct answers (line 1).

On the set of verification problems we have considered, the VeriMAP system is able to provide correct answers to 185 problems out of 216. It is followed by HSF(C) (159 correct answers), ARMC (138), TRACER(WPre) (103), and TRACER(SPost) (91). Moreover, VeriMAP has produced no errors and no incorrect answers, while the other tools generate from 4 to 14 incorrect answers. Thus, VeriMAP exhibits the best precision, defined as the ratio between the number of correct answers and the number of verification problems.

The total time taken by VeriMAP is smaller than the time taken by any of the other tools we have considered. This result is somewhat surprising, if we consider the generality of our approach and the fact that our system has not been specifically optimized for software model checking. This good performance of VeriMAP is due to: (i) the small number of tasks that timed out, and (ii) the fact that VeriMAP takes very little time on most programs, while it takes much more time on a few, complex programs (see Figure 10). In particular, VeriMAP is able to produce 169 answers taking at most 5 seconds each, and this is indeed a good performance if we compare it with HSF(C) (154 answers), ARMC (122), TRACER(SPost) (88) and TRACER(WPre) (101).

In order to ease the comparison of the performance of the software model checkers we have considered, we adopt a visualization technique using score- based quantile functions (see Figure 10), similar to that used by the TACAS 2013 Software Verification competition [12]. By using this technique, the performance of each tool is represented by a quantile function, which is a set of pairs (x, y) computed from the following set of pairs:

V = { (p, t) | the correct answer for the (correct or incorrect) program p

is produced in t seconds }.

∈

Σ ∈ ∧ ≤

Given the set V , for each pair (p, T ) V we produce a pair (x, y) computed as follows:

(i) x = XS + (p, t) V t T score(p), where XS, called the x-shift, is the sum of all the negative scores due to incorrect answers (x is called the accumulated score), and

(ii) y = Y S + T , where Y S, called the y-shift, is a number of seconds equal to the number of the timed-out programs.

Quantile functions are discrete monotone functions, but for reasons of simplicity,

TRACER(SPost) TRACER(WPre) ARMC

HSF(C)

VeriMAP(GenPH)

200

y (time in seconds)

150

100

-50 0 50 100 150 200 250 300 350

x (accumulated score)

Figure 10: Score-based quantile functions for TRACER(SPost), TRACER(WPre), ARMC, HSF(C), and VeriMAP(GenPH). Markers are placed along the lines of the functions, from left to right, every 10 programs that produce correct answers, starting from the program whose verification took the minimum time.

in Figure 10 we depicted them as continuous lines. The line of a quantile function for a generic verification tool should be interpreted as follows:

(i) the x coordinate of the k-th leftmost point of the line, is the sum of the score of the fastest k correctly verified programs plus the (non-positive) x-shift, and measures the reliability of the tool,

(ii) the y coordinate of the k-th leftmost point of the line, is the verification time taken for the k-th fastest correctly verified program plus the (non-negative) y- shift, and measures the inability of the tool of providing an answer (either a correct or an incorrect one),

(iii) the span of the line along the x-axis, called the x-width, measures the precision of the tool, and

≥

(iv) the span of the line along the y-axis, called the y-width, is the difference of verification time between the slowest and the fastest correctly verified programs. Moreover, along each line of the quantile functions of Figure 10 we have put a marker every ten answers. In particular, for n 1, the n-th marker, on the left- to–right order, denotes the point (x, y) corresponding to the (10 n)-th fastest

correct answer that has been produced.

Informally, for the line of a quantile function we have that: good results move the line to the right (because of lower total negative score XS), stretch it horizontally (because of higher positive accumulated score), move it towards the x-axis (because of fewer timed-out problems and a better time performance), and compress it vertically (because of a lower difference between worst-case and best-case time performance). We observe that the line of the quantile function for VeriMAP(GenP H) starts with a positive x-value (indeed, VeriMAP provides no incorrect answers and XS = 0), is the widest one (indeed, VeriMAP has the highest positive accumulated score, due to its highest precision among the tools we have considered), and is the lowest one (indeed, VeriMAP(GenP H) has the smallest numbers of timed-out problems). The height of the line for VeriMAP(GenP H) increases only towards the right end (at an accumulated score value of 300) after having produced 162 correct answers, and this number is greater than the number of all the correct answers produced by any of the other tools we have considered.

In Table 5.2 we report the results obtained by running the VeriMAP system with the four generalization operators presented in Section 5.1.

Each column is labelled by the name of the associated generalization opera- tor. Line 1 reports the total number of correct answers. Lines 2 and 3 report the number of correct answers for correct and incorrect problems, respectively. Line 4 reports the number of tasks that timed out. As already mentioned, the

VeriMAP system has produced neither incorrect answers nor errors. Line 5 re- ports the total time, including the time spent on tasks that timed out, taken to run the whole set of verification tasks. Line 6 reports the average time needed to produce a correct answer, that is, the total time (line 5) divided by the number of correct answers (line 1). Line 7 reports the total correct answer time, that is the total time taken for producing the correct answers, excluding the time spent on tasks that timed out. Lines 7.1–7.4 report the percentages of the total cor- rect answer time taken to run the C2CLP module, the Specializevcg procedure, the Specializeprop procedure, and the CorrectnessTest procedure, respectively. Line 8 reports the average correct answer time, that is, the total correct answer time (line 7) divided by the number of correct answers (line 1). Line 9 reports the maximum number of iterations of the Iterated Specialization strategy that were needed, after the removal of the interpreter, for verifying the correctness property of interest on the various programs. Line 10 reports the total number of predicate definitions introduced by the Specializeprop procedure during the sequence of iterations.

The data presented in Table 5.2 show that polyvariance always gives better precision than monovariance. Indeed, the polyvariant generalization operator with convex hull GenPH achieves the best precision (it provides the correct answer for 85.65% of 216 programs of our benchmark), followed by the poly- variant generalization operator without convex hull GenP (73.61%). (For this reason in Table 5.1 we have compared the other verification systems against VeriMAP (GenPH).) As already mentioned, polyvariant generalization may in- troduce more than one definition for each program point, which means that the Specialize procedure yields a more precise abstraction of the program to be verified and, consequently, it may increase the effectiveness of the analysis. The increase of precision obtained by using polyvariant operators rather than monovariant ones is particularly evident when proving incorrect programs (the precision is increased of about 100%, from 15 to 29–31).

On the other hand, monovariant operators enjoy the best trade-off between precision and average correct answer time. For example, when considering the average correct answer time, the GenM operator, despite the significant loss of precision (about 20%, from 159 to 128) with respect to its polyvariant counter- part GenP, is about 140% faster (3.25 seconds versus 7.88 seconds), when we consider the average correct answer time.

The good performance of monovariant operators is also justified by the much smaller number of definitions (about one tenth) introduced by the Specializeprop procedure with respect to those introduced in the case of polyvariant operators.

		VeriMAP generalization operators
		GenM	GenMH	GenP	GenPH
1	correct answers	128	147	159	185
2	correct problems	113	132	130	154
3	incorrect problems	15	15	29	31
4	timed-out problems	88	69	57	31
5	total time	26816.64	21362.93	18353.11	10717.34
6	average time	209.51	145.33	115.43	57.93
7	total correct answer time	416.64	662.93	1253.11	1417.34
7.1	C2CLP	2.27%	1.63%	0.96%	0.98%
7.2	SpecializeRemove	6.81%	4.74%	4.44%	4.06%
7.3	Specializeprop	90.33%	93.16%	42.77%	44.68%
7.4	CorrectnessTest	0.59%	0.46%	51.83%	50.27%
8	average correct answer time	3.25	4.51	7.88	7.66
9	max number of iterations	4	7	10	7
10	number of definitions	5623	6248	54977	58226

Table 5.2: Verification results using the VeriMAP system with different generalization operators. The sum of the values of lines 1 and 4 is 216, which is the number of the verification problems we have considered. The timeout limit is five minutes. Times are in seconds.

Also the size of the programs produced by monovariant operators is much smaller with respect to those produced by polyvariant operators. (The size of the final programs obtained by specialization is not reported in Table 5.2, but it is approximately proportional to the number of definitions.) This also explains why the impact on the total correct answer time of the CorrectnessTest analysis is much lower for monovariant operators (less than 1%) than for polyvariant operators (about 50%).

The weight of the two preliminary phases (translation from C to CLP and removal of the interpreter) on the overall verification process is very limited. The execution times for the C2CLP module are very low (about 50 milliseconds per program) and have a low impact on the total correct answer time (at most 2.27%, as reported on line 7.1). The time taken by SpecializeRemove for removing the interpreter ranges from a few tenths of milliseconds to about four seconds, for the most complex programs, and its impact on the total correct answer time is between 4% and 7% (see line 7.2).

In the set of problems we have considered, the higher average correct answer time of polyvariant operators does not prevent them from being more precise than monovariant operators. Indeed, by using polyvariant operators we get fewer timed-out problems with respect to those obtained by using monovariant operators, and thus for the verifications that use polyvariant operators we also get smaller total and average times (which take into account the number of timed-out problems).

We also observe that generalization operators using convex hull always give greater precision than their counterparts that do not use convex hull. This confirms the effectiveness of the convex hull operator, which may help infer relations among program variables, and may ease the discovery of useful program invariants.

Some of the programs are verified by the VeriMAP system during the first it- eration after the removal of the interpreter by propagating the constraints of the initial configuration only. Nonetheless, as indicated in Figure 11, which shows the precision achieved by the VeriMAP generalization operators during the first ten iterations, our approach of iterating program specialization is effective and determines a significant increase of the number of correct answers (up to 115% for GenP , from 74 to 159). Moreover, we have that the verification process is able to prove programs by performing up to 7 or 10 iterations when using the more precise generalization operators GenPH or GenP, respectively.

The highest increase of precision is given by the second iteration and, although most correct answers are provided within the fourth iteration, all generalization

VeriMAP(GenPH) VeriMAP(GenP) VeriMAP(GenMH) VeriMAP(GenM)

180

160

y (cumulative precision)

140

120

100

1 2 3 4 5 6 7 8 9 10

x (iterations)

Figure 11: Cumulative precision achieved by the VeriMAP generalization operators during the first ten itera- tions.

operators (except for the monovariant operator GenM) keep increasing their precision from the fifth iteration onwards, providing up to 6 additional correct answers each.

The iteration of specialization is more beneficial when using polyvariant gen- eralization operators where the increase of the number of answers reaches 115%, while the increase for monovariant generalization operators is at most 52%. For example, at the first iteration GenP is able to prove less properties than GenMH, but it outperforms the monovariant operator with convex hull by recovering precision when iterating the specialization.

The increase of precision due to the iteration of program specialization is also higher for generalization operators that do not use the convex hull operator (GenM and GenP), compared to their counterparts that use convex hull (GenMH and GenPH).

We would also like to point out that the use of convex hull is very useful during the first iterations. Indeed, the generalization operators using convex hull can verify 104 programs at the first iteration, while operators not using convex hull can verify 74 programs only. In this case the choice of using a polyvariant vs. monovariant generalization operator has no effect on the number of verified programs at the first iteration.

Finally, we note that the sets of programs with correct answers by using differ- ent operators are not always comparable. Indeed, the total number of different programs with correct answers by using any of the generalization operators we have considered is 190, while the programs for which a single operator produced a correct answer is at most 185. Moreover, there are programs that can be ver- ified by operators with lower precision but that cannot be verified by operators with higher precision, within the considered timeout limit. For example, in our experiments the least precise operator GenM was able to prove four programs for which the most precise operator GenPH timed out.

This confirms that different generalization operators can give, in general, dif- ferent results in terms of precision and performance. If we do not consider time limitations, generalization operators having higher precision should be preferred over less precise ones, because they may offer more opportunities to discover the invariants that are useful for proving the properties of interest. In some cases, however, the use of more precise generalization operators determines the intro- duction of more definition clauses, each requiring an additional iteration of the while-loop (α) of the Specialize Procedure, thereby slowing down the verification process and possibly preventing the termination within the considered timeout limit. In practice, the choice of the generalization operator to be used for any

Document Metadata

Table of Contents

Free Standard Templates

UNIVERSITÀ DEGLI STUDI “G. d’Annunzio” CHIETI - PESCARA

Document Metadata