SPADE Proof Checker

User Manual

 

 

 

 

 

 

 

 

 

 

 

Checker_UM

Issue: 4.5

Status: Definitive

12th October 2010

 

DOCTYPE:Praxis title:varchar=title reference:varchar=reference status:varchar=status issue:varchar=issue date:varchar=date projectname:varchar=DocumentSet

Issue 4.3, Definitive, 11th September 2009

000_000 User Manual

 

 

 

Originator

 

 

SPARK Team

 

 

 

Approver

 

 

SPARK Team Line Manager

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 



Copyright

The contents of this manual are the subject of copyright and all rights in it are reserved.  The manual may not be copied, in whole or in part, without the written consent of Altran Praxis Limited.

 

Limited Warranty

Altran Praxis Limited save as required by law makes no warranty or representation, either express or implied, with respect to this software, its quality, performance, merchantability or fitness for a purpose.  As a result, the licence to use this software is sold ‘as is’ and you, the purchaser, are assuming the entire risk as to its quality and performance.

Altran Praxis Limited accepts no liability for direct, indirect, special or consequential damages nor any other legal liability whatsoever and howsoever arising resulting from any defect in the software or its documentation, even if advised of the possibility of such damages.  In particular Altran Praxis Limited accepts no liability for any programs or data stored or processed using Altran Praxis Limited products, including the costs of recovering such programs or data.

1          Overview

2          Rationale and Logical Background

2.1      Rationale

2.2      Logical background

3          Interface with SPADE and SPARK

3.1      Typechecking

3.2      FDL data types

3.3      Quantifiers and FDL

3.4      Proofs with quantifiers

3.5      Prolog notation

3.6      Upper and lower cases

3.7      The “mod” operator

3.8      Extensions for SPARK

4          Verification Condition Files

4.1      Integration with the SPADE and SPARK tools

4.2      Format of a VC file

4.3      Format of a saved VC file

4.4      The FDL file

5          Proof Checker Commands

5.1      Some points on syntax and use

5.2      Proof management commands

5.3      Listing & viewing commands

5.4      Inference commands

5.5      Substitution & simplification commands

5.6      Proof by cases commands

5.7      Quantifier manipulation commands

5.8      Other proof frame commands

6          Configuring the Proof Checker

7          Common Proof Techniques

7.1      Standardisation followed by forward chaining

7.2      Searching rules with infer command

7.3      Subgoaling

7.4      Equality (and other) substitutions

7.5      Examining the rules database

7.6      Proofs by cases

7.7      Proof by contradiction

7.8      Proofs with quantification

7.9      Look at conclusion structure

8          Log Files

8.1      Proof logs

8.2      Command logs

9          Some Examples

9.1      Partial correctness of a division by repeated subtraction algorithm

9.2      Partial correctness of a hardware multiplication algorithm

9.3      Partial correctness of a maximum value in an array algorithm

9.4      Absence of certain run-time errors in a sort algorithm

10       Getting Started

10.1    First steps

10.2    What to do next

A          Appendix: Command Syntax Summary

Document Control and References

File under

Changes history

Changes forecast

Document references

 

SPADE is a registered trademark of Altran Praxis Limited.

Note: The SPARK programming language is not sponsored by or affiliated with SPARC International Inc. and is not based on SPARC™ architecture.

 


Contents

 

 


1                       Overview

This document provides technical information on the use of the commands and features of the SPADE Proof Checker.  It is not intended to describe the concept of mathematical proof nor to provide training in the pragmatics of using the Proof Checker.  For the former, we recommend as an introduction one of the growing number of books on mathematics for computing and for further reading books on mathematical logic, and for the latter we recommend the SPADE Proof Checker three-day training course, run by Altran Praxis, on which training in the use of the checker and tutorial exercises and further literature is provided.  Having explored what this manual is not, let’s now consider what is included.

Chapter 2 covers some aspects of the logical system underpinning the SPADE Proof Checker and its background.  Chapter 3 briefly describes the relationship between the Proof Checker and FDL, its type system and the similarities and differences, for instance between FDL and Checker expressions.  Chapter 4 outlines the format of the files read by the checker, as an aid to writing your own .VCG files (for instance to prove formulae which are not derived directly from a program model).  Chapter 5 covers the syntax and operation of each of the Proof Checker’s commands.  Chapter 6 states how you can define your own local or system-wide checker initialisation file, to set the value of various flags and consult your own proof rule libraries automatically at the start of each proof session.  Chapter 7 provides a (non-exhaustive) collection of hints and tips on some common proof strategies which may be of use to you in using the Checker.  Chapter 8 makes some brief remarks about the proof logs created by the Proof Checker.  Chapter 9 provides some simple examples of use of the Proof Checker and chapter 10 is a brief guide to get you started using the Checker.  Finally, an appendix provides a one-page command syntax summary.

A sister-document to this User Manual is the SPADE Proof Checker Rules Manual which describes the built-in proof rule libraries supplied with the Checker, how to define rules and create user-defined rule files and how to build rule files and declarations into standalone proof rule libraries.

This manual is intended for Windows and SUN users of the SPADE Proof Checker.  Where file naming conventions are used, these are mostly SUN (e.g. /home/user1/ion/myprog.vcg). Lowercase filenames are assumed throughout, though the built-in rule library uses uppercase names (INTINEQS.RUL).

(This version of the SPADE Proof Checker supports both for output by the SPADE VCG and for VCs generated by the Examiner; extensions specific to the Examiner are discussed in section 3.8 of Chapter 3.)

2                       Rationale and Logical Background

2.1               Rationale

The SPADE Proof Checker has been developed over a number of years as a result of research work at the University of Southampton.  Some of the first tools for the Proof Checker were written in 1983-4.  Since September 1986, further research work and a significant amount of development and engineering of the Proof Checker have taken place first within Program Validation Limited and then within Altran Praxis.

The SPADE suite of static and semantic analysis tools enables rigorous models of programs to be constructed, analysed and annotated for proof.  With such annotated code, the SPADE Verification Condition Generator (VCG) can be employed to generate logical formulae known as Verification Conditions (VCs) automatically.  The truth of these VCs corresponds to the correctness of the code under analysis with respect to its specification as embedded in these annotations.

From the earliest research, however, it was clear that the proofs of correctness even of quite simple code could generate large and apparently complicated formulae whose truth was not obvious.  Two approaches were common: one was to attempt to develop powerful automatic theorem provers so that these formulae could be fed in and, hopefully, out would come the answer “Proved”.  This approach has led to admirable research efforts, but as yet we are still a long way from tools powerful enough to carry out automatic proofs of all but the most straightforward of code.  The other approach was to carry out manual proofs, or semi-formal justifications, of these formulae: a noble effort, but with a great risk of error or failure to spot inadequacies in the arguments used.

The reasoning behind the development of the SPADE Proof Checker was to develop a tool which was capable of taking the best from both worlds.  It is an interactive tool, so that the user’s skill and experience can be used to drive the proof attempt with the result that a great deal of fruitless proof effort can be avoided, yet it also provides some powerful proof tools which can be deployed on the formula to be proved.

Another goal behind the development of the SPADE Proof Checker was to build a tool which could be fully integrated into SPADE, accepting input directly from the SPADE VCG and thereby avoiding the possibility of introducing errors as the formulae are converted from one form to another.  Furthermore, this integration also means that all the inherent semantic features of SPADE’s modelling language, FDL, are made explicit in the built-in proof rules of the SPADE Proof Checker.

The SPADE Proof Checker was also to provide the “missing link” between the proof functions and predicates used in the specification of the code and the VCs derived from the code itself.  By making it possible for the user to add new proof rules, defining properties of the proof functions and predicates used in code annotations, and by insisting on the labelling of these rules (both user-supplied and built-in), a proof log could be generated which details exactly which proofs made use of which proof rules. In this way, any proof rules which are discovered subsequently to be at fault (e.g. by the omission of a constraint) can be isolated and the effect of a change to such a rule can be determined, so that exactly which VCs may require re-proof (and which do not) is apparent from the proof log.

2.2               Logical background

The SPADE Proof Checker is designed around first-order predicate calculus with classical logic.  It is important to understand the implications of the preceding statement.

Firstly, the fact that the Proof Checker is essentially first-order means that we have no notation, and no proof mechanisms, for dealing with second- and higher-order quantification.  Thus, if in a specification we make use of such higher-order quantification to express a definition, this specification will require manual remodelling before it is in a form from which we can directly derive proof rules suitable for use with the SPADE Proof Checker.  Having said this, however, it should be noted that the SPADE Proof Checker, which inherits FDL’s type discipline, does permit quantification over such objects as FDL arrays, records, sequences and sets.  Thus, a relation such as perm(A,B) expressing the fact that array A is a permutation of array B, can be defined in a first-order way by using an array of indices (and such a definition has been used with the Proof Checker to prove various properties of perm, e.g. that perm is commutative, that any array is a permutation of itself and that swapping two elements preserves permutation).

Secondly, the fact that the SPADE Proof Checker is designed around classical logic means that we must be very careful about what it is that we are claiming to have proved.  We use the words “classical logic” to emphasise that we are not making use of some non-standard logic tailored for work on a specific class of problem.  Instead, as FDL is a modelling language designed to model a wide variety of imperative programming languages, we shift the onus of avoiding the pitfalls of using classical logic on to the annotations.  A consequence of using classical logic is that, whenever we read in a VC, we assume that all subexpressions occurring in that VC are well-defined.  Thus, if we have a hypothesis c div d = x, there is an assumption that c div d is defined (i.e. d is not zero): if we also have a hypothesis that d = 0, therefore, classical logic ceases to be applicable, as we have a hypothesis whose value is undefined.

If we are to avoid the possibility of undefined subexpressions of a VC leading us into problems such as the construction of a proof whose validity is dubious, we must ensure that all subexpressions are defined.  So how can an expression be undefined?  One way is by failure to assign a value to a variable.  Here, SPADE flow analysis can be used to check for the absence of any data flow errors of this kind (though with structured types, such as arrays, flow analysis results may not be so clear cut: we may assign a value to a[1], but leave a[2] undefined, for example).  Another means of introducing undefined expressions into our VC is by using a function which is “partial” with respect to the types of its arguments, as in our “div” example: c div d is only defined if d is non-zero.  Thus, if we are to be entirely safe, we should introduce an FDL check statement before every use of a partial function, wherever it occurs in our FDL model (proof context or not).

So the result of using classical logic is that we must pay attention to the possibility of undefined subexpressions in our VCs and guard against such a possibility by code annotations whose truth proves such undefinedness impossible.  Thus, we make explicit our reasoning about partial functions in our code annotations, rather than attempt to bury away our reasoning about them in a non-classical logic.

We wish to use classical logic because it brings significant advantages. One, for instance, is that (A or not A) is true (the “law of excluded middle”), which can be particularly useful for reasoning by cases.  Another is that we can use proof by contradiction, which is particularly appropriate for VCs which are derived from program paths which are not executable in practice. Finally, it is a simple logical system, with a wide audience of people who are familiar with it (even if they do not know it explicitly by that name), making the proof work amenable to the mathematically-minded without the need for specific logic specialists.

The SPADE Proof Checker embodies its classical logic in a number of tools, and also in the way it works.  It uses a natural deduction system, reflected in the operation of the Checker’s simple inference engine, the proof by cases, proof by implication, proof by contradiction, logical deduction, forwardchain, unwrap and instantiate commands and the built-in rules database.  As an example, consider “or-elimination”.  Given a conclusion R, and a hypothesis (P or Q), if we can prove R when we assume P and if we can prove R when we assume Q, then we have proved R by cases: this is the foundation for the Proof Checker’s “prove ... by cases” command.

Finally, induction.  At present, the SPADE Proof Checker provides a built-in strategy only for proof by mathematical induction over a subrange of the integers bounded at one end.  It is not possible for the user to write a general induction rule for any other data types, as this would involve an essentially second-order rule, but one can write a rule which is an instance of an induction scheme for other data types.  To understand why this is, consider an induction scheme for sequences.  We might suggest:

(P([]) and (P(S) -> P([X] @ S))) -> P(ANY_SEQUENCE)

but we notice that the property of sequences to be proved, P, is not a specific predicate, but a general one, which we must instantiate to a specific formula before we can use the induction scheme (e.g. if we put P(Seq) = (length(Seq) >= 0), we then have a specific formula to prove). Thus, this scheme says “for all P, ...”, which is quantification over a predicate and hence not first-order.  However, with our example formula for P, we can write an instance of the above induction scheme as a proof rule thus:

length_induction(1): length(ANY) >= 0 may_be_deduced_from

                    [ length([]) >= 0,

                      length(s) >= 0 -> length([x] @ s) >= 0 ].

(In fact, we would generate new unique identifiers rather than use “x” and “s” in our rule: see chapter 5 of the SPADE Proof Checker Rules Manual for details of proof rules syntax.)  With this, if we could prove that the length of the empty sequence is non-negative and that if the length of some sequence s is non-negative then the length of the same sequence with an extra element on the front is also non-negative, then we know that all sequences are non-negative (since every sequence is either the empty sequence, or a sequence with an element on the front and another, “smaller” sequence as its tail).

3                       Interface with SPADE and SPARK

3.1               Typechecking

The Proof Checker is designed to interface with the SPADE VCG and/or the Examiner.  The VCs generated by the SPADE VCG (or the Examiner for SPARK) are fed in as input to the Checker together with the inherited declarations from the FDL model (of types, constants, variables and functions), as described in chapter 4 of this document.  The Checker needs to read these FDL declarations because it polices a liberal form of the type system of FDL to ensure that no proof violates this type system.

The type checking performed by the Proof Checker is liberal in that any two expressions are regarded as type-compatible if they have the same base type.  Thus, given the FDL declarations

type day =  1..31;

type year = 1900..1999;

var d: day;

var y: year;

the expression

d = y

is properly typed as far as the Checker is concerned, as it is always (mathematically) valid to compare two integers for equality.  Thus a VC which included a hypothesis to the effect that d and y were equal would cause no problem to the Checker, and may well be a proveable formula; but if this VC corresponded to an executable path in the FDL model, we should be concerned about the probability of a run-time error along this path, as d and y should, according to the above declarations, belong to disjoint subsets of the type integer.

3.2               FDL data types

3.2.1           Numbers

Just as in FDL, the Proof Checker regards integers as the true mathematical objects, so that m+1>m is always true and a valid expression for any integer m (whereas it is not in Pascal, for instance: if m=maxint, m+1 should cause an error).  If you are concerned to prove that all integer expressions within a program (as reflected by its FDL model) are within some range and do not cause run-time errors (through division by zero, for instance), you must add FDL check statements to check each expression and each subexpression of the program is sufficiently within bounds. The Examiner can do this automatically for a large class of potential run-time errors in SPARK.

Similar comments apply to the type real.  The Proof Checker, like FDL, regards the reals as real numbers in the mathematical sense, and any problems likely to arise through rounding errors or overflow must be dealt with by adding check annotations to generate proof obligations, the truth of which ensure the absence of such a problem.

As far as the Proof Checker is concerned, all integers are real (though the converse does not hold).  Thus, if i is declared to be of type integer and r of type real, i>r is a well-typed expression (though we cannot infer i>=r+1 from it, as for this to be true, both i and r would have to be of type integer).

All of the standard arithmetic operators of FDL (+, - (binary and unary), *, div, mod, /) are supported by the Proof Checker, together with the built-in FDL functions abs(x) and sqr(x).

3.2.2           Booleans

The full range of boolean operators (i.e. and, or, not, ->, <->) within FDL are supported by the Checker.  In addition, the built-in FDL function odd(x) is supported.  Note that, unlike FDL, the Proof Checker does not permit the boolean type to be regarded as an enumerated type, however, so the Proof Checker does not regard the expression false<true as well-typed.  See also section 3.3 below on quantifiers.

3.2.3           Enumerated Types

The Proof Checker supports user-declared enumerated types, and the Checker’s simplifier will make use of the ordering information, so that given the enumerated type declaration

type colour = (red, orange, yellow, green, blue, indigo, violet)

for instance, the expression yellow>red will be simplified to true.  The FDL pred(x) and succ(x) are also supported for each user-declared enumerated type (but not, as may be gathered from the preceding section, for the type boolean).

3.2.4           Arrays

The Proof Checker supports the FDL update(_,_,_) and element(_,_) functions for array types.  Knowledge of these functions is built into the Checker’s simplifier and rules expressing their properties are available through the built-in rules database.

3.2.5           Records

FDL defines 2n record functions for each record type with n fields (an “update” and an “access” function for each field), and the Checker similarly makes such functions available for each record type.  In addition, the Checker’s simplifier will make use of the properties of these functions. These properties may also be used via the “simplify” rule, or explicitly invoked through the “record” family of rules (though these rules are “single-step” and consequently less powerful than the simplifier).

3.2.6           Sequences

On reading in VCs involving sequences, expressions may be modified slightly.  In FDL, for each sequence-type S ( = sequence of T, for some type T), a sequence constructor S [ ] exists; such expressions are modified by the Checker on reading in the VCs by stripping off the type identifier S from the front.  In Checker commands, expressions and proof rules, you should only use the Proof Checker’s sequence syntax, not FDL’s (i.e. omit the sequence-type identifier).

The full range of FDL sequence operators and functions (i.e. @, first, last, nonfirst, nonlast) is supported by the Checker.  In addition, rules are provided for a length function (which should soon be available within FDL also).  If you are using a version of SPADE which does not support such a length function at present, you will need to declare a separate length_S function for each sequence-type S.  You may then make use of the seqlen family of rules by rewriting each length_S(x) expression to length(x) for each sequence-type S with a rule such as:

conv_len(1): length_intseq(X) may_be_replaced_by length(X).

3.2.7           Sets

FDL sets are also supported by the Checker, but as with sequences expressions involving sets undergo some rewriting as they are read in from a VC file.  As with sequences, you should only use the Checker’s syntax once inside the Checker in all commands, expressions and proof rules.

The changes which occur are shown in the table below.

 

FDL syntax

Checker syntax

S []

set []

S [x1,...,xn]

set [x1,...,xn]

x + y

x \/ y

x * y

x /\ y

x - y

x \ y

x <= y

x subset_of y

x >= y

y subset_of x

x < y

x strict_subset_of y

x > y

y strict_subset_of x

(I.e. the type identifier prefix is converted to the prefix word set, the set union, intersection and lacking operators are changed and the relational operators for subsets are converted to [strict_]subset_of.)  After these changes, the possible confusion of set expressions and numeric expressions is eliminated.  Note that in and not_in are retained.

3.3               Quantifiers and FDL

Quantifiers, which may not be fully supported by the version of SPADE which you are using, are nevertheless supported by the SPADE Proof Checker.  The syntax differs slightly from that of FDL.  A quantified expression for use by the Proof Checker has one of the forms

for_all(VAR : TYPE, PREDICATE)

for_some(VAR : TYPE, PREDICATE)

where VAR is an identifier, TYPE is the type for this identifier and PREDICATE is the predicate which is quantified by this quantifier expression.  The scope of each quantifier written in this form is clear and unambiguous.  Note that using a subrange type for the TYPE identifier does not at present cause the Checker to add range information automatically.  Thus, given the FDL type

type indexrange = lowerint .. upperint,

the two Checker expressions

for_all(i:integer, p(i))

and

for_all(i:indexrange, p(i))

carry the same amount of information (or proof obligation).  If, in the latter case, we explicitly require that i is within lowerint..upperint, we must instead write this universally-quantified expression as

for_all(i:indexrange, (lowerint <= i and i <= upperint) -> p(i))

The same comment applies to existential quantification, but to require that i is within range in an existentially-quantified expression, we instead AND the range information with the predicate.  Thus, the following equivalences apply:

 

Quantifier with range

Checker equivalent

for_all(X:T in L..U, P(X))

for_all(X:T, (L<=X and X<=U) -> P(X))

for_some(X:T in L..U, P(X))

for_some(X:T, L<=X and X<=U and P(X))

 

If you wish to include quantified formulae in VCs, you must edit the VCG file into this format.  If you are using a version of SPADE whose FDL does not support quantification, it is still possible to mimic quantification.  For example, suppose we wish to add an FDL assertion:

assert for_all(i:integer, (1 <= i and i <= current)
       -> p(i)) and

       for_some(j:integer, 1 <= j and j <= current
       and q(j))

We can mimic this, basically by declaring

proof var i, j : integer;        {the “bound” variables: do not use for anything else}

proof function all_int (integer, boolean) : boolean; {”for_all”  for integers}

proof function some_int(integer, boolean) : boolean; {”for_some” for integers}

and then writing our assertion as

assert all_int(i, (1 <= i and i <= current) -> p(i)) and

       some_int(j, 1 <= j and j <= current and q(j)).

We then have two alternatives for converting this to Checker quantification format.  Either we can edit expressions of the form all_int(I, EXP) into the form for_all(I:integer, EXP) (and similarly for some_int) manually, for instance with an editor, or we can provide the Proof Checker with proof rules

all_int(1):  all_int(I, EXP)  may_be_replaced_by

    for_all(I:integer, EXP).

some_int(1): some_int(I, EXP) may_be_replaced_by      for_some(I:integer, EXP).

and rewrite such terms with the Proof Checker.  Either way, to make use of the Checker’s knowledge of quantification, it is necessary to rewrite to Checker syntax.  Finally, notice that such a mimicry of quantification in FDL requires a different function for each type to be quantified (e.g. we may also need all_bool for quantification over booleans, etc.) and dedicated proof variables of the appropriate type which are neither “save”d to nor used anywhere apart from in quantification.

3.4               Proofs with quantifiers

The Proof Checker provides two commands specifically for carrying out proofs involving quantification: “unwrap” and “instantiate”.  Before we consider these commands further, let us first explore what we intuitively understand by a quantified hypothesis or conclusion.

If we have a hypothesis for_all(x:t, p(x)), we are saying that no matter what value of type t we choose for x, p(x) will be true.  Thus this hypothesis is really a whole family of facts: p(V1), p(V2), ..., for each possible value Vi of type t.  If, on the other hand, we have a hypothesis for_some(x:t, p(x)), we are saying that at least one value v of type t satisfies p(v): we do not know anything else about v except that it satisfies p.  Thus, in this case we have only a single fact, p(v), where v is some new identifier of type t.

For conclusions, the position is reversed.  If we have a conclusion for_all(x:t, p(x)), we want to show that every value v of type t satisfies p; to do this, we can introduce a new identifier v of type t, about which we know nothing else; if we can then prove p(v), it follows that for_all(x:t, p(x)) must be true, since we have proved p(v) for an arbitrary element v of the type t.  If we have a conclusion for_some(x:t, p(x)), however, we need only find a value for x which makes p(x) true.  To do this, we can regard our conclusion as a whole family of goals p(V1), p(V2), ..., for each possible value Vi of type t all “or”ed together since, if we prove p(Vj) for one of these j’s, then we have proved that there does indeed exist an x (namely, Vj) for which p(x) holds.

How do we reflect this approach in the Proof Checker’s commands and natural deduction system?  We do so by allowing the user to “unwrap” quantified hypotheses and conclusions, introducing new unique identifiers as we do so. Given a hypothesis such as

H3: for_all(x:t, p(x))

we can always unwrap this to give

H4: p(t_X_1).

Here, t_X_1 is a new unique universal variable introduced by the “unwrap” command.  It has type t, and we can instantiate it whenever we choose to a particular value.  (If we need to use both p(a) and p(b) for a, b of type t, we just unwrap H3 twice, instantiating one unwrapped version with a, the other with b.)  Now, if instead we had

H3: for_some(x:t, p(x))

as our hypothesis, we can again use the “unwrap” command (though this time we introduce an ordinary variable, rather than a universal variable), but only if there are no universal variables in the formula p(x): in other words, before we unwrap an existentially-quantified hypothesis, we must instantiate all the universal variables in it.  There is a good reason for this.  Suppose we have

H3: for_all(x:integer, for_some(y:integer, y > x))

(i.e. every integer x has an integer y which is bigger than it: perfectly true.)  If we unwrap this once, we get

H4: for_some(y:integer, y > int_X_1)

where int_X_1 is our universal variable introduced by unwrapping.  If we were then allowed to unwrap H4, we would get

H5: int_y_1 > int_X_1

(Notice that you can spot universal variables by the presence of capital letters in them, reminding us (for good reason) of a Prolog variable, i.e. something which we can instantiate!)  Unfortunately, we can now instantiate int_X_1 to int_y_1 and get

H5: int_y_1 > int_y_1

which is false, despite the fact that our original, “wrapped” H3 did not contain a contradiction.  Basically, the problem is that our H4 says that there exists a y which is a function of int_X_1 in that, given a value for int_X_1, we can choose a value for y such that it is bigger than our value of int_X_1.  By permitting the unwrapping of H4 before int_X_1 is instantiated, we lose this dependency.  Thus, the Checker prevents such incorrectness by first checking that any existentially-quantified hypothesis is free of universal variables before it is unwrapped.

As we might suspect, converse comments apply to quantified conclusions.  Given

C2: for_all(x:t, p(x))

“unwrapping” this conclusion causes us to enter a new proof frame in which our goal is

C1: p(t_x_1)

i.e. p for some arbitrary object t_x_1 of type t.  We can only do this, however, if our expression p(x) is free of universal variables.  If we unwrap

C2: for_some(x:t, p(x))

however, we always get the following goal in our new proof frame

C1: p(t_X_1)

where t_X_1 is a universal variable.

Let’s round off with an example.  Given the VC

H1: for_all(x : integer, (0 <= x and x <= n) -> p(x))

H2: p(n + 1)

-->

  C1: for_all(x : integer, (0 <= x and x <= n + 1) -> p(x))

(which is, I hope you’ll agree, fairly obvious) we get, if we unwrap H1 and C1, a new proof frame with the following VC state:

H1: for_all(x : integer, (0 <= x and x <= n) -> p(x))

H2: p(n + 1)

H3: (0 <= int_X_1 and int_X_1 <= n) -> p(int_X_1)

-->

  C1: (0 <= int_x_1 and int_x_1 <= n + 1) -> p(int_x_1).

If we now enter a proof by implication attempt on C1, we enter another proof frame, and now the VC state is

H1: for_all(x : integer, (0 <= x and x <= n) -> p(x))

H2: p(n + 1)

H3: (0 <= int_X_1 and int_X_1 <= n) -> p(int_X_1)

H4: 0 <= int_x_1

H5: int_x_1 <= n + 1

-->

  C1: p(int_x_1).

We can then prove C1 by cases on (int_x_1 < n + 1 or int_x_1 = n + 1).  For case 1, int_x_1 < n + 1, so (since x and n are integers, not reals) int_x_1 <= n.  We may therefore instantiate int_X_1 to int_x_1 and forwardchain on H3 to get C1 for this case.  Case 2 is int_x_1 = n + 1, and we can see C1 by substitution of int_x_1 for n + 1 in H2.  This completes the proof by cases and in fact of the whole VC since there are no other outstanding subgoals in any proof frame.

3.5               Prolog notation

The fundamental objects of Prolog are integers and atoms.  Integers are self-explanatory, and Prolog supports variable-length arithmetic, so it is very unlikely that you will manage to cause an arithmetic overflow!  Simple Prolog atoms correspond roughly to words which begin with a lowercase letter; thus the following are all valid Prolog atoms:

w  x1  cat  atom  aTOM_ATOm  this_is_a_rather_long_atom .

Prolog is case sensitive, so aa and aA are different atoms.  Within the Proof Checker, all identifiers inherited from the FDL model (constants, variables, type identifiers, function names) are represented by Prolog atoms.

Prolog also provides variables.  Prolog variables commence with a capital letter (or an underscore), in general, to distinguish them from Prolog atoms.  A Prolog variable should be thought of as standing for some definite but unidentified object: it is not “variable” in the sense of imperative programming languages, where the same variable may have two or more values during the course of a successful execution of the program.  Rather, once a Prolog variable is instantiated to a particular value, all instances of that variable become likewise instantiated.  Note, however, that the scope of a Prolog variable is limited to the expression in which it occurs: if we type “X” in answer to one question from the Checker, then are prompted to answer another question and type “X” again, this is a different variable to the earlier “X”.  For the purposes of understanding the Proof Checker, think of a Prolog variable’s scope as limited to the expression it is in, which is terminated by the next full-stop.

It should be noted at this point that the full-stop character is a special symbol in Prolog: (almost) every command that you type to the SPADE Proof Checker and every reply to a question asked by the Proof Checker should be terminated by a full-stop and a carriage-return.

3.6               Upper and lower cases

In reading the declarations from an FDL text, all identifiers are converted to lowercase.  The SPADE Proof Checker will only operate with releases of the SPADE VCG made on or after August 1988 (V4.3 onwards) and releases of the Examiner (with VCG) made on or after August 1993 (V1.3 onwards), each of which produce .VCG files with VCs entirely in lowercase.  You should therefore employ only lowercase identifiers (e.g. for functions and constants) in any local rulefiles that you write to ensure compatibility.

3.7               The “mod” operator

If the Checker detects that a VC has been generated by the Examiner, then simplification and standardisation of the “mod” operator is implemented according to Ada’s definition of “mod”. For VCs generated from other languages, no simplification, standardisation, or rules for “mod” are provided. Instead, you should provide your own proof rules, appropriate to the definition of the programming language under analysis.

3.8               Extensions for SPARK

Version 1.3 onwards of the SPADE Proof Checker is also intended for use in reasoning about VCs generated by the Examiner for SPARK.  To support this, a number of additions to the Checker’s expression language have been made (making it an extension of FDL).  The three additions, which we consider in turn below, are an exponentiation operator, array aggregates and record aggregates.  Note that Ada’s REM and XOR operators are not needed: (A rem B) is rewritten by the Examiner to (A - A div B * B) and (P xor Q) is rewritten using FDL’s and, or and not operators.

3.8.1           Exponentiation

An exponentiation operator, **, is supported by the Checker as an extension to FDL.  An expression of the form a ** b is well-typed if both a and b are integers, or if a is real and b is an integer.  The definition of the exponentiation operator for integers raised to a non-negative integer power is compatible with that given in the Ada Language Reference Manual, section 4.5.6.  For reals, there is the difference that the Proof Checker’s real numbers are the mathematical reals, rather than the rational subset supported by programming languages such as Ada.  The exponentiation operator binds more tightly than the other binary arithmetic operators.

3.8.2           Array aggregates

SPARK array aggregate expressions, as output by the Examiner, take the form:

mk__array(Arg1, Arg2, ..., ArgN)

(where N>=1).  Each ArgI is of the form

[I] := X

where, if I is an expression, then this denotes that the element whose index is given by the value of I is assigned the value of expression X.  Alternatively, I could be an index range of the form J..K, meaning that all elements with indices between J and K inclusive have the value X.  One final variant is that the first argument, Arg1, may be just an expression; if this is the case, this is the value assigned to the “others” elements by the array aggregate, and is the value of any element not otherwise given an explicit value by the other argument associations.

The value of an element of an array aggregate expression is most easily determined by scanning from right to left: if we want to know the value of the Hth. element, we first check to see if H is one of the index values associated with  the last argument, ArgN; if it is, the value required is that on the right of the := symbol for ArgN.  If, however, H cannot satisfy the index associations of ArgN, we instead consider ArgN-1 and so on, until we either hit Arg1 and this turns out to be the others part, or we run out of arguments in which case the expression cannot be evaluated, or we reach a point where the Checker cannot determine whether or not H satisfies some index association (because the inference engine is not smart enough, for instance, or the hypotheses not strong enough to determine either way).  In this way, the arguments of the array aggregate expression can be thought of as a kind of “override” running from left to right: we start with an “empty” array, then apply Arg1’s mappings, then Arg2’s and so on, so that if both ArgI and ArgJ assign a value to element K, then the later (reading from left to right) is the one which prevails, as it “overwrites” the earlier one. (This interpretation of array aggregate expressions is a conservative extension to SPARK’s array aggregates, in which static semantic constraints should ensure that each array element is assigned to precisely once.)  Because of this left-to-right overriding, it makes sense to look for matches against the desired index starting from the right and scanning backwards (left) until a match is found, hence the above strategy.

3.8.3           Record aggregates

SPARK record aggregate expressions generated by the Examiner take the form:

mk__record(fld1 := val1, fld2 := val2, ..., fldN := valN)

for a record with N fields.  This assigns valI to the field named fldI. Clearly, for I<>J, field names fldI and fldJ must be distinct.  To determine the value of a field, we simply find the argument of the mk__record expression associating a value with the field name and use that value.

3.8.4           Identifier length

SPARK identifiers have “unbounded” length, i.e. there is no truncation of any identifiers appearing in an FDL file when SPARK output is read.  This currently contrasts with SPADE, in which FDL identifiers are truncated to a maximum of 24 significant characters in SPADE VCG output.  The switch of behaviour is determined by the header in the .VCG file, as described in section 4.2 of the next chapter.

4                       Verification Condition Files

4.1               Integration with the SPADE and SPARK tools

As the Proof Checker is designed primarily for use with SPADE and the Examiner for SPARK, it is natural that it should accept as input files produced by these tools.  Consequently, the Checker assumes that the VCs presented to it have been generated by the SPADE VCG or the Examiner and expects a certain format, described in the next section.

On entering the Checker, the user is prompted to give a filename for the current session, unless this has already been provided on the command-line.  The filename should not include any extension, as the extensions .VCG and .FDL are each appended to this name in turn to give the names of the files containing the VCs (from the SPADE VCG or the Examiner) and the FDL type declarations applicable to the VCs respectively.  The proof log produced by the Checker is by default given a .PLG extension, and any command logs produced are by default given the extension .CMD.  Proof logs and command logs are each dealt with in chapter 8 of this documentation.

4.2               Format of a VC file

The VC file, assumed to have been generated by the SPADE VCG or the Examiner, must meet  certain requirements for the Proof Checker to be able successfully to scan its contents.  If you only use the SPADE VCG or Examiner to generate VC files for input to the Checker, the rest of this section will not be of any great interest to you.  If, however, you wish to create a VC file manually, you should do so in such a way that you produce a “forgery” of a SPADE VCG output file.  The basic requirements for this are:

1         The header text must consist of at least thirteen lines.  If the VCs were generated by the Examiner, line three of the header text should consist of a sequence of space characters, followed by the characters “Examiner”.  The presence of such a line three enables access to the SPARK array and record aggregate extensions discussed in sections 3.8.2 and 3.8.3 respectively, and allows the rules for these aggregate functions to be accessed.  Without such a line, the mk__array and mk__record identifiers may be used as ordinary FDL identifiers and other SPARK-specific behaviour (involving, for instance, the length of identifiers — see 3.8.4) will be inhibited. 

2         Each VC must be preceded by its name and number (its name consisting only of letters, underscores and digits, with the first character being a letter, and with the VC number separated from this name by an underscore character), terminated by a full-stop and a carriage-return.

3         In between two VCs there must be at least two blank lines.

4         Each hypothesis must be preceded by an “H” and each conclusion by a “C”, followed (without intervening characters) by a number and a colon, then zero or more spaces followed by the expression itself.  The “H” or “C” must be the first character on the line and there must be no blank lines between one hypothesis and the next or between one conclusion and the next.

5         Each hypothesis/conclusion expression of the VC must be a well-formed Prolog expression, terminated by a full-stop (“.”) and a carriage-return.

6         In the event that the VC itself has simplified to either “true” or “false”, the VC may be a single line, commencing with either *** true or !!! false as appropriate and must be followed by at least one blank line before the start of the next VC.  It should be noted that “true” VCs, which require no proof work by the user, are discarded automatically by the Proof Checker as they are read in, so that there may be gaps in the numbering of VCs if this occurs.  A message recording eliminated true verification conditions is issued to the terminal after the VCs have been read in, and similar messages appear at the start of the proof log for the session.

Examination of examples of VC files produced by the SPADE VCG and/or the Examiner should clarify these requirements further; alternatively, VCs may be presented in a saved format, which we consider next.

4.3               Format of a saved VC file

A saved VC file (extension .CSV) consists of a number of Prolog “facts” and is entirely in Prolog syntax.  This type of file needs no careful scanning by the Checker: it is simply consulted, so that the facts in the file are added as facts to the  Checker’s knowledge database.  It is therefore important that you do not attempt to modify a saved VC file.  Note that .CSV files generated by one version of the Proof Checker may not be compatible with another version: fact storage (and the facts to be stored) may change slightly from release to release, and attempts to use older .CSV files with a newer release of the Checker may result in anomalous behaviour.  Any partially-completed proofs should be resolved with the existing version of the Checker before a new version is installed in its place.

4.4               The FDL file

This file is read to provide the Checker with the inherited type information from the FDL (or SPARK source) program, for use by the Checker’s built-in type-checker.  The Checker reads the declarations part of the FDL file only: it stops once it reaches the start statement of the FDL text, or a free-standing end; as employed in the FDL files generated by the Examiner.  If you are using a valid FDL text (such as the one used to generate the VCs in the .VCG file) or an FDL file generated by the Examiner, the Checker will encounter no problems in reading the file.

5                       Proof Checker Commands

The Checker offers the user the following commands:

case CASE_NUMBER

consult FILENAME

declare

deduce FORMULA from HYPLIST

delete HYPLIST

done C#LO-UP & ...

execute FILENAME

exit

forceexit

forget HYPLIST

forwardchain HC#N

help COMMAND

infer FORMULA using RULES from HYPLIST

instantiate VARIABLE with EXPRESSION

list HC#LO-UP & ...

newvc NUMBER

printvc

prove FORMULA by cases on DISJUNCTION

prove FORMULA by contradiction

prove FORMULA by implication

prove FORMULA by induction

quit

remember HYPLIST

replace HC#N: OLD by NEW using RULES

save

set FLAG to VALUE

show

simplify

standardise FORMULA

status

traverse FORMULA

undelete HYPLIST

unwrap HC#N

In this chapter, we shall consider the syntax and use of each of these commands, while chapter 7 of this documentation covers some of the practicalities of using these commands to prove VCs.  First, however, we consider some general points on command and expression syntax.

5.1               Some points on syntax and use

5.1.1           Command syntax

In the following sections (and summarised in the Appendix) we give the general syntax of each command.  It is possible to omit one or more arguments in general: in this case, either the user will be prompted for the missing argument(s), or one or more of the argument(s) may be unnecessary and the Checker will proceed without these arguments.

A general rule of thumb is that each command is composed of a command verb, followed by zero or more arguments.  The general format of a command is:

commandverb Arg1 keyword1 Arg2 keyword2 ... keywordN ArgN+1.

Note that, for commands with more than one argument, each argument apart from the first follows a keyword.  If we wish to omit arguments, we may do so by omitting the rightmost one or more arguments, with their preceding keywords. Arguments can only be peeled off from the right: we cannot omit the Nth argument unless the (N+1)th... argument(s) are also omitted.  This is a general principle.

Another general principle about commands and arguments concerns the termination of a command.  A command with no arguments may be terminated by either a full-stop followed by a carriage-return, or by a carriage-return alone: this is the only circumstance in which the omission of the full-stop will not cause the Checker to prompt “|:” and wait for you to type one.  If one or more arguments are typed on the command line, the command verb must be followed by a space character (not a full-stop or a carriage-return, as then the Checker will think it is a command with no arguments!) and the sequence of one or more arguments may then be typed, terminated by a full-stop and a carriage-return.  The arguments-part of a command may be spread over several lines, with carriage-return then acting as a white-space character, so long as you remember to use a space character after the command verb (to indicate to the Checker that the command is to have arguments).  If you use a space character when you had not intended to type any arguments, the empty-list [] as single argument will act as though no arguments were typed.

If, in typing arguments to a command, you make a mistake which means that the argument(s) expression is syntactically invalid, you will be warned that this is the case by the underlying Prolog system within the Proof Checker.  In this case, if the command verb was recognised, you should retype the argument(s) without the command verb.  If, however, the command verb was rejected (by a “Command not recognised.  Please retype” message), you must retype both the command verb and any arguments.

In general, Prolog is case-sensitive.  Command verbs, however, may be typed in any combination of upper/lower case characters and may also be abbreviated as indicated in the Appendix.  Keywords (i.e. argument separators within commands), however, may not be abbreviated and must always be in lowercase only.

5.1.2           Expression syntax

Speaking almost circularly, an expression is a well-formed, fully-instantiated Prolog expression.  FDL expressions, with the changes of syntax described in chapter 3, and with all identifiers in lowercase, are examples of valid Checker expressions provided that, in the case of function applications (e.g. f(x)) there must be no intervening white-space characters between the function name and the opening-parenthesis (so f( x) is o.k., but f (x) is not).

Here are some examples of some simple expressions (assuming appropriate type, constant, variable and function declarations have been made in the FDL text):

a+b*c

to_be or not to_be

f(g(x, y), h(x, y))

x>=2 -> sqr(x)>=4

element(update(a, [i], x), [j])

upf_length(seqrec, fld_length(seqrec)+1).

As well as such simple expressions (FDL arithmetic, boolean, array, record operators and functions are predeclared, as are the operators for set and sequence types as covered in chapter 3), we may also use a where clause, essentially to save typing (and so avoid finger trouble!).  The format of an expression with a where clause is:

EXPR_WITH_VARS where BINDING1 & BINDING2 & ... & BINDINGn.

Here, EXPR_WITH_VARS is like a simple expression, but one or more Prolog variables occur within it.  (A Prolog variable is like an atom (examples of atoms are x, y, a1, cat, dog, to_be, quite_a_long_atom) except it begins with an uppercase letter.)  These Prolog variables are “instantiated” (i.e. given a value, that value itself being a simple expression) by the bindings.  Each binding BINDINGi may take either of two forms:

(1) PROLOG_VAR = SIMPLE_EXPRESSION

(2) HYPORCONC = PATTERN

Format (1) instantiates the Prolog variable PROLOG_VAR to the value SIMPLE_EXPRESSION, which is a simple expression (as above).  Format (2) matches the hypothesis/conclusion HYPORCONC (e.g. h#6, c#1) to the pattern PATTERN, which is like a simple expression, but one or more Prolog variables may occur within it one or more times in place of variable identifiers or constants.  To give you a feel for expressions with “where” clauses, some examples with their actual values are provided below.

 

EXPRESSION

SIMPLE EQUIVALENT

X+X where X=a*b+2

(a*b+2)+(a*b+2)

X=X where h#1=(X=0 or Y)

x*x-4*x+4=x*x-4*x+4

X -> Y where h#1=(X or Z) & Y=(x=2)

x*x-4*x+4=0 -> x=2

Note that, in the above examples, we have assumed the following VC:

H1:  x * x - 4 * x + 4 = 0 or x = 0

-->

C1:  x * x = 2 * x

Finally, note that “where” clauses go around the expression as a whole: it is not possible to have where’d subexpressions of an expression (i.e. (a+(B where B=3))*c is illegal: it should be (a+B)*c where B=3).  The “where” clause can be particularly useful when manipulating large expressions.

5.1.3           Rule specification syntax

In a “replace” or “infer” command, we can specify the rule(s) to be tried (either as one of the arguments on the command line or, if this is omitted, when prompted for it by the Checker).  The rules to be used may be specified by one or more basic rule patterns anded together with &.  Each basic rule pattern may have one of the following forms:

(1) FAMILYNAME(INTEGER)

(2) FAMILYNAME(LOWER-UPPER)

(3) FAMILYNAME

(4) PROLOG_VARIABLE

These forms are decreasingly specific.  The first form specifies a particular rule within a rule family (e.g. arith(4)), the second a range of rules within a family (e.g. arith(3-8) specifies arith(3), arith(4), ..., arith(8)), the third specifies a whole rule family (e.g. arith) and the last means any (apparently) suitable rule whatsoever.  Note that, when this last form is used, there is little point in &ing it with any other basic rule pattern, as all are subsumed by this wildcard rulename.  Thus, possible values for the rules to be used in a “replace” or “infer” command might be:

assoc & commut(1) & distrib(3-4)

inequals(60-80) & transitivity

ANYTHING

5.1.4           Keywords

A number of symbols and atoms are used by the Proof Checker as keywords; of these, four — by, using, with and where — are of a form that might conceivably be used as variables within an FDL text.  (The keyword “from” is an FDL reserved word and so could not be so used.)  These keywords are declared as infix operators, so that they can be used to bind multiple arguments of a command into a single expression, hence our command syntax.  You should therefore be careful about using these keywords in FDL texts which you wish to prove: if you use any of these keywords as variables, in some expressions it may be necessary to write the corresponding variable within parentheses (e.g. sofar=(where)) to avoid confusion over operator precedences.

5.1.5           Notational conventions

The Checker allows you to refer to hypotheses and conclusions directly by the form h#N or c#N, where N is an integer.  Note that the h or c must be lowercase and the # (hash/sharp) symbol is compulsory (since h1 without the # might be a program identifier).  Some commands accept ranges of hypotheses and/or conclusions, which may be specified by h#N-M or c#N-M, where N and M are integers (and N<=M).  Also, these ranges can often be joined together, e.g. h#2-4 & h#7 & h#11-14 is shorthand for hypotheses 2, 3, 4, 7, 11, 12, 13 and 14.

5.1.6           Interrupts

If you type control-C before the Checker has finished reading the .FDL and .VCG (or .CSV) files, it will abort to the operating system level.  If, however, you type control-C during normal use, you should return to the “CHECK|:” prompt.  In this latter case, you should be extremely careful about use of control-C, as you may interrupt at a critical moment when the Checker is updating the VC and destroy vital information.  In addition, it is possible that control-C may leave the Checker in an unusable state, which may result in the loss of one’s proof (though reconstruction via execution of an edited version of the command-log may be possible — see 6.2).  You are therefore strongly advised to avoid use of control-C altogether: it may be used to interrupt a rule search by the “infer” or “replace” commands, but in fact this can be achieved more safely by waiting for twenty matches and then selecting the “abandon search” option when prompted by the Checker (as documented in sections 5.4.1 and 5.5.1).

5.1.7           Calls to the operating system

You may execute a command as though it had been typed at the operating system prompt at any time you see the “CHECK|:” prompt by preceding the command by a “$” (dollar) symbol.  When the command terminates, you will be given a “|:” prompt, but you will still be within the Checker as though a “CHECK|:” prompt had appeared.

5.1.8           Invoking the checker from the operating system

The Proof Checker may be invoked by typing the command “checker” (or an alternative local alias of your choice) on the command line.  If this is typed, but no other information follows on the command-line, the Checker is invoked and prompts for a filename to use (as in version 1.1 and earlier).  It is possible, however, to specify optional qualifiers and the filename on the operating system command-line.  The syntax which should be used is:

checker [ [ -qual1 ] [ -qual2 ] ... [ -qualn ] filename ]

Each qualifier qualn may be one of the following forms:

help

version

command_log=filename

execute=filename

proof_log=filename

resume

plain

overwrite_warning

All of the command-line qualifiers are optional and so may either be present or absent.  Order within qualifiers does not matter.  If given, the filename should be typed without extension (to read in .FDL and .VCG or .CSV files): the appropriate extension(s) will be added according to whether or not the resume qualifier is present.  If, however, one wishes to read in a VC file which has been simplified by the SPADE Automatic Simplifier, the filename should be typed with extension .SIV (the extension used by the Simplifier for its output file) and the resume qualifier should not be used; in this case, the .FDL and .SIV files are read into the Checker to be used for the proof session.  The qualifiers may be abbreviated to their initial letters, however the “=” must not be omitted where required.  Hence -p=a.plg refers to the command proof_log, whereas -p invokes plain. We shall now consider each qualifier in turn.

The help qualifier instructs the Checker to summarise its command-line qualifiers on the screen and exit.

The version qualifier instructs the Checker to report its version details on the screen and exit.

The command_log=filename qualifier may be used to redirect the Checker’s command-logging output (see Chapter 8) to a different file: by default, it is written to name.cmd, where name is the filename from which the VCs were read (i.e. NAME.VCG, NAME.CSV or NAME.SIV as appropriate for this proof session).

The execute=filename qualifier may be used to invoke the Proof Checker in “batch” mode, taking its input from the named file instead of from the user.  Input will be read from the named file and obeyed as though the user had typed the text (the execution being echoed to the screen as usual) until either an exit command is executed and the proof session terminates or the end of the file of commands is reached.  In the latter case, control then reverts to the user, and interactive proof may be commenced from where the batch session has left off.  (Use of this qualifier is similar to the execute command, though this will play from the very beginning, rather than from after the user has selected (interactively) the number of the first VC to be worked on, which is always prompted for when a non-resumed proof session is commenced.)

The proof_log=logname qualifier may be used to specify an alternative filename for the proof log to be generated by the proof session.  By default, when this qualifier is not present, the proof log is written to a file with the same name as the file(s) (either FDL and VCG or CSV) read in, but with extension .PLG.  The filename specified after -proof_log= (or -p=) may either include or omit a file extension.  If the extension is omitted entirely, the default extension .PLG is added to the logname specified.

The resume qualifier allows a previously saved proof session to be resumed.

The plain qualifier affects the output of the .plg file.  The date and time fields are suppressed, and the location of rules referenced is not given in full.  This enables easier comparison of .plg files.

The overwrite_warning qualifier switches on a warning to alert you when old command log or proof log files are deleted.  If the Checker needs to create a command log file and there is already a file in the directory of that name, it will move it.  So, for example, name.cmd is moved to name.cmd-.  If both files are already in the directory, the Checker will delete the name.cmd- file first, then move the existing name.cmd file to name.cmd-, leaving it free to create a new name.cmd file for the current command log.  The same behaviour exists for the proof log.  Overwrite_warning will consult the user before deleting the file rather than deleting it silently.

 

5.2               Proof management commands

5.2.1           The newvc command

This command takes the form

CHECK|: newvc NUMBER.

If NUMBER is present, this VC is copied afresh into the current workspace, otherwise the user is prompted to give a VC number from the list of those loaded but not yet proved.  If no unproven VCs remain, no VC will be loaded.  If you are already working on the proof of a VC, and this has not yet been completed, the Checker will first prompt you to confirm that you wish to abandon the current proof attempt; if you reply yes (case irrelevant, may be abbreviated e.g. to y or Y), the newvc command will proceed (and overwrite the current proof attempt), otherwise you should answer no (case irrelevant, may be abbreviated to n or N) in which case no action will be taken by newvc and you will instead be returned to the Checker command processor and may continue with the current proof attempt.  Whenever newvc loads a new VC, this is listed on the screen.  (When the auto_newvc flag is set to on — see section 5.2.6 — the newvc command is not normally needed, since the Checker automatically performs a newvc each time a VC is proved.)

5.2.2           The consult command

With this command, you can declare a new file of user-defined proof rules to the Checker.  The Checker scans through the file, checking that each rule is well-formed, in accordance with the requirements in chapter 5 of the SPADE Proof Checker Rules Manual. You may either type the command on its own, in which case you will be prompted for a filename, or you may type the filename (in single quotes) as the single argument to this command.  Either way, the filename should be a Prolog atom.  You should enclose the filename, including any extension, in single quotes: no default file extension is assumed.  A non-existent file will result in an error message, but you may then proceed as normal.

Note that this command can also be used in an initialisation file (see chapter 6).

Windows users should note that / should be used as a directory separator instead of \ – e.g. “consult ‘../../rules/mod.rul’” should be used instead of “consult ‘..\..\rules\mod.rul’”

5.2.3           The forget & remember commands

The forget command tells the Checker to cease displaying one or more hypotheses when list is used to view the current VC; it has no effect on the provability of any formula, as the knowledge embodied in the forgotten hypothesis or hypotheses is still retained.  To forget a single hypothesis, e.g. hypothesis number 3, the form

CHECK|: forget h#3.

may be used, while to forget a number of hypotheses simultaneously, a list of hypothesis numbers may be used as the argument, e.g.

CHECK|: forget [2,7,4].

forgets hypotheses 2, 4 and 7 simultaneously.

The remember command may be used, with identical argument patterns, to remember one or more hypotheses previously forgotten through use of the forget command.

5.2.4           The delete & undelete commands

The delete command tells the Checker to cease using one or more hypotheses in attempting to prove things.  The Checker will also cease displaying any deleted hypotheses when list is used.  To delete a single hypothesis, e.g. hypothesis number 3, the form

CHECK|: delete h#3.

may be used, while to delete a number of hypotheses simultaneously, a list of hypothesis numbers may be used as the argument, e.g.

CHECK|: delete [2,7,4].

deletes hypotheses 2, 4 and 7 simultaneously.

The undelete command may be used, with identical argument patterns, to restore one or more hypotheses previously deleted through use of the delete command.

5.2.5           The declare command

The declare command allows the user to make any valid type, constant, variable or function declaration which might otherwise have been included in the .FDL declarations.  The declaration must be entirely in the form of a valid FDL declaration (see SPADE FDL Manual); the word “proof” is not needed, as all declarations made in this way are regarded as proof declarations.

The only permissible format for this command is:

CHECK|: declare.

When you have typed this, you will be prompted for the declaration; you should type it exactly as though it were an FDL declaration, terminating it with a semi-colon rather than the customary full-stop.

In the (somewhat peculiar) eventuality that you should wish to include a comment within such a declaration, the FDL comment delimiters { and } should be used (rather than the Prolog /* and */).

5.2.6           The set command

This command can be used to set any of the user-configurable flags within the Checker to a new value.  It can be used as an interactive command, but its most common usage is within the user’s global/local Checker initialisation file, checker.ini (see chapter 6).  The flags configurable by the user are:

auto_done

This flag affects the behaviour of the done command.  If this flag is on, a done command issued directly by the user will not only check whether or not the given conclusion(s) have been proved, but also (if the command was issued without any arguments) if there are any remaining conclusions within the current proof frame.  If there are no more left, done will exit from the current proof frame automatically, and carry out the same check within the earlier proof frame into which it has returned, and so on until it has exited automatically to as early a proof frame as possible.  With auto_done off, done will only exit one frame at a time, and only when specifically instructed to do so by the user.

auto_infer_from_false

When this flag is set to on then the done command will discharge a VC if one of the hypothesis is false. The default value is on.

auto_newvc

If this flag is set to on (its default value), then whenever a VC’s proof is completed by the user and there are still more VCs left to prove, the Proof Checker will behave as though a newvc command (with no arguments) had been typed, prompting the user to choose which VC to prove next.  When this flag is set to off, newvc must be typed manually to move on to the next VC — this corresponds to the Checker’s behaviour in releases 1.2 and earlier.

command_logging

When this flag is set to off, the current command log-file (if any) is closed.  When this flag is set to on, commands typed by the user are echoed into a log file which may be used to replay the proof session (or parts of it, after editing) subsequently via the execute command or command-line qualifier.  Note that if logging is on, and the user turns it off, the set command_logging to off. command will be the last entry in that log file.  If the user interactively turns logging on via a set command when it was formerly off, the first entry in the new log file is set command_logging to on. to reflect this, even though this command would not logically otherwise appear in the command log.  Note that if logging is turned off then back on, output is not appended to the earlier log, but overwrites the earlier log file. A “$” command to the operating system (see section 5.1.7) may be used to rename the old log file before this happens if it is desired to avoid such naming conflicts.

display_subgoals_max

This flag takes a non-negative integer as its value.  It affects the “verbosity” of the Proof Checker during a rule search.  When a rule match is found, a message displaying the match on the user’s screen will only be shown if the number of unsatisfied subgoals of this match is less than or equal to the value of this flag.  Thus, by giving this flag a low number, only “close” matches will be shown, while by giving it the value zero, only exact matches will be displayed.

display_var_free_only

This flag determines the type of rule match to be displayed.  If it is off, all matches found (whose number of unsatisfied subgoals is less than the value of display_subgoals_max) will be displayed, while if it is on, only those matches whose subgoals are all free of Prolog variables, i.e. for which the match was sufficiently close to instantiate all Prolog variables in the rule, will be displayed.

echo

If echo is on, the contents of the .FDL and .VCG files will be displayed on the screen as they are read; if it is off, no such display will occur.  This flag has no effect when loading a saved VC file with -resume, and clearly there is no point in setting its value interactively (as by the time you are using the Checker interactively, the reading of the .FDL and .VCG files will already have occurred).

indentation_increment

The non-negative integer value of this flag determines the number of additional spaces by which the proof log will be indented on entering a new proof frame.  If it is zero, no indentation will occur.  It is prudent to set the value of this flag only within an initialisation file, as changing it interactively may give a somewhat unusual proof log.

inverse_video

This flag may be set to a list of zero or more integers in the range 1..127.  Each integer will be interpreted as an ASCII character code.  This sequence of characters will be written to the terminal whenever the Checker wishes to employ inverse-video mode.  For DEC VT100, VT200, VT300 and VT400 terminals, the setting of [27,91,55,109] should work.  You may set this flag to another character-code sequence if necessary or, if you wish to inhibit inverse video entirely, use the default setting of [], the empty list.

newline_after_prompts

With this flag set to on, every prompt for user input – either of a command, or of command-arguments or “yes/no” responses – will be followed by a newline (carriage-return).  This is to support the use of the Proof Checker within editor windowing environments, where the editor needs to have visibility of the prompt in order to identify that the next command or response may now be submitted.  The default value for newline_after_prompts, for normal interactive use outside such environments, is off.

normal_video

This flag may be set to a list of zero or more integers in the range 1..127.  Each integer will be interpreted as an ASCII character code.  This sequence of characters will be written to the terminal whenever the Checker wishes to revert from inverse-video mode back into normal-video mode.  For DEC VT100, VT200, VT300 and VT400 terminals, the default of [27,91,48,109] should work.  You may set this flag to another character-code sequence if necessary or, if you wish to inhibit inverse video entirely, use the default setting of [], the empty list.

prooflog_width

This flag may be set to an integer value.  It may be set to zero, in which case proof logging will take less time but the proof log for large formulae may contain lines of up to 512 characters in width.  If it is not set to zero, it must be given a value in the range 80..255 inclusive, and the proof log will then write out formulae in such a way that they wrap around within the specified line width.  When prooflog_width has a value within the range 80..255, however, you may notice slightly longer pauses after commands, especially after a single replace command in which several substitutions have been carried out one after another.  For more complex proofs, it is recommended that you stick to a prooflog_width of zero.  (It is intended to phase this flag out in a future Checker release, leaving the task of proof log wrapping to a separate postprocessor implemented in an imperative language.)

record_consults

This flag determines whether uses of the consult command are recorded in the proof log generated by the Proof Checker.  The default is on, causing each consult command to be logged as a proof step within the proof log file.

replace_more

The flag determines if the user is prompted for more replacements after a replace command. The default value is off.

show_vc_changes

With this flag set to on (the default), any changes to the hypotheses and/or conclusions of the current VC resulting from the execution of a Checker command are displayed on the screen.  Thus, with this flag on, execution of a successful infer command will display on the screen the new hypothesis which is added to the VC by the inference.  Equally, using a simplify, or a replace-all  command will show each hypothesis/conclusion which is modified in some way by execution of the command.  If this flag is turned off, these changes are not echoed to the screen and the Checker’s behaviour reverts to its less verbose version 1.2 and earlier behaviour.

simplify_during_load

If this flag is on, the Checker will simplify each hypothesis/conclusion as it is read in from the .VCG file.  If it is off, the file reading will be quicker but no simplification of the VCs will have taken place.  (This flag has no effect if a .CSV saved VC file is loaded with the resume qualifier.)

simplify_in_infer

If this flag is true, the Proof Checker will attempt to use its built-in expression simplifier whenever it uses its inference engine.  The result is that, if it is on, the inference engine will be more powerful, but perhaps slower (typically by around 10 to 20 per cent, though this may vary markedly depending upon VC and expression complexity).  If it is off, the inference engine will, conversely, be faster but dumber, e.g. the goal element(update(a, [i], x), [i]) = x will succeed if this flag is enabled, but fail if it is not.

typechecking

It is strongly recommended that this flag should always be on.  It affects the actions of the Checker when satisfying rules.  Its effect on rule searches is that if it is on, only rules which typecheck properly (i.e. with all subgoals in the conditions list of type boolean and with the goal to be inferred of type boolean for an inference rule or, for a substitution rule, with the left and right hand sides of the rule typechecking to the same type) will be used, while if it is off, any rule whose conditions can all be satisfied will be used, though even with it off the Checker will not use integer-only inequalities rules to infer real inequalities.  With typechecking off, there is a marginal increase in speed, but a risk of violating the type constraints inherited from FDL.  Given the efficiency of the typechecker, this marginal speed benefit is unlikely to justify setting this flag to off except perhaps when handling extremely large formulae.

typechecking_during_load

This flag defaults to on.  However, if you are confident that all of the formulae in the .VCG file being read will typecheck correctly to boolean (because the file was generated by the SPADE VCG and has not been altered in any way, for instance), you may set this flag to off in the checker.ini file. The VC-reader will then operate more quickly, as it will not check that each hypothesis and conclusion typechecks to boolean.  If you use this with a .VCG file which does not typecheck validly, you may end up with VCs which you cannot manipulate properly, since other Checker commands will also try to enforce typechecking and may then fail.  The value of this flag is only relevant when reading a new .VCG file and so should only be set in a checker.ini file: there is no point setting it interactively, and it will have no effect when restoring a saved image (.CSV) file with the resume qualifier.

use_subst_rules_for_equality

If this flag is set to on (the default setting), the infer command will also be able to use substitution rules to attempt to satisfy formulae of the form X=Y, as described in section 5.4.1 below.  If you set it to off, the infer command will only use inference rules and will ignore substitution rules, thus reverting to its old (version 1.0/1.1) behaviour.

5.2.7           The show command

The show command displays the current settings of the various user-modifiable flags discussed in the previous section.  It also gives statistics indicating the amount of CPU time (and garbage collection time) used so far, the amount of memory used and the amount of memory currently free.

5.2.8           The execute command

The execute command may be used to “replay” a sequence of commands and responses recorded in a separate file.  This file may be a (pre-edited section of a) command-log file generated from a previous proof session, or it may be generated entirely manually, though in the latter case it is difficult to achieve a correct file, as one must correctly anticipate all replies requested by the Proof Checker and include these at the appropriate places in the file to be executed.

The execute command takes a single argument, which is the name of the file to be executed.  This should be typed in full, within single-quotes (as with the consult command, section 5.2.2).  This argument may either be typed on the command-line, after the “execute” command, or it may be omitted from the command-line, in which case the Proof Checker will prompt for the name of the file to be used.

As noted in section 8.2, command-files may themselves contain further execute commands which, when encountered, will be executed “in-line” before execution of the enclosing command-file is resumed.  Suppose,  for instance, that file A.CMD contains a call to execute file B.CMD (among other things); then issuing an execute ‘A.CMD’ command results in the following execution messages as the commands of each are replayed:

CHECK|: execute 'A.CMD'.

<<< Commencing command script A.CMD >>>

...

<<< Commencing command script B.CMD >>>

...

<<< End of command script B.CMD >>>

<<< Resuming command script A.CMD >>>

...

<<< End of command script A.CMD >>>

<<< Returning to interactive input >>>

CHECK|: _

Refer to section 8.2 and the command_logging flag of subsection 5.2.6 for more on command logs and the replaying of command scripts.

5.2.9           The save command and resume qualifier

The save command may be issued to save a copy of the current VC state to disk.  It will be saved with the same filename as you typed at the start of the current proof session, with the file extension .CSV.  This command requires no arguments and thus takes the form:

CHECK|: save.

You may resume a saved proof session by starting the Proof Checker with the resume command-line qualifier.

5.2.10       The exit command

The exit command terminates the current session with the Checker and returns the user to the operating system.  It may not be abbreviated.  If there are still unproven VCs remaining from the current proof session and non-trivial commands have been issued since the last save command (if any), the Checker will prompt first before exiting:

There are still VCs to prove and you have not “save”d recently.

Do you still wish to exit (yes/no)? _

to which you should type yes (or y) if you wish to exit immediately, or no (or n) if you wish first to issue a save (or any other) command before exiting back to the operating system.

5.2.11       The forceexit command

The forceexit command acts in the same way as the exit command, but there is no prompt before exiting.  In addition the forceexit command can be used at the Filename? prompt.

5.3               Listing & viewing commands

5.3.1           The list command

This command, without any arguments, lists the current visible hypotheses and conclusions of the VC in numerical order.  It can also be used with an argument, however.  In this case, the argument must be one or more hypothesis or conclusion ranges and/or keywords forgotten and deleted &’ed together, e.g.

CHECK|: list h#1 & c#1-4 & h#3-7 & c#6 & deleted.

Hypotheses which have been forgotten or deleted with the forget and delete commands respectively (q.v.) are not normally visible with the standard list command.  They may, however, be viewed with list by using the keywords forgotten and/or deleted as arguments to list.  These keywords may be abbreviated as required (though lowercase only should be used), and may also be chained together with themselves and other hypotheses/conclusions as in the above example.

5.3.2           The printvc command

The printvc command takes no arguments and may be used to print out the current VC to a file. This can be particularly useful when in a complex proof attempt and/or with a large VC, since list information is quickly lost from a VDU screen but the VC may have changed significantly from that present in the .VCG file during the proof.

The printvc command works by creating a file called VCn.LIS in the current default directory (to which the user must have write access), where n is the VC number.

The VC listing printed by printvc contains the information which would be displayed on the terminal by list, plus a listing of all forgotten and deleted hypotheses and a summary of the current proof state: how many steps have been taken (the step number used in the proof log), what proof frame depth the proof attempt is at and a traceback of all previous proof frames.

5.3.3           The traverse command

This command may be used to “traverse” an expression tree, thus providing a means of exploring complex expressions (typically hypotheses or conclusions, or components thereof) interactively, to determine their main operators/functions (“functors”) and arguments.

As an example, consider the expression


- (max15 + 1) <= wordtoint(a) + wordtoint(b) and wordtoint(a) +

wordtoint(b) <= max15 and wordtoint(indstore(r1, address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b)))) = wordtoint(a) + wordtoint(b) or wordtoint(a) + wordtoint(b) < - (max15 + 1) and indstore(r1, address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b))) = hextoword(hex_8000) or wordtoint(a) + wordtoint(b) > max15 and indstore(r1, address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b))) = hextoword(hex_7fff).


It takes some effort to determine that this expression is a disjunction (... or ...), and some knowledge of operator predecence and associativity to determine which or is the “principal functor” (i.e. topmost operator or function of the expression-tree) of the expression.

To make the handling of such large expressions easier, the traverse command provides facilities for examining the principal functor and arguments of an expression, and to navigate around components of the expression tree examining these in turn.  To enter the traverse command, one types the command

CHECK|: traverse.

or

CHECK|: traverse EXPR.

where EXPR is h#N to traverse hypothesis N, c#N to traverse c#N, or some other expression to be traversed (such as X where h#2=(X or Y), for instance).  One is then in the traverse “environment” and may type traverse environment commands to navigate around the expression and its subexpressions.  Commands which may be typed to the traverse tool are single letters (though, in the case of d (down), a positive integer must follow the command to indicate which argument is to be descended.  (The only exception to this is when there is only a single argument of the current functor, in which case the 1 may be omitted).  More than one command may be typed at a time in answer to the

Traverse-Command>>|:

prompt, and these will be evaluated in left-to-right order.  Spaces, tabs and full-stops may be typed before, after and/or between commands; these are ignored, as is case: the traverse environment is not case-sensitive.

The traverse commands which may be issued are:

 

U or -

UP (ascend expression tree a level)

Dn or +n

DOWN (descend argument n; n may be omitted if only 1 argument)

h or ?

HELP (display brief summary of traverse environment commands)

s

SHOW (displays toplevel expression being traversed)

l

LOCATE (displays location of current subexpression in toplevel expression)

t

TYPE (displays typechecker information about current subexpression)

r

REDISPLAY (redisplays current subexpression information)

x

EXIT (exit from traverse environment back to Checker command processor)

 

Locate l may optionally be followed by b (brief) or f (full); “brief” is the default.  “Locate brief” abbreviates other subexpressions of the toplevel expression to ... in the display of current position, while “full” does not.  To illustrate these different commands, consider the following example dialogue:


CHECK|: traverse c#1.

Depth: 0,       Trace: []

*** - (max15 + 1) <= wordtoint(a) + wordtoint(b) and wordtoint(a) +

wordtoint(b) <= max15 and wordtoint(indstore(r1, address_a,

addword(indread(r1, address_a, a), indread(r2, address_b, b)))) = wordtoint(a)

+ wordtoint(b) or wordtoint(a) + wordtoint(b) < - (max15 + 1) and indstore(r1, address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b))) = hextoword(hex_8000) or wordtoint(a) + wordtoint(b) > max15 and indstore(r1, address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b))) =

hextoword(hex_7fff)

Principal functor: or  (2 arguments)

ARG1: - (max15 + 1) <= wordtoint(a) + wordtoint(b) and wordtoint(a) + wordtoint(b) <= max15 and wordtoint(indstore(r1, address_a,

addword(indread(r1, address_a, a), indread(r2, address_b, b)))) = wordtoint(a)

+ wordtoint(b) or wordtoint(a) + wordtoint(b) < - (max15 + 1) and indstore(r1, address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b))) =

hextoword(hex_8000)

ARG2:  wordtoint(a) + wordtoint(b) > max15 and indstore(r1, address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b))) = hextoword(hex_7fff)

Traverse-Command>>|: d1

Depth: 1,       Trace: [1]

*** - (max15 + 1) <= wordtoint(a) + wordtoint(b) and wordtoint(a) +

wordtoint(b) <= max15 and wordtoint(indstore(r1, address_a,

addword(indread(r1, address_a, a), indread(r2, address_b, b)))) = wordtoint(a)

+ wordtoint(b) or wordtoint(a) + wordtoint(b) < - (max15 + 1) and indstore(r1, address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b))) = hextoword(hex_8000)

Principal functor: or  (2 arguments)

ARG1:  - (max15 + 1) <= wordtoint(a) + wordtoint(b) and wordtoint(a) + wordtoint(b) <= max15 and wordtoint(indstore(r1, address_a,

addword(indread(r1, address_a, a), indread(r2, address_b, b)))) = wordtoint(a)

+ wordtoint(b)

ARG2:  wordtoint(a) + wordtoint(b) < - (max15 + 1) and indstore(r1,

address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b))) = hextoword(hex_8000)

Traverse-Command>>|: d1

Depth: 2,       Trace: [1, 1]

*** - (max15 + 1) <= wordtoint(a) + wordtoint(b) and wordtoint(a) +

wordtoint(b) <= max15 and wordtoint(indstore(r1, address_a,

addword(indread(r1, address_a, a), indread(r2, address_b, b)))) = wordtoint(a)

+ wordtoint(b)

Principal functor: and  (2 arguments)

ARG1:  - (max15 + 1) <= wordtoint(a) + wordtoint(b) and wordtoint(a) + wordtoint(b) <= max15

ARG2:  wordtoint(indstore(r1, address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b)))) = wordtoint(a) + wordtoint(b)

Traverse-Command>>|: l

*HERE* or ... or ...

Traverse-Command>>|: d2

Depth: 3,       Trace: [1, 1, 2]

*** wordtoint(indstore(r1, address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b)))) = wordtoint(a) + wordtoint(b)

Principal functor: =  (2 arguments)

ARG1:  wordtoint(indstore(r1, address_a, addword(indread(r1, address_a, a), indread(r2, address_b, b))))

ARG2:  wordtoint(a) + wordtoint(b)

Traverse-Command>>|: t

Type information for function =:

integer = integer: boolean.

Traverse-Command>>|: d2

Depth: 4,       Trace: [1, 1, 2, 2]

*** wordtoint(a) + wordtoint(b)

Principal functor: +  (2 arguments)

ARG1:  wordtoint(a)

ARG2:  wordtoint(b)

Traverse-Command>>|: t

Type information for function +:

integer + integer: integer.

Traverse-Command>>|: d1

Depth: 5,       Trace: [1, 1, 2, 2, 1]

*** wordtoint(a)

Principal functor: wordtoint  (1 argument)

ARG1:  a

Traverse-Command>>|: t

Type information for function wordtoint:

wordtoint(word): integer.

Traverse-Command>>|: d

Depth: 6,       Trace: [1, 1, 2, 2, 1, 1]

*** a

Principal functor: a  (atomic object: leaf of expression tree)

Traverse-Command>>|: l

 ... and ... = wordtoint(*HERE*) + ... or ... or ...

Traverse-Command>>|: u

Depth: 5,       Trace: [1, 1, 2, 2, 1]

*** wordtoint(a)

Principal functor: wordtoint  (1 argument)

ARG1:  a

Traverse-Command>>|: x

CHECK|:


5.3.4           The help command

This command may be used to display either a general description of the commands and explanations available, or a specific explanation of a particular command.  For general advice, the command is

CHECK|: help.

while for explanation of a particular command, e.g. infer, the format is

CHECK|: help infer.

The command on which help is required may be abbreviated in the same way as on the command-line; the only difference is that the command must commence with a lowercase letter (though both lower- and uppercase may be used thereafter).

As well as help text on Checker commands, you can also access helpful details about the built-in rule families.  For general advice about the rule families, type

CHECK|: help rules.

(rules may be abbreviated to rul) while for specific advice about a particular rule family xxx, use the form

CHECK|: help rules = xxx.

Rule family names may also be abbreviated in many instances, provided the help command can determine which rule is to be described.  Thus, the standardisation rule family may be abbreviated to stan (since no other built-in rules begin with these letters).  If in doubt, try a sensible abbreviation.  Finally, if you type help xxx and the Checker does not recognise xxx as an abbreviation for a command, help will search its rule-library help as though you had typed help rules = xxx, thus increasing the chances of finding some suitable information.

If a block of help-text to be displayed for a particular command or rule family exceeds 20 lines in length, it is displayed in pages of 20 lines at a time, with a pause between each page.  To continue from the pause, you must hit the Return key, as prompted on screen.

5.4               Inference commands

5.4.1           The infer command

The infer command, invoked by giving the Checker a goal such as

CHECK|: infer x+y>0 using transitivity from [2,3].

searches for a rule satisfying the user’s constraints which can be used to infer the goal specified by the user directly or, failing such success, for such a rule which may be used to “subgoal”, should the user so desire.

If the optional from HYPLIST clause is present in the command, only the hypotheses listed may be used to satisfy any of the subgoals of any rule which matches (i.e. the Checker behaves as though all those hypotheses not listed have been temporarily deleted with the delete command (q.v.)).  The rulename may be any standard rulename specification and the expression to be inferred may be any standard expression specification.  If either of these arguments is omitted (note: the expression can only be omitted if all other arguments are omitted, as required by the left-right convention), the user will be prompted for the missing argument(s).

Once the user has specified the goal formula and the rulename pattern, the Checker searches through the rules database, finding rules with both a rulename matching the specified pattern and a goal pattern matching the given goal formula.  For each match found, the Checker then attempts to satisfy the list of conditions associated with the rule.  Either it succeeds for one of these rule matches, in which case we get a message such as

*** x>0 by transitivity(24)

or no such match occurs.  If the Checker finds 20 rule matches, none of which fully succeeds, it will prompt the user:

%%% TWENTY MATCHES FOUND: Do you wish to continue the search?

    Type Y(es) to continue search, N(o) to abandon it...

 

    Continue search (yes/no)? _

If you wish to continue, type yes, otherwise type no and further searching will be abandoned.  (Yes/no replies may be abbreviated, and the case does not matter: a full-stop terminator is not essential, as the carriage-return terminates the reply.)  If yes is selected, the search will continue, but it will again pause in this way if yet another twenty matches are found; this continues until the search either terminates (with a successful match, or through exhaustion of possible matches) or is abandoned.

If the search finds matches which are fully-instantiated, but none of which succeed without unproven subgoals, the user is asked:

Do you wish to subgoal (yes/no)?

and must reply either yes or no.  In the case of a no reply, the attempted inference is abandoned; in the yes case, the Checker displays each possible rule match in turn, together with the list of conditions which cannot be satisfied directly by the Checker at present, and asks whether or not we wish to use the given instantiation to subgoal, until either there are no more matches, in which case the attempted inference is abandoned, or the user answers yes, in which case the Checker enters a new proof frame in which the unsatisfied conditions of the rule selected are added as conclusions; if the user subsequently proves all of these subgoals, the frame is exited and the formula on which the user subgoaled is added as a new hypothesis to the VC.  Only rule matches which give a list of conditions without any uninstantiated Prolog variables are allowed for subgoaling: we are not permitted to introduce a “wildcard”, in the form of a Prolog variable, into the VC.

As an example of subgoaling, consider the VC

H1:  x > 0

H2:  y > 0

-->

C1:  (x * x) * y > 0.

An attempt to infer C1 directly from the VC as it stands might lead to the following interaction:

INFER -- Formula? c#1.

Rulename (or X)? inequals.

 ...

Do you wish to subgoal? yes.

 ...

inequals(63): (x * x) * y > 0 may_be_deduced_from [x * x > 0]

Use this rule (yes/no)? yes.

CHECK|: list.

H1:  x > 0

H2:  y > 0

-->

C1:  x * x > 0

and we may then infer C1 from H1, in which case we have proved our original conclusion, namely that (x*x)*y>0.

With the use_subst_rules_for_equality flag (described in section 5.2.6 above) set to on (its default value), infer will also attempt to make use of substitution commands when given a goal of the form X=Y to infer.

5.4.2           The deduce command

The deduce command, referred to earlier in this chapter, uses a truth table approach to logical deduction.  It may be invoked by the goal

CHECK|: deduce c#4 from [2,5,6].

If one or more arguments are omitted, the user is prompted for it/them.  As an example of deduce in operation, consider the VC

H1:  x > 0

H2:  x > 0 -> f(x) > 0

-->

C1:  f(x) > 0

we can use deduce by:

CHECK|: deduce.

DEDUCE -- Type formula to deduce.

DEDUCE -- Formula? c#1.

Hypotheses to be used? [1,2].

*** x>0 and (x>0 -> f(x)>0) -> f(x)>0 by logical deduction

*** PROVED C1: f(x)>0

but if instead we have

H1:  x > 0

H2:  x >= 0 -> f(x) > 0

-->

C1:  f(x) > 0

we find

CHECK|: deduce.

DEDUCE -- Type formula to deduce.

DEDUCE -- Formula? c#1.

Hypotheses to be used? [1,2].

*** FAILED.

This is because, in the former case, the formula passed to the truth table deduction tool is

(A and (A -> B)) -> B,

which is logically true for any formulae A, B, whereas in the latter case, the formula is

(A and (B -> C)) -> C,

which is not logically true: the failure is because the truth table tool does not make any logical connection between the distinct atomic formulae x>0 and x>=0.  We can only use logical deduction in this latter case by first inferring a new hypothesis

H3: x >= 0

from the existing hypothesis

H1: x > 0

and then using deduce with [2,3] (or even [1,2,3], though this is less efficient) as the list of hypotheses to be used in the logical deduction.

5.4.3           The forwardchain command

This command is especially useful when a call to the standardiser has introduced an implication or logical equivalence into the VC.  It attempts to fulfil the conditions on the left of an implication (or, in the case of a logical equivalence, on one or other side) and, if successful, asks the user if the expression on the other side of the relevant logical connective should be saved as a new fact in place of the previous implication (or equivalence).  The command may either be issued with an argument, as in

CHECK|: forwardchain h#7.

 or

CHECK|: forwardchain c#2.

or it may be issued direct, without arguments, as in

CHECK|: forwardchain.

in which case the user is prompted to specify which hypothesis or conclusion is to be used.  If a conclusion is chosen, the Checker looks for a hypothesis in which this conclusion is on the right of an implication (or on either side of an equivalence), while if a hypothesis is selected, it must be an implication or equivalence.

5.4.4           The done command

The done command checks to see if any of the current conclusions of the VC can be inferred using the Checker’s inference engine and, if so, these conclusions have been proved in the prevailing circumstances (e.g. taking into account any active proof by cases or other proof frames).  The command form

CHECK|: done.

will try each conclusion of the VC and, on finding a successful match will report this, while the more specific form

CHECK|: done c#3 & c#7-11.

checks only the conclusion numbers given (in this example, C3, C7, C8, C9, C10 and C11).  This command may be called by the user when required, in which case either a successful match will be found and reported, or the Checker will report

*** Cannot deduce any conclusions directly at present

or the Checker may implicitly call this command itself, after a successful call to another Checker command, to update its knowledge of the current state of the VC.  If the auto_done flag is on (see the set command), done without any arguments will be more helpful; if it is off, you may have to type done more than once to exit from nested proof frames.

5.5               Substitution & simplification commands

5.5.1           The replace command

The replace command may be used to replace a subexpression of one of the current hypotheses or conclusions by another expression, provided that the replacement may be justified by one of the replacement rules in the rules database.  The user may supply patterns for the appropriate expressions and may also supply a specification of the name(s) of the rule(s) to be used.

The command may be invoked on the appropriate hypothesis or conclusion by a command such as

If any of the arguments are omitted, the user is prompted to supply them by a sequence of questions and answers; we shall consider a typical sequence when working on C1 of the VC

H1:  k = i + 1

H2:  k = x * y

-->

C1:  gcd(k, i + 1) = 2 * k - x * y.

After invocation, we are prompted

OLD EXPRESSION: gcd(k, i + 1) = 2 * k - x * y

Pattern?

(where the old expression is the current value of the selected hypothesis or conclusion) and must type the pattern of the subexpression to be replaced.  If we select k, we are asked first to confirm our choice:

Subexpression is k

Change this subexpression (yes/no)?

and may type yes to proceed, or no to stop.  (In the replace command, as elsewhere, yes/no answers may be given in any mixture of upper- and lower-case, and may be abbreviated e.g. to Y or N.  Additionally, a full-stop terminator is not necessary for such answers, as the carriage-return suffices.)  If there had been more than one distinct possible matches to the pattern we specified (e.g. if we had typed A*B there would have been two matches: 2*k and x*y), we would first have been prompted to decide which particular instantiation of the pattern is to be used.

The next step, in this case, is to determine which occurrence(s) of k are to be replaced: this is only necessary when the selected subexpression occurs more than once in the main expression.  We are prompted

Change which occurrence (number/none/all)?

and may reply with a number (in this case, either 1 or 2) to specify which occurrence, reading from left to right, of the subexpression is to be replaced, or none to stop or all to replace every occurrence of the subexpression simultaneously.  Let’s suppose that we reply all on this occasion;  we must now specify the new subexpression pattern and a pattern for the rulename of the rule to be used to justify the replacement.  First, the Checker prompts

Type new subexpression pattern:

and we may type a pattern, as before.  The more instantiated the pattern, the fewer potential replacements will be found, but this will speed up the search to some extent also.  In this case, let’s type X as the most general possible pattern.  Next, we are asked

By which rule?

and may type a rulename pattern, in much the same way as for the infer command.  If we type eq in our example, specifying that we wish to do an equality substitution (as rule eq(1) allows one of two equal but distinct objects to be replaced by the other), we shall then be presented with the following choice:

Possible replacements for k are:

    1. i + 1

           according to rule eq(1)

    2. x * y

    according to rule eq(1)

Select (number/none):

and may type either a number (either 1 or 2 in this case) to specify which option we wish to use, or none to stop the attempt.  If, in our example, we type 2, we shall see

NEW EXPRESSION: gcd(x * y, i + 1) = 2 * (x * y) - x * y

Is this OK (yes/no)?

and may type yes to proceed with the replacement, or no to abandon it.  Finally, whatever we type, we are offered the opportunity of further replacement attempts by the prompt

Replace more (yes/no)?

and may reply as required.

During rule searches with the replace command, every time more than twenty rule matches are found the Checker will pause and ask the user whether or not to continue the search, as also happens with rule searches conducted with the infer command (section 5.4.1).  Select yes to continue with the search, which may find more matches, or no to stop searching, in which case only those successful matches found up to that point will be considered further.

An additional way in which the replace command may be used is to perform a replace-all to replace all occurrences of a particular expression in all (undeleted) hypotheses and conclusions of the current VC by some replacement, or even in a specified set of current hypotheses and conclusions.

If, for example, you wish to replace every occurrence of the expression x*1 in the current VC by x, you can do so by typing a command such as:

CHECK|: replace all: x*1 by x using RULE_PATTERN.

As another example, suppose one has a hypothesis

H4: c = a1 * m + b1 * n

and one wishes to replace all occurrences of a1*m+b1*n by c, except in H4 (which would render it redundant and make it impossible to undo the substitutions).  This could be done, for instance, by:

CHECK|: replace h#1-3 & h#5-12 & c#1-6: X+Y by c using eq.

It should be noted that all arguments must be specified with the all or multiple hypotheses/conclusions: you may not omit the old expression, new expression or rulename parts from the actual replace command typed on the Checker command line.  In addition, the old expression must be at least partially-instantiated: it may not consist of a single wildcard (though X+Y in the above example is acceptable, as it is partially-instantiated, to an expression whose main operator is +).  Both the new expression and the rulename pattern may be wild and/or partially specified, as normal in the standard use of the replace command.  Where the old pattern is only partially-instantiated, as above, matches to the pattern are found by comparison with all current hypotheses and conclusions of the VC (except hypotheses which have been deleted).

Note that, on finding a successful replacement, the replace command will check that you wish to proceed (or, if there is more than one possible substitution, ask which one you would like to use) and, if so, replace all occurrences of the chosen expression by the new expression in every hypothesis and conclusion specified (either all, or those listed on the command-line), apart from any deleted hypotheses.  Thus, if you do not want the substitution performed on particular hypotheses, you can prevent it either by using the list of hypotheses and conclusions approach, as above, or by deleting them (with delete — see section 5.2.4), applying the replace-all, then using undelete to get the unmodified hypotheses back.  If you do not want the substitution performed on particular conclusions, this is best achieved using the list approach above, though you can also achieve this (provided you don’t mind deferring their proof until the other conclusions have been proved) by using

CHECK|: infer OTHER_CONCS_CONJ using inference(1).

and subgoaling on this.  In this command, OTHER_CONC_CONJ should be a conjunction (... and ... and ...) of all of the conclusions to which you DO want the substitution applied (e.g. X and Y and Z where c#2=X & c#3=Y & c#5=Z); after entering the subgoaling proof frame, only these conclusions will be present as goals and you can then apply the replace-all.  On proving the conclusions, you can then exit successfully from the proof-frame back to the previous level, where you will then only have the other conclusions (to which you did not want the replace-all applied) left to prove.

One final thing to note about the replace command is that, should you ask for a list of subexpressions matching some pattern containing wildcards, you may on occasion see dummy identifiers starting with a dollar symbol ($1, $2, etc.) as subexpressions of expressions containing quantifiers.  This is because such bound variables are renamed by the replace command before and after substitution to prevent variable capture or unwanted substitution of bound variables; this is not a cause for concern, merely an additional safeguard to prevent the user from compromising the Checker’s underlying logic.

5.5.2           The standardise command

This command may be used to convert a current hypothesis or conclusion into standard form, or to standardise a user-supplied expression. Once calculated, the standardised expression is displayed, and the user has the choice of whether or not the new expression should be saved.  If the user opts to save the standardised expression, it either overwrites the old expression (if the expression which has been standardised was specified to be an existing hypothesis or conclusion) or it is added as a new hypothesis, e.g. saving the fact that i-1+1 has the standard form i results in the addition of a new hypothesis i-1+1=i.  If a user-supplied boolean expression was given, the result will be saved either by saving the original expression (if the standard form is true) or by adding the new hypothesis OLD_EXPR <-> NEW_EXPR.

The command may be invoked in a number of ways.  To standardise an existing hypothesis or conclusion, a command such as

CHECK|: standardise h#5.

CHECK|: standardise c#3.

may be used.  If we wish to standardise an expression without modifying any existing hypothesis or conclusion, however, we may use the form

CHECK|: standardise i-1+1.

CHECK|: standardise x>=0 -> x+2>1.

in which the desired expression is given as the argument.  The type-checker will be used to ensure that the expressions given are correctly typed.  Note that the expressions given must be fully instantiated: inclusion of a Prolog variable will cause the standardisation attempt to be rejected.

After a successful call to the standardiser, via one of these invocations, the user is prompted by a message such as:

OLD: x>=0 -> x+2>1

NEW: true

Shall I save this result (yes/no)?

and may reply either yes to save the result (in this case, as a new hypothesis) or no to inhibit the addition of the result to the current VC (using abbreviations, e.g. y, n if desired).

Note that the standardiser is particularly useful on integer, real and relational expressions and their equivalences, but less powerful on other data types at present.  For these other data types, the simplifier (which may be invoked via the built-in simplify family of rewrite rules) is often just as good, and much faster.  Do not use the standardiser with monolithic expressions, as it will take a very long time and probably get you nowhere (except, perhaps, out of memory).

5.5.3           The simplify command

This command “simplifies” a VC by applying a number of different transformations to the VC.  These transformations are basically:

·          simplify and move negation as far in as possible, e.g.

H: not (a or b)         -->     H: (not a) and (not b)

H: not not a            -->     H: a

H: not a>b              -->     H: a=<b

·          split conjunctions up, e.g.

H: a and b              -->     H: a

                                H: b

C: a and b              -->     C: a

                                C: b

·          forward chain from implications and equivalences, e.g.

H: a -> b

H: a            give a new      H: b

H: a -> b

H: not b        give a new      H: not a

(and similarly both ways for equivalences).

This command is useful for hypotheses and conclusions introduced, for instance through subgoaling, which can be simplified in this way to make them useable. Its syntax, very simply, is

CHECK|: simplify.

5.6               Proof by cases commands

5.6.1           The prove ... By cases command

To commence a proof by cases attempt, the user may employ the prove by cases command.  The command may either be typed with an additional argument, as in

CHECK|: prove c#3 by cases on h#2.

which tells the Checker to use hypothesis 2 (which must be a disjunction of two or more formulae) as the case-generating formula to use in attempting to prove c#3, or

CHECK|: prove c#3 by cases on x<0 or x=0 or x>0.

which tells the Checker to check that the given disjunction formula follows from the current hypotheses of the VC and, if so, to introduce it as a new hypothesis and use it as the case-generating formula, or it may be typed without the disjunction argument, as in

CHECK|: prove c#3 by cases.

in which case the user is prompted for the case-generating formula by

Type hypothesis no. or cases formula...

CASES>

and should type either an integer for the hypothesis number or a valid case-generating formula.  Once in a proof by cases, the case, status and quit commands can be used, and the operation of the done command will be modified to take the circumstances into account.

It is important to get the message PROVED ... FOR CASE n and allow the Checker to move you to another case.  Furthermore, if auto_done is off (see the set command), you may need to type extra done commands to exit successfully from a proof by cases.  It is always wise to try an extra done rather than risk throwing away what you have just proved by not doing so.

5.6.2           The case command.

This command can only be used with an argument, e.g.

CHECK|: case 2.

It tells the Checker which case number of the cases at the current depth in a proof by cases to consider.  When the prove ... by cases command is issued, the Checker automatically begins by considering case 1; then, when the conclusion is proved for case 1, the Checker automatically moves on to case 2, and so on until all cases have proved the conclusion.  The user can override this sequence with the case command, however, moving back to an earlier case or forward to a later case, as desired.  (It can lead to confusion of what has been done, however, so we recommend you use this command sparingly.)

5.7               Quantifier manipulation commands

5.7.1           The unwrap command

This command may be applied to either a hypothesis or a conclusion which is of one of the forms for_all(V:TYPE, FORMULA) or for_some(V:TYPE, FORMULA). It may either be typed on its own, as

CHECK|: unwrap.

or with an argument, e.g.

CHECK|: unwrap h#3.

In the former case, the user is prompted to specify which hypothesis or conclusion is to be unwrapped.  If successfully applied to a hypothesis, this command generates a new hypothesis with this outermost quantifier stripped off and an appropriate new unique variable or universal variable introduced, whereas if it is successfully applied to a quantified conclusion, it generates one or more conclusions to be proved and enters a new proof frame in which these are the only conclusions.  See chapter 3 of this documentation for more details of the way in which quantifiers are handled by the Checker.

There are circumstances in which the command will fail: namely, should the user attempt to unwrap a universally-quantified conclusion, or an existentially-quantified hypothesis, before instantiating all universal variables present in the formula.  As an example of the use of unwrap, consider the VC:

H1:  for_all(x : integer, f(x) >= 0)

-->

  C1:  for_all(a : integer, for_all(b : integer, f(a + b) >= 0)).

If we unwrap H1, with the command

CHECK|: unwrap h#1.

we get a new hypothesis

H2:  f(int_X_1) >= 0.

Applying the unwrap command to C1, we first enter a new proof frame with

C1:  for_all(b : integer, f(int_a_1 + b) >= 0)

as its only conclusion; applying

CHECK|: unwrap c#1.

once more, we now get

  C1:  f(int_a_1 + int_b_1) >= 0.

Finally, we use the instantiate command (see next subsection) to instantiate  the universal variable int_X_1 in H2 to int_a_1+int_b_1, which succeeds because they are of compatible types, and done will then spot that we have completed the proof of the VC.

5.7.2           The instantiate command

Given a universal variable (see chapter 3) introduced by stripping a universal quantifier from a hypothesis or an existential quantifier from a conclusion, the proof of a VC will often entail the instantiation of this universal variable with a particular instance, an expression of the appropriate type.  For example, in the preceding subsection, the proof was concluded by spotting that the universal variable int_X_1, when instantiated with the particular instance int_a_1 + int_b_1, matched up with the conclusion and thus concluded the proof.

The command may either be issued without any arguments, thus:

CHECK|: instantiate.

or with up to two arguments, specifying the universal variable to be instantiated and the expression with which it is to be instantiated, e.g.

CHECK|: instantiate int_X_1 with 42.

The Checker should spot when there is only one universal variable in the current VC and save you the job of choosing from a list of one, if the universal variable argument is omitted; otherwise, the user is prompted to supply any arguments not specified in the command.

The Checker checks that the expression which is to be used to instantiate the chosen universal variable has a compatible type and, if so, succeeds.  If a different instantiation is instead required, the user must either generate the same hypothesis again by a similar unwrapping, or quit from the current proof frame and reenter so that the same subgoal is generated afresh.

5.8               Other proof frame commands

5.8.1           The prove ... by contradiction command

The prove by contradiction command may be used to attempt to prove a conclusion of the VC by contradiction.  The command must be typed with a single argument, the conclusion to be proved by contradiction.

The Checker enters a new proof frame, one deeper than the current depth, and retracts all the conclusions of the VC, introducing instead the single conclusion

C1:  false.

It also adds the negation of the conclusion selected for a proof by contradiction as a new hypothesis.  If you succeed in establishing false, given the assumption that the desired conclusion is false, you have established the proof by contradiction and the proof frame is exited successfully, with the selected conclusion now established.

5.8.2           The prove ... by implication command

This command may be applied to any conclusion of the current VC which has the form A -> B (i.e. -> as its principal functor).  It allows the user to attempt to prove this conclusion by entering a new proof frame in which the formula to the left of the implication (i.e. A) is assumed to hold and the formula to the right (namely, B) must then be shown to be true also.  It can only be applied to a conclusion, not to a hypothesis, so it requires only a single argument: the conclusion.

An example of the command with argument is:

CHECK|: prove c#2 by implication.

to prove conclusion 2 by implication.

5.8.3           The prove ... by induction command

A simple form of proof by mathematical induction over the integers is currently available in the Checker.  Proof by induction may only be called on a single conclusion and will enter a new proof frame.  The induction may only be over a well-founded increasing subset of the integers (i.e. the subset must be bounded below) and single incrementation only is allowed.  The user must supply both the formula to be proved by induction, the integer subexpression over which induction is to occur and the base case for the induction.  For example, given the VC

H1:  x >= b

-->

  C1:  p(x)

an attempt to prove C1 by induction on x (with base case b, since the VC guarantees that x >= b, as we require for bounding of x from below) might be entered by the following dialogue:

CHECK|: prove c#1 by induction.

 

INDUCTION -- On what? |: x.

 

INDUCTION -- Base case? |: b.

 

ENTERING PROOF BY INDUCTION ATTEMPT

After this, we would be in a new proof frame with the following VC:

H1:  x >= b

-->

  C1:  p(b)

  C2:  ind_x_1 >= b -> (for_all(ind_x_2 : integer, (ind_x_2
       >= b and ind_x_2 <= ind_x_1) -> p(ind_x_2)) ->
       p(ind_x_1 + 1))

i.e. two conclusions: one the base case and the other the induction case.

5.8.4           The status command

To assist the user in keeping track of which conclusions have been proved for which cases at the current depth in a proof by cases, or where in a proof attempt the user is, the status command displays such status information.  For example, in a proof with three cases to consider, we might see:

CHECK|: status.

[DEPTH: 1]

CASES: x * x >= 0

*** PROVED FOR CASE 1

<Case 2 pending>

<Case 3 pending>

The status command also shows where we are when we have subgoaled, entered a proof by contradiction attempt, entered a proof by implication attempt or unwrapped a quantified conclusion (i.e. when we have executed any command which puts us into a new proof frame); it displays the formula which will have been shown to be true if we manage to prove all the conclusions of the current proof frame as well as our current depth.  For example, we might see:

CHECK|: status.

[DEPTH: 2]

QUANTIFICATION: for_all(x:integer, x * x > 0)

if we have stripped the universal quantifier off this conclusion in our proof attempt.  On proving all conclusions at our current depth (2, in the case of the above example), we exit to the previous depth and the formula shown by the status command at our current depth then holds, i.e. it is added to the hypotheses at the previous depth.

5.8.5           The quit command

This command can only be used when in a proof frame: if it is called when at the top level it will fail.  The command takes no arguments, so may only be used in the form

CHECK|: quit.

The effect of calling this command is to quit the current proof frame and to reenter the previous proof frame without success, i.e. as though the proof frame from which we have just quit had never been entered in the first place.

 

6                       Configuring the Proof Checker

Whenever you commence a proof session with the Proof Checker, the Checker will look for two files: firstly for the file $SPADE_CHECKER/checker.ini.  If this environment variable is not set it will look in <root>/lib/checker/rules/checker.ini.  It will then look for checker.ini in your current default directory.  In the event of finding either, or both, of these files, they will be read (in that order if both exist).  These files may be used to tailor the Proof Checker to your system’s, and your own, preferred defaults.

Within these initialisation files, you may issue set commands to set any of the Proof Checker’s flags to a specific value, consult commands to consult one or more rulefiles.  In such a case, however, the command word (set or consult) must be given in unabbreviated form and totally in lowercase only, and all arguments must be provided on the command line.  Finally, as in interactive use of the Proof Checker, each command must be terminated by a full-stop (and some white space characters, e.g. a carriage return).  Any commands which are not recognised will be rejected by the Checker and an appropriate warning will appear on the screen before the prompt for a filename.

The reason for permitting the checker.ini files to contain consult commands is to enable you, during a project, to provide your own automatically-loadable libraries of rule files without the nuisance of having to type in the appropriate consult commands interactively at the beginning of every proof session.  For example, if you are carrying out proofs of assembly code for a particular microprocessor, you may define a global rule library of rules for the modelling functions and include the consult commands to make these accessible in the system-wide, or in particular users’, Checker initialisation files.  As in a consult command issued interactively, warnings will be issued if any errors are discovered as the rulefiles are consulted. 

One other value that can be set in a checker.ini file is the memory limit.  The version of the Checker distributed has a limit of 500,000 words.  You may change this (only in a checker.ini file) by a command of the form

set memory_limit to VALUE.

where VALUE is an integer greater than or equal to 250,000.  Note that, if you increase this quantity dramatically, for instance to manipulate larger formulae, you may also have to change various system parameters (refer to the Installation Notes for details).

 

7                       Common Proof Techniques

As one gains experience in attempting to prove VCs with the Checker, it soon becomes apparent that there is a number of useful techniques which are used frequently in the kind of proof which tends to crop up with real VCs. Here, we explain a number of these techniques.

7.1               Standardisation followed by forward chaining

Quite often, one has an expression A, and wishes to prove that expression B is true.  It is often the case that B is A written in a slightly different way, i.e. A and B are equivalent expressions.  Standardisation of A may not be enough to prove B, however, as a different formula, C, may be the standard form of A.  In this case, a useful sequence of events may be:

1         standardise the Boolean expression A -> B, since if A and B have the same standard form (which is very likely, if their equivalence is obvious from inspection), this will standardise to “true”;

2         if this does indeed standardise to true, save the result;

3         we now have a hypothesis A, and a hypothesis A -> B; using the “forwardchain” command, we can replace the latter by B, thereby proving B, as required.

For example, if i and n are of type integer, then with

A: i=<n-1

B: n>i

standardise i=<n-1 -> n>i results in true, so we can use the above technique.

If A and B are virtually identical, it may be easier to use the replace command instead, however, e.g.

A: a+b=c

B: b+a=c

replace a+b by b+a in A to get B by commut.

Note also that A and B must be equivalent: if they have different standard forms, this technique will not work, e.g.

A: i<n

B: i=<n

standardise(boolean,i<n -> i=<n) gives -i+n>0 -> 1-i+n>0, so we cannot use this technique.  In this particular case, it is possible for the Checker’s inference engine to “infer” i=<n from A directly, however, without additional work.

7.2               Searching rules with infer command

As already discussed, the user can adjust the flags used to determine which possible rule matches are displayed at the terminal during an attempted inference/replacement.  One technique is to set these flags to reasonably general settings.  Then it can be useful, when the next step is not obvious, to try to infer the current goal-formula (usually a conclusion of the VC) using a reasonably wildcard rulename, i.e. either a Prolog variable (for the most liberal search), or a combination of family rulenames, such as “transitivity & inequals” (which searches through all pertinent transitivity and inequals rules).  In these circumstances, each possible match meeting the flag constraints is displayed, together with the list of unsatisfied conditions, until either a perfect match is found, in which case the formula has been proved, or until all possible matches have been tried.  Furthermore, to speed up the search and cut down on spurious matches, you may wish to use the from clause with the infer command, to ensure that only hypotheses which you believe will be relevant to the inference are used: hypotheses which are not listed in the from clause will be temporarily deleted, so that they are not used by the Checker’s inference engine in attempting to satisfy subgoals of any rule tried.

As the search for matches proceeds, it is useful to watch for matches with unsatisfied conditions which appear to follow from the current hypotheses of the VC: if you believe that the conditions from any of these matches can be proved, the formula may also be proved using that particular rule instantiation.  This, in effect, is a manual search for subgoals which are judged to be satisfiable.

7.3               Subgoaling

Following on from the previous technique, if the search for possible rule matches gives one whose remaining conditions appear to be satisfiable, it may be worthwhile using the option to subgoal with this rule.  Subgoaling on the match

rulename: FORMULA follows from [COND1, COND2, ..., CONDn]

results in a new proof frame being entered and the unsatisfied subgoals in [COND1, COND2, ..., CONDn] are added as the goals to be proved (i.e. they become the conclusions within this proof frame).  If you can then successfully construct a proof of each of these new subgoals within this proof frame, on exiting from the frame the rule is invoked and the desired formula FORMULA has then been established.

The advantages of doing this are that the subgoals that need to be proved are provided as the goals of the VC (so they do not need to be remembered or written down for reference) and any subgoals which require surgery with the “replace” command or further subgoaling as part of their proof can be so manipulated.

7.4               Equality (and other) substitutions

Often, some of the hypotheses of a VC are simple equalities.  It is quite common, given an equality A=B for some expressions A and B, to replace A by B (or B by A) in one or more of the hypotheses and/or conclusions of the VC.  If the equality is of the form IDENTIFIER=EXPRESSION, for example, it often happens that substituting EXPRESSION for IDENTIFIER will result in a new expression which can be simplified and proved, as expressions often have to be expanded before they can be contracted down to a provable formula.

The replace command allows the user to replace one expression by another, equal, expression.  With replace, the user can also employ replacement rules which are not simple equality rules, however, but which follow from properties of user-defined predicates and functions.

The replace command will accept wildcards, both as rulenames and as the goal expression, so that the user may explore different possible replacements for a particular expression.  This can be useful in looking for ways to simplify a complicated expression.  As with wildcard rulenames with infer, however, in general the wilder the card the longer the search!

One other function for which replace can be used is to eliminate a common subexpression which occurs in a number of hypotheses and/or conclusions, by using the replace all:... variant of the command.  This can save time and effort in simplifying VCs with common substitutions before further tailored manipulation is attempted.

7.5               Examining the rules database

In any proof attempt, it is important to have a plan of the strategy to be used in searching for a proof.  If the attempt succeeds, the strategy was valid; if it fails, the point at which the strategy breaks down may well provide important information about something that has been overlooked or something that has been assumed to follow from the hypotheses but which in fact does not.  Without a strategy, the user is likely to waste a great deal of time and effort, with a much lower chance of success.

An important part of successful strategies, it appears, has been to refer where applicable to the domain-specific rules for each conclusion and to try to find a rule for which it seems likely to be possible to match the conclusion with its goal pattern and for which the subgoals which result from this instantiation are all either known to be true or follow from the hypotheses.  Sometimes, there may be more than one rule which matches with the conclusion, covering various different cases, and a proof by cases, as described in the next section, may prove necessary.  One of the most important choices in a proof attempt, however, is of the rule to be used in decomposing a goal into more manageable subgoals.

To aid in the process of hunting the appropriate rules, a copy of the relevant rules from the rules database and the ability for the user to refer to the rules database while searching for a proof can be very useful.  For this reason, the built-in rules databases are listed in the sister document to this manual — “SPADE Proof Checker Rules Manual” — and are provided in textual form with the Proof Checker.  It is recommended that, during a proof attempt, a copy of any user-supplied rules being used is also made available for reference to the person attempting the proof.

7.6               Proofs by cases

A strategy which has come up on a number of occasions in proofs, often quite unexpectedly, is the proof by cases.  Sometimes it is fairly obvious that a proof by cases may be necessary, e.g. when a hypothesis involving a key program variable consists of a disjunction of conditions, such as x div 2*2=x or x div 2*2=x-1.  In this case, for instance, it may be necessary to substitute for x div 2*2 in an expression in order to simplify it, and to do so the user will almost certainly have to consider each of these cases separately.

More frequently, however, in choosing the proof rule to be used for a particular conclusion, for example, one may find that the VC’s hypotheses are marginally too weak to meet all of the conditions of the rule.  For example, in proving the correctness of a bubblesort procedure, several of the VCs gave rise to this situation, because the obvious proof rule to use in each case required, indirectly, that n>=3, when the constraints implied by the VC in each case could only constrain n to be at least 2.  For each of the VCs, considering the cases n<3 or n>=3 (which covers every possible value of n) led to a proof: the case with n>=3 allowed the proof to proceed by the chosen strategy, while the case n<3, combined with the fact that n>=2 from the hypotheses, gave n=2, from which the proof also became trivial by application of a different domain-specific proof rule.  (The problem arose in swapping the jth and (j+1)th elements of an array, where j=1: if n>=3 we use the hypothesis that the array is ordered from its 2nd to its nth element to show that the array after the swap is ordered, while if n=2, we use the fact that the array with its elements swapped is ordered from its 1st to its 2nd elements and n=2, so the array is ordered.)

The important point of this strategy is that, if one of the conditions of a rule match cannot be met because the constraints of the VC are not strong enough, it may be worthwhile to attempt the proof in two parts.  If a proof can be found for the case where the extra constraint needed does not hold, this can be combined with the proof from the case where the additional constraint is met to give a proof of the required formula by cases.

7.7               Proof by contradiction

As we have seen earlier in discussion of the design of the Checker and the natural deduction rules employed, proof by contradiction is supported.  If we have two mutually contradictory hypotheses in the VC at any point, we should be able to arrive at two hypotheses, one of the form X and one of the form not X, where X is some formula.  We can then use deduce to deduce false as a hypothesis and, from this, we may use infer to infer any remaining conclusions.  Such proofs typically arise from VCs for non-executable paths: either the path traversal condition itself is false, or it contradicts the assertion at the top of the path (i.e. one or more of the hypotheses of the VC).  In such cases, it is unnecessary to use the Checker’s prove ... by contradiction command, as the negation of the goal formula is not necessary to establish a path-traversal contradiction.

7.8               Proofs with quantification

In eliminating conclusions involving quantification, unwrapping as many as possible of the outer levels of quantification is usually a good first step.  For existential quantification, it is then necessary to find one or more objects (of the appropriate type) with which to instantiate the formula so derived.  For universal quantification, the resulting formula must be shown to be true.  As formulae in this latter case are often of the form A -> B, where A is the restriction to a specific domain, proof by implication is then usually a worthwhile approach.

With quantified hypotheses, unwrapping produces new useful formulae.  For a universally-quantified hypothesis, we may gain a “general fact” about one or more variables which, when instantiated with objects of the appropriate type, gives us an instance of this general fact.  With existentially-quantified hypotheses, unwrapping provides us with an object satisfying the formula so quantified.  Consider, for example, the VC

H1:  for_all(v:t, (a <= v and v < b) -> p(v))

H2:  p(b)

-->

  C1:  for_all(v:t, (a <= v and v <= b) -> p(v))

where type t is integer or an integer subrange.  Unwrapping the conclusion formula yields a new C1:

C1: a <= t_v_1 and t_v_1 <= b -> p(t_v_1)

which suggests a proof by implication as mentioned earlier.  Proof by implication, followed by unwrapping H1, gives us the following VC:

H1:  for_all(v:t, (a <= v and v < b) -> p(v))

H2:  p(b)

H3:  a <= t_v_1

H4:  t_v_1 <= b

H5:  a <= t_V_1 and t_V_1 < b -> p(t_V_1)

-->

  C1:  p(t_v_1).

Now we have two cases for t_v_1: either it is strictly less than b, in which case instantiating t_V_1 to t_v_1 in H5 gives us C1, or it is equal to b (it must be one or the other, by H4), in which case H2 demonstrates that C1 is correct (after a little substitution).  Thus a proof by cases on t_v_1 < b or t_v_1 = b will allow us to establish p(t_v_1) in each case, and so by eliminating the only subgoal we establish the truth of the original formula.

7.9               Look at conclusion structure

Often a clue to a worthwhile approach is present in the conclusion to be proved.  For instance, if we are attempting to prove a conclusion of the form A -> B, it is probably worthwhile to try using proof by implication.  If the conclusion is (universally or existentially) quantified, it is worth using unwrap and proceeding from there (possibly after first using the quantifier rewrite and inference rules in the built-in rule family quant).  If the conclusion is of the form A or B, inspection of A and B in turn (possibly using the traverse command to make it easier to identify and inspect the separate components) may suggest whether you can prove A or you can prove B, then you could use inference(3) or inference(4) respectively to subgoal on and concentrate on the chosen subcomponent.  If the goal conclusion is a logical equivalence, you might try splitting it up into two implications by subgoaling on inference(9), or using an appropriate rule from the LOGIC.RUL built-in rulefile.  In this way, certain conclusion structures can themselves suggest a worthwhile approach to take in order to simplify the proof attempt and carry you forward to more tractable subgoals.

 

8                       Log Files

8.1               Proof logs

Whenever you use the Proof Checker to carry out a proof, all non-trivial details of the proof attempt are recorded in a proof log.  This file has the same main filename as the file(s) read in at the start of the proof session, but a file extension of “.PLG”, unless the proof_log= qualifier is used to specify an alternative filename.  The date and time recorded at the beginning of the proof log reflects the date and time at which the proof session was commenced (which may be some time before it is completed!).

Each “step” recorded in a proof log corresponds, in general, to the successful execution of a single command.  Not all commands give rise to the recording of a step in the proof log, however: list, status, help, forget, remember, delete, undelete, printvc and traverse are all ignored for proof-logging purposes.  Furthermore, many of the other commands will only be registered if they succeed: thus, if you attempt to infer a formula with the infer command, but do not find an exact match and do not attempt to subgoal, no record will be made (as there will have been no effect on the state of the VC).

When a successful inference or substitution rule match is found and used in a proof session, the subgoals which the Checker had to satisfy to use the rule are not echoed to the screen; they are, however, echoed to the proof log.  Thus, by examining the proof log, one can check, for instance, that the rule used in the actual proof attempt was indeed the rule which appears in the rule file.

Within a proof log, the proof rules used in the proof attempt of each VC are recorded, together with a final summary of all the proof rules used in the proof session as a whole.  Notice that only proof rules used in successful proofs of VCs are recorded: if we give up our attempt to prove VC 5, and this unsuccessful attempt to prove VC 5 was the only place where we made use of rule xxx(1), then xxx(1) will not be recorded as used in the proof session in the final summary at the bottom of the proof log.

Thus, the proof log provides a detailed record of all proofs carried out and may be used to check the proof work performed.

8.2               Command logs

When the command_logging flag is set to on (see section 5.2.6), all user input- commands, command arguments, replies to prompts from the Checker (e.g. for a yes/no answer, for argument details not supplied on the command-line, etc.) — is echoed to a command-log.  The log is not character-for-character exact in every sense: Prolog variables used as wildcards, for instance, appear as _1, _2, etc.  However, the log is complete and “replayable”.  Thus, if you are half way through a complicated proof when the power to your computer gets shut off, you can recover to exactly where you were by “replaying” the proof session via the execute=... command-line qualifier, or via the execute command if this is more appropriate.

A command log created from the beginning of the session will contain the VC number you first selected for proof, then the sequence of commands and responses that you typed subsequently.  If you switch between logging and no logging (by using set command_logging to on and set command_logging to off, the log files that you create will only be for “portions” of a proof session, and will therefore only make sense if replayed in the right context: there’s no point replaying a replace h#13: ... command on a VC which doesn’t have a H13!  Thus, it is advisable to keep a log for an entire proof session and, if there are parts of it which are useful for replaying, these can be excised from the log with an editor and replayed via the execute command.

A useful application of replaying via the execute command occurs where two or more VCs which are hard to prove have sufficient proof work in common that a section of the proof for one can be used on the others.  Where the differences cover different hypothesis numbering, it is necessary for the user to take this into account and make adjustments manually before replaying the desired sequence of common commands.

A second useful application of command logs is to provide further auditable evidence of the impeccability of one’s proof.  The log for each session can be retained and, should an auditor wish to check that nothing amiss occurred in the proof session, the command log can be replayed to replicate the proof and the proof log so generated compared with the original to check correspondence, the only difference in such circumstances being the date and time in the proof log banner.

Note that execute commands can be nested, so that replaying one command log which contains an execute command is permitted.  The command log will record the top level execute command only, and recommence logging after it has been completed.  For example, consider the following two files:

 

A.CMD

B.CMD

newvc 2.

replace all: d by 0 using eq.

simplify.

standardise c#1.

execute 'B.CMD'.

delete h#4.

done.

printvc.

If we issue the command execute ‘A.CMD’ (and assuming all the commands in A.CMD and B.CMD execute o.k.), we get the following command-log for the proof session:

execute 'A.CMD'.

newvc 2.

simplify.

execute 'B.CMD'.

done.

Replaying of the command log is, obviously, dependent on whether any changes are made to B.CMD.

One final word of warning about command logs and replaying: a command log does not record “environmental” actions arising from the system-wide or local checker.ini file(s).  Thus, the replay must always be in exactly the same context, with the same flag settings, if behaviour is to be guaranteed to be identical.  This caveat is not in practice as daunting as it sounds, however, as most flag settings are “cosmetic” and do not affect the Checker’s proof performance, while those that do (e.g. auto_done, simplify_in_infer) are rarely modified by most Checker users.

9                       Some Examples

Provided with the SPADE Proof Checker are a number of example files (see Installation Notes).  Below, we go through these examples, as a simple guide to get you started with the SPADE Proof Checker.  At the end of this chapter, “model” proof logs for each of these examples are provided, though you may find it instructive to try to do the exercises without looking at these.

9.1               Partial correctness of a division by repeated subtraction algorithm

The FDL text DIVBYSUB is already annotated to allow proof of its partial correctness.  Generate VCs from this FDL text using the (latest) SPADE VCG, without simplification.

Once you have done this, enter the SPADE Proof Checker via the command

checker

and, when prompted for the filename type “divbysub.” (without the quotes).  The .FDL and .VCG files are read, then you will be prompted by the Proof Checker with a CHECK|: prompt.  You can now type commands and prove the VCs.  For this very simple example, you can prove all of the VCs using only the done and standardise commands (plus, of course, newvc and exit). In fact, if you try it again using simplification in the VCG, you will only need the done command!  This is one of the rare cases when using the VCG’s simplifier can make life slightly easier, rather than more difficult.

Figures 1(a),(b),(c),(d) show the annotated FDL model, the .VCG file (unsimplified), and a proof log and sample command log for the above proof.

9.2               Partial correctness of a hardware multiplication algorithm

In this case, we have two files, hwmult.fdl and hwmult.rls.  The FDL model is of hardware multiplication by additions and repeated shifts (right and left) modelled by div 2 and * 2.  There are two rules in the local rule file hwmult.rls, which are useful for the proof.  We can generate VCs from this annotated FDL model and carry out a proof.

Figures 2(a),(b),(c),(d) show the annotated FDL model, the .RLS rule file, the .VCG file and the .PLG proof log for this program.

9.3               Partial correctness of a maximum value in an array algorithm

Here, we have an annotated FDL model maxfn.fdl of a text which searches through an array 1..n of integers, returning the largest value found.  From this, we can generate verification conditions, but to prove them we need to know something about the proof function maxof used in the annotations.  For this, we provide a rule file containing four rules for maxof, called max(1)-(4). Having generated VCs and consulted this rulefile, it is quite straightforward to prove each of the four VCs, using only the infer, standardise and done commands.

Figures 3(a),(b),(c),(d) show the FDL text, the local proof rules defining the maxof function, the output from the SPADE VCG and a model proof log for this proof.

9.4               Absence of certain run-time errors in a sort algorithm

As well as partial correctness, we can also attempt to prove the absence of certain types of run-time error.  In the file sort.fdl is a program annotated with “check” conditions so that immediately before accessing or updating the Nth element of our array, we check that N is within its indexrange.  As can be seen from the model, some of the “checks” can be omitted as redundant, as they follow from other check statements in the vicinity.

From such an annotated text, we can generate VCs with the SPADE VCG and prove them to show that no “index out of bounds” errors will occur when the program is run.

Figures 4(a),(b),(c) show the annotated FDL sort algorithm, the verification conditions generated from it and a proof log demonstrating the correctness of each of these VCs.

    title procedure divide;

    var m, n, q, r: integer;

    derives q from m, n &

            r from m, n;

    pre  (m >= 0) and (n > 0);

    post (m = q * n + r) and (0 <= r) and (r < n);

 

    start

    q := 0;

    r := m;

1:  assert (m = q * n + r) and (r >= 0);

    case r >= n of

      true : skip;

      false: goto 2

    end;

    q := q + 1;

    r := r - n;

    goto 1;

2:  skip

    finish

 

Fig. 1(a): FDL model of DIVBYSUB


        ****************************************************

                 RESULTS OF SPADE SEMANTIC ANALYSIS

                     PVL SPADE TOOL VERSION : 4.3

          Program Validation Limited, Southampton, England

        ****************************************************

 

                          procedure divide

 

               DATE : 25-FEB-1988 TIME : 09:51:03.19

 

                      VERIFICATION CONDITIONS

                          (not simplified)

 

For path(s) from statement 2 to (assertion) statement 4 :

 

procedure_divide_1.

H1:   m >= 0.

H2:   n > 0.

      ->

C1:   m = 0 * n + m.

C2:   m >= 0.

 

For path(s) from statement 4 to (assertion) statement 4 :

 

procedure_divide_2.

H1:   m = q * n + r.

H2:   r >= 0.

H3:   r >= n.

      ->

C1:   m = (q + 1) * n + (r - n).

C2:   r - n >= 0.

 

For path(s) from statement 4 to (assertion) statement 10 :

 

procedure_divide_3.

H1:   m = q * n + r.

H2:   r >= 0.

H3:   not (r >= n).

      ->

C1:   m = q * n + r.

C2:   0 <= r.

C3:   r < n.

 

 

Fig 1(b): Output of SPADE VCG for DIVBYSUB

 


        ****************************************************

           SPADE Transcript of Interactive Proof Session

               PVL SPADE PROOF CHECKER VERSION: 1.0a

            Program Validation Limited, Southampton U.K.

        ****************************************************

 

                DATE : 26-FEB-1988  TIME : 16:01:10

 

 

STEP 1

Command: newvc

Now attempting proof of VC: procedure_divide_1

H1:  m >= 0

H2:  n > 0

-->

  C1:  true

  C2:  m >= 0

 

 

STEP 2

Command: done

*** Proved C1: true

*** Proved C2: m >= 0

*** Proved all conclusions

*** PROVED VC

 

The above proof did not make use of the proof rules database

 

 

 

 

 

STEP 1

Command: newvc

Now attempting proof of VC: procedure_divide_2

H1:  m = q * n + r

H2:  r >= 0

H3:  r >= n

-->

  C1:  m = (q + 1) * n + (r - n)

  C2:  r - n >= 0

 

 

STEP 2

Command: done

*** Proved C2: r - n >= 0

 

STEP 3

Command: standardise

Apply standardiser on C1

>>> New goal C1: m - n * q - r = 0

 

STEP 4

Command: standardise

Apply standardiser on H1

*** New H1: m - n * q - r = 0

 

STEP 5

Command: done

*** Proved C1: m - n * q - r = 0

*** Proved all conclusions

*** PROVED VC

 

The above proof did not make use of the proof rules database

 

 

 

 

 

STEP 1

Command: newvc

Now attempting proof of VC: procedure_divide_3

H1:  m = q * n + r

H2:  r >= 0

H3:  not r >= n

-->

  C1:  m = q * n + r

  C2:  0 <= r

  C3:  r < n

 

 

STEP 2

Command: done

*** Proved C1: m = q * n + r

*** Proved C2: 0 <= r

*** Proved C3: r < n

*** Proved all conclusions

*** PROVED VC

 

The above proof did not make use of the proof rules database

 

 

 

 

 

*** END OF PROOF SESSION

 

The above proof session did not make use of the proof rules database

 

Command: exit

 

 

Fig. 1(c): Proof log for DIVBYSUB proof


1.

done

2.

done

list

standardise c # 1.

y

standardise h # 1.

y

list

done

3.

done

newvc

exit

 

Fig 1(d): Command log for DIVBYSUB proof

 

   title function product;

 

   var a, b, p, x, y, product: integer;

 

   derives product from x, y;

   pre  (x >= 0) and (y >= 0);

   post product = x * y;

 

   start

   a := x;

   b := y;

   p := 0;

1: assert p = x * y - a * b and a >= 0;

   case a <> 0 of

     true : skip;

     false: goto 2

   end;

   case odd(a) of

     true : skip;

     false: goto 3

   end;

   p := p + b;

3: a := a div 2;

   b := b * 2;

   goto 1;

2: product := p

   finish

 

 

Fig. 2(a): FDL model of HWMULT

 

 

/*** LOCAL RULES FOR ODDNESS/EVENNESS ***/

 

rule_family local_odd: A * B requires [ A:i, B:i ].

 

local_odd(1): ( X div 2 * (Y * 2) may_be_replaced_by X * Y

                     if [ not odd(X)     ] ).

local_odd(2): ( X div 2 * (Y * 2) may_be_replaced_by (X-1) * Y

                     if [ odd(X), X >= 0 ] ).

 

 

Fig. 2(b): Local proof rules file for HWMULT


        ****************************************************

                 RESULTS OF SPADE SEMANTIC ANALYSIS

                     PVL SPADE TOOL VERSION : 4.3

          Program Validation Limited, Southampton, England

        ****************************************************

 

                          function product

 

               DATE :  1-MAY-1988 TIME : 16:10:16.91

 

                      VERIFICATION CONDITIONS

                          (not simplified)

 

For path(s) from statement 2 to (assertion) statement 5 :

 

function_product_1.

H1:   x >= 0.

H2:   y >= 0.

      ->

C1:   0 = x * y - x * y.

C2:   x >= 0.

 

For path(s) from statement 5 to (assertion) statement 5 :

 

function_product_2.

H1:   p = x * y - a * b.

H2:   a >= 0.

H3:   a <> 0.

H4:   not (odd(a)).

      ->

C1:   p = x * y - a div 2 * (b * 2).

C2:   a div 2 >= 0.

 

function_product_3.

H1:   p = x * y - a * b.

H2:   a >= 0.

H3:   a <> 0.

H4:   odd(a).

      ->

C1:   p + b = x * y - a div 2 * (b * 2).

C2:   a div 2 >= 0.

 

For path(s) from statement 5 to (assertion) statement 13 :

 

function_product_4.

H1:   p = x * y - a * b.

H2:   a >= 0.

H3:   not (a <> 0).

      ->

C1:   p = x * y.

 

Fig. 2(c): Output of SPADE VCG for HWMULT


        ****************************************************

           SPADE Transcript of Interactive Proof Session

               PVL SPADE PROOF CHECKER VERSION: 1.0a

            Program Validation Limited, Southampton U.K.

        ****************************************************

                DATE :  1-MAY-1988  TIME : 16:17:29

 

STEP 1

Command: newvc

Now attempting proof of VC: function_product_1

H1:  x >= 0

H2:  y >= 0

-->

  C1:  0 = x * y - x * y

  C2:  x >= 0

 

STEP 2

Command: standardise

Apply standardiser on C1

>>> New goal C1: true

 

STEP 3

Command: done

*** Proved C1: true

*** Proved C2: x >= 0

*** Proved all conclusions

*** PROVED VC

The above proof did not make use of the proof rules database

 

STEP 1

Command: newvc

Now attempting proof of VC: function_product_2

H1:  p = x * y - a * b

H2:  a >= 0

H3:  a <> 0

H4:  not odd(a)

-->

  C1:  p = x * y - a div 2 * (b * 2)

  C2:  a div 2 >= 0

 

STEP 2

Command: infer

Successful inference with rule: inequals(90)

  Proved subgoal: a >= 0

  Proved subgoal: 2 > 0

Therefore a div 2 >= 0

*** New H5: a div 2 >= 0

*** Proved C2: a div 2 >= 0

 

STEP 3

Command: replace(c # 1)

Successful substitution with rule: local_odd(1)

  Proved subgoal: not odd(a)

Allowing substitution of a * b

for a div 2 * (b * 2)

>>> New goal C1: p = x * y - a * b

 

 

STEP 4

Command: done

*** Proved C1: p = x * y - a * b

*** Proved all conclusions

*** PROVED VC

The following rules were used in proving the above VC:

          HWMULT.RLS::local_odd(1) plain output?????

          SPADE$CHECKER:INTINEQS.RUL::inequals(90)

 

STEP 1

Command: newvc

Now attempting proof of VC: function_product_3

H1:  p = x * y - a * b

H2:  a >= 0

H3:  a <> 0

H4:  odd(a)

-->

  C1:  p + b = x * y - a div 2 * (b * 2)

  C2:  a div 2 >= 0

 

STEP 2

Command: replace(c # 1)

Successful substitution with rule: local_odd(2)

  Proved subgoal: odd(a)

  Proved subgoal: a >= 0

Allowing substitution of (a - 1) * b

for a div 2 * (b * 2)

>>> New goal C1: p + b = x * y - (a - 1) * b

 

STEP 3

Command: standardise

Apply standardiser on C1

>>> New goal C1: a * b + p - x * y = 0

 

STEP 4

Command: standardise

Apply standardiser on H1

*** New H1: a * b + p - x * y = 0

 

STEP 5

Command: infer

Successful inference with rule: inequals(90)

  Proved subgoal: a >= 0

  Proved subgoal: 2 > 0

Therefore a div 2 >= 0

*** New H5: a div 2 >= 0

*** Proved C2: a div 2 >= 0

 

STEP 6

Command: done

*** Proved C1: a * b + p - x * y = 0

*** Proved all conclusions

*** PROVED VC

The following rules were used in proving the above VC:

          HWMULT.RLS::local_odd(2)

          SPADE$CHECKER:INTINEQS.RUL::inequals(90)

 

 

STEP 1

Command: newvc

Now attempting proof of VC: function_product_4

H1:  p = x * y - a * b

H2:  a >= 0

H3:  not a <> 0

-->

  C1:  p = x * y

 

STEP 2

Command: simplify

*** New H3: a = 0

 

STEP 3

Command: replace(h # 1)

Successful substitution with rule: eq(1)

  Proved subgoal: a = 0

  Met constraint: a \= 0

Allowing substitution of 0

for a

*** New H1: p = x * y - 0 * b

 

STEP 4

Command: standardise

Use of standardisation

     on p = x * y - 0 * b -> p = x * y

  gives true

*** New H4: p = x * y - 0 * b -> p = x * y

 

STEP 5

Command: done

*** Proved C1: p = x * y

*** Proved all conclusions

*** PROVED VC

The only rule used in proving the above VC was:  SPADE$CHECKER:SPECIAL.RUL::eq(1)

 

*** END OF PROOF SESSION

 

The following rules were used during the above proof session:

          HWMULT.RLS::local_odd(1)

          HWMULT.RLS::local_odd(2)

          SPADE$CHECKER:INTINEQS.RUL::inequals(90)

          SPADE$CHECKER:SPECIAL.RUL::eq(1)

 

Command: exit

 

Fig. 2(d): Proof log for HWMULT

 

   title function max;

   const n: integer = pending;

   type indexrange  = 1 .. n;

   type intarray    = array[indexrange] of integer;

   var a: intarray;

   var indexval: indexrange;

   var temp, max: integer;

   proof function maxof(integer, intarray, indexrange, indexrange): boolean;

   derives max from a;

   pre  n > 0;

   post maxof(max, a, 1, n);

 

   start

   indexval := 1;

   temp := element(a, [1]);

1: assert (1 <= indexval) and (indexval <= n) and

    maxof(temp, a, 1, indexval);

   case indexval < n of

     true : skip;

     false: goto 2

   end;

   indexval := indexval + 1;

   case element(a, [indexval]) > temp of

     true : skip;

     false: goto 1

   end;

   temp := element(a, [indexval]);

   goto 1;

2: max := temp

   finish

 

Fig. 3(a): FDL model of MAXFN

 


rule_family max: maxof(X,A,L,U) requires [ X:i, A:any, L:i, U:i ] &

            not X requires [ X:any ].

max(1): ( not maxof(X,A,L,U) may_be_deduced_from [ L>U ] ).

max(2): ( maxof(element(A,[L]),A,L,L) may_be_deduced_from [] ).

max(3): ( maxof(X,A,L,U+1) may_be_deduced_from

           [ maxof(X,A,L,U), X >= element(A,[U+1]) ] ).

max(4): ( maxof(element(A,[U+1]),A,L,U+1) may_be_deduced_from

           [ maxof(X,A,L,U), element(A,[U+1]) >= X ] ).

/****************************************************************/

 These follow from the definition:

 maxof(X, A, L, U) = for_all(i:integer, (L <= i and i <= U) ->

                                element(A, [i]) <= X) and

                     for_some(j:integer, L <= j and j <= U and

                                element(A, [j]) = X).

*****************************************************************/

 

Fig. 3(b): Local proof rules file for MAXFN

 


        ****************************************************

                 RESULTS OF SPADE SEMANTIC ANALYSIS

                     PVL SPADE TOOL VERSION : 4.3

          Program Validation Limited, Southampton, England

        ****************************************************

                            function max

               DATE : 24-FEB-1988 TIME : 19:52:25.12

                      VERIFICATION CONDITIONS

 

For path(s) from statement 2 to (assertion) statement 4 :

function_max_1.

H1:   n > 0.

      ->

C1:   1 <= 1.

C2:   1 <= n.

C3:   maxof(element(a,[1]),a,1,1).

 

For path(s) from statement 4 to (assertion) statement 4 :

function_max_2.

H1:   1 <= indexval.

H2:   indexval <= n.

H3:   maxof(temp,a,1,indexval).

H4:   indexval < n.

H5:   not (element(a,[indexval + 1]) > temp).

      ->

C1:   1 <= indexval + 1.

C2:   indexval + 1 <= n.

C3:   maxof(temp,a,1,indexval + 1).

 

function_max_3.

H1:   1 <= indexval.

H2:   indexval <= n.

H3:   maxof(temp,a,1,indexval).

H4:   indexval < n.

H5:   element(a,[indexval + 1]) > temp.

      ->

C1:   1 <= indexval + 1.

C2:   indexval + 1 <= n.

C3:   maxof(element(a,[indexval + 1]),a,1,indexval + 1).

 

For path(s) from statement 4 to (assertion) statement 11 :

 

function_max_4.

H1:   1 <= indexval.

H2:   indexval <= n.

H3:   maxof(temp,a,1,indexval).

H4:   not (indexval < n).

      ->

C1:   maxof(temp,a,1,n).

 

Fig. 3(c): Output from SPADE VCG for MAXFN


        ****************************************************

           SPADE Transcript of Interactive Proof Session

               PVL SPADE PROOF CHECKER VERSION: 1.0a

            Program Validation Limited, Southampton U.K.

        ****************************************************

                DATE : 26-FEB-1988  TIME : 18:12:32

 

STEP 1

Command: newvc

Now attempting proof of VC: function_max_1

H1:  n > 0

-->

  C1:  true

  C2:  1 <= n

  C3:  maxof(element(a, [1]), a, 1, 1)

 

STEP 2

Command: done

*** Proved C1: true

 

STEP 3

Command: standardise

Apply standardiser on C2

>>> New goal C2: n > 0

 

STEP 4

Command: done

*** Proved C2: n > 0

 

STEP 5

Command: infer

Successful inference with rule: max(2)

  (unconstrained rule: no subgoals)

Therefore maxof(element(a, [1]), a, 1, 1)

*** New H2: maxof(element(a, [1]), a, 1, 1)

*** Proved C3: maxof(element(a, [1]), a, 1, 1)

 

STEP 6

Command: done

*** Proved all conclusions

*** PROVED VC

The only rule used in proving the above VC was:  MAXFN.RLS::max(2)

 

STEP 1

Command: newvc

Now attempting proof of VC: function_max_2

H1:  1 <= indexval

H2:  indexval <= n

H3:  maxof(temp, a, 1, indexval)

H4:  indexval < n

H5:  not element(a, [indexval + 1]) > temp

-->

  C1:  1 <= indexval + 1

  C2:  indexval + 1 <= n

  C3:  maxof(temp, a, 1, indexval + 1)

 

 

STEP 2

Command: simplify

*** New H5: element(a, [indexval + 1]) <= temp

 

STEP 3

Command: standardise

Apply standardiser on C2

>>> New goal C2: - indexval + n > 0

 

STEP 4

Command: done

*** Proved C1: 1 <= indexval + 1

*** Proved C2: - indexval + n > 0

 

STEP 5

Command: infer

Successful inference with rule: max(3)

  Proved subgoal: maxof(temp, a, 1, indexval)

  Proved subgoal: temp >= element(a, [indexval + 1])

Therefore maxof(temp, a, 1, indexval + 1)

*** New H6: maxof(temp, a, 1, indexval + 1)

*** Proved C3: maxof(temp, a, 1, indexval + 1)

 

STEP 6

Command: done

*** Proved all conclusions

*** PROVED VC

The only rule used in proving the above VC was:  MAXFN.RLS::max(3)

 

STEP 1

Command: newvc

Now attempting proof of VC: function_max_3

H1:  1 <= indexval

H2:  indexval <= n

H3:  maxof(temp, a, 1, indexval)

H4:  indexval < n

H5:  element(a, [indexval + 1]) > temp

-->

  C1:  1 <= indexval + 1

  C2:  indexval + 1 <= n

  C3:  maxof(element(a, [indexval + 1]), a, 1, indexval + 1)

 

STEP 2

Command: standardise

Apply standardiser on C2

>>> New goal C2: - indexval + n > 0

 

STEP 3

Command: done

*** Proved C1: 1 <= indexval + 1

*** Proved C2: - indexval + n > 0

 

STEP 4

Command: infer

Successful inference with rule: max(4)

  Proved subgoal: maxof(temp, a, 1, indexval)

  Proved subgoal: element(a, [indexval + 1]) >= temp

Therefore maxof(element(a, [indexval + 1]), a, 1, indexval + 1)

*** New H6: maxof(element(a, [indexval + 1]), a, 1, indexval + 1)

*** Proved C3: maxof(element(a, [indexval + 1]), a, 1,

    indexval + 1)

 

STEP 5

Command: done

*** Proved all conclusions

*** PROVED VC

The only rule used in proving the above VC was:  MAXFN.RLS::max(4)

 

STEP 1

Command: newvc

Now attempting proof of VC: function_max_4

H1:  1 <= indexval

H2:  indexval <= n

H3:  maxof(temp, a, 1, indexval)

H4:  not indexval < n

-->

  C1:  maxof(temp, a, 1, n)

 

STEP 2

Command: simplify

*** New H4: n <= indexval

 

STEP 3

Command: replace(h # 3)

Successful substitution with rule: eq(1)

  Proved subgoal: indexval = n

  Met constraint: indexval \= n

Allowing substitution of n

for indexval

*** New H3: maxof(temp, a, 1, n)

 

STEP 4

Command: done

*** Proved C1: maxof(temp, a, 1, n)

*** Proved all conclusions

*** PROVED VC

The only rule used in proving the above VC was:  SPADE$CHECKER:SPECIAL.RUL::eq(1)

 

*** END OF PROOF SESSION

The following rules were used during the above proof session:

          MAXFN.RLS::max(2)

          MAXFN.RLS::max(3)

          MAXFN.RLS::max(4)

          SPADE$CHECKER:SPECIAL.RUL::eq(1)

Command: exit

 

Fig. 3(d): Proof log for MAXFN

 

title procedure sort;

 

const maxindex: integer = pending;

type indexrange  = 1 .. maxindex;

type vector      = array [indexrange] of integer;

var a: vector;

var n: indexrange;

var i, j: indexrange;

var temp: integer;

 

 

derives a from n, a;

pre  (n >= 1) and (n <= maxindex);

post true;  { if we are only interested in proving array

    subscripts o.k. }

 

        start

        i := 1;

10001:  assert i>=1 and i<=n and n<=maxindex;

        case i < n of

           true : skip;

           false: goto 10002

        end;

        j := 1;

10003:  assert i>=1 and i<n and j>=1 and j<=n-i+1 and

        n<=maxindex;

        case j <= n - i of

           true : skip;

           false: goto 10004

        end;

    check 1<=j and j+1<=maxindex;  {for a[j] & a[j+1] below

    o.k.}

        { N.B.: Conditions 1<=j+1 and j<=maxindex follow from

           above }

        case element(a, [j]) > element(a, [j + 1]) of

           true : skip;

           false: goto 10005

        end;

    { N.B.: Previous check statement ensures a[j] below is

    o.k. }

        temp := element(a, [j]);

    { N.B.: Previous check statement ensures a[j] & a[j+1]

           below are o.k. }

        a := update(a, [j], element(a, [j + 1]));

    { N.B.: Previous check statement ensures a[j+1] below

           is o.k. }

        a := update(a, [j + 1], temp);

10005:  j := j + 1;

        goto 10003;

10004:  i := i + 1;

        goto 10001;

10002:  skip

        finish

 

 

Fig. 4(a): FDL model of SORT with array bound checks

 


        ****************************************************

                 RESULTS OF SPADE SEMANTIC ANALYSIS

                     PVL SPADE TOOL VERSION : 4.3

          Program Validation Limited, Southampton, England

        ****************************************************

                           procedure sort

               DATE :  9-JUN-1988 TIME : 16:37:40.76

                      VERIFICATION CONDITIONS

                          (not simplified)

 

For path(s) from statement 2 to (assertion) statement 3 :

 

procedure_sort_1.

H1:   n >= 1.

H2:   n <= maxindex.

      ->

C1:   1 >= 1.

C2:   1 <= n.

C3:   n <= maxindex.

 

For path(s) from statement 6 to (assertion) statement 3 :

 

procedure_sort_2.

H1:   i >= 1.

H2:   i < n.

H3:   j >= 1.

H4:   j <= n - i + 1.

H5:   n <= maxindex.

H6:   not (j <= n - i).

      ->

C1:   i + 1 >= 1.

C2:   i + 1 <= n.

C3:   n <= maxindex.

 

For path(s) from statement 3 to (assertion) statement 6 :

 

procedure_sort_3.

H1:   i >= 1.

H2:   i <= n.

H3:   n <= maxindex.

H4:   i < n.

      ->

C1:   i >= 1.

C2:   i < n.

C3:   1 >= 1.

C4:   1 <= n - i + 1.

C5:   n <= maxindex.

 

For path(s) from statement 6 to (assertion) statement 6 :

 

procedure_sort_4.

H1:   i >= 1.

H2:   i < n.

H3:   j >= 1.

H4:   j <= n - i + 1.

H5:   n <= maxindex.

H6:   j <= n - i.

H7:   not (element(a,[j]) > element(a,[j + 1])).

      ->

C1:   i >= 1.

C2:   i < n.

C3:   j + 1 >= 1.

C4:   j + 1 <= n - i + 1.

C5:   n <= maxindex.

 

procedure_sort_5.

H1:   i >= 1.

H2:   i < n.

H3:   j >= 1.

H4:   j <= n - i + 1.

H5:   n <= maxindex.

H6:   j <= n - i.

H7:   element(a,[j]) > element(a,[j + 1]).

      ->

C1:   i >= 1.

C2:   i < n.

C3:   j + 1 >= 1.

C4:   j + 1 <= n - i + 1.

C5:   n <= maxindex.

 

For path(s) from statement 6 to (check) statement 8 :

 

procedure_sort_6.

H1:   i >= 1.

H2:   i < n.

H3:   j >= 1.

H4:   j <= n - i + 1.

H5:   n <= maxindex.

H6:   j <= n - i.

      ->

C1:   1 <= j.

C2:   j + 1 <= maxindex.

 

For path(s) from statement 3 to (assertion) statement 18 :

 

procedure_sort_7.

***   true. /* transformed postcondition is true */

 

Fig. 4(b): Output from SPADE VCG for SORT with array bound checks

 


        ****************************************************

           SPADE Transcript of Interactive Proof Session

               PVL SPADE PROOF CHECKER VERSION: 1.0a

            Program Validation Limited, Southampton U.K.

        ****************************************************

                DATE : 9-JUN-1988  TIME : 19:38:48

 

STEP 1

Command: newvc

Now attempting proof of VC: procedure_sort_1

H1:  n >= 1

H2:  n <= maxindex

-->

  C1:  true

  C2:  1 <= n

  C3:  n <= maxindex

 

STEP 2

Command: done

*** Proved C1: true

*** Proved C2: 1 <= n

*** Proved C3: n <= maxindex

*** Proved all conclusions

*** PROVED VC

 

The above proof did not make use of the proof rules database

 

 

STEP 1

Command: newvc

Now attempting proof of VC: procedure_sort_2

H1:  i >= 1

H2:  i < n

H3:  j >= 1

H4:  j <= n - i + 1

H5:  n <= maxindex

H6:  not j <= n - i

-->

  C1:  i + 1 >= 1

  C2:  i + 1 <= n

  C3:  n <= maxindex

 

STEP 2

Command: done

*** Proved C1: i + 1 >= 1

*** Proved C3: n <= maxindex

 

STEP 3

Command: standardise

Apply standardiser on C2

>>> New goal C2: - i + n > 0

 

STEP 4

Command: done

*** Proved C2: - i + n > 0

*** Proved all conclusions

*** PROVED VC

 

The above proof did not make use of the proof rules database

 

 

 

STEP 1

Command: newvc

Now attempting proof of VC: procedure_sort_3

H1:  i >= 1

H2:  i <= n

H3:  n <= maxindex

H4:  i < n

-->

  C1:  i >= 1

  C2:  i < n

  C3:  true

  C4:  1 <= n - i + 1

  C5:  n <= maxindex

 

STEP 2

Command: done

*** Proved C1: i >= 1

*** Proved C2: i < n

*** Proved C3: true

*** Proved C4: 1 <= n - i + 1

*** Proved C5: n <= maxindex

*** Proved all conclusions

*** PROVED VC

 

The above proof did not make use of the proof rules database

 

 

 

STEP 1

Command: newvc

Now attempting proof of VC: procedure_sort_4

H1:  i >= 1

H2:  i < n

H3:  j >= 1

H4:  j <= n - i + 1

H5:  n <= maxindex

H6:  j <= n - i

H7:  not element(a, [j]) > element(a, [j + 1])

-->

  C1:  i >= 1

  C2:  i < n

  C3:  j + 1 >= 1

  C4:  j + 1 <= n - i + 1

  C5:  n <= maxindex

 

STEP 2

Command: done

*** Proved C1: i >= 1

*** Proved C2: i < n

*** Proved C3: j + 1 >= 1

*** Proved C5: n <= maxindex

 

 

 

STEP 3

Command: standardise

Use of standardisation

     on j <= n - i -> j + 1 <= n - i + 1

  gives true

*** New H8: j <= n - i -> j + 1 <= n - i + 1

 

STEP 4

Command: forwardchain(c # 4)

*** New H8: j + 1 <= n - i + 1

*** Proved C4: j + 1 <= n - i + 1

 

STEP 5

Command: done

*** Proved all conclusions

*** PROVED VC

 

The above proof did not make use of the proof rules database

 

 

 

STEP 1

Command: newvc

Now attempting proof of VC: procedure_sort_5

H1:  i >= 1

H2:  i < n

H3:  j >= 1

H4:  j <= n - i + 1

H5:  n <= maxindex

H6:  j <= n - i

H7:  element(a, [j]) > element(a, [j + 1])

-->

  C1:  i >= 1

  C2:  i < n

  C3:  j + 1 >= 1

  C4:  j + 1 <= n - i + 1

  C5:  n <= maxindex

 

STEP 2

Command: done

*** Proved C1: i >= 1

*** Proved C2: i < n

*** Proved C3: j + 1 >= 1

*** Proved C5: n <= maxindex

 

STEP 3

Command: standardise

Use of standardisation

     on j <= n - i -> j + 1 <= n - i + 1

  gives true

*** New H8: j <= n - i -> j + 1 <= n - i + 1

 

STEP 4

Command: forwardchain(c # 4)

*** New H8: j + 1 <= n - i + 1

*** Proved C4: j + 1 <= n - i + 1

 

 

STEP 5

Command: done

*** Proved all conclusions

*** PROVED VC

 

The above proof did not make use of the proof rules database

 

 

 

STEP 1

Command: newvc

Now attempting proof of VC: procedure_sort_6

H1:  i >= 1

H2:  i < n

H3:  j >= 1

H4:  j <= n - i + 1

H5:  n <= maxindex

H6:  j <= n - i

-->

  C1:  1 <= j

  C2:  j + 1 <= maxindex

 

STEP 2

Command: done

*** Proved C1: 1 <= j

 

STEP 3

Command: standardise

Use of standardisation

     on i >= 1 -> n - i + 1 <= n

  gives true

*** New H7: i >= 1 -> n - i + 1 <= n

 

STEP 4

Command: forwardchain(h # 7)

*** New H7: n - i + 1 <= n

 

STEP 5

Command: replace(h # 6)

Successful substitution with rule: standardisation(1)

  Met constraint: nonvar(standardisation(1))

  Met constraint: standardisation(1) = standardisation(1)

  Met constraint: nonvar(j + 1 <= n - i + 1)

  Met constraint: (j <= n - i) \= (j + 1 <= n - i + 1)

  Met constraint: checktype(j <= n - i, boolean)

  Met constraint: norm_typed_expr(j <= n - i, boolean, 1 - i -

    j + n > 0)

  Met constraint: norm_typed_expr(j + 1 <= n - i + 1, boolean,

     1 - i - j + n > 0)

Allowing substitution of j + 1 <= n - i + 1

for j <= n - i

*** New H6: j + 1 <= n - i + 1

 

STEP 6

Command: done

*** Proved C2: j + 1 <= maxindex

*** Proved all conclusions

*** PROVED VC

 

The only rule used in proving the above VC was:  SPADE$CHECKER:SPECIAL.RUL::standardisation(1)

STEP 1

Command: newvc

Now attempting proof of VC: procedure_sort_7

-->

  C1:  true

 

STEP 2

Command: done

*** Proved C1: true

*** Proved all conclusions

*** PROVED VC

 

The above proof did not make use of the proof rules database

 

 

*** END OF PROOF SESSION

 

 

The only rule used in the above proof session was:  SPADE$CHECKER:SPECIAL.RUL::standardisation(1)

 

 

Command: exit

 

Fig. 4(c): Proof log for SORT with array bound checks

10                 Getting Started

10.1          First steps

If all has been set up correctly, you may type, in response to the operating system prompt:

checker

Then, after a short pause (which may grow longer if you add global or local initialisation files as described in chapter 6, especially if this initialisation involves consulting rule files or loading rule libraries), you will be prompted:

Please type filename, without extension, in lowercase, within single quotes if it is not in this directory, followed by a full-stop.

(FILENAME.vcg and FILENAME.fdl or FILENAME.csv will be read)

Filename? _

The Checker expects the filename to be a Prolog atom (see section 3.5).  If the file is not in your default directory, you will need to use a device and/or directory name; in this case, as the above suggests, you must enclose the filename within single quotes.  The following are all valid responses to the above:

SUN:

Windows

extgcd.

extgcd.

divbysub.

divbysub.

test1.

test1.

’../other/dir/extgcd’.

‘../other/dir/extgcd’.

’/home/user1/checker/maxfn’.

‘h:/checker/maxfn’.

If you type a filename for which the Checker cannot find both the .FDL and the .VCG file, you will be reprimanded and the Checker will list all of the files in your current directory with the extension .FDL or .VCG (whichever file it could not find or, if neither exists, the FDL files).  You will then be asked to try again.  If you have got copies of the files mentioned earlier in your directory, type

divbysub.

Once you type a valid filename, the Checker will read first the FDL declarations and then the .VCG file.  The .VCG file should contain verification conditions: if it contains path functions or weakest preconditions, the Checker will soon give up attempting to make sense of the file.  Furthermore, the version of the SPADE VCG used is important: it must be version 4.3 or later.  Output from the Examiner may also be used, as noted in Chapter 4.

If the .VCG file is acceptable, the Checker will read it in and you will probably see (unless a global or local checker.ini file has set the echo flag to off) the contents of the .FDL file (up to the “start” or “end” statement) and of the .VCG file echoed on the screen as they are read.  Once it has read in all the VCs, you will see a “welcome” message and you will be prompted for a VC to prove.  You must select a VC number from the list displayed.  Given that there must be at least one VC, you can select the first VC by typing

1.

Notice the full-stop: if you have never used Prolog before, you will spend many happy minutes over the next few days waiting for something to happen while the Checker is also sitting waiting for something to happen: namely, for you to terminate your command by typing a full-stop!  If you forget the full-stop, your only clue will be that, once you hit carriage-return, the continuation prompt

|:

will appear; if it does so, it is perfectly acceptable to type the full-stop (if that’s all that is missing) and hit return again.

VC number 1 has now been loaded in ready for our proof attempt to commence and, to reassure us that it has the VC will be displayed on the screen, then a

CHECK|:

prompt will appear.  The Checker is now ready for a command.  To see the VC displayed once again, try

list.

and, to see if any (or all) of the conclusions can be eliminated by the Checker, try

done.

Finally, when you’ve had enough, you can get back out by typing

exit.

and answering yes (or y) if prompted to confirm that you wish to terminate the proof session.

10.2          What to do next

As mentioned in the overview to this document, it is not intended to be a tutorial manual.  We hope that you will be able to attend a SPADE Proof Checker training course as your next step.

If, however, time considerations dictate that you must try to become more familiar with the Proof Checker without such assistance for the present, we suggest the following steps.

1         Chapters 2 and 3 are worth attention at an early stage.

2         You should read through chapter 5 for the first time, paying particular attention to section 5.1.

3         Try to work through the examples covered in chapter 9, without looking at the model proof logs provided.

4         You may find some of the material in chapter 7 meaningful, now that you are a little more battle-hardened after your proof attempts!

5         Finally, the rest of the manual and its sister document on proof rules awaits.  You can dip into the other sections (or refresh yourself on earlier ones) as seems relevant: if you want to start writing your own proof rules, for instance, chapter 5 of the SPADE Proof Checker Rules Manual is a must and the following chapter is also strongly recommended.

 

 

 

A                      Appendix: Command Syntax Summary

 

Command

Abbr.

Full form

case

ca

case NUMBER

consult

co

consult RULEFILENAME

declare

dec

declare

deduce

ded

deduce FORMULA from HYPLIST

delete

del

delete HYPOTHESES

done

do

done CONCLUSIONS

execute

exe

execute COMMANDFILENAME

exit

exit

exit

forceexit

forceexit

forceexit

forget

forg

forget HYPOTHESES

forwardchain

forw

forwardchain HYPORCONC

help

h

help COMMANDORRULEFAMILY

infer

inf

infer FORMULA using RULES from HYPLIST

instantiate

ins

instantiate VARIABLE with EXPRESSION

list

li

list HYPSANDCONCS

newvc

ne

newvc NUMBER

printvc

pri

printvc

prove

pro

prove CONCLUSION by { implication |
    contradiction | induction }

 

 

prove CONCLUSION by cases on DISJUNCTION

quit

quit

quit

remember

rem

remember HYPOTHESES

replace

rep

replace HYPORCONC : OLDEXP by NEWEXP
    using RULES

save

sa

save

set

se

set FLAG to VALUE

show

sh

show

simplify

si

simplify

standardise

stan

standardise FORMULA

status

stat

status

traverse

t

traverse FORMULA

undelete

und

undelete HYPOTHESES

unwrap

unw

unwrap HYPORCONC

Remember that commands can be issued in the full form with all the arguments shown above or with one or more arguments omitted by omission of arguments from the right.  Expressions should be fully instantiated, apart from the subexpression patterns in the replace command.

Expressions may make use of a “where” clause of the form:

PATTERN where MATCH1 & ... & MATCHn

where each MATCHi is either of the form PrologVar=VarFreeExpression or of the form Hyporconc=ExpressionWithPrologVars.

The argument for the rules to be used (in infer and replace) may be of the form

RULE1 & ... & RULEn

where each RULEi is either a specific rule (e.g. arith(3)) or a range of rules within a family (e.g. arith(2-6)) or a whole rule family (e.g. arith) or a Prolog variable (e.g. X) meaning any rule whatsoever.

 

 

Document Control and References

Altran Praxis Limited, 20 Manvers Street, Bath BA1 1PX, UK.

Copyright Altran Praxis Limited 2010.  All rights reserved.

File under

svn/trunk/userdocs/Checker_UM.doc (was formerly S.P0468.73.33)

Changes history

Issue 0.1  (2nd July 1997):  Conversion to new standard document format.

Issue 0.2  (22nd December 1997):  Minor formatting changes made; section headings added to headers.

Issue 0.3  (20th January 1998): After review by Ian O’Neill.

Issue 1.0  (25th January 1998): After formal review.

Issue 1.1  (1st August 2001): Updated to include NT/2000

Issue 1.2  (6th November 2001) Following review.

Issue 2.0  (7th April 2003): Updated to new template format.

Issue 3.0  (29th May 2003): Changes to new template

Issue 3.1  (30th May 2003): Update for Checker 2.00, documenting new behaviour of SPARK “mod” operator.

Issue 3.2 (4 June 2003): Correct formatting.

Issue 3.3 (23 October 2003): Added description of new “newline_after_prompts” flag.

Issue 3.4 (23 November 2004): Changed company name only.

Issue 3.5 (16 August 2005): Updated to reflect change in logging of execute command.

Issue 3.6 (22nd November 2005): Changed Line Manager.

Issue 3.7 (12th December 2005): Updated following review S.P0468.79.90.

Issue 3.8 (15th December 2005): Updated following review.

Issue 3.9 (2nd September 2008): Describe amended qualifier format and the new help and version qualifiers.

Issue 4.0 (18th November 2008): Describe the modified behaviour of the ‘printvc’ command.

Issue 4.1 (2nd February 2009): Modify copyright notice.

Issue 4.2 (24th August 2009): Removed “Replace more” prompt and added “replace_more” option.

Issue 4.3 (11th September 2009): Remove option “replace_more” and added flags “auto_infer_from_false” and “replace_more” flag.

Issue 4.4 (3rd February 2010): Rebrand for Altran Praxis Limited

Issue 4.5 (12th October 2010): Correct references to the Examiner.

Changes forecast

Nil.

Document references

None.