A very simple example

[Next] [Up] [Previous]
Next: Uninterpreted functions Up: Instruction processors Previous: Instruction processors

A very simple example

As a very simple example, let's define an instruction set architecture with just one instruction, performed on values in a register file. Each instruction has two source operands and a destination operand. Thus, an opcode consists of three fields - srca, srcb and dst. For simplicity, we'll make the operation addition. Here's what the ISA model might look like:

  scalarset REG undefined;
  typedef WORD array 0..31 of boolean;

  module main()
  {
    r : array REG of WORD;
    srca, srcb, dst : REG;

    opra, oprb, res : WORD;

    opra := r[srca];
    oprb := r[srcb];
    res  := opra + oprb;
    next(r[dst]) := res;
  }

We've declared a type REG to represent a register index, a type WORD to represent a data word (in this case a 32 bit word). Notice that REG is a undefiend scalarset. That is, we don't say, for the moment, how many registers there are.

Notice, also, that we've given names to the operand values opra and oprb, and to the operation result res. It wasn't necessary to do this. That is, we could have written:

  next(res[dst]) := r[srca] + r[srcb];

This would have been more concise. However, it's convenient to give the intermediate quantities names, since we will use these later in writing refinement relations. Now let's implement this abstract model with a simple 3 stage pipeline, where the fisrt stage fecthes the operands, the second stage does the addition, and the third stage stores the result into the register file. The implementation has a reguster bypass path that forwards the results directly from later stages of pipe to the operand fetch stage.

  /* the implementation */

  /* implementation register file */

  ir : array REG of WORD;

  /* pipe registers */
  
  stage1 : struct {
    valid : boolean;
    dst : REG;
    opra, oprb : WORD;
  }
  
  stage2 : struct{
    valid : boolean;
    dst : REG;
    res : WORD;
  }
  
  /* read stage : fetch operands with bypass */
  
  next(stage1.opra) :=
    case{
    stage1.valid & srca = stage1.dst : alu_output;
    stage2.valid & srca = stage2.dst : stage2.res;
    default : ir[srca];
  };
  next(stage1.oprb) :=
    case{
    stage1.valid & srcb = stage1.dst : alu_output;
    stage2.valid & srcb = stage2.dst : stage2.res;
    default : ir[srcb];
  };
  
  next(stage1.dst) := dst;
  init(stage1.valid) := 0;
  next(stage1.valid) := 1;
  
  
  /* alu stage: add operands */
  
  alu_output : WORD;
  alu_output := stage1.opra + stage1.oprb;
  next(stage2.res) := alu_output;
  
  next(stage2.dst) := stage1.dst;
  init(stage2.valid) := 0;
  next(stage2.valid) := stage1.valid;
  
  
  /* writeback stage: store result in r */
  
  if(stage2.valid)
    next(ir[stage2.dst]) := stage2.res;

Note that each stage has a valid bit, which says whether there is an instruction in it. Initially, these bits are zero.

Now, we would like to write two refinement maps - one which defines the correct operand values in stage1 and the other which defines the correct result at the adder output. To do this, we add some auxiliary state information to each stage tat remembers the correct operand and result values for the given stage, as computed by the abstract model. Let's add the following component to stage1 :

    stage1.aux : struct{
       opra, oprb, res : WORD;
    }

Now, let's add some code to record the correct operand and result values for the first stage:

   next(stage1.aux.opra) := opra;
   next(stage1.aux.oprb) := oprb; 
   next(stage1.aux.res)  := res;

That is, we simply record the abstract model's values for opra, oprb and res. Note, this is why we gave them explicit names in the abstract model. This is all the auxiliary information we'll need to state our refinement relations. However, for e deeper pipeline, we could just pass the auxiliary information down the pipe along with the instructions, as follows:

    next(stage2.aux) := stage1.aux;
    ...

Now, we can state the two refinement maps in terms of the auxiliary state information. For the operands, we specify that, if stage 1 has a valid instruction, then its operands are equal to the correct operand values:

    layer lemma1: {
      if(stage1.valid) stage1.opra := stage1.aux.opra;
      if(stage1.valid) stage1.oprb := stage1.aux.oprb;
    }

For the ALU results, we specify that, if stage1 has a valid instruction, then the ALU output is equal to the correct result value:

    layer lemma2: 
      if(stage1.valid) alu_output := stage1.aux.res;

We would like to show, of course, the correct operands imply correct results, and conversely, correct results imply correct operands. However, since we have an arbitrary number of registers to deal with, we'll need to break lemma1 into cases as a function of which register is being read. The only problem we have in doing this is that we don't know which registers were the source operands for the instruction in stage one, because our implementation does not store this information. This problem is easily solved, however, since we can store the information in our auxiliary state. So let's add two components to the auxiliary state:

   
    next(stage1.aux.srca) := srca;
    next(stage1.aux.srcb) := srcb;

Of course, we have to remember to declare these components in our auxiliary structure (their type is REG). Now, we split the operand refinement maps into cases based on which are the actual source registers of the instruction in stage 1. For the srca operand, we have:

  forall(i in REG)
    subcase lemma1[i] of stage1.opra//lemma1 for stage1.aux.srca = i;

Similarly, for srcb, we have:

  forall(i in REG)
    subcase lemma1[i] of stage1.oprb//lemma1 for stage1.aux.srcb = i;

This way, we only have to consider one register at a time, so we can reduce an arbitrary number of registers to just one, for each case. Note, we don't need to do this for lemma2, the result refinement maps, since it doesn't depend on the register file. It depends only on the operands.

Now we're ready to prove the various cases of our lemmas. For lemma1, we say:

  forall(i in REG)
    using res//free, alu_output//lemma2 prove stage1//lemma1[i];

That is, we assume that the ALU ouput is correct, and show that (future) operands we obtain are correct. Notice that there are several paths that an ALU result might take to get back to the operand registers in stage 1. It might follow the bypass path, or it might get stored in register i. Either way, it should agree with what the abstract model gets. Notice also that the correct storage and forwarding of a result deosn't depend on what the result actually is. For this reason, we free the abstract model's result res. This eliminates the abstract model's ALU from the cone.

To prove the result lemma (lemma2), we assume that operands entering the ALU are correct:

  using opra//free, oprb//free, stage1//lemma1
    prove alu_output//lemma2;

Note, in this case, we don't care what the correct operands actually are - we only care that the abstract model and the implementation agree on them (lemma1). Thus, we free opra and oprb, and eliminate the abstract model register file from the cone. This is important, since this register file is of unbounded size, and in this case we have no single register index to which we can reduce the type REG.

Now, run this example. You'll notice that there are 32 instances to prove for each of

    stage1.opra[i]//lemma1[0]
    stage1.oprb[i]//lemma1[0]
    alu_output[i]//lemma2

where i is a bit index within a word. This is because SMV proves the refinement maps for each of the 32 bits of the data path separately. Later we'll see how to reduce this rather large number of properties. For the moment, however, select property

    stage1.opra[0]//lemma1[0]

and try to verify it. You should get a counterexample. In this counterexemple, the initial value of r[0][0] (a bit in the abstract register file) is zero, while the initial value of ir[0]0] (the corresponding bit in the implementation register file) is one. The problem here is that the abstract model is underspecified. Because we have specified the initial state of the register file, it is nondeterministic. As a result of this, the abstract model and implementation have diverged.

When there is a nondeterministic choice in an abstract model, we sometimes have to provide a ``witness function'' for this choice. That is, as a function of the implementation behavior, we plug in a suitable value in the abstract model. In this case, since the initial value in the specification is complete undefined, we are free to plug in any value we like. So let's write the following:

  init(r) := ir;

That is, we just set the initial value of the abstract model register file to be the same as the initial value of the implementation register file. You might be wondering why we have to do this. That is, why can't SMV figure out what the correct initial value of the register file is. The answer is that it could, for any given property. However, it might use different intial values to prove different properties. As a result, even though we would have ``verified'' all the properties, there would be no single choice that makes all the properties true. Thus, for reasons of soundness, SMV requires you to fix the choice once and for all, and then verifies all the properties for the particular choice you make.

In any event, let's open the new version, with the witness function, and try again to verify stage1.opra[0]//lemma1[0]. You should find that the property is true. Look in the Cone pane, and observe that it contains only 11 boolean state variables. This is bacause we are considering only registers r[0] and ir[0], and only bit 0 of the data path. We obtain only bit 0 of the data path since neither the abstract model ALU nor the implementation ALU is in the cone. The former was eliminated by freeing res, while the latter was eliminated by using lemma2 to drive the ALU output in the implementation.

Now select property alu_output[0]//lemma2. The cone is rather large in this case (66 state variables) because bit 0 depends in this case on all the other bits of the data path through the ALU. (This is because bit 0 is the most significant bit,and depeds on all the others through the carry chain). However, notice the register files are not in the cone in this case, because we have freed opra and oprb, and we have driven the implementation operand registers using lemma1.

Go ahead and verify property alu_output[0]//lemma2. You should find that it checks fairy quickly in spite of the large number of state variables. This is because the ALU operation is addition, and SMV succeeds in finding an ordering of the BDD variables that maes the addition function compact. In fact, select Prop|Verify All to verify all the remaining properties. On my machine, this takes a little under eight seconds.

On the other hand, if we had had a multiplier in the ALU the story would have been different. This is because there is no BDD variable ordering that makes this function compact. The verification of multipliers is beyond the scope of this tutorial. There is, however, a way of separating the problem of airthmentic verification from the processor verification problem. In this way, we can verify the processor design independent of the ALU function. Then we can plug in any ALU function we like.

[Next] [Up] [Previous]
Next: Uninterpreted functions Up: Instruction processors Previous: Instruction processors

Ken McMillan
Fri Nov 6 22:15:28 PST 1998