5.9. The BREAK Statement

Earlier I promised you a BREAK statement to accompany LOOP. This is one I'm sort of proud of. On the face of it a BREAK seems really tricky. My first approach was to just use it as an extra terminator to Block, and split all the loops into two parts, just as I did with the ELSE half of an IF. That turns out not to work, though, because the BREAK statement is almost certainly not going to show up at the same level as the loop itself. The most likely place for a BREAK is right after an IF, which would cause it to exit to the IF construct, not the enclosing loop. Wrong. The BREAK has to exit the inner LOOP, even if it's nested down into several levels of IFs.

My next thought was that I would just store away, in some global variable, the ending label of the innermost loop. That doesn't work either, because there may be a break from an inner loop followed by a break from an outer one. Storing the label for the inner loop would clobber the label for the outer one. So the global variable turned into a stack. Things were starting to get messy.

Then I decided to take my own advice. Remember in the last session when I pointed out how well the implicit stack of a recursive descent parser was serving our needs? I said that if you begin to see the need for an external stack you might be doing something wrong. Well, I was. It is indeed possible to let the recursion built into our parser take care of everything, and the solution is so simple that it's surprising.

The secret is to note that every BREAK statement has to occur within a block … there's no place else for it to be. So all we have to do is to pass into Block the exit address of the innermost loop. Then it can pass the address to the routine that translates the break instruction. Since an IF statement doesn't change the loop level, procedure DoIf doesn't need to do anything except pass the label into its blocks (both of them). Since loops do change the level, each loop construct simply ignores whatever label is above it and passes its own exit label along.

All this is easier to show you than it is to describe. I'll demonstrate with the easiest loop, which is LOOP:

{ Parse and Translate a LOOP Statement }
procedure DoLoop;
var L1, L2: string;
begin
   Match('p');
   L1 := NewLabel;
   L2 := NewLabel;
   PostLabel(L1);
   Block(L2);
   Match('e');
   EmitLn('BRA ' + L1);
   PostLabel(L2);
end;

Notice that DoLoop now has two labels, not just one. The second is to give the BREAK instruction a target to jump to. If there is no BREAK within the loop, we've wasted a label and cluttered up things a bit, but there's no harm done.

Note also that Block now has a parameter, which for loops will always be the exit address. The new version of Block is:

{ Recognize and Translate a Statement Block }
procedure Block(L: string);
begin
   while not(Look in ['e', 'l', 'u']) do begin
      case Look of
       'i': DoIf(L);
       'w': DoWhile;
       'p': DoLoop;
       'r': DoRepeat;
       'f': DoFor;
       'd': DoDo;
       'b': DoBreak(L);
       else Other;
      end;
   end;
end;

Again, notice that all Block does with the label is to pass it into DoIf and DoBreak. The loop constructs don't need it, because they are going to pass their own label anyway.

The new version of DoIf is:

{ Recognize and Translate an IF Construct }
procedure Block(L: string); Forward;
procedure DoIf(L: string);
var L1, L2: string;
begin
   Match('i');
   Condition;
   L1 := NewLabel;
   L2 := L1;
   EmitLn('BEQ ' + L1);
   Block(L);
   if Look = 'l' then begin
      Match('l');
      L2 := NewLabel;
      EmitLn('BRA ' + L2);
      PostLabel(L1);
      Block(L);
   end;
   Match('e');
   PostLabel(L2);
end;

Here, the only thing that changes is the addition of the parameter to procedure Block. An IF statement doesn't change the loop nesting level, so DoIf just passes the label along. No matter how many levels of IF nesting we have, the same label will be used.

Now, remember that DoProgram also calls Block, so it now needs to pass it a label. An attempt to exit the outermost block is an error, so DoProgram passes a null label which is caught by DoBreak:

{ Recognize and Translate a BREAK }
procedure DoBreak(L: string);
begin
   Match('b');
   if L <> '' then
      EmitLn('BRA ' + L)
   else Abort('No loop to break from');
end;

{ Parse and Translate a Program }
procedure DoProgram;
begin
   Block('');
   if Look <> 'e' then Expected('End');
   EmitLn('END')
end;

That almost takes care of everything. Give it a try, see if you can “break” it <pun>. Careful, though. By this time we've used so many letters, it's hard to think of characters that aren't now representing reserved words. Remember: before you try the program, you're going to have to edit every occurence of Block in the other loop constructs to include the new parameter. Do it just like I did for LOOP.

I said almost above. There is one slight problem: if you take a hard look at the code generated for DO, you'll see that if you break out of this loop, the value of the loop counter is still left on the stack. We're going to have to fix that! A shame … that was one of our smaller routines, but it can't be helped. Here's a version that doesn't have the problem:

{ Parse and Translate a DO Statement }
procedure Dodo;
var L1, L2: string;
begin
   Match('d');
   L1 := NewLabel;
   L2 := NewLabel;
   Expression;
   EmitLn('SUBQ #1,D0');
   PostLabel(L1);
   EmitLn('MOVE D0,-(SP)');
   Block(L2);
   EmitLn('MOVE (SP)+,D0');
   EmitLn('DBRA D0,' + L1);
   EmitLn('SUBQ #2,SP');
   PostLabel(L2);
   EmitLn('ADDQ #2,SP');
end;

The two extra instructions, the SUBQ and ADDQ, take care of leaving the stack in the right shape.