14.6. Assignments

Now that we can declare variables of different sizes, it stands to reason that we ought to be able to do something with them. For our first trick, let's just try loading them into our working register, D0. It makes sense to use the same idea we used for Alloc; that is, make a load procedure that can load more than one size. We also want to continue to encapsulate the machine- dependent stuff. The load procedure looks like this:

{ Load a Variable to Primary Register }
procedure LoadVar(Name, Typ: char);
begin
   Move(Typ, Name + '(PC)', 'D0');
end;

On the 68000, at least, it happens that many instructions turn out to be MOVE's. It turns out to be useful to create a separate code generator just for these instructions, and then call it as needed:

{ Generate a Move Instruction }
procedure Move(Size: char; Source, Dest: String);
begin
   EmitLn('MOVE.' + Size + ' ' + Source + ',' + Dest);
end;

Note that these two routines are strictly code generators; they have no error-checking or other logic. To complete the picture, we need one more layer of software that provides these functions.

First of all, we need to make sure that the type we are dealing with is a loadable type. This sounds like a job for another recognizer:

{ Recognize a Legal Variable Type }
function IsVarType(c: char): boolean;
begin
   IsVarType := c in ['B', 'W', 'L'];
end;

Next, it would be nice to have a routine that will fetch the type of a variable from the symbol table, while checking it to make sure it's valid:

{ Get a Variable Type from the Symbol Table }
function VarType(Name: char): char;
var Typ: char;
begin
   Typ := TypeOf(Name);
   if not IsVarType(Typ) then Abort('Identifier ' + Name +
                                        ' is not a variable');
   VarType := Typ;
end;

Armed with these tools, a procedure to cause a variable to be loaded becomes trivial:

{ Load a Variable to the Primary Register }
procedure Load(Name: char);
begin
     LoadVar(Name, VarType(Name));
end;

(NOTE to the concerned: I know, I know, all this is all very inefficient. In a production program, we probably would take steps to avoid such deep nesting of procedure calls. Don't worry about it. This is an EXERCISE, remember? It's more important to get it right and understand it, than it is to make it get the wrong answer, quickly. If you get your compiler completed and find that you're unhappy with the speed, feel free to come back and hack the code to speed it up!)

It would be a good idea to test the program at this point. Since we don't have a procedure for dealing with assignments yet, I just added the lines:

     Load('A');
     Load('B');
     Load('C');
     Load('X');

to the main program. Thus, after the declaration section is complete, they will be executed to generate code for the loads. You can play around with this, and try different combinations of declarations to see how the errors are handled.

I'm sure you won't be surprised to learn that storing variables is a lot like loading them. The necessary procedures are shown next:

{ Store Primary to Variable }
procedure StoreVar(Name, Typ: char);
begin
   EmitLn('LEA ' + Name + '(PC),A0');
   Move(Typ, 'D0', '(A0)');
end;

{ Store a Variable from the Primary Register }
procedure Store(Name: char);
begin
   StoreVar(Name, VarType(Name));
end;

You can test this one the same way as the loads.

Now, of course, it's a RATHER small step to use these to handle assignment statements. What we'll do is to create a special version of procedure Block that supports only assignment statements, and also a special version of Expression that only supports single variables as legal expressions. Here they are:

{ Parse and Translate an Expression }
procedure Expression;
var Name: char;
begin
   Load(GetName);
end;

{ Parse and Translate an Assignment Statement }
procedure Assignment;
var Name: char;
begin
   Name := GetName;
   Match('=');
   Expression;
   Store(Name);
end;

{ Parse and Translate a Block of Statements }
procedure Block;
begin
   while Look <> '.' do begin
      Assignment;
      Fin;
   end;
end;

(It's worth noting that, if anything, the new procedures that permit us to manipulate types are, if anything, even simpler and cleaner than what we've seen before. This is mostly thanks to our efforts to encapsulate the code generator procedures.)

There is one small, nagging problem. Before, we used the Pascal terminating period to get us out of procedure TopDecls. This is now the wrong character … it's used to terminate Block. In previous programs, we've used the BEGIN symbol (abbreviated 'b') to get us out. But that is now used as a type symbol.

The solution, while somewhat of a kludge, is easy enough. We'll use an UPPER CASE 'B' to stand for the BEGIN. So change the character in the WHILE loop within TopDecls, from '.' to 'B', and everything will be fine.

Now, we can complete the task by changing the main program to read:

{ Main Program }
begin
   Init;
   TopDecls;
   Match('B');
   Fin;
   Block;
   DumpTable;
end.

(Note that I've had to sprinkle a few calls to Fin around to get us out of Newline troubles.)

OK, run this program. Try the input:

     ba        { byte a }   *** DON'T TYPE THE COMMENTS!!! ***
     wb        { word b }
     lc        { long c }
     B         { begin  }
     a=a
     a=b
     a=c
     b=a
     b=b
     b=c
     c=a
     c=b
     c=c
     .

For each declaration, you should get code generated that allocates storage. For each assignment, you should get code that loads a variable of the correct size, and stores one, also of the correct size.

There's only one small little problem: The generated code is wrong!

Look at the code for a=c above. The code is:

     MOVE.L    C(PC),D0
     LEA       A(PC),A0
     MOVE.B    D0,(A0)

This code is correct. It will cause the lower eight bits of C to be stored into A, which is a reasonable behavior. It's about all we can expect to happen.

But now, look at the opposite case. For c=a, the code generated is:

     MOVE.B A(PC),D0
     LEA  C(PC),A0
     MOVE.L D0,(A0)

This is NOT correct. It will cause the byte variable A to be stored into the lower eight bits of D0. According to the rules for the 68000 processor, the upper 24 bits are unchanged. This means that when we store the entire 32 bits into C, whatever garbage that was in those high bits will also get stored. Not good.

So what we have run into here, early on, is the issue of TYPE CONVERSION, or COERCION.

Before we do anything with variables of different types, even if it's just to copy them, we have to face up to the issue. It is not the most easy part of a compiler. Most of the bugs I have seen in production compilers have had to do with errors in type conversion for some obscure combination of arguments. As usual, there is a tradeoff between compiler complexity and the potential quality of the generated code, and as usual, we will take the path that keeps the compiler simple. I think you'll find that, with this approach, we can keep the potential complexity in check rather nicely.