15.9. Parsing

At this point, we have distributed all the routines that made up our Cradle into units that we can draw upon as we need them. Obviously, they will evolve further as we continue the process of bootstrapping ourselves up again, but for the most part their content, and certainly the architecture that they imply, is defined. What remains is to embody the language syntax into the parser unit. We won't do much of that in this installment, but I do want to do a little, just to leave us with the good feeling that we still know what we're doing. So before we go, let's generate just enough of a parser to process single factors in an expression. In the process, we'll also, by necessity, find we have created a code generator unit, as well.

Remember the very first installment of this series? We read an integer value, say n, and generated the code to load it into the D0 register via an immediate move:

        MOVE #n,D0

Shortly afterwards, we repeated the process for a variable,

        MOVE X(PC),D0

and then for a factor that could be either constant or variable. For old times sake, let's revisit that process. Define the following new unit:

unit Parser;

interface
uses Input, Scanner, Errors, CodeGen;
procedure Factor;

implementation

{ Parse and Translate a Factor }
procedure Factor;
begin
        LoadConstant(GetNumber);
end;

end.

As you can see, this unit calls a procedure, LoadConstant, which actually effects the output of the assembly-language code. The unit also uses a new unit, CodeGen. This step represents the last major change in our architecture, from earlier installments: The removal of the machine-dependent code to a separate unit. If I have my way, there will not be a single line of code, outside of CodeGen, that betrays the fact that we're targeting the 68000 CPU. And this is one place I think that having my way is quite feasible.

For those of you who wish I were using the 80x86 architecture (or any other one) instead of the 68000, here's your answer: Merely replace CodeGen with one suitable for your CPU of choice.

So far, our code generator has only one procedure in it. Here's the unit:

unit CodeGen;

interface
uses Output;
procedure LoadConstant(n: string);

implementation

{ Load the Primary Register with a Constant }
procedure LoadConstant(n: string);
begin
        EmitLn('MOVE #' + n + ',D0' );
end;

end.

Copy and compile this unit, and execute the following main program:

program Main;
uses WinCRT, Input, Output, Errors, Scanner, Parser;
begin
        Factor;
end.

There it is, the generated code, just as we hoped it would be.

Now, I hope you can begin to see the advantage of the unit-based architecture of our new design. Here we have a main program that's all of five lines long. That's all of the program we need to see, unless we choose to see more. And yet, all those units are sitting there, patiently waiting to serve us. We can have our cake and eat it too, in that we have simple and short code, but powerful allies. What remains to be done is to flesh out the units to match the capabilities of earlier installments. We'll do that in the next installment, but before I close, let's finish out the parsing of a factor, just to satisfy ourselves that we still know how. The final version of CodeGen includes the new procedure, LoadVariable:

unit CodeGen;

interface
uses Output;
procedure LoadConstant(n: string);
procedure LoadVariable(Name: string);

implementation

{ Load the Primary Register with a Constant }
procedure LoadConstant(n: string);
begin
        EmitLn('MOVE #' + n + ',D0' );
end;

{ Load a Variable to the Primary Register }
procedure LoadVariable(Name: string);
begin
        EmitLn('MOVE ' + Name + '(PC),D0');
end;

end.

The parser unit itself doesn't change, but we have a more complex version of procedure Factor:

{ Parse and Translate a Factor }
procedure Factor;
begin
        if IsDigit(Look) then
                LoadConstant(GetNumber)
        else if IsAlpha(Look)then
                LoadVariable(GetName)
        else
                Error('Unrecognized character ' + Look);
end;

Now, without altering the main program, you should find that our program will process either a variable or a constant factor. At this point, our architecture is almost complete; we have units to do all the dirty work, and enough code in the parser and code generator to demonstrate that everything works. What remains is to flesh out the units we've defined, particularly the parser and code generator, to support the more complex syntax elements that make up a real language. Since we've done this many times before in earlier installments, it shouldn't take long to get us back to where we were before the long hiatus. We'll continue this process in Installment 16, coming soon. See you then.