CroftSoft / Portfolio

Mini

1999-04-27

A nested-procedure programming language to Java translator using object-oriented design patterns.


Abstract

Whereas some programming languages allow procedures (functions, methods) to be declared and nested within procedures, the Java programming language does not. This causes complications when constructing a tool to automatically translate source code written in such languages to Java. This demonstration shows how a language translator/compiler can be created using object-oriented design patterns and how the inner classes feature of the Java programming language can be used to translate nested procedures in other languages.

Acknowledgements

This project was initiated at the instruction of Iwona Wojciechowska as an assignment for the "Compiler Construction" course at West Virginia University. The "Mini" programming language grammar was provided as part of the course materials.

The demonstration was built using the CUP Parser Generator for Java and JLex: A Lexical Analyzer Generator for Java.

Download

The software is available as licensed under the terms of the Mozilla Public License.

The distribution file, mini.zip, includes the Java source code, compiled bytecode, and documentation.

Any further updates to the source code will be released via the CroftSoft Code Library.

Installation

  1. Install Java 1.2.

  2. Unzip the mini.zip file into a new directory.
    Example:
    
         mkdir C:\mini
    
         copy C:\windows\desktop\mini.zip C:\mini\mini.zip
    
         C:
    
         cd C:\mini
    
         unzip mini.zip
    
         

Execution

  1. Move to the Mini directory. It should contain Mini.jar.
    Example:
    
         C:
    
         cd C:\mini
    
         

  2. Enter the following at command-line prompt:
    
         java -jar Mini.jar test/Test.mini
    
         
    The source code, as translated to Java, will be displayed on your console.

    To save the translated Java source code to a file, add an additional output filename argument to the command:

    
         java -jar Mini.jar test/Test.mini test/Test.java
    
         

Scanning

The JLex tool converts MiniScanner.jlex to MiniScanner.java.

This is used to scan the original Mini source code for tokens.

Parsing

The CUP tool converts MiniParser.jcup to MiniParser.java.

This is used to parse the tokens retrieved from MiniScanner into an abstract syntax tree of MiniNode objects.

See the MiniNode package documentation for a listing of the abstract syntax grammar objects.

Note that in the "Mini" programming language syntax, a "block" may declare a "procedure" and a "procedure" contains a "block". Thus, procedures may be of nested scope.

The final step of creating the abstract syntax tree is to construct a ProgramMiniNode object. Within its constructor, the ProgramMiniNode object calls the checkSemantics(parentMiniNodeStack) method upon itself. Using the "Composite" design pattern, semantics are checked throughout the entire tree by propagating the call to each of the child nodes.

In checking semantics, each MiniNode object must be able to access its ancestor MiniNodes in the object graph. For example, to determine whether a variable NameMiniNode has been declared before use, it must traverse its lineage to determine whether there is a DeclarationMiniNode or ParameterMiniNode which initially declares it.

To potentially support the use of the "Flyweight" or Conservator design pattern in the future, MiniNode objects do not reference their parent MiniNodes. That is, instead of a tree data structure which is currently used, a directed acyclic graph (DAG) may be used in the future where each MiniNode, or just its intrinsic data, may have multiple parents. Thus, a Stack object is passed as the argument to the checkSemantics() method. Prior to calling this method on any children, a MiniNode first pushes itself onto the parentMiniNodeStack. After the method has been called on all of its children with this Stack as the argument, the MiniNode then pops itself off of the Stack. In this manner, a MiniNode receives and passes the lineage during the semantics checking phase without requiring that explicit references be to the parents be stored within the MiniNode objects themselves.

If there is a semantic error, a SemanticErrorException will be thrown and parsing will abort. At this time, a SemanticErrorException may be thrown for any one of the following reasons:

Generation

The "Composite" and "Visitor" design patterns are used to generate code from the abstract syntax tree intermediate representation (IR).

Each MiniNode object implements the method generate(MiniNodeCodeVisitor). When this method is called the abstract syntax node generates code for itself including its descendents. Within the generate() method of a parent MiniNode object, calls are made to generate() method of the children MiniNodes in the style of the Composite design pattern used above during semantic checking.

To allow for the translation of the abstrax syntax tree IR to any arbitrary language of choice, with the decision being made at runtime without the need to redesign or recompile the IR classes, the "Visitor" design pattern is exploited. As an argument to the generate() method, an object that implements the MiniNodeCodeVisitor interface is provided. The MiniNodeCodeVisitor interface defines methods which are capable of generating code for each of the different MiniNode types. The single argument to each of the methods is an instance of the specified type of MiniNode which contains the data and accessor methods required to generate the code.

True to the style of the Visitor design pattern, whenever the generate(MiniNodeCodeVisitor) method of a MiniNode object is called, it in turns calls the appropriate generate<MiniNode-type>() method on the MiniNodeCodeVisitor with itself as an argument. A concrete implementation of a MiniNodeCodeVisitor is created to generate the source code for a particular programming language for each MiniNode class, such as a JavaSourceMiniNodeCodeVisitor. Since the IR is fairly abstract, it should be feasible to implement without redesign a C source code MiniNodeCodeVisitor, a Java bytecode MiniCodeVisitor, or a platform-specific tuple, assembly, or machine language MiniNodeCodeVisitor. Thus, by using the Visitor design pattern, the IR MiniNode classes do not need to be changed to support translation to a novel programming language nor ported to support code generation for a new machine. Instead, only the creation and addition of a single new class, a concrete MiniNodeCodeVisitor implementation for a particular new language or platform, is needed.

Java

The JavaSourceMiniNodeCodeVisitor.java implementation translates a MiniNode IR to Java source code. The constructor takes a PrintStream argument so that the output may be directed to the console or a file.

The Mini grammar defines a "to E do S end" loop structure which loops the statement sequence S a fixed number of times determined by the evaluation of the expression E determined at initiation. Since Java does not support this type of loop structure, it is converted to a "for" loop within the generateDefiniteLoopStatement() implementation. As the initial evaluation of the expression and assignment to a "count" variable requires the creation of a new variable which was not originally in the Mini source code, a new variable must be created. It must be ensured that this new variable not conflict with any variable currently in the namespace of the Mini source code. For this reason, such new variables use underscore characters which are not used within the Mini syntax for variable and procedure names. Additionally, as any new variable introduced in the Java source code must not conflict with any other new variable introduced, the nextTemp() method is used to generate a new integer which is guaranteed to be unique to be used as part of the new variable name.

The Mini grammar defines "read()" and "write()" statements which read and print a variable number of integers from and to the standard input and output respectively. As Java does not support a variable number of arguments to its methods, the generateInputStatement() and generateOutputStatement() implementations simply call the standard i/o methods within a loop. I have chosen to inline code in place for each read() and write() method rather than redirect to a static Java library for the Mini equivalents to avoid the necessity to import such a class with each translation or to add additional methods that did not exist as procedures within the original Mini source code.

Example translations:
Ex1.mini ==> Ex1.java
Ex2.mini ==> Ex2.java
Ex3.mini ==> Ex3.java
NestTest.mini ==> NestTest.java

Note that in the third example, the Mini source code defines a procedure which is then translated into a method in the Java source code. This is fairly simple so long as the Mini source code does not use nested procedures. Note that this version of Mini does not check to see if any of the Mini procedure or variable names are reserved words in Java and will thus need to be renamed.

The fourth example, NestTest.mini, has nested procedures. Procedure "sum3()" is nested within procedure "sum4()" which itself is nested with procedure "average4()". Note that statements within the innermost procedure "sum3()" can call any procedures and access any variables in the encompassing scopes.

The fourth translation, NestTest.java, shows how the semantics of nested procedures can be emulated by using inner classes. Each procedure which contains a nested procedure in translated into an inner class. Where the call to the nested procedure would be, a new instance of the inner class is created and the inner class procedure is then executed.

Java inner classes must not have the same name as any encompassing class. The current version of this translator assigns the inner class the name of the Mini procedure it is translated from which may lead to namespace conflicts. An improvement would be to assign the inner classes new names which were guaranteed to be unique, perhaps using the nextTemp() method described above, and using the procedure names for the inner class methods names without change.

Conclusion

This project demonstrates how a translator/compiler written in Java may be created using object-oriented design patterns. Through the run-time selection of a concrete MiniNodeCodeVisitor implementation, it is suggested that a single abstract syntax IR for a given language may be used to translate one language into many other possible languages or different platform-specific machine code operations without the need to redesign the IR. Finally, a method of automatically translating the semantics of nested procedures found in some languages to the Java programming language, which does not permit nested methods, by using inner classes was presented.

Feedback

Send your questions and comments regarding this program to David Wallace Croft.