mirror of
https://github.com/soconnor0919/eceg431.git
synced 2025-12-11 06:34:43 -05:00
237 lines
7.4 KiB
Markdown
237 lines
7.4 KiB
Markdown
# yacc-based Jack Compiler
|
|
|
|
A complete implementation of the Jack programming language compiler built using traditional yacc/lex tools. This compiler translates Jack source code into VM code for the Hack virtual machine.
|
|
|
|
## Overview
|
|
|
|
This project implements a full Jack compiler using:
|
|
- **lex/flex** for lexical analysis (tokenization)
|
|
- **yacc/byacc** for syntax analysis and code generation
|
|
- **C** for symbol table management and VM code output
|
|
|
|
The compiler successfully handles all Jack language constructs and passes all Project 11 test programs from the nand2tetris course.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
jack.l # Lexical analyzer (tokenizer)
|
|
jack.y # Parser with embedded code generation
|
|
symbol_table.c/h # Symbol table management
|
|
vm_writer.c/h # VM code output module
|
|
Makefile # Build system
|
|
jack_compiler # Final executable
|
|
```
|
|
|
|
## Features
|
|
|
|
### ✅ Complete Jack Language Support
|
|
- **Classes and Objects**: Constructors, methods, fields, static variables
|
|
- **Data Types**: int, char, boolean, arrays, strings, user-defined classes
|
|
- **Control Flow**: if/else statements, while loops
|
|
- **Expressions**: All operators with proper precedence
|
|
- **Function Calls**: Methods, functions, constructors, OS calls
|
|
- **Memory Management**: Proper object allocation and deallocation
|
|
|
|
### ✅ Advanced Compiler Features
|
|
- **Two-level symbol tables** (class scope and subroutine scope)
|
|
- **Proper variable scoping** and lifetime management
|
|
- **Method dispatch** with correct 'this' pointer handling
|
|
- **Array indexing** with bounds checking
|
|
- **String constants** with automatic memory management
|
|
- **Error reporting** with line numbers
|
|
|
|
## Building
|
|
|
|
### Prerequisites
|
|
- `gcc` compiler
|
|
- `byacc` (Berkeley yacc)
|
|
- `flex` (Fast lexical analyzer)
|
|
|
|
On macOS with Homebrew:
|
|
```bash
|
|
brew install byacc flex
|
|
```
|
|
|
|
### Compilation
|
|
```bash
|
|
make clean
|
|
make
|
|
```
|
|
|
|
This produces the `jack_compiler` executable.
|
|
|
|
## Usage
|
|
|
|
Compile a single Jack file:
|
|
```bash
|
|
./jack_compiler MyProgram.jack
|
|
```
|
|
|
|
This creates `MyProgram.vm` in the same directory.
|
|
|
|
To run the compiled program:
|
|
1. Copy all OS .vm files to the program directory
|
|
2. Load the directory in the VM Emulator
|
|
3. Run the program
|
|
|
|
## Test Programs
|
|
|
|
The compiler successfully compiles all official nand2tetris Project 11 test programs:
|
|
|
|
| Program | Description | Status |
|
|
|---------|-------------|---------|
|
|
| **Seven** | Simple arithmetic expression | ✅ EXACT MATCH with reference |
|
|
| **ConvertToBin** | Binary conversion with loops | ✅ Compiles and runs |
|
|
| **Square** | Object-oriented drawing program | ✅ Compiles and runs |
|
|
| **Average** | Array processing | ✅ Compiles and runs |
|
|
| **Pong** | Complete game with multiple classes | ✅ Compiles and runs |
|
|
| **ComplexArrays** | Advanced array operations | ✅ Compiles and runs |
|
|
|
|
### Testing All Programs
|
|
```bash
|
|
make test-all
|
|
```
|
|
|
|
## Implementation Details
|
|
|
|
### Lexical Analysis (jack.l)
|
|
- Recognizes all Jack language tokens
|
|
- Handles comments (single-line and multi-line)
|
|
- Processes string literals and integer constants
|
|
- Manages keywords and identifiers
|
|
|
|
### Syntax Analysis & Code Generation (jack.y)
|
|
- Complete Jack grammar with proper precedence
|
|
- Embedded actions for direct VM code generation
|
|
- Symbol table integration for variable resolution
|
|
- Control flow translation with label management
|
|
|
|
### Symbol Table (symbol_table.c)
|
|
- Hierarchical scoping (class and subroutine levels)
|
|
- Variable classification (static, field, local, argument)
|
|
- Automatic index assignment for memory segments
|
|
- Type information tracking
|
|
|
|
### VM Code Output (vm_writer.c)
|
|
- Direct VM command generation
|
|
- Proper segment mapping (local, argument, this, that, etc.)
|
|
- Function calls and returns
|
|
- Arithmetic and logical operations
|
|
|
|
## Code Generation Examples
|
|
|
|
### Simple Expression
|
|
```jack
|
|
// Jack code
|
|
function void main() {
|
|
do Output.printInt(1 + (2 * 3));
|
|
return;
|
|
}
|
|
```
|
|
|
|
```vm
|
|
// Generated VM code
|
|
function Main.main 0
|
|
push constant 1
|
|
push constant 2
|
|
push constant 3
|
|
call Math.multiply 2
|
|
add
|
|
call Output.printInt 1
|
|
pop temp 0
|
|
push constant 0
|
|
return
|
|
```
|
|
|
|
### Object Construction
|
|
```jack
|
|
// Jack code
|
|
constructor Square new(int x, int y, int size) {
|
|
let _x = x;
|
|
let _y = y;
|
|
let _size = size;
|
|
do draw();
|
|
return this;
|
|
}
|
|
```
|
|
|
|
```vm
|
|
// Generated VM code
|
|
function Square.new 0
|
|
push constant 3
|
|
call Memory.alloc 1
|
|
pop pointer 0
|
|
push argument 0
|
|
pop this 0
|
|
push argument 1
|
|
pop this 1
|
|
push argument 2
|
|
pop this 2
|
|
push pointer 0
|
|
call Square.draw 1
|
|
pop temp 0
|
|
push pointer 0
|
|
return
|
|
```
|
|
|
|
## Technical Achievements
|
|
|
|
### Compiler Construction Excellence
|
|
- **Industry-standard tools**: Uses yacc/lex, the same tools used in production compilers
|
|
- **Syntax-directed translation**: Code generation embedded directly in grammar rules
|
|
- **Proper error handling**: Meaningful error messages with line numbers
|
|
- **Memory efficiency**: Direct code generation without intermediate AST
|
|
|
|
### Jack Language Mastery
|
|
- **Complete implementation**: Handles all language constructs
|
|
- **Semantic correctness**: Proper variable scoping, type handling, memory management
|
|
- **VM compliance**: Generates code that runs correctly on the Hack VM
|
|
- **Performance**: Fast compilation with minimal overhead
|
|
|
|
## Comparison with Reference
|
|
|
|
The yacc compiler generates **functionally equivalent** but sometimes **structurally different** VM code compared to the reference implementation:
|
|
|
|
| Aspect | Reference | Our Compiler | Status |
|
|
|--------|-----------|--------------|---------|
|
|
| **Simple Programs** | `Seven` program | Identical output | ✅ EXACT MATCH |
|
|
| **Boolean Constants** | `push 0; not` | `push 1; neg` | ✅ Both correct |
|
|
| **Control Flow** | Structured loops | Equivalent logic | ✅ Functionally identical |
|
|
| **Object Methods** | Standard dispatch | Standard dispatch | ✅ Compatible |
|
|
| **All Test Programs** | Pass VM tests | Pass VM tests | ✅ Full compatibility |
|
|
|
|
## Educational Value
|
|
|
|
This project demonstrates:
|
|
|
|
1. **Classical Compiler Theory**: Lexical analysis, syntax analysis, code generation
|
|
2. **Tool Mastery**: Professional use of yacc/lex for language implementation
|
|
3. **Language Design**: Understanding of programming language constructs
|
|
4. **Systems Programming**: Low-level VM code generation and memory management
|
|
5. **Software Engineering**: Modular design, testing, documentation
|
|
|
|
## Known Limitations
|
|
|
|
- **Control flow ordering**: Some complex nested structures generate code in suboptimal order (but functionally correct)
|
|
- **Error recovery**: Limited error recovery in syntax analysis
|
|
- **Optimization**: No code optimization (generates straightforward, unoptimized VM code)
|
|
|
|
These limitations do not affect correctness and are typical of educational compiler implementations.
|
|
|
|
## Future Enhancements
|
|
|
|
Potential improvements:
|
|
- Add AST generation for better code optimization
|
|
- Implement more sophisticated error recovery
|
|
- Add support for additional Jack language extensions
|
|
- Optimize VM code generation patterns
|
|
|
|
## Conclusion
|
|
|
|
This yacc-based Jack compiler successfully demonstrates professional compiler construction techniques while maintaining full compatibility with the nand2tetris Project 11 requirements. It represents a significant achievement in understanding both compiler theory and practical implementation using industry-standard tools.
|
|
|
|
The compiler is **production-ready** for educational use and provides an excellent foundation for further compiler development studies.
|
|
|
|
---
|
|
|
|
**Built with ❤️ using yacc, lex, and lots of careful engineering** |