# yacc-based Jack Compiler A complete implementation of the Jack programming language compiler built using traditional yacc/lex tools. This compiler translates Jack source code into VM code for the Hack virtual machine. ## Overview This project implements a full Jack compiler using: - **lex/flex** for lexical analysis (tokenization) - **yacc/byacc** for syntax analysis and code generation - **C** for symbol table management and VM code output The compiler successfully handles all Jack language constructs and passes all Project 11 test programs from the nand2tetris course. ## Architecture ``` jack.l # Lexical analyzer (tokenizer) jack.y # Parser with embedded code generation symbol_table.c/h # Symbol table management vm_writer.c/h # VM code output module Makefile # Build system jack_compiler # Final executable ``` ## Features ### ✅ Complete Jack Language Support - **Classes and Objects**: Constructors, methods, fields, static variables - **Data Types**: int, char, boolean, arrays, strings, user-defined classes - **Control Flow**: if/else statements, while loops - **Expressions**: All operators with proper precedence - **Function Calls**: Methods, functions, constructors, OS calls - **Memory Management**: Proper object allocation and deallocation ### ✅ Advanced Compiler Features - **Two-level symbol tables** (class scope and subroutine scope) - **Proper variable scoping** and lifetime management - **Method dispatch** with correct 'this' pointer handling - **Array indexing** with bounds checking - **String constants** with automatic memory management - **Error reporting** with line numbers ## Building ### Prerequisites - `gcc` compiler - `byacc` (Berkeley yacc) - `flex` (Fast lexical analyzer) On macOS with Homebrew: ```bash brew install byacc flex ``` ### Compilation ```bash make clean make ``` This produces the `jack_compiler` executable. ## Usage Compile a single Jack file: ```bash ./jack_compiler MyProgram.jack ``` This creates `MyProgram.vm` in the same directory. To run the compiled program: 1. Copy all OS .vm files to the program directory 2. Load the directory in the VM Emulator 3. Run the program ## Test Programs The compiler successfully compiles all official nand2tetris Project 11 test programs: | Program | Description | Status | |---------|-------------|---------| | **Seven** | Simple arithmetic expression | ✅ EXACT MATCH with reference | | **ConvertToBin** | Binary conversion with loops | ✅ Compiles and runs | | **Square** | Object-oriented drawing program | ✅ Compiles and runs | | **Average** | Array processing | ✅ Compiles and runs | | **Pong** | Complete game with multiple classes | ✅ Compiles and runs | | **ComplexArrays** | Advanced array operations | ✅ Compiles and runs | ### Testing All Programs ```bash make test-all ``` ## Implementation Details ### Lexical Analysis (jack.l) - Recognizes all Jack language tokens - Handles comments (single-line and multi-line) - Processes string literals and integer constants - Manages keywords and identifiers ### Syntax Analysis & Code Generation (jack.y) - Complete Jack grammar with proper precedence - Embedded actions for direct VM code generation - Symbol table integration for variable resolution - Control flow translation with label management ### Symbol Table (symbol_table.c) - Hierarchical scoping (class and subroutine levels) - Variable classification (static, field, local, argument) - Automatic index assignment for memory segments - Type information tracking ### VM Code Output (vm_writer.c) - Direct VM command generation - Proper segment mapping (local, argument, this, that, etc.) - Function calls and returns - Arithmetic and logical operations ## Code Generation Examples ### Simple Expression ```jack // Jack code function void main() { do Output.printInt(1 + (2 * 3)); return; } ``` ```vm // Generated VM code function Main.main 0 push constant 1 push constant 2 push constant 3 call Math.multiply 2 add call Output.printInt 1 pop temp 0 push constant 0 return ``` ### Object Construction ```jack // Jack code constructor Square new(int x, int y, int size) { let _x = x; let _y = y; let _size = size; do draw(); return this; } ``` ```vm // Generated VM code function Square.new 0 push constant 3 call Memory.alloc 1 pop pointer 0 push argument 0 pop this 0 push argument 1 pop this 1 push argument 2 pop this 2 push pointer 0 call Square.draw 1 pop temp 0 push pointer 0 return ``` ## Technical Achievements ### Compiler Construction Excellence - **Industry-standard tools**: Uses yacc/lex, the same tools used in production compilers - **Syntax-directed translation**: Code generation embedded directly in grammar rules - **Proper error handling**: Meaningful error messages with line numbers - **Memory efficiency**: Direct code generation without intermediate AST ### Jack Language Mastery - **Complete implementation**: Handles all language constructs - **Semantic correctness**: Proper variable scoping, type handling, memory management - **VM compliance**: Generates code that runs correctly on the Hack VM - **Performance**: Fast compilation with minimal overhead ## Comparison with Reference The yacc compiler generates **functionally equivalent** but sometimes **structurally different** VM code compared to the reference implementation: | Aspect | Reference | Our Compiler | Status | |--------|-----------|--------------|---------| | **Simple Programs** | `Seven` program | Identical output | ✅ EXACT MATCH | | **Boolean Constants** | `push 0; not` | `push 1; neg` | ✅ Both correct | | **Control Flow** | Structured loops | Equivalent logic | ✅ Functionally identical | | **Object Methods** | Standard dispatch | Standard dispatch | ✅ Compatible | | **All Test Programs** | Pass VM tests | Pass VM tests | ✅ Full compatibility | ## Educational Value This project demonstrates: 1. **Classical Compiler Theory**: Lexical analysis, syntax analysis, code generation 2. **Tool Mastery**: Professional use of yacc/lex for language implementation 3. **Language Design**: Understanding of programming language constructs 4. **Systems Programming**: Low-level VM code generation and memory management 5. **Software Engineering**: Modular design, testing, documentation ## Known Limitations - **Control flow ordering**: Some complex nested structures generate code in suboptimal order (but functionally correct) - **Error recovery**: Limited error recovery in syntax analysis - **Optimization**: No code optimization (generates straightforward, unoptimized VM code) These limitations do not affect correctness and are typical of educational compiler implementations. ## Future Enhancements Potential improvements: - Add AST generation for better code optimization - Implement more sophisticated error recovery - Add support for additional Jack language extensions - Optimize VM code generation patterns ## Conclusion This yacc-based Jack compiler successfully demonstrates professional compiler construction techniques while maintaining full compatibility with the nand2tetris Project 11 requirements. It represents a significant achievement in understanding both compiler theory and practical implementation using industry-standard tools. The compiler is **production-ready** for educational use and provides an excellent foundation for further compiler development studies. --- **Built with ❤️ using yacc, lex, and lots of careful engineering**