Files
nand2tetris-zed/DEVELOPMENT.md
Sean O'Connor c231dbfd27 Fix HDL and Hack Assembly syntax highlighting and queries
- Fixed HDL highlights query syntax error with #match? predicate
- Replaced #match? with #any-of? for exact string matching
- Fixed Hack Assembly outline query invalid field name
- Improved HDL syntax highlighting with comprehensive patterns
- Added HDL bracket matching for all syntax types
- Fixed XML scope mismatch from text.xml to source.xml
- Enhanced outline queries for better code navigation
2025-09-11 11:24:24 -04:00

9.4 KiB

Nand2Tetris Zed Extension - Development Guide

This guide provides detailed information for developers working on or contributing to the Nand2Tetris Zed extension.

Overview

The extension provides comprehensive language support for all nand2tetris course languages through custom Tree-sitter grammars and Zed language configurations.

Architecture

Extension Structure

nand2tetris-zed/
├── extension.toml              # Main extension configuration
├── languages/                  # Zed language configurations
│   ├── hdl/                   # Hardware Description Language
│   ├── jack/                  # Jack programming language
│   ├── hack-assembly/         # Hack Assembly language
│   ├── vm/                    # Virtual Machine language
│   ├── test-script/           # Test Script language
│   └── compare-output/        # Compare/Output file format
├── grammars/                   # Tree-sitter grammar sources
│   ├── hdl/                   # External grammar (quantonganh)
│   ├── jack/                  # External grammar (nverno)
│   ├── hack-assembly/         # Custom grammar
│   ├── vm/                    # Custom grammar
│   ├── test-script/           # Custom grammar
│   └── compare-output/        # Custom grammar
└── examples/                   # Test files for validation

Language Support Levels

  1. HDL & Jack: Use external, mature Tree-sitter grammars
  2. Assembly, VM, Test Script, Compare/Output: Custom grammars built specifically for this extension

Grammar Development

Prerequisites

  • Rust (installed via rustup)
  • Node.js and npm
  • tree-sitter-cli: npm install -g tree-sitter-cli

Grammar Structure

Each custom grammar follows this structure:

grammars/[language]/
├── grammar.js                 # Grammar definition
├── tree-sitter.json          # Tree-sitter configuration
├── package.json               # npm package configuration
├── Cargo.toml                 # Rust package configuration
├── binding.gyp                # Node.js binding configuration
├── src/                       # Generated parser code
│   ├── parser.c              # Generated C parser
│   ├── grammar.json          # Generated grammar metadata
│   └── node-types.json       # Generated node type definitions
├── bindings/                  # Language bindings
│   └── rust/                 # Rust bindings
│       ├── lib.rs            # Library interface
│       └── build.rs          # Build script
└── queries/                  # Tree-sitter queries
    └── highlights.scm        # Syntax highlighting queries

Creating a New Grammar

  1. Create directory structure:

    mkdir -p grammars/my-language/{bindings/rust,queries}
    cd grammars/my-language
    
  2. Create grammar.js:

    module.exports = grammar({
      name: 'my_language',
    
      rules: {
        source_file: $ => repeat($._item),
    
        _item: $ => choice(
          $.comment,
          // Add your language constructs here
        ),
    
        comment: $ => token(seq('//', /.*/)),
      },
    
      extras: $ => [
        /\s/,
        $.comment
      ]
    });
    
  3. Create configuration files:

    • tree-sitter.json (see existing examples)
    • package.json (Node.js package)
    • Cargo.toml (Rust package)
    • binding.gyp (Node.js bindings)
  4. Generate parser:

    tree-sitter generate
    
  5. Create Rust bindings:

    • bindings/rust/lib.rs
    • bindings/rust/build.rs
  6. Create highlighting queries:

    • queries/highlights.scm
  7. Test the grammar:

    tree-sitter test
    tree-sitter parse test-file.ext
    

Grammar Rules Best Practices

Token Naming

  • Use snake_case for rule names
  • Prefix internal rules with _
  • Use semantic names that describe the construct

Rule Structure

// Good: Clear semantic meaning
instruction: $ => choice(
  $.a_instruction,
  $.c_instruction
),

// Bad: Generic naming
thing: $ => choice(
  $.type1,
  $.type2
),

Comments and Whitespace

Always handle comments and whitespace properly:

extras: $ => [
  /\s/,        // Whitespace
  $.comment    // Comments
],

String Tokens

Use token() for multi-character operators:

// Good
if_goto: $ => token('if-goto'),

// Bad (can cause parsing issues)
if_goto: $ => 'if-goto',

Language Configuration

Each language needs a config.toml file in languages/[language]/:

name = "Language Name"
grammar = "grammar_name"
scope = "source.language_name"
path_suffixes = ["ext"]
line_comments = ["// "]
block_comments = [["/*", "*/"]]  # Optional
tab_size = 4
hard_tabs = false

Highlighting Queries

Create .scm files that map grammar nodes to semantic tokens:

; Comments
(comment) @comment

; Keywords
"function" @keyword.function
"return" @keyword.control

; Identifiers
(identifier) @variable
(function_name) @function

; Literals
(number) @constant.numeric
(string) @string

Query Development Tips

  1. Check node types: Use tree-sitter parse --debug to see actual node structure
  2. Test queries: Use tree-sitter highlight to test highlighting
  3. Use semantic tokens: Follow TextMate/LSP token conventions
  4. Prioritize specificity: More specific queries override general ones

Testing

Grammar Testing

cd grammars/[language]
tree-sitter test                    # Run test suite
tree-sitter parse example.ext       # Parse specific file
tree-sitter highlight example.ext   # Test highlighting

Integration Testing

  1. Install extension as dev extension in Zed
  2. Open test files and verify:
    • Syntax highlighting works
    • Bracket matching functions
    • Code outline appears
    • Indentation behaves correctly

Example Files

Create comprehensive example files in examples/ that cover:

  • All language constructs
  • Edge cases
  • Common patterns
  • Error conditions

Common Issues and Solutions

Grammar Generation Errors

Empty string rules:

Error: The rule contains an empty string

Solution: Remove empty alternatives or use optional()

Invalid node types in queries:

Query error: Invalid node type 'foo'

Solution: Check src/node-types.json for actual node names

Highlighting Issues

Tokens not highlighting:

  1. Verify node exists in node-types.json
  2. Check query syntax
  3. Ensure grammar generates expected nodes

Conflicting highlights:

  • More specific queries take precedence
  • Use #match? predicates for conditional highlighting

Zed Integration Issues

Extension not loading:

  1. Check extension.toml syntax
  2. Verify all referenced grammars exist
  3. Check Zed logs for specific errors

Grammar compilation fails:

  1. Ensure Rust is installed via rustup
  2. Check that all required files are present
  3. Verify tree-sitter grammar generates successfully

Contributing Guidelines

Code Style

  • Follow existing patterns in grammar definitions
  • Use clear, semantic naming
  • Add comprehensive comments
  • Test thoroughly before submitting

Documentation

  • Update README.md for user-facing changes
  • Update this guide for development changes
  • Include example files for new languages
  • Document any breaking changes

Pull Request Process

  1. Create feature branch
  2. Implement changes with tests
  3. Update documentation
  4. Test in Zed environment
  5. Submit PR with clear description

Commit Messages

Use conventional commits:

feat: add support for new language construct
fix: resolve highlighting issue with comments
docs: update installation instructions
test: add comprehensive test cases

Release Process

  1. Version Bump: Update extension.toml version
  2. Test: Verify all grammars compile and work in Zed
  3. Documentation: Update README.md and CHANGELOG.md
  4. Tag: Create git tag for version
  5. Publish: Submit to Zed extension registry (when available)

Performance Considerations

Grammar Optimization

  • Avoid excessive backtracking in rules
  • Use token() for multi-character sequences
  • Minimize conflicts between rules
  • Profile with tree-sitter parse --time

Query Optimization

  • Use specific node patterns over generic ones
  • Avoid complex predicates when possible
  • Test query performance on large files

Debugging

Grammar Debugging

tree-sitter generate --debug
tree-sitter parse --debug-graph file.ext
tree-sitter test --debug

Query Debugging

tree-sitter query queries/highlights.scm file.ext
tree-sitter highlight file.ext

Zed Debugging

  • Check zed: open log for extension errors
  • Use --foreground flag for verbose logging
  • Test with minimal example files

Resources

Support

For development questions or issues:

  1. Check existing GitHub issues
  2. Review this development guide
  3. Test with minimal reproducible examples
  4. Provide detailed error messages and logs

This extension represents a complete implementation of nand2tetris language support, demonstrating how to create comprehensive Tree-sitter grammars and integrate them into modern editors.