← back

Designing Simplicity & Taste Tools for LLMs

AST-based style analysis for nudging code generation toward boring code

The Problem

LLMs generate functionally correct code that often violates the subtle style patterns that make code maintainable. They overuse abstractions, introduce premature complexity, and miss the "boring code" wisdom that experienced developers apply instinctively.

Current approaches:

What's missing: a tool that extracts structural style patterns from a codebase and converts them into natural language fragments that nudge LLMs toward the project's actual coding philosophy.

The Solution: Style Fragment Extraction

Scan a codebase's AST, classify structural patterns, and output natural language "style fragments" for LLM system prompts.

Example Output

style_fragments:
  - "Guard clauses preferred over nested ifs (87% early-return pattern)"
  - "Error handling: check-and-return, minimal nesting (Go if-err style)"
  - "Switch/match over if-else chains (3:1 ratio)"
  - "Functions average 25 LOC, max 80 — short and focused"
  - "Single return value + error, no multi-return beyond (val, err)"
  - "Flat control flow: avg nesting depth 1.2, max 3"

These fragments go into an LLM system prompt to nudge generated code toward the codebase's actual style.

What Makes This Novel

No existing tool does this. Current AST-based tools:

None of these: scan codebase → extract structural patterns → output natural language fragments for prompting.

Technical Architecture

                    ┌─────────────┐
                    │  Tool CLI   │
                    └──────┬──────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
        ┌─────▼─────┐ ┌───▼────┐ ┌─────▼─────┐
        │ regex      │ │ tree-  │ │ go/ast    │
        │ counter    │ │ sitter │ │ (Go only) │
        │ (baseline) │ │ (all)  │ │           │
        └─────┬──────┘ └───┬────┘ └─────┬─────┘
              │            │            │
              └────────────┼────────────┘
                           │
                    ┌──────▼──────┐
                    │  metrics +  │
                    │  style      │
                    │  fragments  │
                    └─────────────┘

Layer 1: Tree-sitter (multi-language AST)

Tree-sitter is GitHub's incremental parser with 80+ language grammars. It produces concrete syntax trees preserving all tokens.

Go bindings:

Speed: ~91 SLOC/ms. A 6,000-line file parses in ~80ms. At 16 workers, 10k files × 200 LOC ≈ 1-2s.

Query language: S-expression patterns matching AST node types and fields.

;; guard clause detection
(if_statement
  consequence: (block
    (return_statement) @guard_return))

;; nested if detection
(if_statement
  consequence: (block
    (if_statement) @nested_if))

;; early return
(function_declaration
  body: (block
    (return_statement) @early_return
    (_)))

Layer 2: Pattern Classification

For each language, write Tree-sitter queries to detect structural patterns:

Signal Query Pattern
Guard clause if → single return/continue/break, no else
Nested if if inside if consequence
Early return return not at end of function body
Switch vs if-else Ratio of switch/match to if-else chains
Error check-and-return if err != nil { return } (Go), ? (Rust), catch (TS)
Function length Node span of function bodies
Parameter count Function parameter list length
Callback nesting Function literals inside function calls inside function calls

Layer 3: Style Fragment Generation

After classifying patterns across the codebase, generate natural language fragments:

def generate_fragments(pattern_stats):
    fragments = []

    # Guard clause preference
    if pattern_stats['guard_clause_ratio'] > 0.7:
        fragments.append(
            f"Guard clauses preferred over nested ifs "
            f"({pattern_stats['guard_clause_ratio']:.0%} early-return pattern)"
        )

    # Error handling style
    if pattern_stats['check_and_return_ratio'] > 0.8:
        fragments.append(
            "Error handling: check-and-return, minimal nesting (Go if-err style)"
        )

    # Function length
    avg_loc = pattern_stats['avg_function_loc']
    max_loc = pattern_stats['max_function_loc']
    fragments.append(
        f"Functions average {avg_loc} LOC, max {max_loc} — short and focused"
    )

    # Control flow complexity
    avg_nesting = pattern_stats['avg_nesting_depth']
    max_nesting = pattern_stats['max_nesting_depth']
    fragments.append(
        f"Flat control flow: avg nesting depth {avg_nesting:.1f}, max {max_nesting}"
    )

    return fragments

Boring Code Wisdom Integration

The tool should detect and surface boring code patterns:

Implementation Roadmap

Phase 1: MVP (single language)

  1. Integrate smacker/go-tree-sitter with Go grammar
  2. Implement 5-6 core pattern queries (guard clause, nesting, function length)
  3. Generate basic style fragments
  4. CLI: tool analyze --lang=go ./src → outputs fragments as JSON

Phase 2: Multi-language

  1. Add Rust, Python, TypeScript grammars
  2. Language-specific pattern queries (Rust ?, Python context managers)
  3. Unified fragment format across languages

Phase 3: LLM Integration

  1. Auto-inject fragments into system prompts (VSCode extension, GitHub Copilot plugin)
  2. Feedback loop: track generated code → measure adherence to fragments → refine queries
  3. Community library of pattern queries per domain (web backends, CLIs, data pipelines)

Alternatives Considered

ast-grep

Rust CLI built on tree-sitter with 13.4k stars. Source-like pattern syntax + YAML rule combinators. More readable than S-expressions but subprocess-only (no Go library). Good secondary tool for complex rules.

rule:
  pattern: if $COND { return $VAL }
  not:
    has:
      kind: else_clause

semgrep / opengrep

Semgrep went proprietary late 2024. OpenGrep is the community fork (LGPL-2.1, 2.4k stars, 40+ languages). OCaml-based, no Go bindings. Heavier (~12s for 500k LOC Python). Overkill for style extraction — best for security rules.

Per-language native parsers

For a multi-language tool, per-language native parsers aren't practical from Go. go/ast is useful for type-aware Go-specific analysis alongside tree-sitter, but tree-sitter covers everything else.

Existing Tools Landscape

Codebase Context Extraction Tools

These tools pack codebases into AI-friendly formats but don't extract style patterns:

Tool Approach Limitation
Repomix Packs codebase into XML/markdown/JSON Raw context, no style analysis
CTX Organizes codebase into structured docs Metadata extraction, not patterns
Code2Prompt Converts codebase to single prompt File concatenation, no analysis
Codebase-Digest AI-friendly packer with 60+ prompts Generic templates, not codebase-specific

LLM Code Quality Evaluation Tools

Tool What It Measures Gap
SimCopilot Scope sensitivity, contextual dependencies Evaluates existing code, doesn't guide generation
Copilot Arena User preferences via IDE (4.5M+ suggestions) Post-generation feedback, not proactive guidance
CodeRAG-Bench Whether retrieval improves generation Measures RAG impact, not style adherence
Pylint/Ruff PEP 8 compliance, naming conventions Generic standards, not project-specific taste

Style Guidance Research

Few-shot prompting is the current state-of-the-art for style control:

RAG for code — Retrieval-Augmented Generation retrieves relevant snippets from the codebase to inform generation (survey). Improves factual accuracy but doesn't extract or communicate style patterns explicitly.

Neural steering — Advanced research on identifying style-specific neurons and deactivating unwanted style patterns (paper). Requires fine-tuning access, not practical for most users.

The Gap This Tool Fills

What Exists vs What's Missing

Existing tools:

Missing: Automated extraction of structural style patterns → natural language fragments → proactive LLM guidance

No tool automatically answers: "How does THIS codebase handle errors? Guard clauses or nested ifs? Pure functions or stateful objects?"

Research-Backed Best Practices

Effective LLM Style Guidance (Current State)

  1. Custom instructions — Define conventions once, apply to all interactions (GitHub Copilot, Claude)
  2. Few-shot examples — 2-5 representative implementations showing target style
  3. Explicit directives — "Follow PEP 8" + persona ("senior Python developer who writes idiomatic code")
  4. Codebase context — Provide surrounding code, relevant imports, function dependencies
  5. RAG retrieval — Fetch similar code from the repo to ground generation

This tool automates steps 2-4 by extracting patterns and generating both examples and directives from the actual codebase.

Measuring Adherence

Research shows effective metrics combine:

Open Questions

Comprehensive References

AST Analysis Tools

smacker/go-tree-sitter
548 stars, 34 bundled grammars
tree-sitter/go-tree-sitter
Official, 228 stars
Tree-sitter query syntax
Official documentation
ast-grep
13.4k stars, Rust CLI
ast-grep YAML rules
Documentation
OpenGrep
2.4k stars, community fork
go/ast stdlib
Go standard library

LLM Code Quality & Evaluation

CodeRabbit: AST-grep + LLM
Production example
SimCopilot
LLM code completion eval
Copilot Arena
Code LLM evaluation platform
CodeRAG-Bench
Retrieval augmentation eval
Cognitive Complexity
SonarSource paper

Style Guidance & Prompting Research

Few-Shot LLM Code Synthesis
Research on example selection
Show and Tell
Style control strategies
Style-Specific Neurons
Neural steering for LLMs
Prompting LLMs for Code
Guidelines paper
RAG for Code Survey
Retrieval-augmented generation
Repository-Level Prompts
Context generation for LLMs
Context Engineering
Anthropic guide

Codebase Context Tools

Repomix
AI-friendly codebase packer
CTX
Context Hub Generator
Code2Prompt
Codebase to prompt CLI
Codebase-Digest
60+ coding prompts
Context Generator (MCP)
Context as Code
Claude Code Prompts
Community prompt library

Practical Guides

GitHub Copilot Prompting
Official documentation
JetBrains: AI Agent Guidelines
Coding guidelines for AI
Cody Codebase Understanding
Sourcegraph blog

← krons.fiu.wtf

Research → Implementation