Class QuickFixSuggestionExtractor

java.lang.Object
org.ek9lang.compiler.common.QuickFixSuggestionExtractor

public final class QuickFixSuggestionExtractor extends Object
Single source of truth for "Did you mean?" / quick-fix suggestions across the EK9 compiler, IDE, and AI tooling. Returns a filtered, ranked Suggestion list combining two strategies:
  1. Common-literal / keyword pool — Levenshtein-near matches against a small fixed pool of EK9 keywords, literals, and common type names. Catches user typos (e.g. flasefalse, treutrue) that the symbol table can't help with because keywords aren't symbols.
  2. Fuzzy symbol matches — symbol-table search results, FILTERED by a Levenshtein-cost threshold AND a length-diff threshold relative to the offending token. Without these filters the search returns whatever the closest k symbols were even when irrelevant (e.g. flaseClass1/Class2).

Parse-message expected tokens (e.g. ANTLR's "expecting 'defines', '@'") are deliberately NOT included. They're already rendered in the inline typeOfError text the parser produces, so adding a separate "did you mean: '@'?" line is redundant noise. Kept out of all output channels — CLI classic, CLI visual, IDE Problems tooltip, IDE Quick Fix popup, MCP ek9_query_diagnostics.

Results are ranked by:

  1. Character-bag overlap descending — the strongest typo signal (chars in common). For flase, false shares all 5 unique chars (perfect anagram) and ranks first.
  2. Same-length first (length-diff ascending) — typos rarely change length
  3. Cost ascending — Levenshtein closeness as fine-grained tiebreak
  4. Source priority: parse-token > keyword > symbol
  5. Alphabetical (deterministic)

Bag-overlap-primary is correct over cost-primary because the existing EK9 Levenshtein cost metric has a boundary anomaly (insertion cost 1 along the edge but 2 in the body) that under-charges length-mismatch candidates. For flase, raw cost gives case=3 (gaming the boundary) but false=4 — wrong winner. Bag-overlap captures the "shares the same letters" signal that actually matters for typo correction.

  • Method Details

    • extract

      public static List<Suggestion> extract(ErrorListener.ErrorDetails error)
      Build the ranked, filtered suggestion list for an error. Capped at 5 entries.

      Keyword-pool and fuzzy-symbol matches only run when the compiler signalled "this error might be a typo" by attaching a non-null fuzzySearchResults object. Errors like E07370 Unreachable Statement carry a valid likelyOffendingSymbol (the unreachable label) but no fuzzy context — they shouldn't get "Did you mean?" noise. Parse-message tokens always run since they come from the parser's own grammar knowledge.

    • extractNames

      public static List<String> extractNames(ErrorListener.ErrorDetails error)
      Convenience: list of bare replacement names. Used by IDE quick-fix popups and by AI clients requesting unstructured fix data.
    • formatForErrorMessage

      public static String formatForErrorMessage(ErrorListener.ErrorDetails error)
      Format a suggestion list as the inline "Did you mean: X, Y, Z?" string used by ErrorListener.ErrorDetails.toString() (CLI classic mode). Returns the empty string when there are no suggestions.
    • rankedListForTests

      static List<org.ek9lang.compiler.common.QuickFixSuggestionExtractor.RankedSuggestion> rankedListForTests(ErrorListener.ErrorDetails error)
      Backing list helper kept for tests that want the raw flat list. Hidden because consumers should prefer extract(ErrorListener.ErrorDetails) or extractNames(ErrorListener.ErrorDetails).