Class DataClumpOrError

java.lang.Object
org.ek9lang.compiler.phase5.DataClumpOrError
All Implemented Interfaces:
BiConsumer<String, List<ParsedModule>>

class DataClumpOrError extends Object implements BiConsumer<String, List<ParsedModule>>
Validates that a module does not contain data clumps (E11053). A data clump is when 3+ callables share 4+ parameters with matching types and similar names — indicating a missing record abstraction.

Detection uses a three-phase algorithm:

  1. Pairwise comparison: For each pair of signatures, check if they share a clump of 4+ parameter pairs with matching types and names (exact or fuzzy within Levenshtein distance 2)
  2. Transitive grouping: Union-Find groups signatures connected by shared clumps
  3. Threshold check: Groups with 3+ members trigger E11053

Based on Martin Fowler's "Refactoring" (1999): data clumps should be extracted into their own record/class to reduce parameter proliferation and improve cohesion.