v4.5.0 9 February 2026 Feature
Lexicographic Lattice Scoring
Lattice-Scored Rule Matching
The categorisation engine now uses a formal lexicographic lattice to select the best matching rule, replacing the previous “highest priority wins” approach. When multiple rules match a transaction, the engine scores each across six dimensions and selects the mathematically strongest match.
Score Dimensions
Each matching rule is scored as a 6-dimension vector, compared left to right:
- Operator specificity —
equals(5) beatsstarts_with(4) beatscontains(3). Regex is scored dynamically based on literal character density. - Field strength —
referenceandcontact_name(4) outrankdescription(2) andamount(1). More informative fields win. - Pattern specificity — longer match values are more specific. A rule matching “BRITISH GAS” (11 chars) outranks one matching “GAS” (3 chars).
- Value fit — how much of the input the rule covers. An
equalsmatch scores 100%; a shortcontainson a long description scores proportionally less. - Priority — user-assigned priority (0–999), as before.
- Tie-break — earliest creation date, then lowest ID. Guarantees determinism even when all other dimensions are equal.
What This Means
- Specific rules win automatically — an
equals "TESCO STORES"rule beats acontains "TESCO"rule at the same priority, without manual priority tuning. - Regex penalty — lazy patterns like
.*are penalised. A regexTESCO.*LTDwith 8 literal characters scores well; a regex.*scores near zero and loses to any literal match. - Visible scoring — the test panel now displays the score vector with labelled dimensions, so you can see exactly why one rule won over another.
- Backward compatible — existing rules and priorities continue to work. The lattice adds specificity resolution on top of priority, it doesn’t replace it.
Evidence Improvements
- The test panel now shows the score vector:
[3, 2, 12, 100, 10, -8]with dimension labels(op · field · pattern · fit · priority · tie). - When multiple rules match, the evidence explains which dimension decided the winner (e.g. “won by operator specificity”).
- Ambiguity detection now compares full score vectors rather than just priority.