Power QueryAutomationFinance

Power Query Rule Engine: Build a Reusable Categorisation Library to Replace Manual AI Categorisation

UUnknown

2026-02-09

11 min read

Replace flaky AI categories with a Power Query rule engine: build, test and govern a reusable categorisation library for bank feeds.

Stop cleaning up AI: build a Power Query rule engine that gives consistent categorisation for bank feeds

Hook: If you’re tired of spending hours fixing inconsistent AI categorisation for bank and card feeds, you’re not alone. Many small businesses and operations teams find that AI gives a fast first pass — but it drifts, changes labels, and forces manual clean-up. The answer in 2026 isn't to abandon AI, it’s to pair it with a maintainable Power Query rule engine that delivers repeatable, auditable categories and replaces repetitive manual edits.

The 2026 reality: why rule engines matter now

Late 2025 and early 2026 saw two important trends: AI-driven categorisation tools matured but also revealed model drift and inconsistency when used at scale, and Power Query received performance and fuzzymatch improvements that make on-workbook rule engines practical for business users. Rather than rely purely on an opaque ML model, UK businesses are standardising with rule libraries that are easy to edit, test and deploy across teams.

“AI can speed up categorisation — but without governance you end up spending more time fixing errors.” — common experience across finance teams in 2025–26

What a Power Query rule engine is — and what it replaces

A Power Query rule engine is a set of tables and functions in your Excel workbook (or Power BI dataset) that evaluate a transaction feed against an ordered list of rules and assign a category. It replaces:

Manual edits to categories on individual transactions
Ad-hoc spreadsheet formulas across different team copies
Unreliable AI-only categories that change over time

Core principles

Rules first, AI second — deterministic rules for clear matches; AI-assisted fuzzy lookups as fallback
Single source of truth — store rules in an Excel table (Query-connected) so multiple analysts use the same logic
Order and precedence — rules execute by Priority; the first match wins
Testability and audit logs — store a change history and create a small test harness
Performance — use joins for exact matches, buffered functions for patterns, and fuzzy-merge only when necessary

Designing your rule set: what to store

Use an Excel table called PQ_Rules with these columns (each row is a single rule):

RuleID — unique code (e.g., R001)
Priority — numeric; lower = earlier
Active — TRUE/FALSE
Field — which feed field to test (Description, Payee, Reference)
MatchType — Exact, Contains, StartsWith, EndsWith, TokenMatch, Regex, Fuzzy
Pattern — text or regex pattern
Category — primary category (e.g., Supplies)
Subcategory — optional
Source — notes (e.g., bank, card, merchant list)
CreatedBy, CreatedOn, ModifiedBy, ModifiedOn — governance

Keep a second table PQ_RuleChangeLog to record edits: who changed a rule, when and why. That gives you auditability and helps when categories shift — treat this like a small policy process and consider guidance from policy and resilience playbooks when defining retention and access.

Architecture: staged matching for speed and predictability

A high-performing engine breaks matching into stages:

Exact matches via merge: for known merchant IDs, normalized payee names, invoice numbers — use Table.Join for O(n) performance. Be mindful of cloud query costs versus on-workbook joins; see notes on per-query economics in the cloud per-query discussion.
Deterministic pattern rules: Contains, StartsWith, EndsWith — implemented via a custom M function that iterates the active rules ordered by Priority.
Token-based matching: split descriptions into tokens and match if X% tokens overlap (good for receipts and multi-word payees).
Fuzzy fallback: use Power Query’s Fuzzy Merge to match against a curated payee dictionary when deterministic rules fail — but keep fuzzy lookups auditable and backed by a curated dictionary rather than open web sources.
AI-assisted suggestions (optional): store the AI suggestion but do not apply it automatically — let the rule engine map AI output to stable categories or flag for review. If you use on-device or local models for suggestions, follow sandboxing patterns like those described in desktop LLM safety guides.

Power Query recipe: a reusable M rule engine

Below is a pragmatic, well-commented Power Query (M) recipe. Put your rules table in the workbook as a table named PQ_Rules, and feed transactions as BankFeed. The main idea: load the rules, sort by Priority, and call a function that returns the first matching category.

Step 1: rules table (PQ_Rules)

Ensure your Excel table PQ_Rules has the columns described above. Create named range/table and load to Power Query: Excel.CurrentWorkbook(){[Name="PQ_Rules"]}[Content]

Step 2: core M functions (paste into Advanced Editor)

The following M code is a compact engine; adapt patterns and fields to your feed structure.

let
  // Load rules table from the workbook
  RulesSource = Excel.CurrentWorkbook(){[Name="PQ_Rules"]}[Content],
  // Keep only active rules and normalize
  ActiveRules = Table.SelectRows(RulesSource, each Record.Field(_, "Active") = true),
  SortedRules = Table.Sort(ActiveRules, { {"Priority", Order.Ascending} }),

  // Normalisation helper: make a lowercase trimmed string
  Normalize = (t as nullable text) as text =>
    if t = null then "" else Text.Trim(Text.Lower(t)),

  // Match a single rule to a field value
  MatchOne = (fieldValue as nullable text, rule as record) as logical =>
    let
      fv = Normalize(fieldValue),
      pat = Normalize(Record.Field(rule, "Pattern")),
      mtype = Record.Field(rule, "MatchType")
    in
      if mtype = "Exact" then fv = pat
      else if mtype = "Contains" then Text.Contains(fv, pat)
      else if mtype = "StartsWith" then Text.StartsWith(fv, pat)
      else if mtype = "EndsWith" then Text.EndsWith(fv, pat)
      else if mtype = "TokenMatch" then
        let tokensFV = List.RemoveItems(List.Distinct(List.Transform(Text.Split(fv, " "), each Text.Trim(_))), {""}),
            tokensPat = List.RemoveItems(List.Distinct(List.Transform(Text.Split(pat, " "), each Text.Trim(_))), {""}),
            inter = List.Intersect({tokensFV, tokensPat}),
            threshold = Number.RoundUp(List.Count(tokensPat) * 0.5) // 50% token overlap default
        in List.Count(inter) >= threshold
      else if mtype = "Regex" then
        try Text.RegexReplace(fv, pat, "$0") <> fv otherwise false
      else false,

  // Evaluate rules in order for a single transaction record
  GetCategoryForRecord = (txnRecord as record, rulesTable as table) as record =>
    let
      fieldsToTry = { "Description", "Payee", "Reference" },
      // iterate rules and return first matching category
      rulesList = Table.ToRecords(rulesTable),
      result = List.First(List.Transform(rulesList, each
        let r = _, fieldName = Record.Field(r, "Field"), fieldValue = try Record.Field(txnRecord, fieldName) otherwise null in
        if MatchOne(fieldValue, r) then
          [Match=true, Category = Record.Field(r, "Category"), Subcategory = Record.FieldOrDefault(r, "Subcategory", null), RuleID = Record.Field(r, "RuleID")] else [Match=false]
      ), (x)=> if x[Match] then x else null)
    in
      if result = null then [Match=false, Category=null, Subcategory=null, RuleID=null] else result,

  // Main query: Transactions from feed
  Source = Excel.CurrentWorkbook(){[Name="BankFeed"]}[Content],
  // Optional: normalise column names
  Renamed = Table.RenameColumns(Source, {}) ,
  // Add a Category column by invoking GetCategoryForRecord
  AddCategory = Table.AddColumn(Renamed, "_CategoryPick", each GetCategoryForRecord(Record.FromList(Record.ToList(_), Record.FieldNames(_)), SortedRules)),
  ExpandCategory = Table.ExpandRecordColumn(AddCategory, "_CategoryPick", {"Match","Category","Subcategory","RuleID"}, {"Matched","Category","Subcategory","RuleID"})
in
  ExpandCategory

Notes:

The MatchOne function supports Exact, Contains, StartsWith, EndsWith, TokenMatch and Regex. Extend to suit your use cases.
Regex requires careful patterns; test them on a small sample.
For very large feeds, the per-row function can be slow. Use staged matching (exact joins first) — see next section.

Improving performance: staged approach with joins and buffering

When you hit thousands of transactions, avoid evaluating every pattern rule against every row. Use these pragmatic steps:

Exact lookup table: create a curated table of normalized payee → category. Use Table.Join with the normalized key to tag many transactions in one efficient merge.
Pattern rules via function: apply the M function only to rows not matched by the exact join.
Buffer your rules: call Rules = Table.Buffer(SortedRules) before invoking the function so M doesn’t re-evaluate the rules table per row — buffered tables and buffered functions avoid repeated I/O during evaluation.
Use Table.FuzzyJoin for fuzzy matches: when patterns fail, perform a single fuzzy merge (Power Query supports fuzzy matching in merges) against a curated merchant dictionary and apply results to the unmatched set.

Example staged flow

Transactions → Normalise payee column (remove punctuation, lower-case).
Merge with MerchantDictionary on NormalisedPayee (inner join) → tag Category.
Filter out tagged rows.
Apply pattern engine function to remaining rows.
For still-unmatched rows, run a Fuzzy Merge against MerchantDictionary with Threshold 0.8.

Fuzzy matching tips (2026)

Power Query’s fuzzy merge got notable reliability improvements in late 2025. Use it as a secondary measure:

Keep a curated merchant dictionary rather than fuzzy-matching to an open internet source.
Use transform options: ignore-case, ignore-space, threshold tuning and transform similarity metrics where available.
Log every fuzzy match with score so you can review low-confidence matches and add a deterministic rule if required.

Governance: versioning, testing and rollback

A rule engine is only as good as its governance. Here’s a short checklist to keep it maintainable:

Rule change log: auto-append changes to PQ_RuleChangeLog on manual edits; include old vs new values.
Unit tests: keep a small sample sheet PQ_TestCases with known Description and ExpectedCategory. Create a query that runs the engine on PQ_TestCases and reports mismatches.
Staging and publish: keep a workbook copy in a staging folder for rule edits; after tests pass, copy the rules to production or use a single shared workbook on SharePoint/OneDrive or Dataverse so all users share the same library.
Rollback: keep snapshots of PQ_Rules (CSV exports) before significant changes so you can revert quickly.
Owner and SLAs: assign a rule owner who reviews low-confidence and fuzzy matches weekly.

Testing harness: quick check query

Create a PQ_TestCases table with columns: Description, ExpectedCategory. Load it into Power Query and run the rule engine — then compare expected vs actual. Report failures with the rule that matched and confidence (fuzzy score if used). For ongoing observability, produce weekly reports that mirror practices in edge observability playbooks so you can prioritise rule additions by impact.

Real-world example: small accounting practice

Example case study: a UK small accounting practice migrated 12 clients from an AI-only categorisation workflow to this Power Query rule engine. Results after 8 weeks:

Manual recategorisation time dropped 82% (from average 3 hours/client/month to 20 minutes)
Category consistency improved — same merchant now always maps to the same category across clients
Fewer disputes with clients about mis-categorised expenses — audit trail simplified reconciliation

Key reason for success: the practice kept a small but disciplined rule table and reviewed fuzzy matches weekly to create deterministic rules for recurring merchants.

Integrating with automation and macros (VBA snippet)

You’ll often want to automate the refresh and export process. Here’s a simple VBA snippet to refresh Power Query connections, export the categorised feed to CSV, and save a snapshot of the rules table.

Sub RefreshAndExportCategorisedFeed()
  ' Refresh queries and export final table to CSV
  Application.ScreenUpdating = False
  ThisWorkbook.RefreshAll
  ' wait for refresh (simple blocking wait)
  Application.CalculateUntilAsyncQueriesDone

  Dim ws As Worksheet
  Set ws = ThisWorkbook.Worksheets("CategorisedFeed") ' sheet where final query loads

  ' Export to CSV (overwrite)
  Dim exportPath As String
  exportPath = ThisWorkbook.Path & "\CategorisedFeed_Snapshot.csv"
  ws.Copy
  With ActiveWorkbook
    .SaveAs Filename:=exportPath, FileFormat:=xlCSV, CreateBackup:=False
    .Close SaveChanges:=False
  End With

  ' Save snapshot of rules
  ThisWorkbook.Worksheets("RulesTable").Copy
  ActiveWorkbook.SaveAs Filename:=ThisWorkbook.Path & "\Rules_Snapshot_" & Format(Now(),"yyyyMMdd_HHmm") & ".xlsx"
  ActiveWorkbook.Close SaveChanges:=False

  Application.ScreenUpdating = True
  MsgBox "Refresh and export complete." , vbInformation
End Sub

Notes: use Application.CalculateUntilAsyncQueriesDone (available in recent Excel) to wait for PQ async queries to finish. If your Excel version lacks this, use a loop to test Workbook.Queries status. For field-toolkit automation patterns and practical macros, see the field toolkit review.

Actionable checklist to implement this week

Create PQ_Rules and PQ_RuleChangeLog tables in your workbook and load them to Power Query.
Populate initial deterministic rules for top 50 merchants (Exact & Contains).
Implement the M engine above and test against a small sample.
Build PQ_TestCases and run unit tests; fix failing rules and log changes.
Set up a weekly review cadence for fuzzy matches and add new deterministic rules.

Advanced strategies and future-proofing (2026+)

As datasets grow, adopt these strategies:

Hybrid AI integration: keep AI for suggestions but map AI outputs to stable categories using rules. This avoids the “AI label drift” problem.
Shared rule service: for multi-workbook environments, host rules on SharePoint list or Dataverse and query them from Power Query so all users share the same library.
CI for rules: treat rules like code — store CSV snapshots in a version control system and run scheduled tests on push. See notes on rapid edge publishing for lightweight CI patterns at rapid edge content publishing.
Observability: produce weekly reports that show top unmatched descriptors and low-confidence fuzzy matches to prioritise new rules.

Common pitfalls and how to avoid them

Overly broad patterns: “Contains ‘shop’” will miscategorise many rows — prefer merchant-specific tokens.
Uncontrolled rule order: always use a Priority field and document why a rule has precedence.
Ignoring performance: don’t evaluate regex/fuzzy on every row if exact matching removes most transactions first.
Not testing changes: always run PQ_TestCases before publishing rule changes.

Wrap-up: why this beats AI-only workflows

AI gives speed but lacks stable governance. A Power Query rule engine gives:

Consistency: same merchant = same category every time
Auditable decisions: RuleID and change logs explain why a transaction was categorised
Maintainability: business users can edit tables in Excel without rewriting code
Hybrid capability: combine deterministic matching, fuzzy joins and AI suggestions for best coverage

Takeaways — quick summary

Store rules in an Excel table and load them into Power Query.
Run matches in stages: exact join → pattern engine → fuzzy merge → AI suggestions.
Buffer rule tables and use joins for performance.
Keep a test harness and change log for governance.
Automate refresh/export with a small VBA macro and review fuzzy matches weekly.

Next step: get the template and a short course

If you want to hit the ground running, download our Power Query Rule Engine template (includes PQ_Rules, PQ_TestCases, example MerchantDictionary and the ready-made M functions) and join our 90-minute workshop that walks through implementing and governing the engine across multiple clients.

Call to action: Download the template and sign up for the workshop to stop cleaning up AI output and build repeatable, auditable transaction categorisation workflows. Need a quick audit of your current categorisation? Contact us for a free 30-minute review and we’ll show you the low-hanging rules that save the most time.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.