Power Query Rule Engine: Build a Reusable Categorisation Library to Replace Manual AI Categorisation
Replace flaky AI categories with a Power Query rule engine: build, test and govern a reusable categorisation library for bank feeds.
Stop cleaning up AI: build a Power Query rule engine that gives consistent categorisation for bank feeds
Hook: If you’re tired of spending hours fixing inconsistent AI categorisation for bank and card feeds, you’re not alone. Many small businesses and operations teams find that AI gives a fast first pass — but it drifts, changes labels, and forces manual clean-up. The answer in 2026 isn't to abandon AI, it’s to pair it with a maintainable Power Query rule engine that delivers repeatable, auditable categories and replaces repetitive manual edits.
The 2026 reality: why rule engines matter now
Late 2025 and early 2026 saw two important trends: AI-driven categorisation tools matured but also revealed model drift and inconsistency when used at scale, and Power Query received performance and fuzzymatch improvements that make on-workbook rule engines practical for business users. Rather than rely purely on an opaque ML model, UK businesses are standardising with rule libraries that are easy to edit, test and deploy across teams.
“AI can speed up categorisation — but without governance you end up spending more time fixing errors.” — common experience across finance teams in 2025–26
What a Power Query rule engine is — and what it replaces
A Power Query rule engine is a set of tables and functions in your Excel workbook (or Power BI dataset) that evaluate a transaction feed against an ordered list of rules and assign a category. It replaces:
- Manual edits to categories on individual transactions
- Ad-hoc spreadsheet formulas across different team copies
- Unreliable AI-only categories that change over time
Core principles
- Rules first, AI second — deterministic rules for clear matches; AI-assisted fuzzy lookups as fallback
- Single source of truth — store rules in an Excel table (Query-connected) so multiple analysts use the same logic
- Order and precedence — rules execute by Priority; the first match wins
- Testability and audit logs — store a change history and create a small test harness
- Performance — use joins for exact matches, buffered functions for patterns, and fuzzy-merge only when necessary
Designing your rule set: what to store
Use an Excel table called PQ_Rules with these columns (each row is a single rule):
- RuleID — unique code (e.g., R001)
- Priority — numeric; lower = earlier
- Active — TRUE/FALSE
- Field — which feed field to test (Description, Payee, Reference)
- MatchType — Exact, Contains, StartsWith, EndsWith, TokenMatch, Regex, Fuzzy
- Pattern — text or regex pattern
- Category — primary category (e.g., Supplies)
- Subcategory — optional
- Source — notes (e.g., bank, card, merchant list)
- CreatedBy, CreatedOn, ModifiedBy, ModifiedOn — governance
Keep a second table PQ_RuleChangeLog to record edits: who changed a rule, when and why. That gives you auditability and helps when categories shift — treat this like a small policy process and consider guidance from policy and resilience playbooks when defining retention and access.
Architecture: staged matching for speed and predictability
A high-performing engine breaks matching into stages:
- Exact matches via merge: for known merchant IDs, normalized payee names, invoice numbers — use Table.Join for O(n) performance. Be mindful of cloud query costs versus on-workbook joins; see notes on per-query economics in the cloud per-query discussion.
- Deterministic pattern rules: Contains, StartsWith, EndsWith — implemented via a custom M function that iterates the active rules ordered by Priority.
- Token-based matching: split descriptions into tokens and match if X% tokens overlap (good for receipts and multi-word payees).
- Fuzzy fallback: use Power Query’s Fuzzy Merge to match against a curated payee dictionary when deterministic rules fail — but keep fuzzy lookups auditable and backed by a curated dictionary rather than open web sources.
- AI-assisted suggestions (optional): store the AI suggestion but do not apply it automatically — let the rule engine map AI output to stable categories or flag for review. If you use on-device or local models for suggestions, follow sandboxing patterns like those described in desktop LLM safety guides.
Power Query recipe: a reusable M rule engine
Below is a pragmatic, well-commented Power Query (M) recipe. Put your rules table in the workbook as a table named PQ_Rules, and feed transactions as BankFeed. The main idea: load the rules, sort by Priority, and call a function that returns the first matching category.
Step 1: rules table (PQ_Rules)
Ensure your Excel table PQ_Rules has the columns described above. Create named range/table and load to Power Query: Excel.CurrentWorkbook(){[Name="PQ_Rules"]}[Content]
Step 2: core M functions (paste into Advanced Editor)
The following M code is a compact engine; adapt patterns and fields to your feed structure.
let
// Load rules table from the workbook
RulesSource = Excel.CurrentWorkbook(){[Name="PQ_Rules"]}[Content],
// Keep only active rules and normalize
ActiveRules = Table.SelectRows(RulesSource, each Record.Field(_, "Active") = true),
SortedRules = Table.Sort(ActiveRules, { {"Priority", Order.Ascending} }),
// Normalisation helper: make a lowercase trimmed string
Normalize = (t as nullable text) as text =>
if t = null then "" else Text.Trim(Text.Lower(t)),
// Match a single rule to a field value
MatchOne = (fieldValue as nullable text, rule as record) as logical =>
let
fv = Normalize(fieldValue),
pat = Normalize(Record.Field(rule, "Pattern")),
mtype = Record.Field(rule, "MatchType")
in
if mtype = "Exact" then fv = pat
else if mtype = "Contains" then Text.Contains(fv, pat)
else if mtype = "StartsWith" then Text.StartsWith(fv, pat)
else if mtype = "EndsWith" then Text.EndsWith(fv, pat)
else if mtype = "TokenMatch" then
let tokensFV = List.RemoveItems(List.Distinct(List.Transform(Text.Split(fv, " "), each Text.Trim(_))), {""}),
tokensPat = List.RemoveItems(List.Distinct(List.Transform(Text.Split(pat, " "), each Text.Trim(_))), {""}),
inter = List.Intersect({tokensFV, tokensPat}),
threshold = Number.RoundUp(List.Count(tokensPat) * 0.5) // 50% token overlap default
in List.Count(inter) >= threshold
else if mtype = "Regex" then
try Text.RegexReplace(fv, pat, "$0") <> fv otherwise false
else false,
// Evaluate rules in order for a single transaction record
GetCategoryForRecord = (txnRecord as record, rulesTable as table) as record =>
let
fieldsToTry = { "Description", "Payee", "Reference" },
// iterate rules and return first matching category
rulesList = Table.ToRecords(rulesTable),
result = List.First(List.Transform(rulesList, each
let r = _, fieldName = Record.Field(r, "Field"), fieldValue = try Record.Field(txnRecord, fieldName) otherwise null in
if MatchOne(fieldValue, r) then
[Match=true, Category = Record.Field(r, "Category"), Subcategory = Record.FieldOrDefault(r, "Subcategory", null), RuleID = Record.Field(r, "RuleID")] else [Match=false]
), (x)=> if x[Match] then x else null)
in
if result = null then [Match=false, Category=null, Subcategory=null, RuleID=null] else result,
// Main query: Transactions from feed
Source = Excel.CurrentWorkbook(){[Name="BankFeed"]}[Content],
// Optional: normalise column names
Renamed = Table.RenameColumns(Source, {}) ,
// Add a Category column by invoking GetCategoryForRecord
AddCategory = Table.AddColumn(Renamed, "_CategoryPick", each GetCategoryForRecord(Record.FromList(Record.ToList(_), Record.FieldNames(_)), SortedRules)),
ExpandCategory = Table.ExpandRecordColumn(AddCategory, "_CategoryPick", {"Match","Category","Subcategory","RuleID"}, {"Matched","Category","Subcategory","RuleID"})
in
ExpandCategory
Notes:
- The MatchOne function supports Exact, Contains, StartsWith, EndsWith, TokenMatch and Regex. Extend to suit your use cases.
- Regex requires careful patterns; test them on a small sample.
- For very large feeds, the per-row function can be slow. Use staged matching (exact joins first) — see next section.
Improving performance: staged approach with joins and buffering
When you hit thousands of transactions, avoid evaluating every pattern rule against every row. Use these pragmatic steps:
- Exact lookup table: create a curated table of normalized payee → category. Use Table.Join with the normalized key to tag many transactions in one efficient merge.
- Pattern rules via function: apply the M function only to rows not matched by the exact join.
- Buffer your rules: call Rules = Table.Buffer(SortedRules) before invoking the function so M doesn’t re-evaluate the rules table per row — buffered tables and buffered functions avoid repeated I/O during evaluation.
- Use Table.FuzzyJoin for fuzzy matches: when patterns fail, perform a single fuzzy merge (Power Query supports fuzzy matching in merges) against a curated merchant dictionary and apply results to the unmatched set.
Example staged flow
- Transactions → Normalise payee column (remove punctuation, lower-case).
- Merge with MerchantDictionary on NormalisedPayee (inner join) → tag Category.
- Filter out tagged rows.
- Apply pattern engine function to remaining rows.
- For still-unmatched rows, run a Fuzzy Merge against MerchantDictionary with Threshold 0.8.
Fuzzy matching tips (2026)
Power Query’s fuzzy merge got notable reliability improvements in late 2025. Use it as a secondary measure:
- Keep a curated merchant dictionary rather than fuzzy-matching to an open internet source.
- Use transform options: ignore-case, ignore-space, threshold tuning and transform similarity metrics where available.
- Log every fuzzy match with score so you can review low-confidence matches and add a deterministic rule if required.
Governance: versioning, testing and rollback
A rule engine is only as good as its governance. Here’s a short checklist to keep it maintainable:
- Rule change log: auto-append changes to PQ_RuleChangeLog on manual edits; include old vs new values.
- Unit tests: keep a small sample sheet PQ_TestCases with known Description and ExpectedCategory. Create a query that runs the engine on PQ_TestCases and reports mismatches.
- Staging and publish: keep a workbook copy in a staging folder for rule edits; after tests pass, copy the rules to production or use a single shared workbook on SharePoint/OneDrive or Dataverse so all users share the same library.
- Rollback: keep snapshots of PQ_Rules (CSV exports) before significant changes so you can revert quickly.
- Owner and SLAs: assign a rule owner who reviews low-confidence and fuzzy matches weekly.
Testing harness: quick check query
Create a PQ_TestCases table with columns: Description, ExpectedCategory. Load it into Power Query and run the rule engine — then compare expected vs actual. Report failures with the rule that matched and confidence (fuzzy score if used). For ongoing observability, produce weekly reports that mirror practices in edge observability playbooks so you can prioritise rule additions by impact.
Real-world example: small accounting practice
Example case study: a UK small accounting practice migrated 12 clients from an AI-only categorisation workflow to this Power Query rule engine. Results after 8 weeks:
- Manual recategorisation time dropped 82% (from average 3 hours/client/month to 20 minutes)
- Category consistency improved — same merchant now always maps to the same category across clients
- Fewer disputes with clients about mis-categorised expenses — audit trail simplified reconciliation
Key reason for success: the practice kept a small but disciplined rule table and reviewed fuzzy matches weekly to create deterministic rules for recurring merchants.
Integrating with automation and macros (VBA snippet)
You’ll often want to automate the refresh and export process. Here’s a simple VBA snippet to refresh Power Query connections, export the categorised feed to CSV, and save a snapshot of the rules table.
Sub RefreshAndExportCategorisedFeed()
' Refresh queries and export final table to CSV
Application.ScreenUpdating = False
ThisWorkbook.RefreshAll
' wait for refresh (simple blocking wait)
Application.CalculateUntilAsyncQueriesDone
Dim ws As Worksheet
Set ws = ThisWorkbook.Worksheets("CategorisedFeed") ' sheet where final query loads
' Export to CSV (overwrite)
Dim exportPath As String
exportPath = ThisWorkbook.Path & "\CategorisedFeed_Snapshot.csv"
ws.Copy
With ActiveWorkbook
.SaveAs Filename:=exportPath, FileFormat:=xlCSV, CreateBackup:=False
.Close SaveChanges:=False
End With
' Save snapshot of rules
ThisWorkbook.Worksheets("RulesTable").Copy
ActiveWorkbook.SaveAs Filename:=ThisWorkbook.Path & "\Rules_Snapshot_" & Format(Now(),"yyyyMMdd_HHmm") & ".xlsx"
ActiveWorkbook.Close SaveChanges:=False
Application.ScreenUpdating = True
MsgBox "Refresh and export complete." , vbInformation
End Sub
Notes: use Application.CalculateUntilAsyncQueriesDone (available in recent Excel) to wait for PQ async queries to finish. If your Excel version lacks this, use a loop to test Workbook.Queries status. For field-toolkit automation patterns and practical macros, see the field toolkit review.
Actionable checklist to implement this week
- Create PQ_Rules and PQ_RuleChangeLog tables in your workbook and load them to Power Query.
- Populate initial deterministic rules for top 50 merchants (Exact & Contains).
- Implement the M engine above and test against a small sample.
- Build PQ_TestCases and run unit tests; fix failing rules and log changes.
- Set up a weekly review cadence for fuzzy matches and add new deterministic rules.
Advanced strategies and future-proofing (2026+)
As datasets grow, adopt these strategies:
- Hybrid AI integration: keep AI for suggestions but map AI outputs to stable categories using rules. This avoids the “AI label drift” problem.
- Shared rule service: for multi-workbook environments, host rules on SharePoint list or Dataverse and query them from Power Query so all users share the same library.
- CI for rules: treat rules like code — store CSV snapshots in a version control system and run scheduled tests on push. See notes on rapid edge publishing for lightweight CI patterns at rapid edge content publishing.
- Observability: produce weekly reports that show top unmatched descriptors and low-confidence fuzzy matches to prioritise new rules.
Common pitfalls and how to avoid them
- Overly broad patterns: “Contains ‘shop’” will miscategorise many rows — prefer merchant-specific tokens.
- Uncontrolled rule order: always use a Priority field and document why a rule has precedence.
- Ignoring performance: don’t evaluate regex/fuzzy on every row if exact matching removes most transactions first.
- Not testing changes: always run PQ_TestCases before publishing rule changes.
Wrap-up: why this beats AI-only workflows
AI gives speed but lacks stable governance. A Power Query rule engine gives:
- Consistency: same merchant = same category every time
- Auditable decisions: RuleID and change logs explain why a transaction was categorised
- Maintainability: business users can edit tables in Excel without rewriting code
- Hybrid capability: combine deterministic matching, fuzzy joins and AI suggestions for best coverage
Takeaways — quick summary
- Store rules in an Excel table and load them into Power Query.
- Run matches in stages: exact join → pattern engine → fuzzy merge → AI suggestions.
- Buffer rule tables and use joins for performance.
- Keep a test harness and change log for governance.
- Automate refresh/export with a small VBA macro and review fuzzy matches weekly.
Next step: get the template and a short course
If you want to hit the ground running, download our Power Query Rule Engine template (includes PQ_Rules, PQ_TestCases, example MerchantDictionary and the ready-made M functions) and join our 90-minute workshop that walks through implementing and governing the engine across multiple clients.
Call to action: Download the template and sign up for the workshop to stop cleaning up AI output and build repeatable, auditable transaction categorisation workflows. Need a quick audit of your current categorisation? Contact us for a free 30-minute review and we’ll show you the low-hanging rules that save the most time.
Related Reading
- Building a Desktop LLM Agent Safely: Sandboxing, Isolation and Auditability
- How Startups Must Adapt to Europe’s New AI Rules
- Major Cloud Provider Per‑Query Cost Cap — What City Data Teams Need to Know
- Edge Observability for Resilient Login Flows in 2026
- Host-Friendly Travel: What to Expect When Renting a Place That Has a Roborock
- Makeup Minimalism: Nostalgia Beauty Trends Reimagined for Modest Looks
- Design Sprint: Create a Hybrid Lesson Using a BBC-Style Short and a YouTube Discussion Thread
- How Advertisers' Use of AI in Creative Translates to Better Panel Recruitment Ads
- Where to Store Your Home Documents: Local NAS vs. Sovereign Cloud — A Homeowner’s Guide
Related Topics
excels
Contributor
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you