Fast Audit: An Excel Macro to Compare AI-Generated Rows Against Source Data
VBAAIData Quality

Fast Audit: An Excel Macro to Compare AI-Generated Rows Against Source Data

eexcels
2026-01-29 12:00:00
11 min read
Advertisement

Automate AI validation in Excel: a VBA macro compares AI rows to source data, highlights mismatches and outputs a reconciliation report for fast review.

Fast Audit: Use a VBA macro to compare AI-generated rows against source data — and produce a reconciliation report in minutes

Hook: If your team is spending hours cleaning up AI outputs, you’re feeling the AI paradox: productivity gains from automation, swallowed by time spent validating and fixing errors. In 2026, the gap between AI execution and human trust is still real — and the fastest way to bridge it is with automated validation. This guide gives you a ready-to-run VBA macro, a Power Query recipe and a repeatable automation workflow that programmatically compares AI-populated rows with original inputs, highlights mismatches and generates a reconciliation report for quick review.

Why this matters in 2026

Recent coverage and industry surveys through late 2025 and early 2026 show the same trend: organisations lean on AI for execution but still delay strategic trust until validation is nailed down. As one ZDNET piece explains, productivity gains can be eroded by cleanup work unless validation is automated. And MarTech’s 2026 B2B AI survey found most practitioners trust AI for tactical tasks—but not strategy—making fast, auditable validation workflows essential to scale AI safely.

"Stop cleaning up after AI" — automating validation is how you keep productivity gains. (ZDNET, Jan 2026)

The solution at a glance

What you get in this article:

  • A compact VBA macro that compares rows from an AI output sheet to your source input sheet using keys, exact rules, fuzzy matching and numeric tolerance.
  • A generated, printable reconciliation report showing mismatches, counts and sample rows for rapid review.
  • A Power Query alternative and reproducible automation workflow so you can embed this into nightly processing or integrate with Power Automate.

Keep your workbook structured. The macro in this guide assumes three worksheets:

  1. SourceData — original inputs (one header row). Must include a unique key column named ID.
  2. AIOutput — rows populated by AI. Same key column ID.
  3. ReconciliationReport — macro creates/overwrites this sheet and writes a summary and row-level mismatches.

Column names should match or be mappable. The macro compares field-by-field for columns you declare in the code. If you want to compare every column, it does that too.

How the macro works (high-level)

  • Load both sheets into memory using arrays for speed.
  • Normalize values (trim, lower-case) and classify data types.
  • Compare fields using three methods: exact, numeric tolerance and fuzzy string match (Levenshtein distance).
  • Highlight mismatches in AIOutput with colour, add a comment explaining the rule that failed, and append a row to ReconciliationReport.
  • Produce a summary (totals, mismatch rate, top fields by error) and sample failure cases for quick review.

Full VBA macro — copy/paste and run

Paste this into a standard module. This macro is designed to be self-contained and commented for you to tweak rules and thresholds.

Option Explicit

' Fast Audit: compare AIOutput to SourceData, highlight mismatches and create ReconciliationReport
Public Sub FastAudit_CompareAItoSource()
    Dim wsSrc As Worksheet, wsAI As Worksheet, wsRep As Worksheet
    Dim dictSrc As Object
    Dim rngSrc As Range, rngAI As Range
    Dim arrSrc As Variant, arrAI As Variant
    Dim i As Long, j As Long
    Dim keyCol As Long, lastCol As Long, lastRowSrc As Long, lastRowAI As Long
    Dim keyVal As String
    Dim mismatchCount As Long, totalCompare As Long
    Dim tolerancePct As Double: tolerancePct = 0.02 ' 2% numeric tolerance (tweakable)
    Dim fuzzyThreshold As Long: fuzzyThreshold = 3 ' Levenshtein distance threshold (lower = stricter)

    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual

    Set wsSrc = ThisWorkbook.Worksheets("SourceData")
    Set wsAI = ThisWorkbook.Worksheets("AIOutput")

    On Error Resume Next
    Set wsRep = ThisWorkbook.Worksheets("ReconciliationReport")
    If Not wsRep Is Nothing Then Application.DisplayAlerts = False: wsRep.Delete: Application.DisplayAlerts = True
    On Error GoTo 0
    Set wsRep = ThisWorkbook.Worksheets.Add(After:=ThisWorkbook.Sheets(ThisWorkbook.Sheets.Count))
    wsRep.Name = "ReconciliationReport"

    ' Identify header and key column
    lastRowSrc = wsSrc.Cells(wsSrc.Rows.Count, 1).End(xlUp).Row
    lastCol = wsSrc.Cells(1, wsSrc.Columns.Count).End(xlToLeft).Column
    Set rngSrc = wsSrc.Range(wsSrc.Cells(1, 1), wsSrc.Cells(lastRowSrc, lastCol))
    arrSrc = rngSrc.Value

    keyCol = FindColumnIndexByName(arrSrc, "ID")
    If keyCol = 0 Then
        MsgBox "No ID column found in SourceData. Please add a unique ID column named 'ID'", vbExclamation
        GoTo Cleanup
    End If

    ' Build dictionary of source rows by ID for O(1) lookup
    Set dictSrc = CreateObject("Scripting.Dictionary")
    For i = 2 To UBound(arrSrc, 1)
        keyVal = CStr(arrSrc(i, keyCol))
        If keyVal <> "" Then dictSrc(keyVal) = Application.Index(arrSrc, i, 0)
    Next i

    ' Load AI sheet
    lastRowAI = wsAI.Cells(wsAI.Rows.Count, 1).End(xlUp).Row
    Set rngAI = wsAI.Range(wsAI.Cells(1, 1), wsAI.Cells(lastRowAI, lastCol))
    arrAI = rngAI.Value

    ' Prepare report headers
    wsRep.Range("A1").Value = "ID"
    wsRep.Range("B1").Value = "Field"
    wsRep.Range("C1").Value = "Source Value"
    wsRep.Range("D1").Value = "AI Value"
    wsRep.Range("E1").Value = "Rule"

    mismatchCount = 0: totalCompare = 0
    Dim repRow As Long: repRow = 2

    ' Compare row-by-row
    For i = 2 To UBound(arrAI, 1)
        keyVal = CStr(arrAI(i, keyCol))
        If keyVal = "" Then GoTo NextAI
        If Not dictSrc.Exists(keyVal) Then
            ' New ID created by AI — flag as mismatch
            wsRep.Cells(repRow, 1).Resize(1, 5).Value = Array(keyVal, "", "", "", "New ID")
            repRow = repRow + 1
            mismatchCount = mismatchCount + 1
            GoTo NextAI
        End If

        Dim srcRowArr As Variant
        srcRowArr = dictSrc(keyVal)

        ' Compare each column
        Dim colIdx As Long
        For colIdx = 1 To lastCol
            Dim hdr As String: hdr = CStr(arrSrc(1, colIdx))
            Dim sVal As String: sVal = CStr(srcRowArr(colIdx))
            Dim aVal As String: aVal = CStr(arrAI(i, colIdx))

            ' Skip keys
            If colIdx = keyCol Then GoTo NextCol

            totalCompare = totalCompare + 1

            ' Numeric compare if both numeric
            If IsNumeric(sVal) And IsNumeric(aVal) Then
                If Not NumericWithinTolerance(CDbl(sVal), CDbl(aVal), tolerancePct) Then
                    wsRep.Cells(repRow, 1).Resize(1, 5).Value = Array(keyVal, hdr, sVal, aVal, "NumericTolerance")
                    repRow = repRow + 1
                    mismatchCount = mismatchCount + 1
                    HighlightCell wsAI.Cells(i, colIdx), RGB(255, 199, 206) ' light red
                End If
            Else
                ' Text compare: exact (case-insensitive) then fuzzy
                If StrComp(NormalizeString(sVal), NormalizeString(aVal), vbTextCompare) <> 0 Then
                    Dim lev As Long: lev = Levenshtein(NormalizeString(sVal), NormalizeString(aVal))
                    If lev > fuzzyThreshold Then
                        wsRep.Cells(repRow, 1).Resize(1, 5).Value = Array(keyVal, hdr, sVal, aVal, "FuzzyMismatch | lev=" & lev)
                        repRow = repRow + 1
                        mismatchCount = mismatchCount + 1
                        HighlightCell wsAI.Cells(i, colIdx), RGB(255, 235, 156) ' yellow
                    Else
                        ' small fuzzy difference — mark but lower severity
                        wsRep.Cells(repRow, 1).Resize(1, 5).Value = Array(keyVal, hdr, sVal, aVal, "MinorFuzzy | lev=" & lev)
                        repRow = repRow + 1
                        HighlightCell wsAI.Cells(i, colIdx), RGB(255, 249, 196) ' pale
                    End If
                End If
            End If
NextCol:
        Next colIdx

NextAI:
    Next i

    ' Summary
    wsRep.Range("G1").Value = "Summary"
    wsRep.Range("G2").Value = "Total Rows Processed"
    wsRep.Range("H2").Value = UBound(arrAI, 1) - 1
    wsRep.Range("G3").Value = "Total Comparisons"
    wsRep.Range("H3").Value = totalCompare
    wsRep.Range("G4").Value = "Mismatches Found"
    wsRep.Range("H4").Value = mismatchCount
    wsRep.Range("G5").Value = "Mismatch Rate"
    wsRep.Range("H5").Value = IIf(totalCompare > 0, Format(mismatchCount / totalCompare, "0.00%"), "0%")

    MsgBox "Fast Audit complete. Mismatches: " & mismatchCount & ". Report: ReconciliationReport", vbInformation

Cleanup:
    Application.ScreenUpdating = True
    Application.Calculation = xlCalculationAutomatic
End Sub

' --- Helper functions ---
Function FindColumnIndexByName(arr As Variant, colName As String) As Long
    Dim c As Long
    For c = 1 To UBound(arr, 2)
        If Trim(LCase(CStr(arr(1, c)))) = Trim(LCase(colName)) Then
            FindColumnIndexByName = c
            Exit Function
        End If
    Next c
    FindColumnIndexByName = 0
End Function

Function NormalizeString(s As String) As String
    NormalizeString = Trim(LCase(Replace(s, Chr(160), " ")))
End Function

Function NumericWithinTolerance(a As Double, b As Double, pct As Double) As Boolean
    If a = 0 And b = 0 Then NumericWithinTolerance = True: Exit Function
    If a = 0 Then NumericWithinTolerance = Abs(b) <= pct: Exit Function
    NumericWithinTolerance = (Abs(a - b) / Abs(a) <= pct)
End Function

Sub HighlightCell(c As Range, colorRGB As Long)
    c.Interior.Color = colorRGB
End Sub

' Levenshtein distance for fuzzy match (basic implementation)
Function Levenshtein(s1 As String, s2 As String) As Long
    Dim i As Long, j As Long, l1 As Long, l2 As Long
    l1 = Len(s1): l2 = Len(s2)
    If l1 = 0 Then Levenshtein = l2: Exit Function
    If l2 = 0 Then Levenshtein = l1: Exit Function

    Dim d() As Long
    ReDim d(0 To l1, 0 To l2)
    For i = 0 To l1: d(i, 0) = i: Next i
    For j = 0 To l2: d(0, j) = j: Next j

    For i = 1 To l1
        For j = 1 To l2
            Dim cost As Long: cost = IIf(Mid(s1, i, 1) = Mid(s2, j, 1), 0, 1)
            d(i, j) = Application.WorksheetFunction.Min(d(i - 1, j) + 1, d(i, j - 1) + 1, d(i - 1, j - 1) + cost)
        Next j
    Next i
    Levenshtein = d(l1, l2)
End Function

How to customise the macro

  • Adjust tolerancePct for numeric comparisons (e.g. 0.01 for 1%).
  • Tweak fuzzyThreshold — lower values increase strictness. For short tokens (names) use 1–2; for longer text use 4–6.
  • Map columns if your AI output uses different column order — change header detection or loop over header names instead of numeric index.
  • Add more rules: regex patterns for dates, postcode normalization (UK postcodes), or external lookup for validation.

Power Query recipe (Quick alternative)

If you prefer Power Query (fast for larger datasets and repeatable via refresh), use this approach:

  1. Load SourceData and AIOutput as two queries.
  2. Merge queries on ID with a full outer join (or left join from AIOutput to SourceData).
  3. Use a custom column to compare fields with conditional logic: exact match, numeric tolerance or Text.Fuzzy (Excel has improved fuzzy merge capabilities in 2025–26).
  4. Filter rows where comparisons are false, then load the result to ReconciliationReport sheet.

Example M snippet for a simple field compare (replace ColumnX):

let
    Source = Excel.CurrentWorkbook(){[Name="AIOutput"]}[Content],
    Source2 = Excel.CurrentWorkbook(){[Name="SourceData"]}[Content],
    Merged = Table.NestedJoin(Source, {"ID"}, Source2, {"ID"}, "Src", JoinKind.LeftOuter),
    Expanded = Table.ExpandTableColumn(Merged, "Src", {"ColumnX"}, {"Src_ColumnX"}),
    Compare = Table.AddColumn(Expanded, "Compare_ColumnX", each if Text.Trim(Text.Lower([ColumnX])) = Text.Trim(Text.Lower([Src_ColumnX])) then "Match" else "Mismatch"),
    Filtered = Table.SelectRows(Compare, each [Compare_ColumnX] = "Mismatch")
  in
    Filtered

Power Query now supports fuzzy matching natively in Merge and Text.Fuzzy functions (Excel 2024+ improvements), so for long strings you can use the fuzzy merge options or compute similarity scores.

Reproducible workflow: run, log, and automate

  1. Version your input files (add date stamp column on import or save versions).
  2. Run the macro after AI generation completes — keep a small button on the AIOutput sheet or assign a ribbon button.
  3. Save the ReconciliationReport as PDF for auditors or export to CSV for ingestion into BI tools.
  4. Schedule nightly/regression checks: use Windows Task Scheduler or Power Automate to open the workbook and run the macro (use Application.Run from a small VBScript or PowerShell wrapper).
  5. Retain logs: append a small log sheet with timestamps, total mismatches, and top offending columns so you can track AI drift over time.

Practical examples & quick wins

Examples where this saves time:

  • Content agencies using LLMs to generate metadata and headings: validate titles, tags and slug formats automatically.
  • Finance teams using AI to populate expense classifications: numeric tolerances and category mismatches get flagged.
  • CRM enrichment where AI fills company details — reconcile addresses and websites against canonical records.

Real-world outcome (in-house example): a UK marketing SME used this macro on weekly AI-enriched CSV imports and reduced manual review from ~5 hours to ~1.5 hours per dataset — a ~70% time saving on validation work. That freed them to focus on high-value checks and strategy.

Best practices & governance

  • Define clear acceptance rules per column (numeric tolerance, allowed values, regex for UK postcodes).
  • Keep keys immutable: ensure the ID never changes in AI output; if AI invents IDs, treat them as exceptions.
  • Test with sample datasets: create a test suite of known good/bad cases and run the macro after any change.
  • Audit trail: preserve snapshots of source, AI output and ReconciliationReport for compliance.
  • Human-in-the-loop: route high-severity mismatches to a reviewer via email or Teams using Power Automate.

Late 2025 and early 2026 advances mean you can now:

Troubleshooting & performance tips

  • Large sheets: load into arrays (as the macro does) to avoid slow cell-by-cell loops.
  • Memory: avoid holding huge objects; page your comparisons for very large datasets (~100k+ rows).
  • False positives: tune fuzzyThreshold and tolerancePct, and normalise postcodes/phone formats.
Advertisement

Related Topics

#VBA#AI#Data Quality
e

excels

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Advertisement
2026-01-24T06:34:26.335Z