Power Query Recipe: Combine Leads from Multiple CRMs into One Clean Sheet
Power QueryCRMAutomation

Power Query Recipe: Combine Leads from Multiple CRMs into One Clean Sheet

UUnknown
2026-02-27
10 min read
Advertisement

Step-by-step Power Query recipe to ingest CRM exports, standardise fields, fuzzy-dedupe and produce a single canonical lead list for reporting.

Stop wasting hours cleaning CRM exports: a reproducible Power Query recipe to consolidate leads

If you’re a small business owner, operations lead or sales ops professional, you know the pain: multiple CRMs, inconsistent exports, duplicated leads and hours lost massaging spreadsheets. In 2026 those challenges are even more obvious — vendors keep adding features, new connectors and AI tools, but your reporting still struggles with messy source data. This step-by-step Power Query recipe shows how to pull exports from different CRM platforms, standardise fields, deduplicate effectively and produce one clean, canonical lead list for reporting and automation.

  • More CRM choices, more fragmentation. Cloud CRMs like Salesforce, HubSpot, Dynamics 365, Zoho, Pipedrive and specialist tools continue to diversify. Each export format and field name adds friction.
  • Power Query continues to evolve. Since late 2024–2025 Microsoft rolled AI-assisted mapping and faster merge algorithms into Power Query (and into Fabric's Data Factory), making ETL in Excel more powerful and accessible.
  • AI helps, but governance matters. Copilot and AI-suggested mappings can speed work — but you still need reproducible logic, audits and GDPR-safe handling of PII.

What you'll get from this recipe

  • A repeatable folder + Power Query workflow to ingest CRM exports
  • Field mapping technique to standardise disparate column names
  • Deduplication with a priority rule, plus fuzzy matching for near-duplicates
  • An audit column and canonical lead ID so reports are traceable
  • Small VBA snippet to autorefresh and export the canonical list

Overview: the canonical consolidation pattern

  1. Collect exports into a folder, one file per CRM export (CSV/Excel).
  2. Use a Power Query that reads the folder, tags each file with SourcePlatform.
  3. Apply a mapping table (another simple Excel sheet) to standardise column names to your canonical schema (e.g., LeadID, FirstName, LastName, Email, Phone, CreatedDate, Owner, SourcePlatform, RawNotes).
  4. Clean individual fields (trim, normalise phone, lowercase emails, parse names).
  5. Concatenate important keys for dedupe (Email + NormalisedPhone + Name fingerprint).
  6. Merge records using exact and fuzzy matching; apply rules to keep the best record (e.g., most recent CreatedDate or source priority).
  7. Produce the final canonical lead list with audit columns and a canonical LeadID.

Preparation: folder layout and control table

Set up a controlled folder structure that everyone in your team uses:

  • /CRM-Exports/Incoming/ — put raw CSV/XLSX exports here, named like Salesforce_2026-01-01.csv
  • /CRM-Exports/Archive/ — move processed files here
  • In your Excel workbook save a sheet named FieldMap containing two columns: RawName and CanonicalName. This is the mapping table Power Query will use to rename columns.

Step 1 — Import all exports from the folder

Use Power Query’s From Folder connector. This is resilient: you can drop new exports into the folder and refresh the query.

  1. Data > Get Data > From File > From Folder. Point to /CRM-Exports/Incoming/.
  2. In the query editor, expand Content > Use File. Add a column to extract the source platform from the file name (e.g., use Text.BeforeDelimiter([Name], "_")). Name it SourcePlatform.

Key M snippet: read and tag files

<code>let
    Source = Folder.Files("C:\\Path\\To\\CRM-Exports\\Incoming"),
    Filtered = Table.SelectRows(Source, each List.Contains({".csv", ".xlsx"}, Text.Lower(Text.End(Text.From([Extension]), 4)))),
    AddedCustom = Table.AddColumn(Filtered, "Imported", each Csv.Document(File.Contents([Folder Path]&[Name]),[Delimiter=",", Encoding=1252])),
    Expanded = Table.ExpandTableColumn(AddedCustom, "Imported", {"Column1"}),
    AddedPlatform = Table.AddColumn(Expanded, "SourcePlatform", each Text.BeforeDelimiter([Name], "_"))
in
    AddedPlatform
  in
    AddedPlatform</code>

Note: the exact import code varies if files are Excel; the UI is flexible. The goal is a single table where each row comes from a source file and includes SourcePlatform.

Step 2 — Normalise column names using a mapping table

Different CRMs use different column names: Email vs EmailAddress, FirstName vs Given Name. Your FieldMap sheet lets you standardise without hardcoding every variant.

  1. Load FieldMap into Power Query as a table (name it FieldMap).
  2. Create a query that takes your imported table, gets the header row for each file and uses FieldMap to rename columns.

Mapping technique (UI steps)

  • Promote headers in the imported table.
  • Create a list of current column names: Transform > Use First Row as Headers, then go to Advanced Editor and create a Rename step using a merge with FieldMap.

Key M snippet: rename via mapping table

<code>let
    Source = /* your combined source table */ ,
    CurrentNames = Table.ColumnNames(Source),
    MapTable = Excel.CurrentWorkbook(){[Name="FieldMap"]}[Content],
    MapList = Table.ToRows(MapTable),
    RenamePairs = List.Transform(MapList, each {_[0], _[1]}),
    Renamed = Table.RenameColumns(Source, RenamePairs, MissingField.Ignore)
in
    Renamed
  in
    Renamed</code>

This will rename matching raw names to your canonical schema and leave unmapped columns alone. Keep FieldMap up to date as you add new CRMs.

Step 3 — Clean and normalise values

Before deduping, normalise the key fields.

  • Email: Text.Lower, Trim, remove plus tags (e.g., alice+test@example.com → alice@example.com).
  • Phone: Remove non-digits, add country code if missing (simple rule-based normalisation).
  • Names: Trim, Proper case, split into First/Last where possible.
  • Dates: Ensure CreatedDate is DateTime type; use try/otherwise to handle different source formats.

Sample M steps

<code>CleanEmail = Table.TransformColumns(Renamed, {{"Email", each Text.Lower(Text.Trim(_)), type text}}),
  RemovePlus = Table.TransformColumns(CleanEmail, {{"Email", each Text.BeforeDelimiter(_, "+", {0, RelativePosition.FromEnd}), type text}}),
  CleanPhone = Table.TransformColumns(RemovePlus, {{"Phone", each Text.Select(_, {"0".."9"}), type text}})
</code>

Step 4 — Build dedupe keys and priorities

Decide how to choose which duplicate to keep. Common strategies:

  • Keep the most recent record (Latest CreatedDate wins).
  • Keep by Source Priority (e.g., Salesforce > HubSpot > Pipedrive).
  • Keep the most complete record (count of non-null key fields).

Create helper columns: EmailKey, PhoneKey, NameKey and a CombinedKey that concatenates them. Then compute a Score column and a SourceRank based on a lookup table.

M snippet: example SourceRank and Score

<code>SourceRankMap = {{"Salesforce",1},{"Dynamics",2},{"HubSpot",3},{"Pipedrive",4},{"Other",99}},
  AddSourceRank = Table.AddColumn(CleanPhone, "SourceRank", each Record.FieldOrDefault(Record.FromList(List.Transform(SourceRankMap, each _{1}), List.Transform(SourceRankMap, each _{0})), [SourcePlatform], 999), Int64.Type),
  AddCompleteness = Table.AddColumn(AddSourceRank, "Completeness", each List.Count(List.RemoveItems(Record.FieldValues(_), {null, ""})), Int64.Type)
</code>

Step 5 — Exact dedupe pass

First apply exact de-duplication on reliable keys (email and normalized phone). Group by EmailKey and PhoneKey and pick the row with the best Score/SourceRank.

  1. Group By EmailKey (or CombinedKey) > All Rows.
  2. Add a custom column for BestRow = Table.Sort([AllRows], {"SourceRank", Order.Ascending, {"CreatedDate", Order.Descending}}){0}.
  3. Expand BestRow back out.

Step 6 — Fuzzy dedupe for near duplicates

Some duplicates don't share email or phone. Use Power Query's Merge Queries (Fuzzy) or programmatic fuzzy joins for name+company matches. Fuzzy Merge lets you set similarity thresholds and weighting.

  • Duplicate your deduped table to create a self-join candidate set.
  • Merge queries on Name + Company with Fuzzy matching (threshold ~0.8). Inspect the results manually initially to tune threshold.

Fuzzy merge considerations (2026 tip)

Power Query's fuzzy engine in 2025-2026 is faster and supports tokenisation weighting — use it for larger datasets. But always include an IsFuzzyMatch flag and an audit column with matched IDs for human review.

Step 7 — Create canonical LeadID and audit trail

Create a deterministic CanonicalLeadID. Common pattern:

  • If Email exists, use a hash of email.
  • Else use hash of Phone + Name.
<code>AddLeadID = Table.AddColumn(FinalTable, "CanonicalLeadID", each if [Email] <> null then Text.From(Binary.ToText(Binary.FromText(Text.Lower([Email]), BinaryEncoding.Base64), BinaryEncoding.Base64)) else Text.From(Number.Abs(Number.From(Text.Hash(Text.Combine({[Name],[Phone]}))))) ) </code>

Also add an Audit column listing merged source IDs and any fields changed during consolidation. That makes the list trustworthy for reporting.

Step 8 — Load to Excel table and governance

Load the final canonical table to an Excel sheet named CanonicalLeads. Protect the sheet and keep a read-only view for stakeholders. Keep the Power Query steps documented: each query should have a comment describing purpose, source and last modification date.

Step 9 — Automation: refresh and archive

Use a small VBA macro to refresh all queries, export the canonical sheet to CSV, and move processed files to Archive. Run it manually or as a scheduled task using Power Automate Desktop or Task Scheduler (if you keep the workbook on a server.).

Minimal VBA snippet to refresh and export

<code>Sub RefreshAndExportCanonical()
  ThisWorkbook.RefreshAll
  Application.CalculateUntilAsyncQueriesDone
  Dim ws As Worksheet: Set ws = ThisWorkbook.Worksheets("CanonicalLeads")
  Dim outPath As String: outPath = ThisWorkbook.Path & "\\Exports\\CanonicalLeads_" & Format(Now(), "yyyy-mm-dd_hhmm") & ".csv"
  ws.ListObjects(1).Range.Select
  ws.Copy
  ActiveWorkbook.SaveAs Filename:=outPath, FileFormat:=xlCSV
  ActiveWorkbook.Close SaveChanges:=False
  MsgBox "Exported to " & outPath
End Sub</code>

Tip: use Application.CalculateUntilAsyncQueriesDone to ensure Power Query finishes before export (Excel 365 builds in 2024+ support this). If you're on Excel online or Fabric, schedule refreshes in the cloud.

Example case study: how a UK SME saved 6 hours/week

We implemented this recipe for a UK marketing agency in late 2025. They had three CRM systems plus a manual spreadsheet. Results after two weeks:

  • Consolidation time reduced from ~6 hours/week to ~20 minutes refresh.
  • Duplicate rate reduced by 78% after applying fuzzy dedupe and source priority rules.
  • Sales reporting accuracy improved; the canonical list became the single source for campaign targeting.

Data privacy and governance (must-do in 2026)

  • Limit who can put files in /Incoming/. Manage access via SharePoint or secure file share.
  • Redact or pseudonymise sensitive fields when possible. Only export PII when necessary.
  • Keep a change log: every refresh should write a small audit file noting who ran it and which files were ingested.
  • Reference: GDPR still applies to UK businesses; document lawful basis for each processing step.

Advanced tips and 2026-forward strategies

  • Use Power Query parameters for dynamic folder paths and thresholds — makes the workbook portable across environments.
  • Centralise mappings in Fabric Dataflows if you have Microsoft Fabric — reuse FieldMap across teams and build modular ETL.
  • Leverage AI-assisted mapping in Excel/Copilot to suggest field maps, then lock them with your FieldMap table for reproducibility.
  • Expose canonical IDs to your CRM(s) via integrations or API — prevents re-creation of duplicates at source in the long term.

Common pitfalls and how to avoid them

  • Over-reliance on fuzzy matching: It’s powerful but can merge distinct leads if thresholds are too low. Always review fuzzy matches until you're confident.
  • Mapping drift: New exports may introduce new column names. Keep FieldMap updated and monitor query errors.
  • PII leaks: Avoid storing raw exports in shared folders; rotate and archive them securely.
“Automating consolidation doesn’t remove the need for governance — it amplifies the importance of good mapping and audit trails.”

Actionable checklist to implement today

  1. Create the folder structure and FieldMap sheet.
  2. Build the From Folder query and tag SourcePlatform.
  3. Load FieldMap and implement column rename step.
  4. Standardise Email, Phone and Dates; add SourceRank and Completeness score.
  5. Run exact dedupe, then fuzzy dedupe with conservative threshold.
  6. Create CanonicalLeadID, load to CanonicalLeads sheet and protect it.
  7. Add the VBA refresh/export macro and test with sample exports.

Takeaways

Power Query is a practical, repeatable ETL tool for consolidating leads from multiple CRMs in 2026. By combining a folder-based ingestion, a FieldMap-driven schema, thoughtful dedupe rules (exact + fuzzy) and basic automation, you can turn messy exports into a single canonical lead list you trust for reporting and activation. Keep governance, privacy and auditability front of mind as you scale the solution.

Next steps — get the templates and a quick run-through

If you want to skip building from scratch, we offer a downloadable Excel workbook that includes:

  • Pre-built Power Query queries for folder import.
  • FieldMap sample with common CRM mappings.
  • VBA refresh/export macro and documentation.

Ready to stop wrestling your CRM exports? Download the template, try it with two weeks of exports and schedule a 30-minute session with our spreadsheet experts to tailor it to your stack.

Call to action: Download the Canonical Leads Power Query template and book a free 30-minute setup call — or subscribe for monthly updates and support to keep your mappings and dedupe rules current.

Advertisement

Related Topics

#Power Query#CRM#Automation
U

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Advertisement
2026-02-27T00:18:07.452Z