Skip to content
Salesforce Dictionary - Free Salesforce GlossarySalesforce Dictionary
All articles
dev·June 21, 2026·10 min read·0 views

Salesforce and Databricks for Agentforce: Zero Copy, Federated Search, and Governed Agent Grounding in 2026

At Data + AI Summit 2026, Salesforce and Databricks showed what happens when Agentforce agents get governed access to the lakehouse. Here is what is live now, what is coming in H2 2026, and how to plan your data architecture around it.

Salesforce and Databricks expanded partnership for Agentforce agents: Zero Copy, federated search, and governed agent grounding announced at Data + AI Summit 2026
By Dipojjal Chakrabarti · Founder & Editor, Salesforce DictionaryLast updated Jun 21, 2026

Your Agentforce agent gets asked which open opportunities belong to accounts that a churn model scored as high-risk last night. The pipeline data is right there in Salesforce. The agent answers that part in a heartbeat. Then it stops. The churn scores live in a Databricks notebook, the agent has no path to them, and it can only see half the question.

That gap is the whole story. The CRM knows who your customers are. The lakehouse knows what they are about to do. For years those two halves have lived in separate systems with a brittle pipeline stitched between them. At Data + AI Summit in June 2026, Salesforce and Databricks announced a set of capabilities meant to close the gap for agents specifically. Some of it shipped. Some of it is a slide with a date on it.

Here is what is real today, what is coming later in 2026, and how to plan your data architecture so you are ready either way.

Why enterprise data is still split

Walk into almost any enterprise and you will find the same division of labor. Salesforce holds the system of record: accounts, contacts, opportunities, cases, the operational truth of who your customers are and what you have done with them. Databricks holds the system of analysis: churn models, propensity scores, feature sets, event telemetry, the things you compute about customers rather than record about them.

This split is not an accident. The two systems are good at different jobs. A CRM is built for transactional reads and writes against well-shaped business objects. A lakehouse is built for large-scale compute over messy, high-volume data. Asking either one to do the other's job ends badly.

The trouble starts when you need both halves in the same answer. The classic fix is a pipeline. You run a nightly job that copies churn scores out of Databricks and writes them onto Account records as a custom field. It works, sort of. Then the model gets retrained and the field schema drifts. The job fails at 2 a.m. and nobody notices until a sales rep asks why every account suddenly shows a churn risk of null. You now maintain a second copy of data that disagrees with the first copy on a schedule you cannot fully predict.

The enterprise data gap Agentforce agents cannot cross alone

Zero Copy is the alternative approach underneath this entire announcement. Instead of moving the data, you query it where it lives. The bytes stay in Databricks. Salesforce holds the metadata and the query path, not a duplicate of the rows. No nightly job. No second copy. No drift.

What Zero Copy already does

Zero Copy federation between Salesforce and Databricks is not new. It went GA well before June 2026, and if you have read the Data Cloud Zero Copy guide you already know the mechanics. The short version: Data Cloud, now branded Data 360, uses Apache Iceberg as a shared table format. Databricks exposes its tables through Unity Catalog, Data 360 registers them as external objects, and queries run against the source.

So the data federation problem was solved. You could already surface a Databricks table inside Data 360 and use it in segments, calculated insights, and reports. What you could not do cleanly was hand that data to an Agentforce agent as trusted grounding.

That is the specific gap the June announcement targets. An agent does not just need the churn score. It needs to know the score is current, where it came from, who owns it, and whether it is safe to repeat to a customer-facing user. A raw federated table gives the agent the number with none of that context. The agent ends up either ignoring the data or quoting it without any sense of whether it should. Governed grounding is the missing layer between "the data is technically reachable" and "the agent can responsibly use it."

The June 2026 partnership expansion

At Data + AI Summit, held June 16 to 18, 2026, the two companies announced four capabilities. They are not all at the same maturity, which matters a great deal for planning, so read the GA versus roadmap section right after this one before you commit to anything.

What the June 2026 Databricks partnership expansion adds

Governed Business Context on Zero Copy. This is the headline. Structured metadata from Unity Catalog now flows into Agentforce agent prompts as trusted grounding. The agent does not just receive the churn score. It receives the governance context around it: the definition of the metric, its owner, its sensitivity classification, and its freshness. The agent can use the number because it knows what the number is.

Bi-directional Federated Search. Agents and users get to search across Salesforce Data 360 and Databricks in a single query. Instead of asking the agent to look in one place and then the other, you ask once and results from both surfaces come back ranked together. The agent stops being a switchboard operator who has to know which system holds the answer.

MuleSoft Agent Scanner for Databricks. This one solves a quieter failure mode. Agents fail not only when they lack data but when they do not know the data exists. The scanner crawls your Databricks workspaces, discovers the available data assets, and indexes them so the agent's tool selection can actually find them. Fewer "the agent did not know what to call" misses.

Slack Genie App for Databricks. A Slack-native interface where users ask Genie questions that pull from Salesforce and Databricks Unity Catalog at the same time. The point is to put the combined data where people already work instead of forcing them into yet another console.

What is GA versus roadmap

This is where you have to read the fine print, because the announcement was heavy on roadmap and light on shipping dates that have already passed.

Here is the honest breakdown as of June 2026:

  • Zero Copy federation with Databricks: GA. Has been generally available since 2024 and 2025. This is the foundation, not the news.
  • Governed Business Context: GA. Per the announcement, this is available now. It is the one genuinely new thing you can act on today.
  • MuleSoft Agent Scanner for Databricks: GA. Also available now, per the announcement.
  • Bi-directional Federated Search: coming H2 2026. Roadmap. A date, not a feature you can switch on.
  • Slack Genie App for Databricks: coming H2 2026. Also roadmap.

Two of the four headline items are futures. That is not a knock on the partnership, but it changes how you should treat the announcement. If your plan depends on federated search or the Slack Genie app, your plan depends on a quarter that has not happened yet, against a track record where second-half dates have a way of becoming next-year dates. Build against what is GA. Treat the rest as a reason to prepare, not a reason to promise anything to your stakeholders.

What this means in practice for architects

Strip away the launch energy and a few concrete decisions land on your desk.

Start with the connector, not the agent. None of this works if Data 360 cannot read Databricks in the first place. Get the Apache Iceberg connector working and registering Unity Catalog tables before you write a single agent instruction. If you skip this and go straight to agent configuration, you will spend a week debugging the wrong layer.

Then sort out the governance question, because it has two owners. Unity Catalog owns the data classification, lineage, and access policy on the Databricks side. The Salesforce trust layer owns what the agent is allowed to do with that data once it arrives. These are different jobs held by different teams who, in my experience, do not talk to each other nearly enough. Decide explicitly who owns what before the first sensitive field reaches a prompt.

Governed agent grounding: from lakehouse to prompt

The prompt-grounding question is the one people get wrong because it sounds abstract. "Governed business context" shows up as structured metadata injected into the agent's instructions alongside the data value. The agent does not see a bare number. It sees the number plus its definition, owner, sensitivity tag, and freshness, so it can decide whether and how to use it. The metadata is the difference between an agent that quotes a churn score confidently and one that quotes it correctly.

Zero Copy: agents query where the data already lives

The common mistake is assuming the agent can now query all of Databricks. It cannot. It queries what is in Unity Catalog, and only the assets that have been properly registered and exposed to Data 360. A table that exists in a notebook but was never registered is invisible to the agent, governance metadata and all. The catalog is the boundary. If it is not in the catalog, it does not exist as far as your agent is concerned.

When not to use Zero Copy for Agentforce

Zero Copy is a federation pattern, not a universal answer. There are cases where grounding an agent on it will hurt you.

Real-time streaming data is the first. Iceberg is a table format, not a streaming protocol. If your use case needs the agent to react to an event the instant it happens, federation against an Iceberg snapshot is the wrong tool. The snapshot is only as fresh as its last metadata refresh.

High-cardinality lookups inside a single agent turn are the second. If answering one question forces the agent to scan millions of rows in Databricks while a user waits, you have built a slow agent. Federation does not make a big scan cheap. It just moves where the scan runs.

The third is when the Databricks data changes faster than the Iceberg metadata refreshes. If your scores update every few minutes but the catalog snapshot refreshes hourly, the agent will confidently serve stale numbers and have no idea they are stale. The freshness metadata helps an honest agent flag the staleness, but it does not make the data fresh.

The fourth is any case that needs sub-second latency in the agent response. Federated queries cross a network boundary into another platform. That round trip has a floor. If your interaction budget cannot absorb it, precompute the value and store it locally instead. Ingestion is not a dirty word when latency is the constraint.

What to do before H2 2026

The two roadmap items are coming. The work that makes them useful is work you can do now. Here is the prep list, in order.

Do those five and you will be ready the day federated search and the Slack Genie app actually ship, rather than starting from zero when the press release lands.

The single most useful thing you can do this week is the catalog audit in step one. Open Unity Catalog, list every asset an agent might reasonably need, and write down who owns each and whether its sensitivity is tagged. That document is the foundation for everything else here, and it is the one piece no vendor announcement will hand you.

About the Author

Dipojjal Chakrabarti is a B2C Solution Architect with 29 Salesforce certifications and over 13 years in the Salesforce ecosystem. He runs salesforcedictionary.com to help admins, developers, architects, and cert/interview candidates sharpen their fundamentals. More about Dipojjal.

Share this article

Share on XLinkedIn

Sources

Related dictionary terms

Comments

    No comments yet. Start the conversation.

    Sign in to join the discussion. Your account works across every page.

    Keep reading

    Salesforce Einstein Trust Layer 2026: complete guide to secure AI
    ai·May 16, 2026·11 min read

    Salesforce Einstein Trust Layer: The Complete 2026 Guide to Secure AI

    Your security team asks where the customer data goes when Agentforce processes it. Here is the full answer: how the Einstein Trust Layer's prompt journey, data masking, zero-data retention, and toxicity detection actually work.

    Salesforce MCP Model Context Protocol complete 2026 guide
    dev·May 8, 2026·11 min read

    Salesforce MCP: The Complete 2026 Guide

    MCP is the open standard that lets Agentforce agents, Claude Desktop, and Cursor read records, run SOQL, and invoke Flows directly in your org. Here's what every Salesforce developer needs to know in 2026.

    Salesforce Data 360 implementation 2026: data streams, identity resolution, segmentation, activation
    Salesforce Products·May 12, 2026·14 min read

    Salesforce Data 360: The Complete 2026 Implementation Guide

    Data 360 is Salesforce's unified data platform for the agent era. This 2026 implementation guide walks data streams, identity resolution, segmentation, activations, and zero-copy federation.

    Salesforce Agentforce Multi-Agent Orchestration - a team of AI agents working together in 2026
    Agentforce·May 12, 2026·15 min read

    Salesforce Multi-Agent Orchestration: The Complete 2026 Guide

    In 2026, orgs run an average of 12 AI agents - half in isolated silos. Learn the primary-and-specialist architecture, Agent Fabric, and the A2A protocol that turn agent sprawl into coordinated enterprise AI.

    Salesforce Data Cloud Zero Copy architecture guide 2026: how federation with Snowflake, Databricks, BigQuery, and Redshift works through Apache Iceberg
    Data 360·Jun 19, 2026·10 min read

    Salesforce Data Cloud Zero Copy: The Complete 2026 Architecture Guide

    Zero Copy lets Data Cloud query Snowflake, Databricks, BigQuery, and Redshift in place. No ETL, no data duplication, 28.5x cheaper per million rows than traditional ingestion. Here is how it works and when not to use it.