Agentforce Headless 360: The API Quota & Security Risks Nobody Mentions

I’ve spent the last few weeks going deep on the Headless 360 documentation and auditing early deployment patterns. And I keep having the same conversation with architects who built it and leadership who approved it where I ask one question and get the same uncomfortable pause.

“What’s your API quota strategy for when the agent is live?”

Silence. Then: “We assumed it would be fine.”

That assumption is the problem. Headless 360, announced at Salesforce TDX in April 2026, is a genuinely significant platform shift - it opens your entire CRM to AI agents via APIs, MCP tools, and CLI commands, no browser required. The marketing is compelling. The demo is clean. What the launch deck doesn’t show you is what happens on day one when real users start talking to your agent, or what happens when someone figures out your agent will do whatever it’s told by anyone.

This post is for the architect who built the thing and knows something feels slightly wrong, and for the CTO who just signed off on the rollout. Two things are coming for your org that nobody in the partner ecosystem is talking about loudly enough. Let’s go through both.

The Quota Wall You’ll Hit Before Lunch

Salesforce Enterprise Edition gives you 100,000 API calls per 24-hour rolling window. That quota is shared - every integration, every sync job, every automation in your org pulls from the same bucket.

Here’s what nobody models before go-live: an AI agent is not a user. It doesn’t know where things live. Every single time someone asks it a question, it has to learn your org before it can answer. It fires off calls to list objects, fetch field definitions, resolve relationships - all before it touches the actual business question. This is the ReAct loop (Reasoning and Acting) in action: the model thinks out loud, using your API quota as its scratchpad.

I’ve been reviewing early Headless 360 deployment patterns and the schema discovery phase alone before any business logic runs routinely generates multiple API calls per conversational turn. That’s not a bug. That’s how the model works.

Now do the math for your org:

Interaction Type	API Behaviour	Quota Risk
Standard UI Navigation	Zero — internal routing	None
Batched Middleware Sync	~200 records per call	Negligible
LLM Conversational Prompt	Multiple calls per turn: schema discovery + query + resolution	Severe at scale

Scale a busy sales team asking pipeline questions throughout the morning, and your quota isn’t a weekly concern. It’s gone before your ERP sync runs at noon. And when it hits zero, Salesforce doesn’t triage by importance your billing pipeline, your automated alerts, your back-office integrations all freeze simultaneously. The agent burned the shared budget learning your schema. Everything else pays the price.

In simple words, your AI pilot can take down your production operations. On a Tuesday. In the first week.

The Security Problem Nobody Wants to Say Out Loud

In July 2025, a security researcher submitted a Web-to-Lead form to a standard Salesforce org. Inside the Description field - just a normal text box, 42,000 characters wide open were hidden instructions disguised as a legitimate prospect inquiry.

When an internal employee later asked the company’s AI agent to check the lead and respond, the agent followed both sets of instructions: the employee’s and the attacker’s. It pulled contact records from the CRM and sent them to a domain the attacker had purchased for $5 - a domain Salesforce’s own security policy still trusted because it had expired unnoticed.

Official severity score: 9.4 out of 10 Critical.

Salesforce patched it. But here’s what the patch didn’t fix: the agent behaved exactly as designed. It couldn’t distinguish between data it should read and instructions it should follow. That gap between data and commands is something humans navigate instinctively. AI agents don’t. The OWASP Foundation ranks this class of attack, prompt injection, as the single most critical vulnerability in AI applications today, with real-world success rates between 50% and 84% depending on model configuration.

Now layer in what I keep seeing in fast-moving Headless 360 deployments: one shared admin credential, handed to the agent as a permanent key, because the demo worked and nobody had time to set up proper per-user authentication.

That credential is God-mode. The agent sees everything in your org — no field-level security, no sharing rules, no record-level restrictions. When a user manipulates the agent through a prompt injection, it doesn’t just see what they’re allowed to see. It sees what the admin account sees. Executive compensation. Confidential pipeline data. Financial records it has no business surfacing.

Auth Model	How It Works	What the Agent Can Access
External Client App (ECA)	Per-user OAuth 2.0 — enforces field-level security and sharing rules	Only what that specific user is permitted to see
Shared Integration User / JWT	One server-to-server credential for all sessions	Everything in the org, no exceptions

A researcher spent $5 and demonstrated this in production. The next person might not publish their findings.

The Architecture That Fixes Both: ROWP

Both problems share the same root cause - treating Salesforce like an unbounded analytical playground instead of a production system with hard limits.

The pattern that solves this is called ROWP: Read Off-Platform, Write On-Platform. It’s built on the principle of CQRS (Command Query Responsibility Segregation), which simply means: physically separate how you read data from how you write it. Don’t use the same system, the same quota, or the same credentials for both workloads.

ROWP Architecture Pattern

In practice:

Use Change Data Capture (CDC) to asynchronously stream your CRM data out to a decoupled read replica - Heroku Postgres, an external vector database, your choice. Think of it as a Digital Garden: a sovereign, off-platform copy of your data that the AI can explore freely without ever touching Salesforce. The agent reads and reasons against this replica as many times as it needs schema discovery, exploratory queries, reasoning loops - consuming zero Salesforce API tokens. It only crosses back to the core CRM when it has a final, verified, deterministic write to execute.

Two guardrails make this production-safe:

1. Named Query API Never give an LLM raw SOQL access. Expose parameterized, immutable query endpoints as bounded MCP tools instead. The agent passes strongly typed inputs into pre-compiled, indexed queries no open-ended exploration, no table scans, no schema discovery burning your quota at runtime.

2. AI Gateways Route agent traffic through an intelligent control plane. The MuleSoft AI Gateway enforces rate limits and token budgets, stopping the aggressive retry loops LLMs generate when they hit errors. Pair it with behavioural monitoring on your MCP payloads to catch prompt injection patterns before they reach your data.

Layer	Tool	What It Protects Against
Data reads	CDC + Read Replica (Heroku Postgres / Vector DB)	Quota exhaustion, live database exposure
Query access	Named Query API as bounded MCP tools	Open-ended queries, table scans
Traffic control	MuleSoft AI Gateway	Retry loops, token budget overruns
Payload security	MCP behavioural monitoring	Prompt injection before it reaches data

There’s a performance bonus most teams don’t model for: when you control the structure of what the agent reads, you can organise your data so the AI reuses context from previous requests. Major LLM providers cache repeated prompt prefixes. If your static schema sits consistently at the start of every prompt, you’re not paying to re-explain your data model on every call. In high-volume deployments, that single structural decision can cut your AI inference costs dramatically.

The Architect’s Note

The hardest part of this conversation isn’t technical. It’s that the shared credential and the open database access were chosen because they were fast the demo worked, the stakeholders were happy, and nobody had time for the harder plumbing before the launch date.

ROWP takes more upfront work. But the org that skips it is one Web-to-Lead form away from a 9.4 severity incident - and the attacker writing the next one may not be interested in publishing a responsible disclosure report.

💬 Let’s Argue

Headless 360 is a real shift. The productivity it unlocks for sales teams, service teams, and operations is genuine and I’m not here to talk you out of it.

But there’s a version of this rollout that goes badly where a quota wall freezes production operations on day three, or where a $5 domain registration turns your integration user into someone else’s master key. That version is the one where the demo worked and nobody asked the uncomfortable questions before go-live.

I keep asking those questions. The pause I get in response tells me most teams haven’t asked them yet either.

You’ve approved the Agentforce rollout. Your integration account has access to everything. What’s your plan for the prompt injection that hasn’t been written yet?

The Quota Wall You’ll Hit Before Lunch#

The Security Problem Nobody Wants to Say Out Loud#

The Architecture That Fixes Both: ROWP#

The Architect’s Note#

💬 Let’s Argue#

📩 Join the Architecture & AI Newsletter

Join the Newsletter

The Quota Wall You’ll Hit Before Lunch

The Security Problem Nobody Wants to Say Out Loud

The Architecture That Fixes Both: ROWP

The Architect’s Note

💬 Let’s Argue