Fix SPF PermError: 10 DNS Lookup Limit

Your transactional domain authenticates fine in testing, then Outlook and corporate gateways start reporting spf=permerror and DMARC alignment collapses — because your SPF record now triggers more than ten DNS lookups. This resolves that on a live sending domain, expanding on the SPF section of the SPF, DKIM, and DMARC authentication guide.

Every nested include spends part of the ten-lookup budget; flattening and consolidating includes brings evaluation back under the limit.

Root Cause: The RFC 7208 Ten-Lookup Limit

RFC 7208 §4.6.4 caps SPF evaluation at 10 DNS-querying mechanisms per record. The DNS-querying mechanisms are include, a, mx, ptr, exists, and the redirect modifier. Critically, the limit is cumulative across nesting: when your record contains include:_spf.google.com, the receiver must resolve Google's record too, and every lookup inside it counts against your budget of ten.

This is why the limit is so easy to blow. A single ESP include can expand into several nested lookups:

include:sendgrid.net resolves to additional nested includes (u####.wl.sendgrid.net-style records).
include:_spf.google.com expands to three further includes (_netblocks, _netblocks2, _netblocks3).
include:spf.protection.outlook.com adds its own.

Stack two or three ESPs plus Google Workspace and you silently cross ten. When that happens, the evaluating server returns PermError, SPF is treated as a failure, and if you have no aligned DKIM, DMARC fails too. Outlook, Microsoft 365, and corporate gateways like Proofpoint treat PermError as a hard fail rather than ignoring it — which is what links this directly to your DMARC outcome and the DMARC enforcement rollout.

Exact Fix

Step 1 — Count the lookups

Expand the record fully and count. A typical over-limit record:

; BEFORE: this record expands to 11+ DNS lookups -> PermError at Outlook/Microsoft 365.
yourdomain.com.  IN  TXT  "v=spf1 include:_spf.google.com include:sendgrid.net include:amazonses.com include:spf.protection.outlook.com include:servers.mcsv.net mx a -all"
;                                  |                       |                   |                     |                                  |                       | |
;                                  | (=3 nested)           | (=3+ nested)      | (=1)                | (=2 nested)                      | Mailchimp (=2)        | +-- 'a' lookup (=1)
;                                  +-- Google Workspace                                                                                                          +-- 'mx' lookup (=1)
; Sum already exceeds 10 before counting every nested record.

Step 2 — Remove unused includes

Most over-limit records carry dead weight: an ESP you migrated off, a Mailchimp include from a campaign that ended, an mx/a mechanism for hosts that never send. Removing include:servers.mcsv.net (Mailchimp) and the mx/a mechanisms — neither of which actually originates transactional mail — often drops you back under ten immediately.

Step 3 — Consolidate to a single ESP include

The cleanest fix is to send all transactional mail through one provider so you need only one include. Consolidating onto Amazon SES, for example, reduces the ESP footprint to include:amazonses.com (one lookup). Your ESP selection and integration decisions are the lever here — fewer providers means fewer includes.

; AFTER (consolidation): one ESP include + Google Workspace = well under 10 lookups.
yourdomain.com.  IN  TXT  "v=spf1 include:amazonses.com include:_spf.google.com -all"
;                                  |                     |                       |
;                                  | Amazon SES (=1)     | Google Workspace (=4) +-- hard fail
;                                  +-- total ~5 lookups, comfortably under the limit

Step 4 — Flatten when consolidation is not possible

If you genuinely need multiple providers, flatten: resolve the includes to their underlying IP ranges and publish them directly as ip4/ip6 mechanisms, which cost zero DNS lookups.

; AFTER (flattening): includes replaced by literal IP ranges -> 0 lookups for those senders.
yourdomain.com.  IN  TXT  "v=spf1 ip4:23.249.208.0/20 ip4:198.37.144.0/20 include:amazonses.com -all"
;                                  |                    |                   |                     |
;                                  | flattened ESP-A    | flattened ESP-B   | SES kept dynamic    +-- hard fail
;                                  +-- literal ranges cost no lookups, but must be refreshed when the provider's IPs change

Step 5 — Delegate via a subdomain

Alternatively, delegate a sender to its own subdomain with its own SPF record. Mail sent as notify.yourdomain.com is evaluated against notify.yourdomain.com's SPF, giving that sender a fresh budget of ten lookups and isolating its includes from the root domain.

Variant: Flattening Tradeoffs vs Delegation

Flattening trades correctness-over-time for lookup savings. The risk is IP churn: when SendGrid or another provider rotates its sending ranges, your flattened literals go stale and legitimate mail starts failing SPF. If you flatten, automate a scheduled job that re-resolves the provider's published record and rewrites your ip4/ip6 set, gated by review.

Subdomain delegation avoids churn entirely — the provider's include: stays dynamic — at the cost of running additional sending domains and configuring each ESP to use the delegated subdomain as its Return-Path. For most teams, delegation is the more durable choice and flattening is a stopgap when you cannot change provider configuration quickly.

A second, often-overlooked variant is the ptr mechanism. ptr is both deprecated by RFC 7208 and a DNS-lookup cost, and some receivers ignore it entirely; if an old record still carries ptr, removing it both reclaims a lookup and eliminates a mechanism that may be silently disregarded. Likewise, an mx mechanism counts as one lookup plus a lookup for each MX host returned, so a domain with several mail exchangers can spend its entire budget on mx alone — replace it with explicit ip4/ip6 or an include for the actual sending infrastructure if those MX hosts do not originate your transactional mail.

Watch also for the difference between a PermError and a TempError. A TempError is a transient DNS resolution failure and resolves itself; a PermError is structural — too many lookups, a malformed record, or a syntax error — and will not fix itself. If your monitoring shows intermittent failures, it is TempError (investigate DNS reliability); if it is constant, it is PermError (count your lookups).

SPF Macros and the `exists` Lookup Trap

A subtler source of runaway lookups is the SPF macro system, most often seen via the exists mechanism. SPF macros let a record build a dynamic DNS query from message attributes — %{i} (the connecting IP), %{s} (the envelope sender), %{d} (the domain) — and exists: then checks whether that constructed hostname resolves. Providers use this for per-IP authorization:

; An exists macro: constructs a hostname from the sending IP and checks if it resolves.
; Each evaluation is one DNS lookup that counts against your budget of ten.
yourdomain.com.  IN  TXT  "v=spf1 exists:%{ir}.%{v}.arpa._spf.provider.example include:amazonses.com -all"
;                                 |                                                |
;                                 | exists = 1 lookup, evaluated per message      | SES include
;                                 +-- %{ir} reverses the IP octets, %{v} is in-addr/ip6

Two things make macros dangerous for your budget. First, an exists mechanism counts as a DNS-querying mechanism just like include, so a record stacking several exists clauses can exhaust the limit on its own. Second, macros are evaluated per message against the actual sending IP, so a record that counts fine in a static analyzer can still behave differently in production if a nested provider record uses macros you did not expand. When you count lookups, treat every exists, a, mx, ptr, include, and redirect as one — macros do not get a discount.

The redirect modifier is a related trap: it replaces your record with another domain's SPF entirely, and that target's lookups all count against your ten. redirect is occasionally used to centralize SPF across many domains onto one managed record, but if that central record is itself near the limit, every domain pointing at it inherits the PermError. Prefer include over redirect for sending sources you add on top of your own; reserve redirect for the deliberate case of fully delegating a domain's SPF to a managed authority.

Delegation in Depth: A Fresh Lookup Budget per Subdomain

Subdomain delegation (Step 5 above) is worth expanding because it is the most durable fix for a genuinely multi-sender domain. The mechanism is simple: SPF is evaluated against the envelope-sender (Return-Path) domain, not the visible From. If your ESP's Return-Path is notify.yourdomain.com, the receiver looks up SPF at notify.yourdomain.com — a completely separate record with its own fresh budget of ten lookups.

; Root domain: lean record, only senders that use the root as Return-Path.
yourdomain.com.        IN  TXT  "v=spf1 include:_spf.google.com -all"   ; Google Workspace only (=4 lookups)
; Delegated subdomain: the ESP's includes live here, isolated from the root budget.
notify.yourdomain.com. IN  TXT  "v=spf1 include:amazonses.com include:sendgrid.net -all"  ; ESP senders (=4 lookups)

The catch is that delegation only takes effect if each sender is actually configured to use the delegated subdomain as its custom Return-Path / MAIL FROM — set this in the ESP per the ESP selection and integration guide. For DMARC to still pass, the delegated subdomain must align with your From domain, which under relaxed alignment (the default) a subdomain of the organizational domain satisfies automatically. This is why delegation both fixes the lookup count and keeps DMARC alignment intact, whereas flattening fixes only the count and introduces IP-churn risk.

Pipeline Integration: Automated Lookup-Count Checks in CI

Manual counting drifts the moment a provider adds a nested include. Gate every DNS change behind an automated lookup-count check so a record that would PermError never ships.

# .github/workflows/spf-check.yml — fail the build before an over-limit SPF record ships.
name: SPF lookup count
on: [pull_request]
jobs:
  spf:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Count SPF DNS lookups
        run: |
          # pyspf-tools' check resolves the record and counts cumulative lookups.
          # Exits non-zero if the record exceeds the RFC 7208 limit of 10.
          pip install checkdmarc
          checkdmarc yourdomain.com --check-spf-lookups || exit 1

If you want the check self-contained rather than depending on an external tool's exit code, a small script that resolves includes recursively and counts DNS-querying mechanisms makes the failure condition explicit:

# spf_lookup_count.py — fail CI if the live SPF record exceeds RFC 7208's 10-lookup limit.
import sys, dns.resolver
COUNTING = ("include", "a", "mx", "ptr", "exists", "redirect")  # mechanisms that cost a lookup

def count(domain, seen=None):
    seen = seen or set()
    if domain in seen: return 0            # guard against include loops
    seen.add(domain)
    txt = next(r.to_text().strip('"') for r in dns.resolver.resolve(domain, "TXT")
               if r.to_text().startswith('"v=spf1'))
    total = 0
    for term in txt.split():
        mech, _, arg = term.partition(":")
        mech = mech.lstrip("+-~?").split("=")[0]
        if mech in COUNTING:
            total += 1                       # the mechanism itself is one lookup
            if mech in ("include", "redirect") and arg:
                total += count(arg, seen)    # nested lookups count cumulatively
    return total

n = count(sys.argv[1])
print(f"{sys.argv[1]} resolves to {n} SPF DNS lookups")
sys.exit(1 if n > 10 else 0)   # non-zero exit fails the CI job before the record ships

Run either check on a schedule, not just on pull requests, so provider-side include changes that silently push you over ten are caught before mailbox providers do. Surface failures through your webhook event pipelines so an SPF regression pages the on-call engineer.

Validation Checklist

The expanded SPF record resolves to 10 or fewer cumulative DNS lookups.
Unused includes (former ESPs, ended campaigns) and non-sending mx/a mechanisms are removed.
Transactional mail is consolidated to as few ESP includes as practical.
Any flattened ip4/ip6 ranges have an automated refresh job to handle IP churn.
A test message to Outlook and a corporate gateway returns spf=pass, not permerror.
DMARC still passes via aligned SPF or DKIM after the change.
A CI and scheduled lookup-count check blocks any record that would exceed the limit.

SPF, DKIM, and DMARC for transactional email — the full authentication context and SPF record anatomy
Configuring a DMARC policy from p=none to p=reject — why an SPF PermError can break DMARC during enforcement
ESP selection and integration — consolidating providers to reduce required SPF includes
Webhook event pipelines — alerting when a scheduled SPF check regresses

← Back to Email Authentication: SPF, DKIM, and DMARC