The Problem

Three people, three workstreams, zero coordination.

In a single Slack thread, a teammate shared a Claude-built XMS Graduation Qualifier, a partner-recommender tool. Separately, a Databricks-based predictive model covering related territory was already in progress. A third person suggested layering in Salesforce data. Three workstreams, overlapping signals, no shared registry, no one knowing who owned what.

Left ungoverned, this pattern compounds: people build the same thing twice, tools get abandoned when owners leave, and Anthropic API spend stacks up with no visibility. I designed and shipped a lightweight governance system, one rule, three tiers, a 5-minute intake card, to catch overlap in 24 hours instead of 24 weeks.

5 min

Intake card

From idea to registered

Tools surfaced

3 registered · 6 watched · 2 flagged

3 tiers

Cost & blast-radius

T1 <$50 · T2 <$500 · T3 above

48 hr

T2 sign-off SLA

5 business days for T3

Pilot launched April 21, 2026 in #test-gtm-ai-ops with 4–5 GTM users. Numbers reflect actual pilot state. Quarterly audit cadence begins after the pilot graduates.

The Framework

Five steps. One rule. One source of truth.

Before you build anything AI-powered for GTM, you check the registry and complete the intake card. Full stop.

👆 Click any stage to expand

Search Registry

Does this exist?

Intake Card

5 fields · 5 min

Tier Check

T1 / T2 / T3

Sign-off

SLA by tier

Quarterly review

Workspace Tour

👆 Try the tabs

Click any page in the sidebar to navigate.

Click between channels to see how the program runs day-to-day.

Pilot Mode Owner: BizOps · GTM Strategy Ops · Registry v1.3 · Apr 2026

GTM AI Tool Governance Playbook

The system, the one rule, and how it runs at Shippo.

Why this exists

Without governance, AI tools sprawl. People build the same thing twice, no one knows what already exists, tools get abandoned when owners leave, and Anthropic API and infra spend compounds with no visibility. This playbook turns AI adoption from chaotic to compounding, without becoming bureaucracy.

The one rule

Before you build anything AI-powered for GTM, you check the registry and complete the intake card. Full stop.

Roles & SLAs

Role	Responsibility	Cadence / SLA
AI Ops DRI	Owns registry, T2/T3 sign-off, quarterly audit, real-time overlap flagging	~2 hrs/month steady-state
Tool DRI	Keeps the tool alive, cost accurate, registry entry current	Quarterly check-in
Data/Eng Contact	Required for any T3 tool, architecture & integration review	5 business days
T2 sign-off	AI Ops DRI only	48 hours
T1	No approval, register within 1 week of going live	n/a

The 5-step process

Search Registry → Intake Card → Tier Check → Sign-off → Register & Audit. Most T1 tools clear the path in under 10 minutes. T2 tools clear in under 48 hours. T3 tools require an architecture pass and budget pre-approval, that's the design intent, not a bug.

Decision criteria

Tier matches data sensitivity and integration depth (writes to Salesforce/Databricks = T3)
Cost estimate provided, "I don't know" is not acceptable; order-of-magnitude is fine
No materially overlapping registered tool, unless extension is justified
One named DRI accountable for the tool, teams don't own tools, people do

Cost triggers

Tool hits 2× its estimated cost → DRI notifies AI Ops DRI within 1 week
Tool exceeds $200/mo → re-tiered to T2 minimum
Tool exceeds $500/mo → automatic escalation to T3 review
<3 active users at >$100/mo → flagged for sunset or consolidation

What this is not

This isn't a tool blocker. The intake takes 5 minutes. Most T1 builds need zero approval. The bottleneck before this program was indecision and silent duplication, not process.

Why we publish a watch list

Passive scans of #shippo-does-ai surface tools people built without filing intake. Watch list ≠ shame list, it's how we close the gap between "I built something cool" and "everyone can find it."

Appendices

→ Intake card template
→ Tier rubric (cost & integration thresholds)
→ Quarterly audit checklist
→ DRI offboarding handoff

Reflection

What I'd do differently.

A pilot is the cheapest place to be wrong. These are the calls I'd revisit if I were starting again, and the ones I already am.

I'd ship the watch list before the registry.

The first scan of #shippo-does-ai surfaced 6 tools nobody had filed. Passive discovery turned out to be more valuable in week one than active intake. Next time, I'd build the scanner first and let it create the demand for governance.

T3 formal review is still being designed, and that's deliberate.

A pilot with 4–5 users will never produce a real T3 case. Designing the T3 process in the abstract would have produced bureaucracy nobody asked for. I deferred it explicitly in the playbook, with a flag-it-to-the-DRI fallback until the first real T3 lands.

I underestimated the demand for a "did this already get built?" lookup.

The Shippo AI Dedup Tool, a Streamlit app that searches Slack/Confluence/Databricks before you build, was a reaction to pilot-week-one feedback, not part of the original spec. If I'd interviewed 3 GTM engineers up front, I'd have built it in parallel from day one.