What runs on your device
- Page-context engine — pure deterministic action selector. Server and client.
- Heuristic enrichment — regex-only signal extraction. Server and client.
- Local LLM (when device tier permits) — selected from a tier-aware catalog and loaded from our Vercel Blob origin.
- Refinement merger — caller > LLM > heuristic priority chain. Re-resolves through the same deterministic engine so the LLM never produces a plan the engine wouldn't.
What never leaves your device
- Your raw query, page context, and any field the model classifies.
- The model's prompt, completion, and intermediate state.
- Any per-event identifier beyond the optional per-day rotating anonymous id used by privacy-first telemetry (see /policies/responsible-ai).
GPC, DNT, and Save-Data honored
When you set the Global Privacy Control header, enable Do Not Track, or browse with the Save-Data hint, we skip the local model load entirely and rely on the deterministic baseline.
How the models load
- Your browser fetches the model catalog from /api/public/adaptive-action-models.
- The Adaptive Action client classifies your device tier (network, memory, cores, WebGPU, SIMD).
- If the tier permits, the worker is spawned and the model shards are pulled from our Vercel Blob origin.
- @mlc-ai/web-llm caches shards in IndexedDB; subsequent visits warm in seconds.
- The deterministic plan is rendered immediately so SSR and first paint never wait on the model.
What models we publish
| Tier | Model | Approx download |
|---|---|---|
| tiny | SmolLM2-360M-Instruct (q4f16_1) | ~240 MB |
| small | Qwen2.5-0.5B-Instruct (q4f16_1) | ~380 MB |
| medium | Qwen2.5-1.5B-Instruct (q4f16_1) | ~1.1 GB |
| large | Llama-3.2-3B-Instruct (q4f16_1) | ~2.1 GB |
Each model is a static, content-addressed bundle hosted on our Vercel Blob origin. The catalog is hot-swappable via per-tier NEXT_PUBLIC_ environment variables — no code deploy is required to roll a new build forward.
What the LLM can and cannot do
- Can: classify urgency, audience, document need, and missing-field count from the buyer's free-text query.
- Can: refine the deterministic plan toward TraceFit, AOG, supplier-backup, alternate-path, or quote depending on the classified signals.
- Cannot: auto-quote a buyer, certify airworthiness, transmit data off-device, or override caller-supplied context fields.
- Cannot: change the action grammar — every plan it produces still flows through the same deterministic engine.
Verifying our claims
- Open DevTools Network panel — model shards load from blob.vercel-storage.com (or your configured Blob mirror) only.
- There is no fetch to a third-party LLM endpoint during adaptive-action routing.
- The /.well-known/trust.json manifest declares 'On-device only AI' under standards.
- The Adaptive Action telemetry endpoint accepts a strict allow-list — any field outside the schema is rejected at the boundary, not logged.
Question, request, or follow-up?
Reach the PartsPerk team for clarification, escalation, or to start a related conversation.
PartsPerk LLC · Delaware, United States · Doc /policies/local-only-ai · v1.0
