How GateTest works
104deterministic modules. One Claude pass when it's worth it. Zero hype.
Most QA scanners are either purely pattern-matched (cheap, noisy) or purely LLM-driven (expensive, unpredictable). GateTest is neither. The default scan is a static engine with no AI in the loop — predictable, reproducible, no surprise API spend. AI is reserved for fix generation, and even there we try three deterministic layers first.
The pipeline
Customer push hits one of two ingress points, lands in a single Postgres queue, runs the gate, and ships a PR. The same path serves every tier — depth comes from what we layer on top, not from a different pipeline.
The diagram is hand-rolled SVG. Mermaid would have required adding a dependency, and the rule on unapproved dependencies is hard.
The 104 modules
Each module is self-contained, runs in parallel, and emits structured findings. Click a card to see a representative finding. Grouped by category for browsability — the actual suite assignment lives in src/core/config.js.
104 modules total. Every Full scan ($99) runs the developer suite. URL scans run the live-site subset. Quick scan runs the highest-signal four.
Source & quality12 modules
The foundation. Catches the bugs every linter and compiler should have caught but didn't.
syntaxValidates JS, TS, JSON, YAML, CSS, HTML.
Validates JS, TS, JSON, YAML, CSS, HTML.
Unclosed bracket at src/api/handler.ts:148lintESLint, Stylelint, language-aware style rules.
ESLint, Stylelint, language-aware style rules.
Unexpected console statement at src/db.ts:42codeQualityconsole.log, debugger, TODO/FIXME, eval, innerHTML, complexity.
console.log, debugger, TODO/FIXME, eval, innerHTML, complexity.
innerHTML assignment found at components/Comment.tsx:88deadCodeUnused exports across JS/TS/Python, orphaned files, rotting commented-out blocks.
Unused exports across JS/TS/Python, orphaned files, rotting commented-out blocks.
Export 'parseLegacyToken' in src/auth.ts has no importerstypescriptStrictnesstsconfig regressions, @ts-ignore abuse, any-leak detection on exported signatures.
tsconfig regressions, @ts-ignore abuse, any-leak detection on exported signatures.
strict: false in tsconfig.json — implicit-any leaks across 47 filesdocumentationREADME, CHANGELOG, LICENSE, JSDoc coverage, env documentation.
README, CHANGELOG, LICENSE, JSDoc coverage, env documentation.
Missing README section: InstallationduplicateCodeCopy-pasted blocks that should be extracted into utilities.
Copy-pasted blocks that should be extracted into utilities.
16-line block duplicated 4x across src/handlers/importCycleCircular dependencies that cause runtime TDZ / undefined-import bugs.
Circular dependencies that cause runtime TDZ / undefined-import bugs.
Cycle: src/user.ts → src/post.ts → src/user.tsasyncIterationAsync callbacks handed to .reduce/.filter/.some/.every/.forEach/.map where Promise semantics silently break.
Async callbacks handed to .reduce/.filter/.some/.every/.forEach/.map where Promise semantics silently break.
.filter(async x => await isValid(x)) — predicate is a Promise, always truthydatetimeBugNaive datetimes, JS 0-vs-1 month, moment-legacy.
Naive datetimes, JS 0-vs-1 month, moment-legacy.
datetime.now() without tz= at jobs/scheduler.py:31moneyFloatIEEE-754 precision loss on currency-named variables.
IEEE-754 precision loss on currency-named variables.
parseFloat(amount) on trust-account money in TrustActions.tsxhomoglyphTrojan Source bidi overrides, Cyrillic/Greek letters in Latin identifiers, zero-width chars.
Trojan Source bidi overrides, Cyrillic/Greek letters in Latin identifiers, zero-width chars.
Cyrillic 'а' (U+0430) inside identifier `data` at src/parser.ts:212Security15 modules
OWASP-grade scanning that goes beyond CVE lookups into your actual code paths.
securityOWASP patterns, XSS, SQL injection, innerHTML, shell exec, Docker misconfigs.
OWASP patterns, XSS, SQL injection, innerHTML, shell exec, Docker misconfigs.
exec(req.body.cmd) at api/run.ts:54 — command injectionsecretsAWS keys, GitHub tokens, Stripe keys, passwords, private keys, DB strings.
AWS keys, GitHub tokens, Stripe keys, passwords, private keys, DB strings.
AKIA[redacted] hardcoded in config/aws.tssecretRotationLong-lived credentials in git, .env drift, placeholder/real example mismatch.
Long-lived credentials in git, .env drift, placeholder/real example mismatch.
API_KEY in .env has been unchanged for 412 daysssrfUser-controlled URLs handed to fetch/axios/got/node-http without validation.
User-controlled URLs handed to fetch/axios/got/node-http without validation.
fetch(req.query.url) with no allowlist at api/proxy.ts:18tlsSecurityrejectUnauthorized:false, verify=False, NODE_TLS_REJECT_UNAUTHORIZED=0.
rejectUnauthorized:false, verify=False, NODE_TLS_REJECT_UNAUTHORIZED=0.
rejectUnauthorized: false in production https.AgentcookieSecurityhttpOnly:false, weak session secrets, SESSION_COOKIE_* misconfigs.
httpOnly:false, weak session secrets, SESSION_COOKIE_* misconfigs.
session cookie httpOnly: false — XSS becomes session takeoverredosCatastrophic-regex detector: nested quantifiers, overlapping alternation, user-controlled patterns.
Catastrophic-regex detector: nested quantifiers, overlapping alternation, user-controlled patterns.
(a+)+ at src/validator.ts:30 — catastrophic backtrackingauthBypassRoutes missing authentication.
Routes missing authentication.
/api/admin/users has no middleware guardcrossFileTaintCross-file taint analysis — user input → dangerous sinks across module boundaries.
Cross-file taint analysis — user input → dangerous sinks across module boundaries.
req.body.path → fs.readFile via 3 hops, no validationwebhookPayloadWebhook handlers that use req.body without validation.
Webhook handlers that use req.body without validation.
Stripe webhook handler reads req.body.amount without zod parselogPiiCredentials, tokens, and request objects logged in plaintext.
Credentials, tokens, and request objects logged in plaintext.
console.log(user) at auth/login.ts:88 — leaks bcrypt hashwpExposedFilesWordPress: sensitive files exposed via public webroot (wp-config.php.bak, debug.log, .git, .env, SQL backups).
WordPress: sensitive files exposed via public webroot (wp-config.php.bak, debug.log, .git, .env, SQL backups).
wp-config.php.bak reachable at /wp-config.php.bak (HTTP 200)wpXmlrpcExposedWordPress: /xmlrpc.php exposed (brute-force amplification + DDoS reflector + auth-bypass surface).
WordPress: /xmlrpc.php exposed (brute-force amplification + DDoS reflector + auth-bypass surface).
/xmlrpc.php returns 200 — disable or block at WAFwpMalwarePatternsWordPress: rendered HTML/JS scanned for known malware signatures (eval(atob), hidden iframes, base64 payloads).
WordPress: rendered HTML/JS scanned for known malware signatures (eval(atob), hidden iframes, base64 payloads).
eval(atob(...)) found in footer script — likely compromisedwpAdminProtectionWordPress: /wp-admin and /wp-login.php checked for rate limit / WAF / 2FA / cookie hardening.
WordPress: /wp-admin and /wp-login.php checked for rate limit / WAF / 2FA / cookie hardening.
/wp-login.php has no rate limiting — brute-force openReliability11 modules
The bugs that don't break on your machine but break in production at 3am.
errorSwallowEmpty catch, .catch(noop), callback-err ignored, floating promises, global silent handlers.
Empty catch, .catch(noop), callback-err ignored, floating promises, global silent handlers.
Empty catch block at db/save.ts:114 — error swallowednPlusOneDatabase calls inside loops across Prisma, Sequelize, TypeORM, Mongoose, Knex, Drizzle.
Database calls inside loops across Prisma, Sequelize, TypeORM, Mongoose, Knex, Drizzle.
await prisma.post.findUnique inside arr.map at feed.ts:42retryHygieneTight retry loops, no backoff, unbounded retry, retry-on-4xx across fetch/axios/got/node-http.
Tight retry loops, no backoff, unbounded retry, retry-on-4xx across fetch/axios/got/node-http.
while(true) retry with no jitter at api/upload.ts:88raceConditionTOCTOU, get-or-create anti-pattern, lost-update on counters.
TOCTOU, get-or-create anti-pattern, lost-update on counters.
fs.exists() then fs.unlink() — same path, symlink-race vectorresourceLeakUnclosed streams, file handles, intervals, sockets across fs/net/ws/events.
Unclosed streams, file handles, intervals, sockets across fs/net/ws/events.
fs.createReadStream never piped or closed at importer.ts:31envVarsprocess.env / os.environ reads cross-referenced with .env.example and CI env blocks.
process.env / os.environ reads cross-referenced with .env.example and CI env blocks.
STRIPE_SECRET_KEY read in code but missing from .env.examplecronExpressionInvalid / impossible / too-frequent cron strings (Feb 30, * * * * *, typo aliases).
Invalid / impossible / too-frequent cron strings (Feb 30, * * * * *, typo aliases).
0 0 30 2 * — Feb 30 never fires, silent killerfeatureFlagStale flags collapsed into constants and dead-branch conditionals.
Stale flags collapsed into constants and dead-branch conditionals.
if (true) wrapping 200 lines of code at src/checkout.ts:14intentVerificationAI checks that the diff matches the commit message / PR description.
AI checks that the diff matches the commit message / PR description.
PR titled 'fix typo' touches 18 files across 3 directoriesregressionPredictorAI predicts which files this PR is most likely to break.
AI predicts which files this PR is most likely to break.
Confidence 87%: this change will break tests in checkout/rollbackHonestyRollback Honesty Checker — verifies advertised rollback path actually rolls back.
Rollback Honesty Checker — verifies advertised rollback path actually rolls back.
deploy.sh has no rollback function despite docs claiming oneWeb & UX13 modules
Surfacing the user-visible problems static analysis usually pretends don't exist.
accessibilityWCAG 2.2 automated audit (AA + AAA-aligned) — missing alt text, ARIA labels, keyboard traps, heading hierarchy.
WCAG 2.2 automated audit (AA + AAA-aligned) — missing alt text, ARIA labels, keyboard traps, heading hierarchy.
Heading skip h1 → h3 at /pricing (WCAG 1.3.1)performanceDependency count, bundle size analysis, image optimisation checks.
Dependency count, bundle size analysis, image optimisation checks.
Hero image 3.4MB unoptimised — LCP penaltyvisualVisual & UI Regression Testing.
Visual & UI Regression Testing.
Hero CTA shifted 14px between deploysseoMeta tags, Open Graph, structured data, robots.txt, canonical URLs.
Meta tags, Open Graph, structured data, robots.txt, canonical URLs.
Missing canonical on /compare/snyklinksEvery broken href — dead anchors, placeholder links, 404s.
Every broken href — dead anchors, placeholder links, 404s.
/docs/guide returns 404 from footer linkcompatibilityBrowser matrix validation. Modern API and CSS features without polyfills.
Browser matrix validation. Modern API and CSS features without polyfills.
:has() selector at safari < 15.4 — partial supporte2eEnd-to-End Test Execution.
End-to-End Test Execution.
Checkout flow times out at payment stepliveCrawlerLive site crawl — 404 / 500 / broken-image / redirect-chain on the live URL.
Live site crawl — 404 / 500 / broken-image / redirect-chain on the live URL.
/blog/old-post → 3 redirects → 404explorerAutonomous Interactive Element Explorer — clicks every button + form + dropdown via Playwright.
Autonomous Interactive Element Explorer — clicks every button + form + dropdown via Playwright.
Submit button on /signup raises uncaught TypeErrorruntimeErrorsLive browser runtime errors — uncaught JS, console.error/warn, network 4xx/5xx, CSP violations, hydration mismatches.
Live browser runtime errors — uncaught JS, console.error/warn, network 4xx/5xx, CSP violations, hydration mismatches.
Hydration mismatch: server rendered 'Dec', client rendered 'Jan'chaosChaos & Resilience Testing — slow network, API failure, offline, missing resources, server timeouts. Runs via the GitHub Action where a headless browser is available; the website-only Forensic scan does not include it.
Chaos & Resilience Testing — slow network, API failure, offline, missing resources, server timeouts. Runs via the GitHub Action where a headless browser is available; the website-only Forensic scan does not include it.
App freezes on 3G simulation — no loading state shownwebHeadersCSP/HSTS/XFO/CORS misconfig across Next.js, Vercel, Netlify, Express, Fastify, nginx.
CSP/HSTS/XFO/CORS misconfig across Next.js, Vercel, Netlify, Express, Fastify, nginx.
CSP missing — defaults to inline-everythingcacheHeadersCache Headers & CDN Configuration.
Cache Headers & CDN Configuration.
/api/user has Cache-Control: public — PII cacheable at CDNInfrastructure18 modules
Catches the supply-chain takeovers, container exploits, and CI/CD foot-guns.
dependenciesSupply-chain hygiene across npm, pip, Pipenv, Poetry, go.mod, Cargo, Bundler, Composer, Maven, Gradle.
Supply-chain hygiene across npm, pip, Pipenv, Poetry, go.mod, Cargo, Bundler, Composer, Maven, Gradle.
left-pad in package.json pinned to 'latest' — supply-chain riskdockerfileRoot user, :latest tags, curl|sh, apt hygiene, secrets-in-layers, cache bloat.
Root user, :latest tags, curl|sh, apt hygiene, secrets-in-layers, cache bloat.
USER not set — container runs as rootciSecurityGitHub Actions hardening — action pinning, pwn-request, shell injection, secrets-in-logs, permissions.
GitHub Actions hardening — action pinning, pwn-request, shell injection, secrets-in-logs, permissions.
actions/checkout@v4 unpinned to SHA — supply-chain riskciParamValidatorValidates GitHub Actions with: inputs against action schemas.
Validates GitHub Actions with: inputs against action schemas.
actions/upload-artifact: invalid input 'retention' (typo of 'retention-days')shellShell script security — curl|sh, unsafe rm, eval injection, hardcoded secrets, set -e, POSIX compliance.
Shell script security — curl|sh, unsafe rm, eval injection, hardcoded secrets, set -e, POSIX compliance.
rm -rf $VAR with no quoting at scripts/clean.sh:14bashSafetyBash / Shell Error-Swallow Detector.
Bash / Shell Error-Swallow Detector.
Pipeline lacks set -o pipefail — silent failuresqlMigrationsDrop column/table, non-concurrent indexes, NOT NULL without default, blocking constraints, rolling-deploy renames.
Drop column/table, non-concurrent indexes, NOT NULL without default, blocking constraints, rolling-deploy renames.
ALTER TABLE users ADD COLUMN email NOT NULL — blocks writesterraformPublic buckets, wildcard ingress, hardcoded secrets, missing encryption, IAM wildcards.
Public buckets, wildcard ingress, hardcoded secrets, missing encryption, IAM wildcards.
aws_s3_bucket.acl = 'public-read' on customer-data bucketkubernetesPrivileged pods, host namespaces, :latest images, missing limits/probes, dangerous caps.
Privileged pods, host namespaces, :latest images, missing limits/probes, dangerous caps.
privileged: true in production deploymentsystemdSystemd Unit File Validator.
Systemd Unit File Validator.
Service has no Restart= policy — won't recover from crashdeployScriptValidatorHealth-check URL consistency.
Health-check URL consistency.
deploy.sh checks :3000 but service listens on :8080serviceConsistencyExecStart / Procfile / PM2 vs package.json start script.
ExecStart / Procfile / PM2 vs package.json start script.
Procfile runs `node dist/server.js`, package.json runs `node server.js`deployContractDeploy Contract Validator.
Deploy Contract Validator.
Vercel runtime=edge but route uses fs moduledeployReadinessAggregate 0-100 deployment confidence score.
Aggregate 0-100 deployment confidence score.
Deploy readiness: 62/100 — 3 critical, 8 high opennativeBundlerGuardNative Node addons that cannot be bundled.
Native Node addons that cannot be bundled.
import sharp — native binary not bundleable on Vercel edgebundleSizeJS bundles exceeding size budgets.
JS bundles exceeding size budgets.
main.js 412 KB gzip — budget is 200 KBenvIntegrityEnv-File Integrity Linter.
Env-File Integrity Linter.
.env has duplicate STRIPE_KEY entries — last wins silentlypromptSafetyBrowser-exposed API keys, unbounded max_tokens, prompt-injection surfaces, deprecated models.
Browser-exposed API keys, unbounded max_tokens, prompt-injection surfaces, deprecated models.
Client-bundled NEXT_PUBLIC_* credential shipped to every visitorDeveloper hygiene10 modules
Pulls bad-process bugs out of CI before they cost a 90-minute review.
prSizeBlocks unreviewably-large pull requests (files / lines / sprawl across top-level dirs).
Blocks unreviewably-large pull requests (files / lines / sprawl across top-level dirs).
PR touches 142 files, 3,400 lines across 6 top-level dirsprQualityWeak commit messages, missing tests, mixed deps+code.
Weak commit messages, missing tests, mixed deps+code.
Commit message 'fix' on a 200-line changeflakyTestsCommitted .only/.skip, real clock/network/timers, env leaks, self-admitted flakes.
Committed .only/.skip, real clock/network/timers, env leaks, self-admitted flakes.
describe.only( found in tests/checkout.test.tsfakeFixDetectorAI-generated symptom patches — skipped tests, swallowed errors, dead code.
AI-generated symptom patches — skipped tests, swallowed errors, dead code.
Test changed from expect(x).toBe(2) to .toBe.any() — patching test, not bughardcodedUrllocalhost / 127.0.0.1 / RFC1918 / internal TLDs / non-TLS URLs leaking into production.
localhost / 127.0.0.1 / RFC1918 / internal TLDs / non-TLS URLs leaking into production.
A dev URL (loop-back, RFC1918, internal TLD) shipped into the production bundleopenapiDriftRoutes defined in code missing from openapi.yaml, and spec paths with no matching handler.
Routes defined in code missing from openapi.yaml, and spec paths with no matching handler.
GET /api/v2/orders defined in code but absent from spectrpcContracttRPC procedure definitions vs frontend call sites.
tRPC procedure definitions vs frontend call sites.
Frontend calls trpc.user.delete — procedure removed in servermonorepoConstraintsEnforces package boundary rules in apps/ packages/ libs/.
Enforces package boundary rules in apps/ packages/ libs/.
apps/web imports from apps/admin — boundary violationzodSchemaPresenceReact components without runtime prop validation.
React components without runtime prop validation.
<Checkout> exported with no zod parse on prop inputdataIntegrityMigration safety, SQL injection patterns, PII in logs, database schema validation.
Migration safety, SQL injection patterns, PII in logs, database schema validation.
Migration drops column 'email' with no backfillAI & advanced8 modules
Where deterministic scanning stops and reasoning starts. Used sparingly, not by default.
aiReviewClaude reads your code and finds real bugs — not patterns, actual understanding.
Claude reads your code and finds real bugs — not patterns, actual understanding.
Token refresh races with logout — second auth call uses dead tokenagenticMemory-driven AI investigation — picks hypotheses from past scans, walks the code.
Memory-driven AI investigation — picks hypotheses from past scans, walks the code.
Recurring null-deref in user.profile — root cause traced to login flowmemoryCodebase memory — compounding intelligence across scans (issue history + fix patterns).
Codebase memory — compounding intelligence across scans (issue history + fix patterns).
This file had 14 prior findings — focus areas: auth, sessionaiHallucinationFake imports, invented APIs, non-existent methods.
Fake imports, invented APIs, non-existent methods.
An import named { useFoo } from a library that has no such exportclaudeComplianceAI-output rot — mock data in prod, not-implemented stubs, WHAT-not-WHY comment noise, `any` / `@ts-ignore` density.
AI-output rot — mock data in prod, not-implemented stubs, WHAT-not-WHY comment noise, `any` / `@ts-ignore` density.
John Doe placeholder in src/users.ts:42 — mock data shipped to produndefinedRefVariables, functions, methods referenced before they're defined.
Variables, functions, methods referenced before they're defined.
ReferenceError: handleClick is not defined at line 47 — typo'd from handeClickarchitectureDriftAI flags code that violates documented architectural conventions.
AI flags code that violates documented architectural conventions.
src/api/orders.ts bypasses repository layer — direct DB accessmutationModifies your source code to verify your tests actually catch bugs. Runs via the GitHub Action because it executes your test suite; the website-only Forensic scan does not include it.
Modifies your source code to verify your tests actually catch bugs. Runs via the GitHub Action because it executes your test suite; the website-only Forensic scan does not include it.
Mutated return true → return false, 11/11 tests still passScanning & testing2 modules
The classic suite — unit, integration, end-to-end — wired into the same gate as everything else.
unitTestsUnit Test Execution.
Unit Test Execution.
tests/cart.test.ts: 14 failed, 218 passed, 0 skippedintegrationTestsIntegration Test Execution.
Integration Test Execution.
Order placement → payment → fulfilment: 1 failure at fulfilment stepLanguage coverage9 modules
Nine non-JS language backends. Same engine, language-aware patterns.
pythoneval/exec, bare-except, SQL injection, pickle.
eval/exec, bare-except, SQL injection, pickle.
pickle.loads(request.data) at api/users.py:12 — RCE vectorgoIgnored errors, panics, goroutine hygiene.
Ignored errors, panics, goroutine hygiene.
_, err := db.Query(...) — err discarded at db.go:88rustunwrap/panic/todo, unsafe block review.
unwrap/panic/todo, unsafe block review.
.unwrap() on Option in production code at src/auth.rs:24javaSystem.out, broad catches, empty catches.
System.out, broad catches, empty catches.
catch (Exception e) {} at OrderService.java:301rubyeval, shell injection, bare rescue.
eval, shell injection, bare rescue.
system("convert #{input}") — shell injection at uploader.rb:18phpeval, legacy mysql_, XSS, debug output.
eval, legacy mysql_, XSS, debug output.
mysql_query() deprecated API at legacy/db.php:42csharpConsole.WriteLine, empty catches.
Console.WriteLine, empty catches.
Empty catch in OrderController.cs:189 — exception swallowedkotlin!!, TODO(), println.
!!, TODO(), println.
user.profile!! at HomeFragment.kt:71 — NPE riskswiftfatalError, try!, force-unwrap.
fatalError, try!, force-unwrap.
try! JSONDecoder().decode(...) in production at Network.swift:23WordPress6 modules
Live-URL probes for the wp.gatetest.ai product. Run against any public WordPress site.
wpVersionLeakWhere the site leaks its core version (readme.html, meta generator, RSS feed, CSS/JS ver=).
Where the site leaks its core version (readme.html, meta generator, RSS feed, CSS/JS ver=).
Meta generator: 'WordPress 5.8.1' — readable from view-sourcewpPluginCveCheckDetects installed plugins via fingerprinting and flags any with known CVEs.
Detects installed plugins via fingerprinting and flags any with known CVEs.
elementor 3.5.2 detected — CVE-2023-XXXX criticalwpUserEnumerateChecks if usernames can be enumerated via /?author=1, /wp-json/wp/v2/users, /author/admin/.
Checks if usernames can be enumerated via /?author=1, /wp-json/wp/v2/users, /author/admin/.
/?author=1 reveals login 'admin' via redirectwpPhpVersionEolDetects the running PHP version and flags it if end-of-life.
Detects the running PHP version and flags it if end-of-life.
PHP 7.4 detected — EOL since 2022, no security patcheswpThemeAbandonmentDetects the active theme and flags it if abandoned, deprecated, or carrying known CVEs.
Detects the active theme and flags it if abandoned, deprecated, or carrying known CVEs.
Theme 'oldtheme' last updated 2019 — abandonedwpBackupValidationWhether a backup plugin is installed AND whether any backup files are publicly exposed.
Whether a backup plugin is installed AND whether any backup files are publicly exposed.
/backup-2024-01.zip reachable (HTTP 200) — full-site dump exposedThe fix flywheel
When the gate produces a finding that you've paid to have fixed, our orchestrator (website/app/lib/try-fix.js) walks four layers in order. The first layer that produces a real patch wins. Each layer is bounded by a 30-second soft timeout; a crash falls through; a no-op patch is rejected. The whole orchestrator never throws.
Babel-parsed deterministic transforms — currently ~10 canonical patterns covering TLS, cookies, parseInt radix, async-iteration, and the most common config flips.
When the bug is a single config flag or call-site argument that can be flipped without semantic ambiguity.
https.Agent({
rejectUnauthorized: false
})https.Agent({
rejectUnauthorized: true
})Regex and structural pattern engine for shapes the AST doesn't model. Same-file edits, deterministic replacements, fast path for high-frequency patterns.
When the bug is a recognisable line-level shape that AST traversal would have to special-case.
console.log(user)
logger.info({ user_id: user.id })Cached fixes that Claude solved on a previous scan. The 'auto-distill' step records the before/after when Claude's diff is small and templatey — next time the same shape appears, the recipe wins.
Anything Claude has solved before. The recipe layer is the flywheel — it learns from every paid fix.
// match by ruleKey + file ext // hit: js-reject-unauthorized // applied 7 times
// recipe applied, zero cost // promoted to 'stable' at 3 hits
Anthropic Claude Sonnet. Only invoked when the first three layers all return null. Bounded by a 30s per-layer timeout, capped per tier so spend never exceeds margin.
First-time-seen patterns. Bespoke business-logic bugs. Anything templated layers can't model.
// novel pattern: ad-hoc auth check // mixed with feature-flag rollout // no canonical shape
// Claude reasons from your code // fix lands, auto-distill records // next time → recipe layer
Cost trend as recipes accumulate
When Claude solves something and the diff is templatey, theauto-distillstep records a recipe. Next time the same shape appears, the recipe layer wins and Claude is never called. The Claude ratio is highest on day one and trends toward single digits over time.
Illustrative — actual ratio depends on codebase shape and recipe-hit rate. The architectural goal is that repeat patterns stop reaching Claude entirely.
The four tiers
Same engine, same modules, same queue. The tiers differ in what we layer on top of the base scan — and we're honest about what you don't get at each tier. “no” means no.
| What you get | Quick | Full | Scan + Fix | Forensic |
|---|---|---|---|---|
| Price | $29 | $99 | $199 | $399 |
| Modules run | 4 | All | All | All |
| Findings clustering by root cause | ✓ | ✓ | ✓ | ✓ |
| Health score / verdict | ✓ | ✓ | ✓ | ✓ |
| Detailed report (file, line, advisory) | ✓ | ✓ | ✓ | ✓ |
| AI code review (Claude reads your code) | no | ✓ | ✓ | ✓ |
| Auto-PR with working fixes | no | no | ✓ | ✓ |
| Iterative fix loop with retry | no | no | ✓ | ✓ |
| Cross-fix syntax + scanner gate | no | no | ✓ | ✓ |
| Regression test generated per fix | no | no | ✓ | ✓ |
| Pair-review (second Claude critiques fixes) | no | no | ✓ | ✓ |
| Architecture annotations | no | no | ✓ | ✓ |
| Per-finding Claude diagnosis | no | no | no | ✓ |
| Cross-finding attack-chain correlation | no | no | no | ✓ |
| Mutation testing (via GitHub Action) | no | no | no | Action |
| Chaos / fuzz pass (via GitHub Action) | no | no | no | Action |
| CTO-readable executive summary | no | no | no | ✓ |
Action = available via the GitHub Action with mutation: true / chaos: true. These two checks need a CI runner (your test suite for mutation, a headless browser for chaos) so they run wherever your CI runs, not on the website-only scan flow.
Per-scan paymentat every tier. One-time charge via Stripe at checkout, no subscription, no auto-renew. If a scan fails to start or crashes mid-way, contact support — we re-run it or issue a credit at our discretion.
Self-healing CI
Beyond the managed scan, GateTest ships a GitHub Actions workflow that runs in your CI with your Anthropic key. When CI breaks, the workflow pipes the failing log through the same AST → Rule → Recipe → Claude flywheel, applies the fix, and opens a follow-up PR. Same engine, same recipe store, your bill on Anthropic rather than ours.
name: GateTest Self-Healing CI
on:
workflow_run:
workflows: ["CI"]
types: [completed]
jobs:
heal:
if: ${{ github.event.workflow_run.conclusion == 'failure' }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '20' }
- run: npx gatetest heal --pr
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}- 1CI fails
Workflow_run trigger fires on conclusion: failure.
- 2Logs in
Heal step downloads the failing job's logs and the diff.
- 3Flywheel
Same AST → Rule → Recipe → Claude orchestrator runs.
- 4Fix PR
Patch lands on a follow-up branch, PR opens against your default.
The stack
We deliberately keep the stack small. Every box below earns its place — no “just in case” services, no orchestration layers we don't need. The serverless rule is hard: no in-memory state between requests, ever.
Next.js 16 (App Router) + Tailwind 4. Server Components everywhere except where interactivity demands client.
Vercel serverless functions. Every function is stateless — no in-memory persistence between requests.
Postgres on Neon. Holds scan_queue, audit log, fix-recipe store, customer sessions.
Stripe upfront-charge. One-time payment per scan at checkout. No subscription, no auto-renew.
Anthropic Claude Sonnet. Our key for managed scans; your key for the self-healing CI bot in your repo.
Dual-host: GitHub App webhook and Gluecron Signal Bus. HostBridge abstraction means new hosts plug in without rewiring.
Playwright (open-source, Microsoft) — used internally for chaos, explorer, and runtime-error modules. Not a paid competitor; an implementation detail.
All scan state lives in Postgres or in Stripe's payment-intent metadata. We never write a Map or module-level variable that's expected to survive across requests — the function instance that picked up your second-page poll is not the one that ran your scan.
What GateTest doesn't do (yet)
Every QA vendor promises the moon. Here's what we don't deliver today. If any of these are blockers for you, the honest answer is “not yet.”
- 01
Doesn't replace a senior engineer's code review. We catch the bugs that have a recognisable shape; humans still own architecture and product judgement.
- 02
Doesn't catch logic bugs that need domain context. If your invariant is 'don't ever discount over 30%', no scanner can know that without you telling it.
- 03
Doesn't fix bugs that span 5+ files without human review. Multi-file refactors are flagged but require an engineer to drive.
- 04
Coverage on Rust, Go, and Java is shallower than JS/TS/Python today. We have language-specific modules for nine non-JS backends but the depth is honestly thinner than our JS coverage.
- 05
No on-prem deployment yet. Everything runs against our managed Vercel + Neon stack today. Air-gapped customers are on the roadmap.
- 06
No VSCode extension that runs in real time yet. Today's loop is push → CI → PR comment. Editor integration is on the list.
Run it against your code
Architecture is just words until you see the report. The free URL scan takes about ten seconds and returns a real health score against your live site.