QA Automation Engineer
15 production repositories built test-first from day one. pytest, CI/CD with GitHub Actions, LLM evaluation frameworks with RAGAS scoring and adversarial fixtures. Zero test failures across a 3-month production run.
Three pillars of test engineering across every project.
Every test written before the implementation it validates. Zero tests retroactively added. Coverage thresholds enforced from project inception, not bolted on after launch.
Golden eval sets with adversarial fixtures. RAGAS scoring pipelines. Brier score confidence calibration. CI gates that block accuracy regressions before merge.
SAST scanning (bandit), dependency CVE auditing (pip-audit), strict type checking (mypy), and coverage enforcement on every commit. Merge blocked on any failure.
pytest · pytest-asyncio · mypy strict · bandit · pip-audit · GitHub Actions
pytest · RAGAS · LLM-as-judge · Brier score calibration
pytest · pytest-asyncio · Pact v3 · Locust · GitHub Actions
Playwright · TypeScript · 4 viewport breakpoints
rel=noopener enforcement, mailto format validationpytest · dbt tests · Vitest · API integration tests
Writing tests for codebases I didn't build — the clearest signal of test-first thinking.
Typed exception mapping for BaseLLMHTTPHandler._handle_error on Anthropic messages API. 6 tests covering RateLimitError, ContextWindowExceededError, AuthenticationError, InternalServerError, no-model backward compatibility, and already-typed pass-through. Eliminates silent failure in Router fallback chains.
AI companies and SaaS companies with LLM products are a particularly strong fit (direct experience designing LLM evaluation frameworks). Open to: QA Automation Engineer, SDET, Software Engineer in Test, Test Automation Engineer.
US-based (Cathedral City, CA) · Canadian citizen, no sponsorship required · $35-50/hr or $70-90K salaried
caymanroden@gmail.com