AI/LLM Developer
Shipped a Claude-powered lead qualification platform processing 500+ real leads with zero downtime. Built production RAG with 94.6% accuracy on adversarial eval. Published to PyPI. Contributing to LiteLLM (27K+ stars). 9,956+ automated tests across production repos.
Production AI infrastructure — from retrieval pipelines to multi-agent systems
Hybrid retrieval (BM25 + cosine + RRF), citation-aware answers, agentic ReAct reasoning, semantic caching (88% hit rate), 28-fixture adversarial eval suite with prompt injection defense. CI regression gate at 94.6% accuracy.
Domain-specific agent mesh with ReAct orchestration, 3-tier cache (L1 memory, L2 Redis, L3 PostgreSQL), per-agent model routing (Haiku/Sonnet/Opus), circuit-breaker failover, human handoff protocols. MCP server toolkit published to PyPI.
Golden eval suites with RAGAS scoring, LLM-as-judge CI gates, adversarial fixtures (prompt injection, data exfiltration, roleplay override). Levenshtein similarity, Brier score calibration, field-level weighted accuracy.
3 Claude-powered SMS bots handling lead qualification for a real estate firm. 500+ leads processed, under 500ms response time, bilingual EN/ES, zero downtime over 3-month production run.
Capabilities
Stack
Async document processing with hybrid retrieval, citation-aware answers, and agentic ReAct reasoning. 94.6% extraction accuracy on 28-fixture golden eval (12 adversarial cases including 4 prompt injection attacks).
Capabilities
Stack
Domain-specific agent mesh with 3-tier cache achieving 88% aggregate hit rate. 8 agent capabilities, circuit-breaker failover, per-agent model routing, OWASP-hardened security, and OpenTelemetry instrumentation.
Capabilities
Stack
9 pre-built MCP servers with A2A adapter, auto-caching, rate limiting, auth middleware. MCPTestClient for testing without live API keys. Reduces LLM tool integration from days to a single import.
PR #24551 -- Surfaces AuthenticationError, RateLimitError, and NotFoundError distinctly through the Router fallback chain instead of swallowing as generic Exception. Enables callers to implement appropriate recovery strategies per error type.
Also: open PRs in FastAPI (80K+ stars, #15217) and pgvector-python (#151)
7 AI/LLM certifications · 439 hours · IBM, DeepLearning.AI, Microsoft, Duke, Google, Anthropic
Targeting teams building production LLM applications, RAG systems, agentic AI, and developer tooling. Open to: AI/LLM Developer, MLOps Engineer, Agentic AI Engineer, AI Platform Engineer.
US-based (Cathedral City, CA) · Canadian citizen, no sponsorship required
caymanroden@gmail.com