Case Studies

Real engineering challenges I solved, with quantified outcomes you can verify in the code. Each case study maps to a public repo.

Verified in Code LLMOps View Repo

89% Token Cost Reduction Across a 3-Bot AI Platform

Challenge

A real estate AI platform with 3 specialized chatbots (lead qualification, buyer matching, seller advisory) was consuming 93,000 tokens per workflow. Each bot needed conversation context, system prompts, user data, and market intelligence — all sent on every API call. Token costs were scaling linearly with conversation volume.

Solution

Built a 3-tier caching system (L1 in-memory, L2 Redis with TTL, L3 PostgreSQL fallback), context window optimization that sends only relevant turns instead of full history (2.3x efficiency), and model routing by task complexity (TaskComplexity enum routes ROUTINE tasks to faster/cheaper models).

FastAPI Claude AI Redis PostgreSQL

Result

93K → 7.8K

Tokens per workflow

87%

Cache hit rate

<200ms

Orchestrator overhead

Verify: services/claude_orchestrator.py (cache layers), core/llm_client.py (TaskComplexity routing). Full benchmarks →
Verified in Code Multi-Agent AI View Repo

3-Bot Cross-Agent Handoff System with Safeguards

Challenge

Three specialized chatbots needed to seamlessly transfer conversations based on detected intent. A lead saying "I want to buy a house" to the lead bot needs to end up with the buyer bot — with full conversation context preserved. Initial implementation had circular handoff loops, race conditions when two bots tried to claim the same contact, and no way to learn from handoff outcomes.

Solution

Built JorgeHandoffService with 5 safeguards: 0.7 confidence threshold (tested against 200+ transcripts), circular prevention (30-minute window blocks same source→target), rate limiting (3/hr, 10/day per contact), contact-level locking (prevents concurrent handoff race conditions), and pattern learning (dynamic threshold adjustment after 10+ data points). GHL-enhanced intent decoders boost scoring with CRM data (tags, lead age, engagement recency).

FastAPI Claude AI GoHighLevel A/B Testing

Result

279

Automated tests

5

Handoff safeguards

<500ms

Handoff P95 latency

Verify: services/jorge/jorge_handoff_service.py (all safeguards), agents/jorge_*_bot.py (public APIs). Full benchmarks →
Verified in Code Multi-Agent AI View Repo

3-Bot Lead Qualification Platform with Cross-Bot Handoff

Challenge

40% lead loss from slow manual response times averaging over 15 minutes. Leads requiring different expertise (buying vs. selling) were stuck with a single generalist response, causing drop-off and missed conversion opportunities.

Solution

Built a 3-bot system (Lead/Buyer/Seller) with cross-bot handoff orchestrated by JorgeHandoffService. Integrated A/B testing for response strategies, GHL CRM integration for real-time lead data, and intent decoding with GHL score boosts (tag analysis, lead age, engagement recency). Shared services layer provides consistent metrics, alerting, and performance tracking across all bots.

Python FastAPI PostgreSQL Redis Claude API GoHighLevel

Result

279

Tests passing

0.7

Confidence threshold

3/hr, 10/day

Rate limiting

Verify: services/jorge/jorge_handoff_service.py (circular prevention, rate limiting), services/jorge/ab_testing_service.py (deterministic variant assignment).
Verified in Code Automation View Repo

AI-Powered Prospecting Pipeline with Security-First Design

Challenge

15+ hours per week spent on manual prospecting and proposal writing. No systematic way to evaluate job fit, qualify opportunities, or detect prompt injection attacks in AI-generated content pipelines.

Solution

Built 3 integrated products: an AI job scanner with a 105-point scoring rubric for automated qualification, a 4-agent proposal pipeline (Prospecting, Credential Sync, Proposal Architect, Engagement) for tailored proposal generation, and a prompt injection tester with 60+ attack patterns across 8 MITRE ATLAS threat categories. Includes a RAG Cost Optimizer for token budget management.

Python FastAPI Claude API BeautifulSoup Pandas

Result

240

Tests passing

105-pt

Scoring rubric

60+

Injection patterns

Verify: product_1_launch_kit/ (injection tester), product_2_rag_cost_optimizer/ (cost optimization), product_3_agent_orchestrator/ (proposal pipeline).

Capability Demonstrations

These projects demonstrate what the systems can do. Each is a fully functional, tested application you can clone and run.

Capability Demo RAG View Repo

Document Q&A with Hybrid Retrieval and Source Citations

A RAG system that ingests PDFs, DOCX, and text documents, then answers questions with cited sources. Uses hybrid retrieval (BM25 keyword search + dense vector similarity + Reciprocal Rank Fusion) to find relevant passages. Includes a prompt engineering lab for A/B testing answer quality and per-query cost tracking.

94

Automated tests

3

Retrieval methods

Mock Mode

No API keys needed

Capability Demo Data Analytics View Repo

CSV Upload to Instant Dashboards, Attribution, and Predictions

Upload a CSV or Excel file and get auto-profiled data, interactive Plotly dashboards, marketing attribution (first-touch, last-touch, linear, time-decay), predictive modeling with SHAP explanations, automated data cleaning, and one-click PDF reports. Three demo datasets included (e-commerce, marketing touchpoints, HR attrition).

63

Automated tests

4

Attribution models

6

Modules

Capability Demo Automation View Repo

End-to-End Automation: Job Scanning, Proposal Generation, Security Testing

An automated pipeline that scans job listings with a 105-point scoring rubric, generates tailored proposals via a 4-agent pipeline (Prospecting, Credential Sync, Proposal Architect, Engagement), and includes a prompt injection testing suite with 60+ detection patterns across 8 MITRE ATLAS threat categories.

240

Automated tests

105-pt

Scoring rubric

60+

Injection patterns