89% Token Cost Reduction Across a 3-Bot AI Platform
Challenge
A real estate AI platform with 3 specialized chatbots (lead qualification, buyer matching, seller advisory) was consuming 93,000 tokens per workflow. Each bot needed conversation context, system prompts, user data, and market intelligence — all sent on every API call. Token costs were scaling linearly with conversation volume.
Solution
Built a 3-tier caching system (L1 in-memory, L2 Redis with TTL, L3 PostgreSQL fallback), context window optimization that sends only relevant turns instead of full history (2.3x efficiency), and model routing by task complexity (TaskComplexity enum routes ROUTINE tasks to faster/cheaper models).
FastAPIClaude AIRedisPostgreSQL
Result
93K → 7.8K
Tokens per workflow
87%
Cache hit rate
<200ms
Orchestrator overhead
Verify:services/claude_orchestrator.py (cache layers), core/llm_client.py (TaskComplexity routing). Full benchmarks →
Three specialized chatbots needed to seamlessly transfer conversations based on detected intent. A lead saying "I want to buy a house" to the lead bot needs to end up with the buyer bot — with full conversation context preserved. Initial implementation had circular handoff loops, race conditions when two bots tried to claim the same contact, and no way to learn from handoff outcomes.
Solution
Built JorgeHandoffService with 5 safeguards: 0.7 confidence threshold (tested against 200+ transcripts), circular prevention (30-minute window blocks same source→target), rate limiting (3/hr, 10/day per contact), contact-level locking (prevents concurrent handoff race conditions), and pattern learning (dynamic threshold adjustment after 10+ data points). GHL-enhanced intent decoders boost scoring with CRM data (tags, lead age, engagement recency).
FastAPIClaude AIGoHighLevelA/B Testing
Result
279
Automated tests
5
Handoff safeguards
<500ms
Handoff P95 latency
Verify:services/jorge/jorge_handoff_service.py (all safeguards), agents/jorge_*_bot.py (public APIs). Full benchmarks →
3-Bot Lead Qualification Platform with Cross-Bot Handoff
Challenge
40% lead loss from slow manual response times averaging over 15 minutes. Leads requiring different expertise (buying vs. selling) were stuck with a single generalist response, causing drop-off and missed conversion opportunities.
Solution
Built a 3-bot system (Lead/Buyer/Seller) with cross-bot handoff orchestrated by JorgeHandoffService. Integrated A/B testing for response strategies, GHL CRM integration for real-time lead data, and intent decoding with GHL score boosts (tag analysis, lead age, engagement recency). Shared services layer provides consistent metrics, alerting, and performance tracking across all bots.
AI-Powered Prospecting Pipeline with Security-First Design
Challenge
15+ hours per week spent on manual prospecting and proposal writing. No systematic way to evaluate job fit, qualify opportunities, or detect prompt injection attacks in AI-generated content pipelines.
Solution
Built 3 integrated products: an AI job scanner with a 105-point scoring rubric for automated qualification, a 4-agent proposal pipeline (Prospecting, Credential Sync, Proposal Architect, Engagement) for tailored proposal generation, and a prompt injection tester with 60+ attack patterns across 8 MITRE ATLAS threat categories. Includes a RAG Cost Optimizer for token budget management.
Document Q&A with Hybrid Retrieval and Source Citations
A RAG system that ingests PDFs, DOCX, and text documents, then answers questions with cited sources. Uses hybrid retrieval (BM25 keyword search + dense vector similarity + Reciprocal Rank Fusion) to find relevant passages. Includes a prompt engineering lab for A/B testing answer quality and per-query cost tracking.
CSV Upload to Instant Dashboards, Attribution, and Predictions
Upload a CSV or Excel file and get auto-profiled data, interactive Plotly dashboards, marketing attribution (first-touch, last-touch, linear, time-decay), predictive modeling with SHAP explanations, automated data cleaning, and one-click PDF reports. Three demo datasets included (e-commerce, marketing touchpoints, HR attrition).
An automated pipeline that scans job listings with a 105-point scoring rubric, generates tailored proposals via a 4-agent pipeline (Prospecting, Credential Sync, Proposal Architect, Engagement), and includes a prompt injection testing suite with 60+ detection patterns across 8 MITRE ATLAS threat categories.