About

I build production AI systems — not prototypes, not demos, but software that runs with CI, tests, and monitoring.

Who I Am

I'm an AI systems engineer based in the Inland Empire, California. I specialize in LLM cost optimization, multi-agent orchestration, and RAG pipeline engineering. My work focuses on taking AI from "it works in a notebook" to "it runs in production with 7,800+ tests and 11 CI pipelines."

I hold 19 certifications from Vanderbilt, IBM, Google (including Google Cloud), Microsoft, Duke, Meta, DeepLearning.AI, and U of Michigan — covering deep learning, generative AI, LLMOps, data analytics, business intelligence, and marketing. I've logged 1,768 hours of structured coursework.

What I Specialize In

Strongest Differentiator

LLM Cost Engineering & Multi-Agent Orchestration

I reduced token consumption from 93K to 7.8K per workflow (89%) using 3-tier caching, context window optimization, and model routing by task complexity. I've built a 3-bot system with confidence-based handoff, circular prevention, rate limiting, and A/B testing of response strategies.

RAG Pipeline Engineering

Three separate repos demonstrate different facets of RAG: hybrid retrieval (BM25 + dense + Reciprocal Rank Fusion), source citation with page numbers, prompt engineering lab with A/B testing, and per-query cost tracking. All run without API keys in mock/demo mode.

AI-Powered Business Process Automation

Full automation pipelines from web scraping to AI-powered analysis: YAML-configurable scrapers with change detection, price monitoring with alerts, Excel-to-Streamlit CRUD app generation, SEO content scoring, and 4-agent proposal generation with 105-point job scoring.

How I Work

1

Discovery

Understand your data, existing systems, constraints, and what "done" looks like. No scope creep — explicit deliverables before code starts.

2

Architecture

Design the system: data flow, component boundaries, API contracts, caching strategy, and deployment plan. You review before implementation.

3

TDD Build

Test-driven development. Tests first, then implementation, then refactor. CI runs on every push. You see progress in real-time via GitHub.

4

Delivery

Deployed with Docker, documented with examples, demo mode included. Handoff includes architecture docs, test coverage report, and a walkthrough call.

Ideal Client Profile

Great Fit

  • You have a working product and need AI/LLM integration
  • Your LLM costs are growing faster than revenue
  • You need multiple AI agents to coordinate without losing context
  • You want RAG over your internal documents with citations
  • You need data pipelines that auto-generate dashboards or reports

Not the Best Fit

  • You need a mobile app or frontend-heavy SPA
  • You need DevOps/infrastructure only (no AI component)
  • You want a no-code/low-code solution
  • Your project requires fine-tuning foundation models

Stack

AI / LLM

Claude API, Gemini, OpenAI, Perplexity, LangChain, BM25, Vector Search, SHAP, XGBoost, scikit-learn

Backend

Python, FastAPI, PostgreSQL, Redis, SQLite, Alembic, Pydantic, Docker, GitHub Actions

Data / BI

Streamlit, Plotly, Pandas, NumPy, BeautifulSoup, httpx, PyPDF2, python-docx

Let's Talk

Open to full-time AI/ML roles, contract work, and fractional AI engineering.

View services & pricing →

View all 19 certifications →