Enterprise LLM Inference WebUI
LLM OpsExtensible interface for prompt storage, system prompt control, and on-the-fly local model loading with token/s logs.
- Local + edge deployment ready
- Configurable providers and quantized models
Machine Learning Engineer
Hands-on with model fine-tuning, evaluation, retrieval, and scalable deployment. Strong Python/C++ engineering background and product mindset.
A few focused projects that reflect how I approach ML/AI in production.
Extensible interface for prompt storage, system prompt control, and on-the-fly local model loading with token/s logs.
Optimized retrieval pipelines (Ollama + MCP) with context-length & model-size trade-offs for fast, grounded responses.
Led Agile ceremonies; shipped a functional search prototype with clean API boundaries and a lightweight UI.
Python/NumPy/Pandas workflows for preprocessing, regression, and classification with clear visualizations.
Recent roles and impact.
State of California, Department of Toxic Substances Control
Scale AI
DataAnnotation
Tech I use daily.
What I’m currently exploring and notes worth sharing.
Aug 2025
Designing eval sets for 32k+ tokens; latency/quality trade-offs with summaries + reranking.
Aug 2025
Collecting retrieval diagnostics, answer faithfulness metrics, and user feedback loops.
Jul 2025
Exploring 4-bit quant + cache sharing for fast local inference.
Open to ML/Software Engineering roles and interesting problems at the intersection of product and AI.
West Sacramento, CA · +1 (916) 370-6391