Machine Learning Engineer

I build fast, reliable LLM & RAG systems from prototype to production.

Hands-on with model fine-tuning, evaluation, retrieval, and scalable deployment. Strong Python/C++ engineering background and product mindset.

See my work Email me: keatum@gmail.com View résumé Open booking page
LLMsRAGPythonPyTorchTensorFlowDocker KubernetesC++CI/CD

Selected Work

A few focused projects that reflect how I approach ML/AI in production.

Enterprise LLM Inference WebUI

LLM Ops

Extensible interface for prompt storage, system prompt control, and on-the-fly local model loading with token/s logs.

  • Local + edge deployment ready
  • Configurable providers and quantized models

RAG/CAG Stack & Local Inference

Retrieval

Optimized retrieval pipelines (Ollama + MCP) with context-length & model-size trade-offs for fast, grounded responses.

  • Latency-aware reranking
  • Observability for answer quality

IMDb Search Prototype (API + UI)

APIs

Led Agile ceremonies; shipped a functional search prototype with clean API boundaries and a lightweight UI.

Analytics & Visualization Notebooks

Data

Python/NumPy/Pandas workflows for preprocessing, regression, and classification with clear visualizations.

Experience

Recent roles and impact.

Information Technology Associate — Solutions Engineer

Jul 2024 – Mar 2025

State of California, Department of Toxic Substances Control

  • Implemented and integrated an enterprise scale GCP DocAI solution to digitize and process 2-3 million files.
  • Delivered full SDLC across analysis, design, implementation, testing, docs.
  • Data modeling and deep analysis to guide system improvements.

Software Development — AI Trainer

Feb 2024 – Present

Scale AI

  • Stress-tested LLMs to surface failure modes and improve code-gen reliability.
  • Produced high-quality reference solutions across CRUD apps and auth patterns.

AI Trainer — CS/ML/Math

Feb 2024 – Present

DataAnnotation

  • Red-teamed models for safe behavior; corrected physics simulations and concepts.
  • Authored reference answers to boost model performance in CS/ML domains.

Core Skills

Tech I use daily.

PythonC++PyTorchTensorFlowSQL/NoSQLDockerKubernetesGitCI/CDGPUREST APIs

Learning log

What I’m currently exploring and notes worth sharing.

Aug 2025

Long-context LLM evals

Designing eval sets for 32k+ tokens; latency/quality trade-offs with summaries + reranking.

Aug 2025

RAG observability

Collecting retrieval diagnostics, answer faithfulness metrics, and user feedback loops.

Jul 2025

Quantization + KV cache tricks

Exploring 4-bit quant + cache sharing for fast local inference.

Open to ML/Software Engineering roles and interesting problems at the intersection of product and AI.

View résumé

West Sacramento, CA · +1 (916) 370-6391