Training

Practitioner-grade AI infrastructure education

Built and taught by the engineers running the systems — not career trainers.

Production LLM Engineering

5 days

Intermediate → Advanced

End-to-end: routing, serving compilers, KV-cache, streaming, evals, cost telemetry. Hands-on with vLLM, TensorRT-LLM, and Triton on real models.

Stand up a production-grade inference endpoint
Tune throughput and latency with compilers and batching
Wire cost-per-call telemetry that holds up in finance review

GPU Performance & Cost

3 days

Intermediate

Bin-packing, spot/on-demand mixing, preemption-safe training, and the dashboards that actually move utilization. Less waste, same hardware.

Diagnose under-utilization in a real cluster
Mix spot, on-demand, and reserved capacity safely
Build $/utilization dashboards by team and workload

RAG & Retrieval at Scale

3 days

All levels

Chunking, embedding choice, hybrid retrieval, re-ranking, eval design — and the pipelines to keep all of it fresh. Tuned to product metrics, not benchmark vanity.

Design a retrieval system tied to a product KPI
Operate hybrid (lexical + semantic) retrieval
Build an eval harness that gates deploys

Building Agentic Systems

4 days

Intermediate → Advanced

Tool execution, sandboxing, durable workflow state, per-step evals, and budget enforcement. The boring engineering that lets agents ship.

Design and operate a durable agent runtime
Sandbox tool execution safely
Add per-step evals and budget caps that hold

Formats

Delivered the way your team learns

Private cohort

On-site or remote, scoped to your team's stack and workload. Engineers leave with code, not slides.

Public bootcamp

Mixed-organization cohorts, scheduled quarterly with limited seats.

Custom curriculum

Co-developed for product, platform, or research teams with specific goals.

Request a training proposal →

training@vertotech.io