LLM Inference Visualizer

Vanilla

See what happens inside a large language model as it thinks. Vanilla visualizes every step of inference in real time — token probabilities, attention patterns, layer activations, residual flows, and parameter space — with four modes from deep research to kid-friendly animations. v1.0.4: VLM image input, model discovery, Metal-accelerated heatmaps, animated transitions, and error recovery.

Expert Mode — Token probability analysis, attention insights, and layer-by-layer breakdown

Expert Mode — Token probability analysis, attention insights, and layer-by-layer breakdown

Features

Four modes. One vision.

From deep research to a child's first encounter with AI — every perspective is designed.

01

Expert Mode

Full analysis dashboard: Top-K probability bars, attention heatmaps, Logit Lens layer-by-layer predictions, residual flow magnitudes, GDN trends, and parameter space statistics.

Expert Mode
02

Guided Mode

Step-by-step educational walkthrough: from tokenization to attention to final prediction. Perfect for learning how transformers work.

Guided Mode
03

Kid Mode

An animated AI sprite character guides children through the inference journey with colorful token cards and playful interactions.

Kid Mode
04

Pause & Step

True pause with DispatchSemaphore — freeze inference at any token, step forward one at a time, rewind and inspect any layer.

Pause & Step
05

Review Mode

Sequence summary with key findings: confidence distribution, most decisive layers, and generation statistics at a glance.

Review Mode
06

Native Performance

Built with Swift + Metal on Apple Silicon. Metal-accelerated attention heatmaps, real-time particle rendering, and smooth 60fps animated transitions.

Native Performance
07

Model Discovery

Built-in recommended model panel with local scanning. First-time users see suggestions immediately — pick a model and visualize in 30 seconds.

Model Discovery
08

Error Recovery

Actionable error states with retry, switch model, and timeout protection. No more app restarts when things go wrong.

Error Recovery

Why Vanilla

01

Understand model behavior at every layer

02

Debug and validate model outputs visually

03

Teach AI concepts to anyone — even kids

04

Runs 100% locally, no cloud dependency

05

Interactive exploration, not static charts

06

Built for Apple Silicon with Metal GPU acceleration

Vanilla processes everything on your device. Zero data collection. No account required.

AtomGradient — Bringing AI to the Edge