AI Runtime CLI examples

Important

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Azure Databricks previews.

The following examples are complete, end-to-end workloads you submit from the air CLI with air run -f train.yaml. Each shows a real multi-GPU pattern on H100 GPUs, including the workload YAML, bootstrap commands, and code. Start with the quickstart if you haven't submitted a run before.

Example	Description
Multi-node LLM fine-tuning with FSDP	Supervised fine-tuning of Llama-3.1-8B across 16 H100 GPUs (2 nodes) using `torchrun` and PyTorch Fully Sharded Data Parallel (FSDP). Logs to MLflow and checkpoints to a Unity Catalog volume.
Distributed training with Ray Train	Distributed data-parallel fine-tuning with Ray Train's `TorchTrainer` across 8 H100 GPUs on a single node, with one worker per GPU.
Batch inference with Ray Data and vLLM	Offline LLM batch inference with Ray Data and vLLM across 8 H100 GPUs on a single node, running one vLLM replica per GPU and writing results to a Unity Catalog volume as Parquet.

Feedback

Var denne side nyttig?

Last updated on 2026-06-24

AI Runtime CLI examples

Feedback

Yderligere ressourcer