
Serverless GPUs for inference, training, batch jobs and sandboxes — from a few lines of Python.
Not a model — the workshop floor your models run on. Modal gives you on-demand GPUs without managing infrastructure: define a function, attach a GPU, and it scales from zero to many and back. Ideal for fast inference, fine-tuning scripts in ~300 lines, parallel hyperparameter sweeps, and running agents in sandboxes.
Deploy and scale inference for LLMs, audio and image/video generation.
Fine-tune open-source models on single or multi-node clusters instantly.
Parallelize massive jobs with hyper-elastic compute infrastructure.
Programmatically scale ephemeral environments for running untrusted code.
To qualify ·Use Modal for the development or runtime of your app and note it in your Space README. 1st 10,000 · 2nd 7,000 · 3rd 3,000 credits.