1B – 8B

OpenBMB

Tiny, capable models for text, vision, audio and omni — small enough to live on your own hardware.

The MiniCPM family proves you don’t need a giant to get real work done. Each model is tuned to punch far above its parameter count and runs happily on a laptop — or a phone. OpenBMB provide free hosted API access for the jam, and every model also runs locally via llama.cpp or transformers. Pick the modality you need and go.

The kit, sized small

MiniCPM-V 4.6

Image & video understanding, OCR and document understanding for multimodal apps.

  • vision
  • OCR
  • documents
1.3B parameters32B cap
MiniCPM-o 4.5

Full-duplex omni model — voice, vision and language in, speech out. Real-time capable.

  • omni
  • speech
  • realtime
MiniCPM-V 4.5

Strong multimodal understanding and visual reasoning for image & video use cases.

  • vision
  • video
MiniCPM5-1B

Lightweight text generation for local-first and on-device assistants.

  • text
  • on-device
1B parameters32B cap
MiniCPM4.1-8B

Text reasoning with efficient, hybrid local inference.

  • text
  • reasoning
8B parameters32B cap
VoxCPM2

Text-to-speech and creative voice design for voice-enabled apps.

  • TTS
  • voice

If you want to build…

Image understanding, OCR or documents
MiniCPM-V 4.6Vision + OCR
A video-understanding demo
MiniCPM-V 4.6 / 4.5Multimodal video
An omni or voice + vision assistant
MiniCPM-o 4.5Speech in, speech out
A lightweight, local-first text app
MiniCPM5-1B1B, on-device
Text reasoning / problem-solving
MiniCPM4.1-8BBest reasoning
TTS or a creative voice experience
VoxCPM2Voice generation
STARTER SPACE
MiniCPM-V-4.6 Demo

Fork this Gradio Server Space to start from a working MiniCPM-V-4.6 app.

Fork it
OPENBMB · SPONSOR PRIZE

Best MiniCPM Build

To qualify ·Build with MiniCPM models. The pool is split $5k per track (1st $2,500 · 2nd $1,500 · 3rd $1,000).

$10,000
All prizes