Molmo 7B-D by AllenAI – Open Vision-LM with CLIP Backbone
Molmo 7B-D is a powerful open-source vision-language model developed by the Allen Institute for AI (AI2), built on Qwen2-7B and powered by OpenAI CLIP as its visual encoder. Trained on the high-quality PixMo dataset (1M curated image-text pairs), this model delivers state-of-the-art performance for its size across academic benchmarks and human evaluations. Highlights: 🖼️ Multimodal input:…
Molmo 7B-D by AllenAI – Open Vision-LM with CLIP Backbone endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST |
Chat Completions /allenai-molmo/chat |
add your prompt and interact with model |
Molmo 7B-D by AllenAI – Open Vision-LM with CLIP Backbone pricing
| Plan | Price | Rate limit | Quotas |
|---|---|---|---|
| BASIC | Free | — |
|
| PRO | $5 / month | — |
|
| ULTRA | $15 / month | — |
|
| MEGA | $30 / month | — |
|