Qwen-VL
Qwen Visual Language model series from Alibaba. As of 2026 the frontier OSS multimodal model is Qwen3-VL-235B-A22B-Instruct, which rivals Gemini 2.5 Pro and GPT-5 on visual reasoning. Strong at multilingual visual understanding, document parsing, and chart QA.
Pixtral
Mistral's vision-language model, originally available as open weights and via La Plateforme. As of 2026 Pixtral has been unified into Mistral Small 4, which combines vision, text, and code in a single model. New deployments should target Mistral Small 4.