Top OSS multimodal model from OpenGVLab
InternVL2 series from Shanghai AI Lab — consistently top-ranked on open-source multimodal benchmarks. Strong at document understanding, chart analysis, and multi-image reasoning.
Vision-language models for image understanding, captioning, visual QA, and document parsing
Other tools in this slot:
AIchitect's Genome scanner detects InternVL2 in your project via these signals:
transformersHealth ↑ 40 → 55
12 days ago
Crossed 10,000 stars ⭐
3 weeks ago
Pricing updated
5 weeks ago
Crossed 5,000 stars ⭐
6 weeks ago
Crossed 1,000 stars ⭐
6 weeks ago
Add to your GitHub README
[](https://www.aichitect.dev/tool/internvl2)Explore the full AI landscape
See how InternVL2 fits into the bigger picture — browse all 207 tools and their relationships.