Mistral · GPT · China · France · Hugging Face
Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context, Best Sub-100M Retrieval Quality
Compiled by KHAO Editorial — aggregated from 1 source. See llms.txt for citation guidance.
★ Tier-1 Source
In this post: Enterprise-Ready by Design · A Strong Sub-100M Multilingual Model · What Changed from R1 · Training the Full-Size 311M Model · Building the compact 97M Multilingual Model · Benchmark Results · Matryoshka Embeddings · Deployment Options · For Framework Integrators · Which Model Should You Use? · Try The Models.
Key facts
- Cutting from 768 to 256 dimensions, a 3x reduction in storage and similarity-computation cost, drops MTEB Multilingual Retrieval by 0.5 points (65.2 → 64.7) and Code Retrieval by 0.5 points (63.9 →
- The result is a model that scores 65.2 on MTEB Multilingual Retrieval and 56.3 on the overall average, a +14.5 point average gain over its R1 predecessor
- The 311M model uses the Gemma 3 tokenizer (262K tokens); the 97M model starts from the GPT-OSS tokenizer and prunes it down to a compact 180K-token vocabulary that preserves broad multilingual
- The 311M model is a 22-layer ModernBERT encoder with a 262K-token multilingual vocabulary, trained through a multi-stage pipeline
Summary
Multilingual embedding models face a persistent tension: broad language coverage usually comes at the cost of model size, and small models usually sacrifice languages. Granite-embedding-311m-multilingual-r2, A 311M-parameter full-size model with 768-dimensional embeddings, Matryoshka dimension support, and top-tier multilingual retrieval quality. Granite-embedding-97m-multilingual-r2, A 97M-parameter compact model with 384-dimensional embeddings that delivers strong retrieval quality for its size. Both models support 200+ languages with enhanced retrieval quality for 52 languages and programming code, handle context lengths up to 32,768 tokens (a 64x increase over their R1 predecessors), and are released under the Apache 2.0 license.