← Back to KHAO

Mistral · GPT · China · France ·

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context, Best Sub-100M Retrieval Quality

2 min read

Compiled by KHAO Editorial — aggregated from 1 source. See llms.txt for citation guidance.

★ Tier-1 Source

Granite Embedding Multilingual R2.

In this post: Enterprise-Ready by Design · A Strong Sub-100M Multilingual Model · What Changed from R1 · Training the Full-Size 311M Model · Building the compact 97M Multilingual Model · Benchmark Results · Matryoshka Embeddings · Deployment Options · For Framework Integrators · Which Model Should You Use? · Try The Models.

Key facts

Summary

Multilingual embedding models face a persistent tension: broad language coverage usually comes at the cost of model size, and small models usually sacrifice languages. Granite-embedding-311m-multilingual-r2, A 311M-parameter full-size model with 768-dimensional embeddings, Matryoshka dimension support, and top-tier multilingual retrieval quality. Granite-embedding-97m-multilingual-r2, A 97M-parameter compact model with 384-dimensional embeddings that delivers strong retrieval quality for its size. Both models support 200+ languages with enhanced retrieval quality for 52 languages and programming code, handle context lengths up to 32,768 tokens (a 64x increase over their R1 predecessors), and are released under the Apache 2.0 license.

Read full article at Hugging Face →

#Mistral #GPT #United Kingdom #China #France