AI Safety · France · Spain · Hugging Face
Nemotron 3.5 Content Safety: Customizable Multimodal Safety for Global Enterprise AI
Compiled by KHAO Editorial — aggregated from 1 source. See llms.txt for citation guidance.
★ Tier-1 Source
This post covers what changes in 3.5, the design decisions behind each new capability, and how to integrate the model into production safety pipelines.
Key facts
- Nemotron 3.5 Content Safety averages 97% harmful-content classification accuracy on Multilingual Aegis Cultural + Adapted (prompt classification) (harmful-f1) across 12 languages
- Nemotron 3.5 Content Safety is built on Google Gemma 3 4B IT (4B parameters), providing a 128K context window, strong vision-language reasoning, and broad multilingual coverage
- Nemotron 3.5 Content Safety averages 89% harmful-content classification accuracy on RTPLX (prompt classification) (harmful-f1) across 12 languages
- On Multilingual Aegis, Nemotron 3.5 averages 96.5% harmful-content classification accuracy across 12 languages
Summary
Nemotron 3 introduced image understanding; Nemotron 3.5 deepens the multimodal integration. Nemotron 3.5 maintains the 12-language explicit training coverage of its predecessors— English, French, Spanish, German, Chinese, Japanese, Korean, Arabic, Hindi, Russian, Portuguese, and Italian —while also inheriting strong zero-shot generalization across approximately 140 languages from the Gemma 3 base model. This is the most significant architectural addition in 3.5 relative to Nemotron 3. This extends the work first introduced in Nemotron Content Safety Reasoning 4B to the full multimodal, multilingual setting. Every safety verdict in Nemotron 3.5 can be accompanied by an auditable reasoning trace via an optional think mode.