AI Safety · France · Spain · Hugging Face

Nemotron 3.5 Content Safety: Customizable Multimodal Safety for Global Enterprise AI

Thu, Jun 4 · 6:57 PM UTC 2 min read

Compiled by KHAO Editorial — aggregated from 1 source. See llms.txt for citation guidance.

★ Tier-1 Source

This post covers what changes in 3.5, the design decisions behind each new capability, and how to integrate the model into production safety pipelines.

Key facts

Nemotron 3.5 Content Safety averages 97% harmful-content classification accuracy on Multilingual Aegis Cultural + Adapted (prompt classification) (harmful-f1) across 12 languages
Nemotron 3.5 Content Safety is built on Google Gemma 3 4B IT (4B parameters), providing a 128K context window, strong vision-language reasoning, and broad multilingual coverage
Nemotron 3.5 Content Safety averages 89% harmful-content classification accuracy on RTPLX (prompt classification) (harmful-f1) across 12 languages
On Multilingual Aegis, Nemotron 3.5 averages 96.5% harmful-content classification accuracy across 12 languages

Summary

Nemotron 3 introduced image understanding; Nemotron 3.5 deepens the multimodal integration. Nemotron 3.5 maintains the 12-language explicit training coverage of its predecessors— English, French, Spanish, German, Chinese, Japanese, Korean, Arabic, Hindi, Russian, Portuguese, and Italian —while also inheriting strong zero-shot generalization across approximately 140 languages from the Gemma 3 base model. This is the most significant architectural addition in 3.5 relative to Nemotron 3. This extends the work first introduced in Nemotron Content Safety Reasoning 4B to the full multimodal, multilingual setting. Every safety verdict in Nemotron 3.5 can be accompanied by an auditable reasoning trace via an optional think mode.

Read full article at Hugging Face →

#AI Safety #France #Spain