Nvidia's Nemotron-Cascade 2 is a 30B MoE model that activates only 3B parameters at inference time, yet achieved gold medal-level performance at the 2025 IMO, IOI, and ICPC World Finals. Nvidia has ...
Researchers at the University of Science and Technology of China have developed a new reinforcement learning (RL) framework that helps train large language models (LLMs) for complex agentic tasks ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results