Descriptions:
InternLM’s latest release, Intern-S2-Preview, is a 35-billion-parameter scientific multimodal model that takes a different approach to capability scaling: rather than increasing parameter count, the team increased the difficulty, diversity, and coverage of scientific tasks across the entire training pipeline—from pre-training through reinforcement learning. In this hands-on video, Fahd Mirza installs and tests the FP8-quantized version locally on an Nvidia RTX 6000 GPU with 48GB of VRAM, serving it with vLLM and consuming approximately 45GB of VRAM at inference time.
The model, continued pre-trained from Qwen 3.5, claims benchmark performance comparable to the trillion-parameter Intern S1 Pro on core scientific tasks despite being orders of magnitude smaller. It is also reportedly the first open-source model with crystal structure generation capability alongside strong general reasoning. Technical innovations include shared-weight multi-token prediction (MTP) with KL loss during reinforcement learning to improve token generation speed, and a chain-of-thought compression technique discussed in the video.
Mirza tests the model on real scientific prompts and a complex coding task—a Hodgkin-Huxley neuron action potential simulator in a single self-contained HTML file—noting both where the model succeeds and where it misses (the simulator rendered an interface but failed to animate). The candid evaluation of failures alongside successes makes this useful for researchers and practitioners evaluating whether scientific AI models are ready for local deployment.
📺 Source: Fahd Mirza · Published May 23, 2026
🏷️ Format: Tutorial Demo






