The First Real LLM Breakthrough Is Here… SubQ (1000x Less Compute)

Business & Strategy2 weeks ago

The First Real LLM Breakthrough Is Here… SubQ (1000x Less Compute)

Descriptions:

TheAIGRID covers the release of SubQ 1.1 Small, which its developers claim is the first large language model built on a fully sub-quadratic sparse attention architecture — a potential departure from the quadratic scaling problem that has constrained context window size and cost since the original Transformers paper in 2017. The company’s core claim is that standard dense attention wastes compute by evaluating every word-to-word relationship in a sequence, while SubQ’s Sparse Selective Attention (SSA) learns, per token, which small subset of relationships actually matters and computes full attention only on those.

The numbers are specific: at 1 million tokens, SubQ reports 64.5 times less compute than dense attention and 56 times faster throughput than Flash Attention 2. The model ships with a 12 million token context window — despite being primarily trained at 1 million tokens — and scores 100% on needle-in-haystack retrieval at 1M and 2M tokens, dropping to 98% at 6M and 12M. On RULER, a multi-step reasoning retrieval benchmark, it scores 99.12% at 128k. On GPQA Diamond (graduate-level science), it scores 85.4, below GPT-4.5 at 93.2 and Opus 4.8 at 92, but above Haiku 4.5 at 67.2.

The video clearly distinguishes SubQ’s content-aware token selection from earlier positional shortcuts like Longformer and BigBird, and from fixed-memory compression approaches like Mamba, making it a useful explainer for developers evaluating whether this architecture represents a genuine scaling inflection or an incremental improvement.

📺 Source: TheAIGRID · Published June 18, 2026
🏷️ Format: News Analysis

1 Item

Channels

No Image Available

TheAIGRID

Tags

Claude Opus 4.7 Claude Opus 4.8 Transformers

Prev

LoopCoder – The 7B Model That Thinks Twice – Does it Beat Others?

Next

The Age Of The 40-Year-Old Solo Founder Is Here

18 Related Posts

Related Posts

42:25

Business & Strategy

a16z Goes Global: Why American Tech Must Lead the World

23 hours ago

21:14

Business & Strategy

The Best AI Coding Setup Isn’t the Most Autonomous One (Here’s Why)

23 hours ago

09:36

Business & Strategy

How Claude is Creating a New Generation of Millionaires

23 hours ago

29:21

Business & Strategy

AI News: Fable’s Back But This New Model is Better?

23 hours ago

20:13

Business & Strategy

The Prompt Is Still a Punch Card – Ted Johnson, JoinIn AI

2 days ago

18:03

Business & Strategy

Fable 5 vs GPT 5.6 Sol: The Early Results

2 days ago