Seeing if Opus 4.7 sucks [LIVE]

Seeing if Opus 4.7 sucks [LIVE]

More

Descriptions:

Matthew Berman hosts a live stream examining Claude Opus 4.7, Anthropic’s latest flagship model, drawing on community feedback from X to assess whether the new release lives up to expectations. The session aggregates real-world reports of failure modes — including hallucinated conversation turns inside Claude Code and problematic MCP tool calls — alongside positive data points from benchmarks like Vending Bench, where Opus 4.7 reportedly manages a simulated vending machine over 365 days and finishes with $8,000 in account balance.

Berman covers Opus 4.7’s reported strength in agentic CAD design, noting potential applications for 3D-printing workflows, and discusses open-source model comparisons including Gemma 4 versus Qwen 3.6 for local inference on hardware like the DJX workstation. The stream also includes candid commentary on the AI creator ecosystem, with Berman arguing that much of the negative sentiment toward Anthropic models on social media reflects undisclosed sponsorships or personal bias rather than genuine hands-on evaluation.

While the format is informal and audience-driven, viewers get a grounded look at real-world Opus 4.7 performance across coding, design generation, and long-horizon agent tasks, with pointed discussion of where the model appears to fall short relative to its predecessor.


📺 Source: Matthew Berman · Published April 17, 2026
🏷️ Format: Livestream

1 Item

Channels

3 Items

Companies

1 Item

People