GPT-5.4 Is Here — I Tested the New ChatGPT Model

Research & Benchmarks4 months ago

GPT-5.4 Is Here — I Tested the New ChatGPT Model

Descriptions:

Skill Leap AI tests GPT-5.4 Thinking shortly after its release, walking through the model’s headline capabilities and comparing it to adjacent OpenAI releases and competing frontier models. The video also contextualizes where GPT-5.4 sits relative to GPT-5.3 Instant (the fast, non-reasoning variant released days earlier) and GPT-5.4 Pro (a research-grade tier), explaining why the versioning split between instant and thinking models may appear non-sequential.

Key capabilities demonstrated include native computer use — now built directly into GPT-5.4 rather than requiring a separate agent model — with examples covering data entry, email handling, and calendar management. The creator also tests knowledge work output: a 15-slide PowerPoint generated from a research prompt in roughly five minutes, and a multi-tab Excel spreadsheet with working formulas produced from a single prompt in about ten minutes. On the coding side, GPT-5.4 Thinking is shown matching GPT-5.3 Codex performance in a general-purpose package, while improved tool-calling efficiency reportedly reduces token consumption even at a slightly higher per-token price.

OpenAI’s internal benchmark comparisons against Anthropic’s Opus 4.6 and Google’s Gemini 3.1 Pro show GPT-5.4 Thinking with a marginal edge on select tasks, though the creator notes results are mixed and the comparison excludes non-OpenAI benchmarks. The video concludes with a live website-building demo using the model’s canvas mode, giving developers a practical reference for real-world output quality.

📺 Source: Skill Leap AI · Published March 05, 2026
🏷️ Format: Review

1 Item

Channels

No Image Available

Skill Leap AI

Tags

Anthropic Claude Opus 4.6 Gemini 3.1 Pro Google GPT 5.3 Codex GPT-5.2 GPT-5.3 Instant OpenAI

Prev

Build Agent Teams within Claude Cowork in 17 min

Build Agent Teams within Claude Cowork in 17 min

Next

GPT 5.4 “we see no wall”

GPT 5.4 “we see no wall”

18 Related Posts

Related Posts

14:03

Research & Benchmarks

Fable 5 is Back! Here’s the Best Way to Use It…

23 hours ago

21:10

Research & Benchmarks

I Tested Gemini Spark: What Google’s AI Agent Can Actually Do in 21 Minutes

23 hours ago

10:50

Research & Benchmarks

Laguna XS 2.1: Poolside’s Local Coding Agent Tested – Nine Languages

2 days ago

12:40

Research & Benchmarks

Sonnet 5 vs Ornith 35B: Can a Local Model Beat Closed-Source?

3 days ago

10:26

Research & Benchmarks

NotebookLM’s Brand New Feature Generates Shorts With One Click

3 days ago

28:52

Research & Benchmarks

GLM-5.2 Proves Open-Source AI is Finally Good Now!

3 days ago