AI & Machine Learning

New AI model dropped — first impressions of Claude 4 Opus and GPT-5

Anthropic released Claude 4 Opus last month and OpenAI followed with GPT-5 two weeks later. I have been testing both daily for work and here are my honest impressions after real usage, not benchmark hype.

Claude 4 Opus:

  • The extended thinking is genuinely different. You can watch it reason through multi-step problems in a way that feels less like autocomplete and more like a colleague working through something.
  • Coding ability is a significant jump. I gave it a 400-line React component with a subtle race condition and it found it in one pass. Claude 3.5 Sonnet would have missed it.
  • The 1M token context window is real. I loaded an entire codebase and asked it to trace a bug across 12 files. It did it correctly.
  • Weakness: still hallucinates on niche library APIs. If you are using something with under 1000 GitHub stars, verify everything.

GPT-5:

  • Multimodal is where it shines. Image understanding, audio processing, and video analysis in a single model. I fed it a whiteboard photo from a meeting and it generated accurate user stories.
  • The reasoning mode is competitive with Claude but feels more rigid. It follows instructions precisely but sometimes misses the spirit of what you are asking.
  • The voice mode is genuinely useful now. I had a 20-minute conversation debugging an architecture problem while driving.
  • Weakness: the pricing. GPT-5 Pro is $200/month. That is a tough sell when Claude 4 does 80% of the same things at a lower price.

My workflow now:

  • Claude 4 for coding, writing, and analysis
  • GPT-5 for multimodal tasks and voice interactions
  • Still using Gemini 2.5 for quick throwaway queries because the free tier is generous

Sources:

  • Anthropic blog — Claude 4 release notes
  • OpenAI blog — GPT-5 technical report
  • Personal usage across 4 weeks of daily testing

What is your AI stack in 2026?

Community ReportAutomatedSource: Community ReportPublished: Apr 4, 2026, 2:57 PM

The pricing war is heating up. Gemini 2.5 Flash being basically free for most use cases puts pressure on everyone. Google is playing the long game.

GPT-5 voice mode while driving is the sleeper feature nobody talks about. I plan my entire day during my commute on 635 now. It is like having a chief of staff in my car.

I work at a DFW defense contractor and we are not allowed to use any of these. Everything has to be on-prem models. The gap between what we use internally and what is available commercially is painful.

Claude 4 Opus genuinely changed my workflow. I am a backend engineer at a DFW fintech company and the codebase understanding is insane. Loaded our entire repo and it found a production bug we had been hunting for two weeks.

bookmarked immediately

lol nah

Hot take: the benchmarks do not matter anymore. Every frontier model is good enough for 95% of tasks. The differentiator is UX and integration, not raw capability.