srinivas raghav blog's

vibecoding goes brrrrrrr!!!

So, for a while I’ve been spending some $, and to my surprise, the models’ effectiveness at following complex instructions, doing scaffolding, handling complex projects, and sitting with the vagueness of the human mind is not bad.

imo I’ve tested gpt-5-codex, gpt-5-high, grok-4 and grok-4-fast, Opus 4.1 and Sonnet 4/4.5, GLM 4.5/4.6, Qwen 3 Max, DeepSeek V3.1 Terminus/V3.2, and Kimi K2.