When Andrej Karpathy — the former OpenAI cofounder and one of Silicon Valley’s favorite deep learning minds — says something won’t work yet, people listen.
His latest take? AI agents, the buzzy “autonomous helpers” everyone’s hyping up this year, are nowhere near ready for prime time.
“They just don’t work,” he said flatly on the Dwarkesh Podcast. No sugarcoating, no PR gloss — just frustration from someone who’s built the very tools others are overpromising.
He’s not alone in that skepticism. Analysts dissecting the autonomy gap argue that current models lack what Stanford’s HAI researchers call “memory continuity” — the ability to remember and learn across tasks.
In plain English: they forget what they were doing five minutes ago.
So while investors are dubbing 2025 the “year of the AI agent,” Karpathy’s take feels like a splash of cold water on overheated optimism.
Interestingly, Karpathy doesn’t sound cynical, just pragmatic. He’s building his own AI-native school at Eureka Labs, where he wants machines to collaborate, not dominate.
In a post on X, he wrote that the industry is “overshooting the tooling” — racing toward a future where autonomous bots run everything.
It reminded me of an eerily similar warning from Geoffrey Hinton’s interview with MIT Tech Review, where he cautioned that overestimating AI’s readiness is more dangerous than underestimating it.
And here’s where things get real. AI “agents” — those systems meant to handle emails, schedule tasks, or write code — are failing basic reliability tests.
A 2024 evaluation from Scale AI’s internal research found that multi-step agents succeed fully only 30% of the time.
Imagine a human intern who gets things right once every three tries. Cute, maybe. Useful? Not yet.
Still, I get why people are obsessed. The idea of an AI that just “does it all” is seductive — a kind of modern-day genie in a laptop.
But as DeepMind’s Demis Hassabis noted in a recent talk, there’s a huge gulf between a chatbot and a reliable agent.
It’s the difference between a helpful parrot and a capable assistant. And if Karpathy’s right, bridging that gap will take a decade of patient, unglamorous engineering — not just hype.
Call me sentimental, but there’s something grounding about hearing a pioneer admit that progress takes time.
Maybe the real story here isn’t about delayed timelines — it’s about recalibrating our expectations.
After all, AI doesn’t need to replace us tomorrow; it just needs to make today’s work a little smarter, one imperfect step at a time.


