Artificial Intelligence - Small Language Models are the Future of Agentic AI

2025-09-22

Hey PaperLedge crew, Ernis here, ready to dive into another fascinating piece of research! Today, we're talking about something that's becoming increasingly relevant as AI gets woven into more and more aspects of our lives: agentic AI. Now, you might be thinking, "Agentic AI? What's that?" Think of it like this: instead of just asking a language model (like ChatGPT) a question and getting an answer, agentic AI is about giving the AI a specific job to do and letting it figure out how to do it,...

Now, you might be thinking, "Agentic AI? What's that?" Think of it like this: instead of just asking a language model (like ChatGPT) a question and getting an answer, agentic AI is about giving the AI a specific job to do and letting it figure out how to do it, step-by-step. Imagine a personal assistant that not only answers your questions but also books your flights, manages your calendar, and even orders your groceries, all on its own. That's the power of agentic AI!

For a while now, the focus has been on these massive, super-smart language models – the LLMs – because they seem capable of doing almost anything. But the paper we're looking at today is challenging that assumption. It's basically saying: "Hold on a second! Do we really need to use a sledgehammer to crack a nut?"

The authors make a strong case for small language models (SLMs). They argue that for many of these repetitive, specialized tasks that agentic AI systems are doing, these smaller models are actually better suited, more efficient, and ultimately, cheaper. Think of it like this: you wouldn't use a Formula 1 race car to drive to the grocery store, would you? A regular car gets the job done just fine, and it’s much more economical.

Here's the core argument, broken down:

SLMs are powerful enough: They can handle the specific tasks they're designed for.
Agentic systems are often simple: Many tasks involve repeating the same steps over and over.
Economics matter: Running these giant LLMs all the time is expensive! SLMs are much cheaper to deploy.

The paper even suggests that for situations where you do need that broad, conversational ability, you can use a mix-and-match approach – a "heterogeneous agentic system." This means using different models for different parts of the task. Maybe a small model handles the repetitive stuff, and a larger model kicks in for the complex, conversational bits.

So, why does this matter?

For businesses: This could mean significantly lower costs for AI deployments.
For developers: It opens up new opportunities to build efficient and specialized AI agents.
For everyone: It promotes a more sustainable and accessible approach to AI development.

"Small language models (SLMs) are sufficiently powerful, inherently more suitable, and necessarily more economical for many invocations in agentic systems, and are therefore the future of agentic AI."

The authors acknowledge that there might be some hurdles to overcome in switching from LLMs to SLMs, and they even propose a general algorithm for doing just that. They're basically saying, "This is important, let's figure out how to make it happen!"

Ultimately, this paper is about using AI resources more effectively and lowering the costs of AI for everyone. It's a call to action to think critically about how we're building and deploying AI systems.

Here are a few questions that popped into my head while reading this:

If SLMs are so great for specific tasks, how do we best identify and train them for those tasks? What are the best training techniques?
Could focusing on SLMs actually lead to more innovation in AI, by allowing smaller teams and organizations to participate?
Are there potential downsides to relying heavily on specialized SLMs? Could this create "brittleness" in our AI systems?

I think this is a really important conversation to be having, and I'm excited to see where it goes. Let me know your thoughts on this! You can find this paper and more at the link in the show notes. Until next time, keep learning!

Credit to Paper authors: Peter Belcak, Greg Heinrich, Shizhe Diao, Yonggan Fu, Xin Dong, Saurav Muralidharan, Yingyan Celine Lin, Pavlo Molchanov

Comments (3)