Introduction to AI Agents

What is an AI Agent?

As technology around us continues to advance at an unprecedented pace, AI is being applied across a wider range of fields than ever before. Among the emerging trends, "AI Agents" have drawn significant attention—intelligent assistants capable of making decisions and executing tasks autonomously. By combining Large Language Models (LLMs), knowledge graphs, and external tools, these agents “think” through the optimal procedures for a given goal and then take “action” to accomplish their tasks. In this blog, we will explain what AI Agents are, outline their core mechanisms and potential, and discuss key points for implementation in a clear and accessible way.

How Are They Different from LLMs?

Compared to Large Language Models (LLMs), which have garnered much attention, the defining characteristic of AI Agents is their ability to create plans (Planning) and carry them out (Actions) to achieve a given objective.

Tasks They Excel At:

Input:

Output:

How They Operate:

Requirements:

Why Are AI Agents Trending?

Image source: Andrewng.org

Deep learning expert Andrew Ng points out that incorporating agent workflows may drive greater advancements than next-generation foundational models in improving LLM performance.

For instance, GPT-3.5 achieves a zero-shot code generation accuracy of 48.1% (67% with GPT-4), but this rate can jump to 95.1% by integrating an agent loop. In other words, agent workflows could dramatically boost LLM performance. Image source: Agentic Design Patterns

According to Andrew Ng, AI Agents can be implemented with four key elements, referred to as "Design Patterns."

Reflection (Self-Examination): The LLM reviews its own outputs and identifies improvements. For example, it might re-evaluate generated code, pinpoint errors, and suggest enhancements. Image source: Agentic Design Patterns

Tool Utilization: The ability of LLMs to use external tools, such as web searches or code execution, to gather and process information. This allows agents to obtain the latest data and carry out complex tasks without human intervention. Image source: Agentic Design Patterns

Planning: The LLM autonomously determines and executes the sequence of steps required to achieve the target goal. By breaking down a complex task into smaller subtasks, it can reach the objective more efficiently. In other words, it makes decisions. Image source: Agentic Design Patterns

Multi-Agent Collaboration: Multiple AI agents each take on specialized roles, working together to efficiently solve tasks. Each agent leverages its expertise to deliver the best outcome. Image source: Agentic Design Patterns

Can AI Agents Do Everything?

AI Agents excel at simple, well-defined tasks, particularly small-scale, repetitive tasks that occur routinely. On the other hand, they are not as suitable for large-scale, infrequent tasks. Since AI Agents cannot handle every kind of task (they are not AGI), it’s important to maintain realistic expectations and understand current limitations.


How to Implement an AI Agent

Implementing AI Agents involves multiple challenges—such as state management, tool execution, data storage, deployment strategies, and framework selection—making it more complex than using LLMs or LMMs alone. Some representative challenges include:

Complex State Management:

Safe Actions:

Storage Requirements:

Deployment Difficulties:

Framework Selection:

Context Window Structure:

Multi-Agent Communication:

Azure AI Agent Service

Image source: Azure AI Agent Service

Azure AI Agent Service is an enterprise-grade AI agent development and deployment platform provided by Microsoft. It integrates cutting-edge models, data, tools, and services to support the automation of complex business processes.

Key Features:

Azure AI Agent Service addresses enterprise challenges like rapid process automation, broad tool/system integration, data privacy assurance, agent cost/performance monitoring, interoperability, and scalability.

Amazon Bedrock Agents

Image source: Amazon Bedrock Agents

Amazon Bedrock Agents, offered by AWS, accelerates the development of generative AI applications. Using this service, you can create agents that break down user requests into multiple steps and automatically perform necessary API calls or data retrieval.

Key Features:

These features enable rapid development of generative AI applications with Amazon Bedrock Agents.

Vertex AI Agent Builder

Image source: Vertex AI Agent Builder

Vertex AI Agent Builder from Google Cloud simplifies the building and deployment of generative AI applications. It caters to a range of developer needs, from no-code agent-building consoles to open-source frameworks (e.g., LangChain on Vertex AI).

Key Features:

Vertex AI Agent Builder allows developers to efficiently create AI agents that optimize complex business processes and improve customer experiences.

OSS and Other Services

In addition to the three cloud vendor solutions (AWS/Azure/GCP), a wide array of technologies and stacks are available to suit various use cases. Image source: The AI agents stack

Challenges and Solutions in AI Agent Implementation

While we introduced various tools and services for AI Agent development, additional concepts and techniques can help overcome core challenges. In particular, enabling agents to make decisions (Planning) and take actions (Action) are key hurdles.

Chain-of-Thought (CoT)

Chain-of-Thought (CoT) is a prompt engineering technique that guides AI agents to break down complex tasks into step-by-step reasoning. This approach yields more accurate and consistent results.

Image source: Mastering Chain of Thought (CoT) Prompting for Practical AI Tasks

Best Practices for CoT Prompting:

Large Action Models (LAM)

Large Action Models (LAMs) integrate decision-making abilities into LLMs, allowing them to understand human intentions and autonomously execute complex tasks. Image source: Large action models (LAMs): The foundation of AI agents

Creating LAMs from LLMs:

“The main talk about LAMs started with Rabbit AI’s release of R1, but there are a few other players in the game. In particular, the recent release of Anthropic’s Claude features shook the AI community with what’s possible in agentic AI.” Large action models (LAMs): The foundation of AI agents

Tools for Enabling Actions

A major challenge is enabling AI agents to perform actions (e.g., calling APIs, running scripts). Cloud vendors allow this within their ecosystems, while external environments may use offerings from OpenAI, Anthropic, and others.


Conclusion

AI Agents represent the evolution of AI from information tools to intelligent partners capable of autonomous decision-making and action. They can handle complex decision-making, task decomposition, and external tool utilization, proving valuable in both business and everyday applications.

However, challenges remain in real-world deployment, including reliability, safety, error control, and explainability. As related technologies and frameworks advance, AI Agents are expected to become more accessible, reliable, and instrumental in creating new sources of value.


References

Back