Why Agents Are the Most Useful Part of AI Chat

Home » Analysis » Why Agents Are the Most Useful Part of AI Chat
Happy kawai computer chips

Previous post | All posts in series

The journey of so many users of generative AI begins with the blank message box. And then it ends there too! The entire world of options is open before a user, and it is too much. It’s easier to go back to Word, Outlook, and Excel then it is to stare down infinite possibility.

Nearly as bad, it’s essentially guaranteed that the first thing a new user tries ends up in a failure. They probably will not be specific enough and get a generic slop answer, or they will try something that the AI didn’t have access to and get a hallucination instead of answers.

Agents address these problems: they give users something to try that gives great results, and they help with useful tasks. Agents turn expensive software licenses into dramatic productivity gains for all information workers.

Why do we have agents?

Agents in Microsoft 365 Copilot are simply a collection of related functionality that are easy to access. When selected, they change the functionality of Copilot Chat. This isn’t always more capability; they often have less data and tools access than the base mode of Copilot. This sounds pretty basic, so what good are they?

AI chat would be too difficult to use without agents, and it would be much less effective. Agents serve three major purposes within Microsoft 365 Copilot: wayfinding, prompt engineering, and accessing external systems.

Finding your place

Chatting with a generative AI is very powerful, but it is also difficult to learn. The first purpose that agents serve is a type of user education.

There’s little more humbling than watching a new user open up Copilot Chat in a user study. The empty chat box and submit button provide very little to “grab onto.” Beyond the suggested prompts, there’s little on the screen that guides the user to what they can do.

This is the first purpose of agents: a familiar handle that is already in the user’s navigation bar (because they were installed by their organization for them, or they clicked a share link from a teammate). When the agent interface is implemented well, there will be a familiar product or purpose visible on the screen, and the user is going to start there.

We call the screen that opens when clicking on an agent to be “focused view.” It continues user education with conversation starters, the prompts to try as suggested by the developer. Once the user is successful with an agent, they can use it again. By using an agent, they are also learning to use the rest of Copilot Chat better.

It’s focused in another way, which is to a specific domain or purpose through knowledge selection and prompt engineering.

Prompt Engineering

The number of tasks that Copilot can help with is astronomical. Due to the immensity of opportunity, you have to be quite specific about what exactly you want to do. Learning to work with AI through an iterative conversation is a good model, but it’s even better to succeed with a short prompt, immediately.

Prompt engineering is pretty hard though, and it takes time. Agents allow for someone with skill in prompt engineering to share their work with their colleagues or customers. Some of the agents I create only require one-word prompts, because I put in the time to make the instructions as effective as possible.

Engineering an agent to be excellent at a skill or domain fits naturally with selecting a subset of grounding knowledge. If you’re an information worker, you have access to millions of pieces of content. It can be hard to find exactly the information you are looking for. Scoping what documents and locations an agent can access makes it (and the user) more successful at the task.

When an agent has a familiar role to the user, has engineered instructions, and only the relevant content, it becomes an amazing specialist at this role’s tasks. A “specialist” is one of the two patterns we see for agents in our user research. The other is a “doer.” A doer agent is able to take actions beyond returning text.

Taking action

As agents are already focused on a role or task and have a surface to teach users effective use patterns, they’re the ideal place to host tools. Tools to a language model are simply descriptions of programmatic interfaces to another computer system. These tools (also called functions or skills) are critical, as they allow the AI to do things other than send text to the user.

Tools expand what the chat interface can do. The buzzword RAG (Retrieval Augmented Generation) is just one type of tool. Like with the wide scope of available data, there are any number of computer systems and software applications that the user could need to work with.

Focusing an agent on a related set of tools doesn’t only help the user—it helps the model. If you attempt to give an AI dozens or hundreds of tools, its reliability decreases quickly. Having different agents direct the user to the right place to go and limit the AI to a set of tools it can use reliably.

Don’t let your licenses gather dust

I’m very proud of how agents are helping employees all over the world. But I know that we have a lot of improvements to make in each of their purposes:

  • Wayfinding: we just redesigned the interface for agents in the Microsoft 365 Copilot app. Now they are easier to discover, create, edit, and share. Still, we only implemented the first phase of our updates. We aim to support all users get more value from agents, whether new or experienced.
  • Prompt engineering: while we also recently released many tools for developers, it is difficult to perform automated evaluation. Meanwhile, updated models and orchestration make specialized prompt engineering less and less important.
  • Tools: recent releases of using MCP servers as tools was an important beginning to MCP’s objective of becoming the “USB-C for AI” in Copilot. We are in early days, and we have a lot of ideas.

I’ll keep working to make agents in Microsoft 365 Copilot even better, but right now they remain the smartest decision a leader can make for their company. Don’t let your licenses gather dust. Forget the training videos. Find or build excellent agents, then pre-install them for your users.

If you need help building agents on any platform, check out my Agent Best Practices series. You’ll get real results, company wide. Agents are not magic, but they are just enough structure to get any user moving and succeeding.