Categories: Analysis

Agent2Agent: Why Agents Need to Talk to Each Other

This blog is solely my personal opinion.

This week, Microsoft announced a partnership with Google to develop the open Agent2Agent protocol. I think Agent2Agent will be much more important than MCP.

This weekend I built an agent compatible with the standard. Check out what I learned about Agent2Agent, Semantic Kernel, Azure Functions, and the future of the internet!

Why do we need this?

We’ve had the current form of AI Agents since OpenAI built the custom GPTs feature in November 2023. Instead of training an ML model or building an AI wrapper application around APIs, developers only need to specify instructions, knowledge, and tools to make an agent. This is easy enough for regular users to do, making application development more democratic than ever.

I’ve tried several of the agent-building products in the industry as competitive research. One aspect of every agent product designed for end-users is they are designed to be locked in to the host software. You can’t use a custom GPT outside of ChatGPT, despite that feature request being logged the same month as the feature’s release.

Agent2Agent is designed as a protocol specifically to break down that barrier. If the agent-building system is compatible, all agents created through it can be used in any chat interface that also supports Agent2Agent.

Discovery

The first problem Agent2Agent solves is discovery of agents. Click this link for the agent I created this weekend:

https://agent-too-agent.azurewebsites.net/api/.well-known/agent.json

If I’m still hosting this agent while you’re reading this, you’ll get a JSON document that describes the agent’s capabilities, AKA the Agent Card. Any website or software can add an agent and describe it in a path like this, to be discovered by any AI.

I’ll continue the meme of overusing “book travel” as an end-user agent scenario. Many startups and labs will show their web-browsing AI or other new feature learning on the fly how to use an airline website. RIght now this sometimes works, slowly. If Agent2Agent succeeds, your AI agent will just ask the airline’s booking agent to book travel.

The advantage here is that the airline has incentive to make their business very usable by users’ AIs. They are best-suited to make an AI version of their booking process, instead of making expensive models attempt to book by navigating the website that is designed for humans.

Send a message

I built my agent with the Semantic Kernel sample in the Agent2Agent repository. The repository is set up for local use and development, so I put it into an Azure Functions project. I’ve never actually used Azure Functions, so this was a double learning experience for me.

It seems odd to me that there has never been a standard way to call an agent developed by someone else. Now there is, and it is very simple. The most basic message to an Agent2Agent endpoint is to POST { “user_input”: “What can you do?”, “session_id”: “123” }. The response’s content will be a similarly simple JSON document. You can run this PowerShell script to interact with my travel planning and currency conversion agent:

$prompt = "Plan a 1-week trip to Paris with currency conversion and 5 things to see"
$functionUrl = "https://agent-too-agent.azurewebsites.net/api/tasks/send"

$requestBody = @{
    user_input = $prompt
    session_id = New-Guid
} | ConvertTo-Json

Invoke-RestMethod -Uri $functionUrl -Method Post -Body $requestBody -ContentType "application/json" | fl

PowerShell actually makes this look harder than it is. There’s not much in computer-to-computer interfaces simpler than POSTing two parameters to a URL!

EDIT: I made a mistake in this article, and this PowerShell command will no longer work. My mistake was that Agent2Agent uses JSON-RPC, not REST. That means you need to POST multiple parameters, including the name of the method you want to call, to a single endpoint. I updated my server to use JSON-RPC instead, but I will leave this article alone beyond this note.

This is backed by GPT-4.1-nano, so please don’t expect excellent answers here. It does seem to reliably give correct currency exchange rates. This is actually a three-agent system: TravelManagerAgent, ActivityPlannerAgent, and CurrencyExchangeAgent. This last one has a plugin to the Frankfurter currency API.

And if it stops working entirely, I may have run out money I’m willing to spend on hosting a travel agent!

Some of the Functions Code

Azure Functions isn’t actually too hard once you get the hang of it, assuming you have an AI help you write it. Functions is basically a web endpoint that runs a bit of short-lived code. Here it runs some Python code to handle the JSON (I’ve removed the logging and error-handling for simplicity):

@app.route(route="tasks/send", methods=["POST"])
async def send_task(req: func.HttpRequest) -> func.HttpResponse:
    """Handle the /tasks/send function for sending tasks."""
    response = await travel_agent.invoke(user_input, session_id) 
    response_json = json.dumps(response)
    return func.HttpResponse(
        body=response_json,
        status_code=200,
        mimetype="application/json"
    )

It otherwise calls the invoke method of the travel_agent that I’ve instantiated from the Semantic Kernel sample.

A neat thing about Azure Functions is that it is very easy to test both locally and when hosted publicly on Azure. GitHub Copilot gave me a trick to give a different URL property in the Agent Card based on where it is currently being hosted:

@app.route(route=".well-known/agent.json")
def get_agent_card(req: func.HttpRequest) -> func.HttpResponse:
    modified_card = agent_card.model_copy()
    base_url = str(req.url).rsplit('/.well-known/agent.json', 1)[0]
    modified_card.url = base_url + '/'
    
    agent_card_json = json.dumps(modified_card, default=lambda o: o.__dict__)
    return func.HttpResponse(agent_card_json, mimetype='application/json')

More complicated features

If all it took for AI agents to talk to each other was POSTing two properties, we wouldn’t really need a whole standard protocol. There’s actually way more to handle:

  • Streaming responses
  • Partial answers and “in progress” messages
  • Answers that take minutes or hours
  • Images and documents
  • Sharing citations
  • Push notifications

My live agent now supports “in progress” messages if you call the tasks/sendSubscribe endpoint. I’d like to add each capability of the Agent2Agent protocol. You can follow along in my GitHub repository, Agent-Too-Agent.

But really, we don’t know what else the protocol will need in the future. This is like the HTML 1.0 version of the protocol! Next week I hope to stand up a free web Agent2Agent client so you don’t need to copy/paste PowerShell commands.

Abram Jackson

Recent Posts

Why Agents Are the Most Useful Part of AI Chat

AI chat would be too difficult to use and much less effective without agents. They…

4 days ago

Juice: How to Make Any Project Succeed

No matter your skill, what matters is whether the buyer is paying attention to what…

2 weeks ago

Interpretability: What AI Actually Thinks (Then Fixing It)

How can we fix the bugs in AI models until we understand them? Learn how…

2 weeks ago

Focus on the Idea: How AI Can Elevate Art

AI art thrives when driven by ideas, not technique. Discover how to make AI-generated media…

4 weeks ago

Entirely New Communication Patterns Between AI and Humans

AI Agents can communicate like humans, but also in new ways that we've never seen…

1 month ago

Last App Standing: Software That Will Continue to Thrive

Will general-purpose AI destroy all other software? Here's the software you can build now that…

2 months ago