Content
# A2A Protocol & MCP Context Funnel Demo (a2a-protocol-demo3)

This is a complete project example based on Python, designed to demonstrate the two core pillars of modern Agent systems: **A2A (Agent-to-Agent) communication protocol** and **MCP (Model Context Protocol) context management**.
This project not only implements standardized communication between Agents (JSON-RPC + SSE) but also, more importantly, builds a **Context Funnel** engine. This engine visually demonstrates how data is transformed between "Raw Slots" and "Prompt," addressing the critical Token overflow issue in large model development through dynamic pruning and semantic compression strategies.
## 🌟 Core Features
1. **A2A Communication Layer**:
* Uses **JSON-RPC 2.0** for command interaction.
* Implements **SSE (Server-Sent Events)** for streaming responses and status feedback.
* Realizes a standard `AgentCard` capability discovery mechanism.
2. **MCP Data Layer (Slots)**:
* Defines the **Slot** structure: the smallest semantic unit that can be compressed, stored, and retrieved.
* Distinguishes between **Artifact** (transmission state) and **Prompt Slot** (thinking state).
3. **Core Engine: Context Funnel**:
* Simulates the limited Token window of LLMs.
* Implements a **dynamic compression algorithm**: when the conversation is too long, it automatically merges old historical Slots into a `Summary Slot`, while retaining the latest User Input.
## 📂 Project Structure
```text
a2a-protocol-demo3/
├── requirements.txt # Dependency list
├── protocol.py # [Base] Defines Slot data structure and A2A protocol package
├── engine.py # [Core] Context Funnel engine (compression and pruning logic)
├── server.py # [Server] Agent entity, includes memory storage and API
└── client.py # [Client] Simulates user interaction to observe Prompt changes
```
## 🚀 Quick Start
### 1. Install Dependencies
Make sure you are using Python 3.8+.
```bash
pip install -r requirements.txt
```
*If you haven't created `requirements.txt`, please install the following packages:*
`pip install fastapi uvicorn sse-starlette httpx pydantic`
### 2. Start the Server
The server will run at `http://localhost:8000`. It includes a "small brain capacity" Agent limited to **50 Tokens**, forcing it to frequently perform memory compression.
```bash
python server.py
```
### 3. Run the Client
The client will simulate multi-turn conversations with the Agent and print out the **internal thought process of the server**.
```bash
python client.py
```
## 🔍 Key Observations (Learning Notes)
When running `client.py`, pay special attention to the **[Under the Hood]** section of the console output. You will see the following evolution process:
### Stage 1: Short Conversation (Uncompressed)
At this point, the history is short, and the Context Funnel passes through directly.
```text
[User]: Hello!
🔍 [Under the Hood]
- Actual Prompt fed to LLM:
[USER]: Hello!
```
### Stage 2: Trigger Compression (Token Overflow)
When sending long text, the Tokens exceed the 50 limit, triggering the Context Funnel to compress old history into a summary.
```text
[User]: I am learning A2A protocol...
🔍 [Under the Hood]
- Actual Prompt fed to LLM:
[SYSTEM_SUMMARY]: Summary(2 rounds) <-- Old memory becomes a summary
[USER]: I am learning A2A protocol... <-- Latest input is fully retained
```
## 🧠 Core Architecture Analysis
### Context Funnel
This is the logical core of the Demo (`engine.py`). It addresses the contradiction of **"infinite conversation history vs. limited model window"** in LLM development.
Flowchart:
```
graph LR
A[Raw Memory (Database)] -->|Retrieve full history| B(Context Funnel)
B -->|Calculate Tokens| C{Exceeds limit?}
C -- No --> D[Directly generate Prompt]
C -- Yes --> E[Compression strategy]
E -->|1. Retain latest User Input| F[Prompt]
E -->|2. Compress old history into Summary| F
F -->|Final input| G[LLM]
```
## 📝 License
MIT