Content
# A2A: Multi-Agent System for Automated Invoice Processing
A sophisticated multi-agent system built in Python for automating the entire invoice processing workflow, from document scanning to report generation and email notifications.
## System Architecture
Architecture diagram:

Workflow diagram:

## Core Components
### 1. Agent Framework (a2a/)
- Base agent implementation with message passing protocol
- Asynchronous communication between agents
- Robust error handling and logging
- Message queue management
### 2. Specialized Agents (agents/)
#### File Scanner Agent
- Monitors invoice directory for new files
- Supports PDF and image formats (.pdf, .jpg, .jpeg, .png)
- Uses watchdog for real-time file system events
- Maintains file processing state in PostgreSQL
- Deduplication using file checksums
#### OCR Agent
- Extracts text from invoice documents
- Uses Tesseract OCR engine
- Supports multiple document formats
- Pre-processing for better OCR accuracy
#### Data Entry Agent
- Uses local LLM (Mistral via Ollama) for data extraction
- Structured data parsing from OCR text
- Excel and PostgreSQL data storage
- Data validation and error handling
#### Report Agent
- Generates customizable reports
- Supports scheduled and immediate reporting
- Data visualization using matplotlib
- Excel report generation
#### Email Agent
- Automated email notifications
- Report attachment handling
- SMTP configuration
- Multiple recipient support
### 3. Web Interface (app.py)
- Built with Gradio
- Real-time processing status
- File upload interface
- System monitoring dashboard
## Technical Requirements
### System Dependencies
- Python 3.8+
- PostgreSQL 12+
- Tesseract OCR
- Poppler Utils
- Ollama (for local LLM)
### Python Dependencies
Key packages and their purposes:
- `asyncio`: Asynchronous I/O
- `watchdog`: File system monitoring
- `pandas`: Data manipulation
- `gradio`: Web interface
- `matplotlib`: Data visualization
- `psycopg2-binary`: PostgreSQL adapter
- `python-dotenv`: Environment management
- `Pillow`: Image processing
- `pytesseract`: OCR integration
- `ollama`: Local LLM integration
## Quick start Guide:
1. Clone the repository:
```bash
git clone https://github.com/namnd00/a2a-multi-agent-system-for-automated-invoice-processing.git
cd a2a-multi-agent-system-for-automated-invoice-processing
```
2. Install all dependencies:
```bash
chmod +x install.sh
./install.sh
```
3. Configure environment variables:
Required environment variables:
```env
# Email Configuration
SMTP_SERVER=smtp.gmail.com
SMTP_PORT=587
SENDER_EMAIL=your-email@gmail.com
EMAIL_PASSWORD=your-app-password
RECIPIENT_EMAILS=recipient1@email.com,recipient2@email.com
# Database Configuration
DATABASE_URL=postgresql://user:password@localhost:5432/invoice_db
# Report Configuration
REPORT_HOUR=10
REPORT_MINUTE=0
REPORT_MODE=scheduled # or 'immediate'
# LLM Configuration
DATA_ENTRY_MODEL_NAME=mistral # Ollama model name
```
4. Launch the web interface:
```bash
python app.py
```
The interface will be available at `http://localhost:7860`
5. Alternatively, run in CLI mode:
```bash
python main.py
```
## Step by step Installation Guide
1. Clone the repository:
```bash
git clone https://github.com/namnd00/a2a-multi-agent-system-for-automated-invoice-processing.git
cd a2a-multi-agent-system-for-automated-invoice-processing
```
2. Install system dependencies:
```bash
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install -y tesseract-ocr poppler-utils postgresql postgresql-contrib
# macOS
brew install tesseract poppler postgresql
```
3. Install Ollama:
```bash
# Run the installation script
./install.sh
```
4. Set up the Python environment:
```bash
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # Linux/macOS
# OR
.\venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
```
5. Initialize the database:
```bash
# Start PostgreSQL service
sudo service postgresql start # Linux
# OR
brew services start postgresql # macOS
# Create database and tables
./scripts/init_db.sh
```
6. Configure environment variables:
```bash
# Copy example env file
cp .env.example .env
# Edit .env with your settings
nano .env
```
Required environment variables:
```env
# Email Configuration
SMTP_SERVER=smtp.gmail.com
SMTP_PORT=587
SENDER_EMAIL=your-email@gmail.com
EMAIL_PASSWORD=your-app-password
RECIPIENT_EMAILS=recipient1@email.com,recipient2@email.com
# Database Configuration
DATABASE_URL=postgresql://user:password@localhost:5432/invoice_db
# Report Configuration
REPORT_HOUR=10
REPORT_MINUTE=0
REPORT_MODE=scheduled # or 'immediate'
# LLM Configuration
DATA_ENTRY_MODEL_NAME=mistral # Ollama model name
```
## Usage Guide
### Starting the System
1. Start the Ollama service:
```bash
./scripts/run_ollama.sh
```
2. Launch the web interface:
```bash
python app.py
```
The interface will be available at `http://localhost:7860`
3. Alternatively, run in CLI mode:
```bash
python main.py
```
### Processing Invoices
1. Web Interface:
- Upload invoices through the web interface
- Monitor processing status in real-time
- View processing history
2. Directory Monitoring:
- Place invoice files in the `invoices/` directory
- System automatically detects and processes new files
- Results stored in `data/invoices.xlsx` and PostgreSQL
### Report Generation
Two modes available:
1. Scheduled:
- Daily reports at specified time
- Set `REPORT_HOUR` and `REPORT_MINUTE` in .env
2. Immediate:
- Report generated after each invoice
- Set `REPORT_MODE=immediate` in .env
Reports include:
- Invoice summary
- Processing statistics
- Data visualizations
- Error reports
## Development Guide
### Project Structure
```
.
├── a2a/ # Core agent framework
├── agents/ # Agent implementations
├── database/ # Database models and utilities
├── utils/ # Helper functions
├── scripts/ # Utility scripts
├── tests/ # Test suite
├── app.py # Web interface
├── main.py # CLI entry point
└── requirements.txt # Python dependencies
```
### Adding New Features
1. Creating a New Agent:
- Inherit from `BaseAgent` in `a2a/base_agent.py`
- Implement `process_message()` and `run()` methods
- Register in `AgentSystem` class
2. Extending Database Schema:
- Add models to `database/models.py`
- Create migration script
- Update data entry logic
3. Adding Report Types:
- Extend `ReportAgent` class
- Add visualization functions
- Update email templates
### Testing
Run the test suite:
```bash
pytest tests/
```
## Troubleshooting
Common issues and solutions:
1. OCR Quality Issues:
- Check image resolution
- Verify Tesseract installation
- Adjust pre-processing parameters
2. Database Connection:
- Verify PostgreSQL service
- Check DATABASE_URL format
- Ensure database exists
3. Email Notifications:
- Check SMTP settings
- Verify app password
- Check recipient format
4. LLM Processing:
- Ensure Ollama is running
- Check model availability
- Verify memory requirements
## Contributing
1. Fork the repository
2. Create a feature branch
3. Implement changes with tests
4. Submit pull request
Follow our coding standards:
- Black for formatting
- Flake8 for linting
- Type hints required
- Docstrings for functions
## License
This project is licensed under the Apache License - see the [LICENSE](LICENSE) file for details.
## Support
For issues and feature requests:
- Create GitHub issue
- Provide detailed description
- Include relevant logs
- Specify environment details