Debugging a Silent Data Pipeline: How I Fixed My AI's Proactive Intelligence System
Debugging a Silent Data Pipeline: How I Fixed My AI's Proactive Intelligence System
January 8, 2026
A case study in tracing data flow through autonomous systems.
The Problem: A System That Never Acted
I built a Proactive Intelligence system for my AI agent - designed to anticipate needs before being asked. Pattern learning, opportunity detection, intelligent surfacing. The architecture was sound. The code compiled. The tests passed.
But in production? 0% action rate.
The dashboard showed opportunities detected: 0. Suggestions delivered: 0. The system sat dormant, waiting for data that never arrived.
The Architecture
Before diving into debugging, here's what the system was supposed to do:
┌─────────────────────────────────────────────────────────────────┐
│ PROACTIVE INTELLIGENCE SYSTEM │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌───────────────────┐ ┌─────────────┐ │
│ │ Email Triage │───▶│ email_classific- │───▶│ Opportunity │ │
│ │ (Manual) │ │ ations (Postgres) │ │ Detector │ │
│ └──────────────┘ └───────────────────┘ └─────────────┘ │
│ │ │
│ ┌──────────────┐ ┌───────────────────┐ │ │
│ │ Calendar │───▶│ calendar_events │──────────┤ │
│ │ Sync │ │ (Postgres) │ │ │
│ └──────────────┘ └───────────────────┘ ▼ │
│ ┌─────────────┐ │
│ ┌──────────────┐ ┌───────────────────┐ │ Suggestion │ │
│ │ Pattern │───▶│ learned_patterns │──▶│ Queue │ │
│ │ Learner │ │ (Postgres) │ └─────────────┘ │
│ └──────────────┘ └───────────────────┘ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Proactive │ │
│ │ Surfacer │ │
│ └─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
The OpportunityDetector runs every 30 minutes, checking: 1. High-priority emails that need responses 2. Calendar conflicts or preparation needs 3. Patterns that suggest upcoming work
When it finds something, it creates a Suggestion - a proactive recommendation surfaced before I ask for it.
The Investigation: Follow the Data
The first rule of debugging data pipelines: trace the data flow from source to sink.
Step 1: Check the Database
SELECT COUNT(*) FROM email_classifications
WHERE classified_at > NOW() - INTERVAL '24 hours';
Result: 0 records
The email_classifications table was stale. Last entries from two days ago. No wonder the OpportunityDetector found nothing - there was nothing to find.
Step 2: Trace the Writer
Who writes to email_classifications? I searched the codebase:
# aegis/email/processor.py
def save_classification(self, triaged: TriagedEmail) -> bool:
"""Save a triaged email classification to the database."""
db.execute(
"""
INSERT INTO email_classifications
(email_id, sender, subject, priority, category,
requires_action, classified_at, metadata)
VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s)
ON CONFLICT (email_id) DO UPDATE SET ...
""", ...
)
The method existed. It worked when called directly. But who was calling it?
Step 3: Find the Gap
The email triage workflow:
Gmail MCP API → EmailProcessor.triage() → Return results → ???
The ??? was supposed to be save_classification(). But the email-triage command never called it. The data was processed, classified, and then... discarded.
The workflow terminated before persisting results.
The Root Cause
Here's what the email-triage command was doing:
## Steps
1. Query Gmail for unread emails (max 20)
2. For each email, run rule-based classification
3. For HIGH/CRITICAL emails, use AI for detailed triage
4. Post summary to Discord #alerts if any CRITICAL items found
5. Return structured triage results
Notice what's missing? No step to save classifications to the database.
The command triaged emails perfectly. It posted alerts correctly. But the downstream Proactive Intelligence system was starving for data because nothing persisted the results.
The Fix
Simple once identified:
## Steps
1. Query Gmail for unread emails (max 20)
2. For each email, run rule-based classification
3. For HIGH/CRITICAL emails, use AI for detailed triage
4. **IMPORTANT**: Save classifications to database for Proactive Intelligence:
```python
from aegis.email.processor import EmailProcessor
processor = EmailProcessor(user_id="aegis@richardbankole.com")
processor.save_classifications(triaged_list)
```
5. Post summary to Discord #alerts if any CRITICAL items found
6. Return structured triage results
The key addition: explicit persistence step with a comment explaining why.
Verification: End-to-End Test
After fixing:
- Ran email triage → 3 emails classified and saved
- Added a test high-priority email to the database
- Triggered OpportunityDetector manually
- Result: Suggestion created with 0.9 relevance score
# The system found the opportunity
Opportunity(
type='email_response',
relevance=0.9,
context={'subject': 'URGENT: Production deployment review needed'}
)
# And created an actionable suggestion
Suggestion(
content='Review and respond to urgent deployment email',
relevance=0.9,
source='email_opportunity'
)
The pipeline was alive.
Lessons Learned
1. Explicit Data Contracts
Every component in a data pipeline should have explicit contracts: - What data does it consume? - What data does it produce? - Where does each persist?
Don't assume downstream systems will handle persistence. Make it explicit in the workflow.
2. Monitoring the Gaps
I had monitoring on the components: - Gmail API calls tracked - Triage success rate measured - OpportunityDetector runs logged
But no monitoring on the data handoffs:
- No alert when email_classifications goes stale
- No metric for records written vs emails processed
Monitor the seams, not just the services.
3. Test with Real Data Flow
Unit tests passed because they mocked the database. Integration tests passed because they ran in isolation.
But no test verified: "When email-triage runs in production, does data reach the OpportunityDetector?"
End-to-end tests with real (or realistic) data flow would have caught this immediately.
4. Comments as Documentation
The fix included:
4. **IMPORTANT**: Save classifications to database for Proactive Intelligence:
That comment isn't for the code - it's for future me. It explains why this step exists and what breaks if removed.
Broader Implications
This pattern appears constantly in autonomous systems:
- Feature extraction that doesn't persist
- Analysis pipelines that log but don't store
- AI inference results discarded after display
The solution is always the same: 1. Map the full data flow on paper 2. Identify every persistence point 3. Verify each handoff produces expected records 4. Add monitoring for data staleness
Conclusion
The Proactive Intelligence system wasn't broken. It was starving. A single missing line - one call to save_classification() - severed the entire pipeline.
Silent failures in data pipelines are insidious because everything looks healthy. The Gmail API works. Triage succeeds. Dashboard renders. But downstream, systems sit idle, waiting for data that never arrives.
Always trace the data. Always verify the handoffs. Always check the database.
This post documents debugging work on Project Aegis, an autonomous AI agent system. The Proactive Intelligence module anticipates user needs by analyzing email patterns, calendar events, and historical behavior.