MattGPT

Two-Track Architecture Documentation

🚀 MattGPT Two-Track Architecture Overview

🧠 Local AI Foundation
Powered by IBM's Granite 3.2:8b model running locally via Ollama, providing complete privacy and no cloud dependencies for all AI operations.
🏗️ Separation of Concerns
Two-Track Architecture separates complex automated workflows from conversational interactions, preventing interference and enabling specialized optimization.
📊 Data Processing Pipeline
Multi-source content curation from RSS feeds, web scraping, and PDF processing with AI-powered relevance scoring and hot topics detection.
🎙️ Content Generation
Edge-TTS voice synthesis creates daily podcasts with dual speaker personalities, intelligent script generation, and audio assembly.
👥 Customer Intelligence
CSV-based customer database with cluster organization, reference materials caching, and intelligent research insights generation.
📈 Analytics & Visualization
D3.js-powered word clouds, trend analysis, keyword frequency tables, and comprehensive weekly review generation.

🎯 Core System Capabilities

Local-First Architecture
Complete system runs locally with no cloud dependencies, ensuring privacy and control over all data processing
Modular Component Design
Independent modules for content curation, current events analysis, chat assistance, and customer management
Intelligent Progress Tracking
Real-time progress monitoring across all operations with detailed phase information and completion status
Flexible Configuration
JSON-based configuration for RSS sources, search providers, interests, and analysis parameters

🔄 Two-Track System Architecture

🎯 Architecture Philosophy

The Two-Track Architecture implements complete separation of concerns between complex automated workflows and conversational interactions, preventing interference while enabling specialized optimization for each use case.

🔗 Track 1: Chain Operations
Purpose: Complex automated workflows
Trigger: Button-driven operations
Examples: "What's Going On?", Week in Review, Daily Podcast
Features: Progress tracking, multi-step orchestration, result modals
💬 Track 2: Enhanced Chat
Purpose: Conversational interactions
Trigger: Text input and messaging
Examples: Article discussions, URL analysis, customer queries
Features: Session management, context awareness, streaming responses

🔄 Track Interaction Patterns

User Interface Layer
Single UI routes interactions to appropriate track based on input type
Button clicks → Chain Operations / Text input → Enhanced Chat
Request Classification
Intelligent routing determines which track should handle the request
Intent detection, session context, and operation type analysis
Track 1: Chain Operations Manager
Handles complex workflows with direct module orchestration
Direct API calls to content curator, current events analyzer, podcast generator
Track 2: Enhanced Chat Manager
Manages conversational interactions with session state
Article context, customer database queries, URL analysis, PDF processing
Response Formatting
Track-specific response formatting and delivery to frontend
Chain: Progress tracking + Modals / Chat: Streaming responses + Context

🎯 Track Isolation Benefits

⚡ Performance Optimization
Each track optimized for its specific use case without interference from the other track's operations or state management.
🔄 Independent Scaling
Chain operations can handle long-running workflows while chat maintains responsiveness for immediate interactions.
🛡️ Error Isolation
Failures in one track don't affect the other, ensuring system stability and graceful degradation.
🔧 Simplified Debugging
Clear separation makes it easier to troubleshoot issues and implement new features without cross-track interference.

🔗 Chain Operations Track

🎯 Chain Operations Manager

The Chain Operations Manager provides direct orchestration of complex workflows without research assistant dependency. It manages multi-step operations with progress tracking and result delivery.

🔍 "What's Going On?" Workflow
Steps: Current events analysis → Hot topics detection → Article collection → Relevance scoring
Output: Ranked articles with boost scores and trending topics modal
Features: Force refresh option, existing data detection
📊 Week in Review Generation
Steps: File discovery → Data parsing → LLM consolidation → Visualization
Output: D3.js word clouds, keyword frequency tables, comprehensive summaries
Features: 7-day analysis, caching, print-friendly format
🎙️ Daily Podcast Creation
Steps: Article selection → Script generation → Voice synthesis → Audio assembly
Output: MP3 podcast with dual speaker dialogue (Alex & Sam)
Features: Edge-TTS integration, fallback options, metadata generation
🔄 Force Content Refresh
Steps: Source validation → Content collection → AI analysis → Database update
Output: Fresh content with updated relevance scores
Features: Configurable timeframe, selective refresh options

🔗 Chain Operation Flow

🚀
Operation Initialization
Create session, check existing data, initialize progress tracking
🔍
Current Events Analysis
RSS feed collection, headline analysis, hot topics detection
📰
Content Collection
Multi-source article gathering, web scraping, content extraction
🤖
AI Processing
Granite 3.2 analysis, summarization, relevance scoring
🔥
Hot Topics Boost
Apply trending topic scores to enhance article rankings
📊
Results Compilation
Generate modal content, prepare downloads, update progress
Delivery Complete
Present results in interactive modal with filtering and chat options

🔧 API Endpoints

POST /api/chain/whats-going-on
Start "What's Going On?" analysis with optional force refresh
POST /api/chain/week-review
Generate comprehensive week in review with D3.js visualizations
POST /api/chain/podcast
Create daily podcast with Edge-TTS voice synthesis
GET /api/chain/progress/{session_id}
Get real-time progress updates for chain operations

💬 Enhanced Chat Track

🧠 Enhanced Chat Manager

The Enhanced Chat Manager provides stateful conversational interactions with advanced context awareness, session management, and specialized capabilities for article discussion and customer database queries.

💬 Session Management
Features: Persistent session state, conversation history, context preservation
Database: SQLite session storage with activity tracking
Lifecycle: Auto-cleanup, timeout handling, context switching
📰 Article Discussions
Features: Article context injection, relevance-based selection, intelligent responses
Capabilities: Summarization, key points extraction, follow-up questions
Integration: Direct database article access, metadata utilization
🌐 URL Analysis
Features: Web scraping, PDF processing, content extraction
Supported: Web pages, PDF documents, academic papers
Processing: Multi-library PDF support, graceful fallbacks
👥 Customer Intelligence
Features: Customer database queries, cluster analysis, research insights
Data Source: CSV-based customer management with cache integration
Capabilities: Fuzzy matching, multi-step workflows, battle card generation

🎯 Chat Intent Detection

Message Analysis
Parse user input for intent patterns and context clues
Database search patterns, article references, customer keywords
Intent Classification
Determine appropriate response strategy based on detected intent
Article discussion, database search, URL analysis, customer management
Context Enrichment
Gather relevant context data for enhanced responses
Article content, customer data, URL processing, session history
Response Generation
Generate contextually aware response using Granite 3.2
Streaming responses, follow-up suggestions, action recommendations

🔧 Enhanced Chat Features

🏗️ Technical Implementation Details

🔧 System Components

🧠 AI Engine
IBM Granite 3.2:8b model via Ollama providing local AI capabilities without cloud dependencies.
  • generate_summary()
  • analyze_relevance()
  • extract_keywords()
  • chat_completion()
📊 Content Curator
Multi-source content collection and curation with RSS feeds, web scraping, and AI analysis.
  • collect_rss_feeds()
  • scrape_article_content()
  • analyze_trending_topics()
  • calculate_relevance_scores()
🎙️ Podcast Generator
AI-powered podcast creation with Edge-TTS voice synthesis and intelligent script generation.
  • generate_dialogue_script()
  • synthesize_voice_audio()
  • assemble_podcast_segments()
  • create_metadata()
📈 Analytics Engine
D3.js-powered visualizations with keyword analysis and trend detection capabilities.
  • generate_word_clouds()
  • analyze_keyword_frequency()
  • create_trend_visualizations()
  • export_analysis_reports()

🔄 Data Flow Architecture

Data Ingestion
RSS feeds, web scraping, PDF processing, and user inputs
Content Processing
AI analysis, summarization, keyword extraction, and relevance scoring
Data Storage
SQLite database with session management and cache optimization
Visualization & Output
D3.js visualizations, podcast generation, and interactive reports
User Interface
Responsive web interface with real-time updates and progress tracking

🔌 API Architecture

GET /api/health
System health check and component status
POST /api/chat/message
Send message to enhanced chat system
GET /api/articles/search
Search articles database with filters
POST /api/analysis/url
Analyze URL content with AI processing