# TLC Search + Feed Master Integration This directory contains an integrated setup combining: - **TLC Search**: Flask app for searching YouTube transcripts (Elasticsearch/Qdrant) - **Feed Master**: RSS aggregator for YouTube channels - **RSS Bridge**: Converts YouTube channels to RSS feeds All services share the same source of truth for YouTube channels from `channels.yml` and the adjacent `urls.txt` in this repository. ## Architecture ``` ┌─────────────────────┐ │ channels.yml │ Source of truth (this repo) │ (python_app repo) │ └──────────┬──────────┘ │ ├─────────────────────────────┬────────────────────────┐ │ │ │ v v v ┌──────────────┐ ┌──────────────┐ ┌─────────────────┐ │ TLC Search │ │ RSS Bridge │ │ Feed Master │ │ (Flask App) │ │ (Port 3001) │───────>│ (Port 8097) │ │ Port 8080 │ └──────────────┘ └─────────────────┘ │ │ │ │ Elasticsearch│ │ │ Qdrant │ │ └──────────────┘ │ v http://localhost:8097/rss/youtube-unified ``` ## Services ### 1. TLC Search (Port 8080) - Indexes and searches YouTube transcripts - Uses Elasticsearch for metadata and Qdrant for vector search - Connects to remote Elasticsearch/Qdrant instances ### 2. RSS Bridge (Port 3001) - Converts YouTube channels to RSS feeds - Supports both channel IDs and @handles - Used by Feed Master to aggregate feeds ### 3. Feed Master (Port 8097) - Aggregates all YouTube channel RSS feeds into one unified feed - Updates every 5 minutes - Keeps the most recent 200 items from all channels ## Setup ### Prerequisites - Docker and Docker Compose - Python 3.x ### Configuration 1. **Environment Variables**: Create `.env` file with: ```bash # Elasticsearch ELASTIC_URL=https://your-elasticsearch-url ELASTIC_INDEX=this_little_corner_py ELASTIC_USERNAME=your_username ELASTIC_PASSWORD=your_password # Qdrant QDRANT_URL=https://your-qdrant-url QDRANT_COLLECTION=tlc-captions-full # Optional UI links RSS_FEED_URL=/rss/youtube-unified CHANNELS_PATH=/app/python_app/channels.yml RSS_FEED_UPSTREAM=http://feed-master:8080 ``` 2. **Generate Feed Configuration**: ```bash # Regenerate feed-master config from the channels list python3 -m python_app.generate_feed_config_simple ``` This reads `channels.yml` and generates `feed-master-config/fm.yml`. ### Starting Services ```bash # Start all services docker compose up -d # View logs docker compose logs -f # View specific service logs docker compose logs -f feed-master docker compose logs -f rss-bridge docker compose logs -f app ``` ### Stopping Services ```bash # Stop all services docker compose down # Stop specific service docker compose stop feed-master ``` ## Usage ### Unified RSS Feed Access the aggregated feed through the TLC app (recommended): - **URL**: http://localhost:8080/rss - **Format**: RSS/Atom XML - **Behavior**: Filters RSS-Bridge error items and prefixes titles with channel name - **Updates**: Every 5 minutes (feed-master schedule) - **Items**: Most recent 200 items across all channels Direct feed-master access still works: - **URL**: http://localhost:8097/rss/youtube-unified ### TLC Search Access the search interface at: - **URL**: http://localhost:8080 ### Channel List Endpoints - **Plain text list**: http://localhost:8080/channels.txt - **JSON metadata**: http://localhost:8080/api/channel-list ### RSS Bridge Access individual channel feeds or the web interface at: - **URL**: http://localhost:3001 ## Updating Channel List When channels are added/removed from `channels.yml`: ```bash # 1. Regenerate feed configuration cd /var/core/this-little-corner/src/python_app python3 -m python_app.generate_feed_config_simple # 2. Restart feed-master to pick up changes docker compose restart feed-master ``` ## File Structure ``` python_app/ ├── docker-compose.yml # All services configuration ├── channels.yml # Canonical YouTube channel list ├── urls.txt # URL list kept in sync with channels.yml ├── generate_feed_config_simple.py # Config generator script (run via python -m) ├── feed-master-config/ │ ├── fm.yml # Feed Master configuration (auto-generated) │ ├── var/ # Feed Master database │ └── images/ # Cached images ├── data/ # TLC Search data (read-only) └── README-FEED-MASTER.md # This file ``` ## Troubleshooting ### Feed Master not updating ```bash # Check if RSS Bridge is accessible curl http://localhost:3001 # Restart both services in order docker compose restart rss-bridge sleep 10 docker compose restart feed-master ``` ### Configuration issues ```bash # Regenerate configuration python -m python_app.generate_feed_config_simple # Validate the YAML cat feed-master-config/fm.yml # Restart feed-master docker compose restart feed-master ``` ### View feed-master logs ```bash docker compose logs -f feed-master | grep -E "(ERROR|WARN|youtube)" ``` ## Integration Notes - **Single Source of Truth**: All channel URLs come from `channels.yml` and `urls.txt` in this repo - **Automatic Regeneration**: Run `python3 -m python_app.generate_feed_config_simple` when `channels.yml` changes - **No Manual Editing**: Don't edit `fm.yml` directly - regenerate it from the script - **Handle Support**: Supports both `/channel/ID` and `/@handle` URL formats - **Shared Channels**: Same channels used for transcript indexing (TLC Search) and RSS aggregation (Feed Master) - **Skip Broken RSS**: Set `rss: false` in `channels.yml` to exclude a channel from RSS aggregation ## Future Enhancements - [ ] Automated config regeneration on git pull - [ ] Channel name lookup from YouTube API - [ ] Integration with TLC Search for unified UI - [ ] Webhook notifications for new videos - [ ] OPML export for other RSS readers