6.8 KiB
6.8 KiB
TLC Search + Feed Master Integration
This directory contains an integrated setup combining:
- TLC Search: Flask app for searching YouTube transcripts (Elasticsearch/Qdrant)
- Feed Master: RSS aggregator for YouTube channels
- RSS Bridge: Converts YouTube channels to RSS feeds
All services share the same source of truth for YouTube channels from channels.yml and the adjacent
urls.txt in this repository.
Architecture
┌─────────────────────┐
│ channels.yml │ Source of truth (this repo)
│ (python_app repo) │
└──────────┬──────────┘
│
├─────────────────────────────┬────────────────────────┐
│ │ │
v v v
┌──────────────┐ ┌──────────────┐ ┌─────────────────┐
│ TLC Search │ │ RSS Bridge │ │ Feed Master │
│ (Flask App) │ │ (Port 3001) │───────>│ (Port 8097) │
│ Port 8080 │ └──────────────┘ └─────────────────┘
│ │ │
│ Elasticsearch│ │
│ Qdrant │ │
└──────────────┘ │
v
http://localhost:8097/rss/youtube-unified
Services
1. TLC Search (Port 8080)
- Indexes and searches YouTube transcripts
- Uses Elasticsearch for metadata and Qdrant for vector search
- Connects to remote Elasticsearch/Qdrant instances
2. RSS Bridge (Port 3001)
- Converts YouTube channels to RSS feeds
- Supports both channel IDs and @handles
- Used by Feed Master to aggregate feeds
3. Feed Master (Port 8097)
- Aggregates all YouTube channel RSS feeds into one unified feed
- Updates every 5 minutes
- Keeps the most recent 200 items from all channels
Setup
Prerequisites
- Docker and Docker Compose
- Python 3.x
Configuration
- Environment Variables: Create
.envfile with:
# Elasticsearch
ELASTIC_URL=https://your-elasticsearch-url
ELASTIC_INDEX=this_little_corner_py
ELASTIC_USERNAME=your_username
ELASTIC_PASSWORD=your_password
# Qdrant
QDRANT_URL=https://your-qdrant-url
QDRANT_COLLECTION=tlc-captions-full
# Optional UI links
RSS_FEED_URL=/rss/youtube-unified
CHANNELS_PATH=/app/python_app/channels.yml
RSS_FEED_UPSTREAM=http://feed-master:8080
- Generate Feed Configuration:
# Regenerate feed-master config from the channels list
python3 -m python_app.generate_feed_config_simple
This reads channels.yml and generates feed-master-config/fm.yml.
Starting Services
# Start all services
docker compose up -d
# View logs
docker compose logs -f
# View specific service logs
docker compose logs -f feed-master
docker compose logs -f rss-bridge
docker compose logs -f app
Stopping Services
# Stop all services
docker compose down
# Stop specific service
docker compose stop feed-master
Usage
Unified RSS Feed
Access the aggregated feed through the TLC app (recommended):
- URL: http://localhost:8080/rss
- Format: RSS/Atom XML
- Behavior: Filters RSS-Bridge error items and prefixes titles with channel name
- Updates: Every 5 minutes (feed-master schedule)
- Items: Most recent 200 items across all channels
Direct feed-master access still works:
TLC Search
Access the search interface at:
Channel List Endpoints
- Plain text list: http://localhost:8080/channels.txt
- JSON metadata: http://localhost:8080/api/channel-list
RSS Bridge
Access individual channel feeds or the web interface at:
Updating Channel List
When channels are added/removed from channels.yml:
# 1. Regenerate feed configuration
cd /var/core/this-little-corner/src/python_app
python3 -m python_app.generate_feed_config_simple
# 2. Restart feed-master to pick up changes
docker compose restart feed-master
File Structure
python_app/
├── docker-compose.yml # All services configuration
├── channels.yml # Canonical YouTube channel list
├── urls.txt # URL list kept in sync with channels.yml
├── generate_feed_config_simple.py # Config generator script (run via python -m)
├── feed-master-config/
│ ├── fm.yml # Feed Master configuration (auto-generated)
│ ├── var/ # Feed Master database
│ └── images/ # Cached images
├── data/ # TLC Search data (read-only)
└── README-FEED-MASTER.md # This file
Troubleshooting
Feed Master not updating
# Check if RSS Bridge is accessible
curl http://localhost:3001
# Restart both services in order
docker compose restart rss-bridge
sleep 10
docker compose restart feed-master
Configuration issues
# Regenerate configuration
python -m python_app.generate_feed_config_simple
# Validate the YAML
cat feed-master-config/fm.yml
# Restart feed-master
docker compose restart feed-master
View feed-master logs
docker compose logs -f feed-master | grep -E "(ERROR|WARN|youtube)"
Integration Notes
- Single Source of Truth: All channel URLs come from
channels.ymlandurls.txtin this repo - Automatic Regeneration: Run
python3 -m python_app.generate_feed_config_simplewhenchannels.ymlchanges - No Manual Editing: Don't edit
fm.ymldirectly - regenerate it from the script - Handle Support: Supports both
/channel/IDand/@handleURL formats - Shared Channels: Same channels used for transcript indexing (TLC Search) and RSS aggregation (Feed Master)
- Skip Broken RSS: Set
rss: falseinchannels.ymlto exclude a channel from RSS aggregation
Future Enhancements
- Automated config regeneration on git pull
- Channel name lookup from YouTube API
- Integration with TLC Search for unified UI
- Webhook notifications for new videos
- OPML export for other RSS readers