From 63fe922860b82f5d9bf91fa042f92a861397b07f Mon Sep 17 00:00:00 2001 From: knight Date: Thu, 8 Jan 2026 22:46:30 -0500 Subject: [PATCH] Document channel feeds --- README-FEED-MASTER.md | 209 ++++++++++++++++++++++++++++++++++++++++++ urls.txt | 74 +++++++++++++++ 2 files changed, 283 insertions(+) create mode 100644 README-FEED-MASTER.md create mode 100644 urls.txt diff --git a/README-FEED-MASTER.md b/README-FEED-MASTER.md new file mode 100644 index 0000000..ddb531b --- /dev/null +++ b/README-FEED-MASTER.md @@ -0,0 +1,209 @@ +# TLC Search + Feed Master Integration + +This directory contains an integrated setup combining: +- **TLC Search**: Flask app for searching YouTube transcripts (Elasticsearch/Qdrant) +- **Feed Master**: RSS aggregator for YouTube channels +- **RSS Bridge**: Converts YouTube channels to RSS feeds + +All services share the same source of truth for YouTube channels from `channels.yml` and the adjacent +`urls.txt` in this repository. + +## Architecture + +``` +┌─────────────────────┐ +│ channels.yml │ Source of truth (this repo) +│ (python_app repo) │ +└──────────┬──────────┘ + │ + ├─────────────────────────────┬────────────────────────┐ + │ │ │ + v v v + ┌──────────────┐ ┌──────────────┐ ┌─────────────────┐ + │ TLC Search │ │ RSS Bridge │ │ Feed Master │ + │ (Flask App) │ │ (Port 3001) │───────>│ (Port 8097) │ + │ Port 8080 │ └──────────────┘ └─────────────────┘ + │ │ │ + │ Elasticsearch│ │ + │ Qdrant │ │ + └──────────────┘ │ + v + http://localhost:8097/rss/youtube-unified +``` + +## Services + +### 1. TLC Search (Port 8080) +- Indexes and searches YouTube transcripts +- Uses Elasticsearch for metadata and Qdrant for vector search +- Connects to remote Elasticsearch/Qdrant instances + +### 2. RSS Bridge (Port 3001) +- Converts YouTube channels to RSS feeds +- Supports both channel IDs and @handles +- Used by Feed Master to aggregate feeds + +### 3. Feed Master (Port 8097) +- Aggregates all YouTube channel RSS feeds into one unified feed +- Updates every 5 minutes +- Keeps the most recent 200 items from all channels + +## Setup + +### Prerequisites +- Docker and Docker Compose +- Python 3.x + +### Configuration + +1. **Environment Variables**: Create `.env` file with: +```bash +# Elasticsearch +ELASTIC_URL=https://your-elasticsearch-url +ELASTIC_INDEX=this_little_corner_py +ELASTIC_USERNAME=your_username +ELASTIC_PASSWORD=your_password + +# Qdrant +QDRANT_URL=https://your-qdrant-url +QDRANT_COLLECTION=tlc-captions-full + +# Optional UI links +RSS_FEED_URL=/rss/youtube-unified +CHANNELS_PATH=/app/python_app/channels.yml +RSS_FEED_UPSTREAM=http://feed-master:8080 +``` + +2. **Generate Feed Configuration**: +```bash +# Regenerate feed-master config from the channels list +python3 -m python_app.generate_feed_config_simple +``` + +This reads `channels.yml` and generates `feed-master-config/fm.yml`. + +### Starting Services + +```bash +# Start all services +docker compose up -d + +# View logs +docker compose logs -f + +# View specific service logs +docker compose logs -f feed-master +docker compose logs -f rss-bridge +docker compose logs -f app +``` + +### Stopping Services + +```bash +# Stop all services +docker compose down + +# Stop specific service +docker compose stop feed-master +``` + +## Usage + +### Unified RSS Feed +Access the aggregated feed through the TLC app (recommended): +- **URL**: http://localhost:8080/rss +- **Format**: RSS/Atom XML +- **Behavior**: Filters RSS-Bridge error items and prefixes titles with channel name +- **Updates**: Every 5 minutes (feed-master schedule) +- **Items**: Most recent 200 items across all channels + +Direct feed-master access still works: +- **URL**: http://localhost:8097/rss/youtube-unified + +### TLC Search +Access the search interface at: +- **URL**: http://localhost:8080 + +### Channel List Endpoints +- **Plain text list**: http://localhost:8080/channels.txt +- **JSON metadata**: http://localhost:8080/api/channel-list + +### RSS Bridge +Access individual channel feeds or the web interface at: +- **URL**: http://localhost:3001 + +## Updating Channel List + +When channels are added/removed from `channels.yml`: + +```bash +# 1. Regenerate feed configuration +cd /var/core/this-little-corner/src/python_app + python3 -m python_app.generate_feed_config_simple + +# 2. Restart feed-master to pick up changes +docker compose restart feed-master +``` + +## File Structure + +``` +python_app/ +├── docker-compose.yml # All services configuration +├── channels.yml # Canonical YouTube channel list +├── urls.txt # URL list kept in sync with channels.yml +├── generate_feed_config_simple.py # Config generator script (run via python -m) +├── feed-master-config/ +│ ├── fm.yml # Feed Master configuration (auto-generated) +│ ├── var/ # Feed Master database +│ └── images/ # Cached images +├── data/ # TLC Search data (read-only) +└── README-FEED-MASTER.md # This file +``` + +## Troubleshooting + +### Feed Master not updating +```bash +# Check if RSS Bridge is accessible +curl http://localhost:3001 + +# Restart both services in order +docker compose restart rss-bridge +sleep 10 +docker compose restart feed-master +``` + +### Configuration issues +```bash +# Regenerate configuration +python -m python_app.generate_feed_config_simple + +# Validate the YAML +cat feed-master-config/fm.yml + +# Restart feed-master +docker compose restart feed-master +``` + +### View feed-master logs +```bash +docker compose logs -f feed-master | grep -E "(ERROR|WARN|youtube)" +``` + +## Integration Notes + +- **Single Source of Truth**: All channel URLs come from `channels.yml` and `urls.txt` in this repo +- **Automatic Regeneration**: Run `python3 -m python_app.generate_feed_config_simple` when `channels.yml` changes +- **No Manual Editing**: Don't edit `fm.yml` directly - regenerate it from the script +- **Handle Support**: Supports both `/channel/ID` and `/@handle` URL formats +- **Shared Channels**: Same channels used for transcript indexing (TLC Search) and RSS aggregation (Feed Master) +- **Skip Broken RSS**: Set `rss: false` in `channels.yml` to exclude a channel from RSS aggregation + +## Future Enhancements + +- [ ] Automated config regeneration on git pull +- [ ] Channel name lookup from YouTube API +- [ ] Integration with TLC Search for unified UI +- [ ] Webhook notifications for new videos +- [ ] OPML export for other RSS readers diff --git a/urls.txt b/urls.txt new file mode 100644 index 0000000..f580ada --- /dev/null +++ b/urls.txt @@ -0,0 +1,74 @@ +https://www.youtube.com/channel/UCCebR16tXbv5Ykk9_WtCCug/videos +https://www.youtube.com/channel/UC6vg0HkKKlgsWk-3HfV-vnw/videos +https://www.youtube.com/channel/UCeWWxwzgLYUbfjWowXhVdYw/videos +https://www.youtube.com/channel/UC952hDf_C4nYJdqwK7VzTxA/videos +https://www.youtube.com/channel/UCU5SNBfTo4umhjYz6M0Jsmg/videos +https://www.youtube.com/channel/UC6Tvr9mBXNaAxLGRA_sUSRA/videos +https://www.youtube.com/channel/UC4Rmxg7saTfwIpvq3QEzylQ/videos +https://www.youtube.com/channel/UCTdH4nh6JTcfKUAWvmnPoIQ/videos +https://www.youtube.com/channel/UCsi_x8c12NW9FR7LL01QXKA/videos +https://www.youtube.com/channel/UCAqTQ5yLHHH44XWwWXLkvHQ/videos +https://www.youtube.com/channel/UCprytROeCztMOMe8plyJRMg/videos +https://www.youtube.com/channel/UCpqDUjTsof-kTNpnyWper_Q/videos +https://www.youtube.com/channel/UCL_f53ZEJxp8TtlOkHwMV9Q/videos +https://www.youtube.com/channel/UCez1fzMRGctojfis2lfRYug/videos +https://www.youtube.com/channel/UC2leFZRD0ZlQDQxpR2Zd8oA/videos +https://www.youtube.com/channel/UC8SErJkYnDsYGh1HxoZkl-g/videos +https://www.youtube.com/channel/UCEPOn4cgvrrerg_-q_Ygw1A/videos +https://www.youtube.com/channel/UC2yCyOMUeem-cYwliC-tLJg/videos +https://www.youtube.com/channel/UCGsDIP_K6J6VSTqlq-9IPlg/videos +https://www.youtube.com/channel/UCEzWTLDYmL8soRdQec9Fsjw/videos +https://www.youtube.com/channel/UC1KgNsMdRoIA_njVmaDdHgA/videos +https://www.youtube.com/channel/UCFQ6Gptuq-sLflbJ4YY3Umw/videos +https://www.youtube.com/channel/UCEY1vGNBPsC3dCatZyK3Jkw/videos +https://www.youtube.com/channel/UCIAtCuzdvgNJvSYILnHtdWA/videos +https://www.youtube.com/channel/UClIDP7_Kzv_7tDQjTv9EhrA/videos +https://www.youtube.com/channel/UC-QiBn6GsM3JZJAeAQpaGAA/videos +https://www.youtube.com/channel/UCiJmdXTb76i8eIPXdJyf8ZQ/videos +https://www.youtube.com/channel/UCM9Z05vuQhMEwsV03u6DrLA/videos +https://www.youtube.com/channel/UCgp_r6WlBwDSJrP43Mz07GQ/videos +https://www.youtube.com/channel/UC5uv-BxzCrN93B_5qbOdRWw/videos +https://www.youtube.com/channel/UCtCTSf3UwRU14nYWr_xm-dQ/videos +https://www.youtube.com/channel/UC1a4VtU_SMSfdRiwMJR33YQ/videos +https://www.youtube.com/channel/UCg7Ed0lecvko58ibuX1XHng/videos +https://www.youtube.com/channel/UCMVG5eqpYFVEB-a9IqAOuHA/videos +https://www.youtube.com/channel/UC8mJqpS_EBbMcyuzZDF0TEw/videos +https://www.youtube.com/channel/UCGHuURJ1XFHzPSeokf6510A/videos +https://www.youtube.com/@chrishoward8473/videos +https://www.youtube.com/channel/UChptV-kf8lnncGh7DA2m8Pw/videos +https://www.youtube.com/channel/UCzX6R3ZLQh5Zma_5AsPcqPA/videos +https://www.youtube.com/channel/UCiukuaNd_qzRDTW9qe2OC1w/videos +https://www.youtube.com/channel/UC5yLuFQCms4nb9K2bGQLqIw/videos +https://www.youtube.com/channel/UCVdSgEf9bLXFMBGSMhn7x4Q/videos +https://www.youtube.com/channel/UC_dnk5D4tFCRYCrKIcQlcfw/videos +https://www.youtube.com/@Freerilian/videos +https://www.youtube.com/@marks.-ry7bm/videos +https://www.youtube.com/@Adams-Fall/videos +https://www.youtube.com/@mcmosav/videos +https://www.youtube.com/@Landbeorht/videos +https://www.youtube.com/@Corner_Citizen/videos +https://www.youtube.com/@ethan.caughey/videos +https://www.youtube.com/@MarcInTbilisi/videos +https://www.youtube.com/@climbingmt.sophia/videos +https://www.youtube.com/@Skankenstein/videos +https://www.youtube.com/@UpCycleClub/videos +https://www.youtube.com/@JessPurviance/videos +https://www.youtube.com/@greyhamilton52/videos +https://www.youtube.com/@paulrenenichols/videos +https://www.youtube.com/@OfficialSecularKoranism/videos +https://www.youtube.com/@FromWhomAllBlessingsFlow/videos +https://www.youtube.com/@FoodTruckEmily/videos +https://www.youtube.com/@O.G.Rose.Michelle.and.Daniel/videos +https://www.youtube.com/@JonathanDumeer/videos +https://www.youtube.com/@JordanGreenhall/videos +https://www.youtube.com/@NechamaGluck/videos +https://www.youtube.com/@justinsmorningcoffee/videos +https://www.youtube.com/@grahampardun/videos +https://www.youtube.com/@michaelmartin8681/videos +https://www.youtube.com/@davidbusuttil9086/videos +https://www.youtube.com/@matthewparlato5626/videos +https://www.youtube.com/@lancecleaver227/videos +https://www.youtube.com/@theplebistocrat/videos +https://www.youtube.com/@RightInChrist/videos +https://www.youtube.com/@RafeKelley/videos +https://www.youtube.com/@WavesOfObsession/videos