Document channel feeds
This commit is contained in:
parent
1ac076e5f2
commit
63fe922860
209
README-FEED-MASTER.md
Normal file
209
README-FEED-MASTER.md
Normal file
@ -0,0 +1,209 @@
|
||||
# TLC Search + Feed Master Integration
|
||||
|
||||
This directory contains an integrated setup combining:
|
||||
- **TLC Search**: Flask app for searching YouTube transcripts (Elasticsearch/Qdrant)
|
||||
- **Feed Master**: RSS aggregator for YouTube channels
|
||||
- **RSS Bridge**: Converts YouTube channels to RSS feeds
|
||||
|
||||
All services share the same source of truth for YouTube channels from `channels.yml` and the adjacent
|
||||
`urls.txt` in this repository.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────┐
|
||||
│ channels.yml │ Source of truth (this repo)
|
||||
│ (python_app repo) │
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
├─────────────────────────────┬────────────────────────┐
|
||||
│ │ │
|
||||
v v v
|
||||
┌──────────────┐ ┌──────────────┐ ┌─────────────────┐
|
||||
│ TLC Search │ │ RSS Bridge │ │ Feed Master │
|
||||
│ (Flask App) │ │ (Port 3001) │───────>│ (Port 8097) │
|
||||
│ Port 8080 │ └──────────────┘ └─────────────────┘
|
||||
│ │ │
|
||||
│ Elasticsearch│ │
|
||||
│ Qdrant │ │
|
||||
└──────────────┘ │
|
||||
v
|
||||
http://localhost:8097/rss/youtube-unified
|
||||
```
|
||||
|
||||
## Services
|
||||
|
||||
### 1. TLC Search (Port 8080)
|
||||
- Indexes and searches YouTube transcripts
|
||||
- Uses Elasticsearch for metadata and Qdrant for vector search
|
||||
- Connects to remote Elasticsearch/Qdrant instances
|
||||
|
||||
### 2. RSS Bridge (Port 3001)
|
||||
- Converts YouTube channels to RSS feeds
|
||||
- Supports both channel IDs and @handles
|
||||
- Used by Feed Master to aggregate feeds
|
||||
|
||||
### 3. Feed Master (Port 8097)
|
||||
- Aggregates all YouTube channel RSS feeds into one unified feed
|
||||
- Updates every 5 minutes
|
||||
- Keeps the most recent 200 items from all channels
|
||||
|
||||
## Setup
|
||||
|
||||
### Prerequisites
|
||||
- Docker and Docker Compose
|
||||
- Python 3.x
|
||||
|
||||
### Configuration
|
||||
|
||||
1. **Environment Variables**: Create `.env` file with:
|
||||
```bash
|
||||
# Elasticsearch
|
||||
ELASTIC_URL=https://your-elasticsearch-url
|
||||
ELASTIC_INDEX=this_little_corner_py
|
||||
ELASTIC_USERNAME=your_username
|
||||
ELASTIC_PASSWORD=your_password
|
||||
|
||||
# Qdrant
|
||||
QDRANT_URL=https://your-qdrant-url
|
||||
QDRANT_COLLECTION=tlc-captions-full
|
||||
|
||||
# Optional UI links
|
||||
RSS_FEED_URL=/rss/youtube-unified
|
||||
CHANNELS_PATH=/app/python_app/channels.yml
|
||||
RSS_FEED_UPSTREAM=http://feed-master:8080
|
||||
```
|
||||
|
||||
2. **Generate Feed Configuration**:
|
||||
```bash
|
||||
# Regenerate feed-master config from the channels list
|
||||
python3 -m python_app.generate_feed_config_simple
|
||||
```
|
||||
|
||||
This reads `channels.yml` and generates `feed-master-config/fm.yml`.
|
||||
|
||||
### Starting Services
|
||||
|
||||
```bash
|
||||
# Start all services
|
||||
docker compose up -d
|
||||
|
||||
# View logs
|
||||
docker compose logs -f
|
||||
|
||||
# View specific service logs
|
||||
docker compose logs -f feed-master
|
||||
docker compose logs -f rss-bridge
|
||||
docker compose logs -f app
|
||||
```
|
||||
|
||||
### Stopping Services
|
||||
|
||||
```bash
|
||||
# Stop all services
|
||||
docker compose down
|
||||
|
||||
# Stop specific service
|
||||
docker compose stop feed-master
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Unified RSS Feed
|
||||
Access the aggregated feed through the TLC app (recommended):
|
||||
- **URL**: http://localhost:8080/rss
|
||||
- **Format**: RSS/Atom XML
|
||||
- **Behavior**: Filters RSS-Bridge error items and prefixes titles with channel name
|
||||
- **Updates**: Every 5 minutes (feed-master schedule)
|
||||
- **Items**: Most recent 200 items across all channels
|
||||
|
||||
Direct feed-master access still works:
|
||||
- **URL**: http://localhost:8097/rss/youtube-unified
|
||||
|
||||
### TLC Search
|
||||
Access the search interface at:
|
||||
- **URL**: http://localhost:8080
|
||||
|
||||
### Channel List Endpoints
|
||||
- **Plain text list**: http://localhost:8080/channels.txt
|
||||
- **JSON metadata**: http://localhost:8080/api/channel-list
|
||||
|
||||
### RSS Bridge
|
||||
Access individual channel feeds or the web interface at:
|
||||
- **URL**: http://localhost:3001
|
||||
|
||||
## Updating Channel List
|
||||
|
||||
When channels are added/removed from `channels.yml`:
|
||||
|
||||
```bash
|
||||
# 1. Regenerate feed configuration
|
||||
cd /var/core/this-little-corner/src/python_app
|
||||
python3 -m python_app.generate_feed_config_simple
|
||||
|
||||
# 2. Restart feed-master to pick up changes
|
||||
docker compose restart feed-master
|
||||
```
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
python_app/
|
||||
├── docker-compose.yml # All services configuration
|
||||
├── channels.yml # Canonical YouTube channel list
|
||||
├── urls.txt # URL list kept in sync with channels.yml
|
||||
├── generate_feed_config_simple.py # Config generator script (run via python -m)
|
||||
├── feed-master-config/
|
||||
│ ├── fm.yml # Feed Master configuration (auto-generated)
|
||||
│ ├── var/ # Feed Master database
|
||||
│ └── images/ # Cached images
|
||||
├── data/ # TLC Search data (read-only)
|
||||
└── README-FEED-MASTER.md # This file
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Feed Master not updating
|
||||
```bash
|
||||
# Check if RSS Bridge is accessible
|
||||
curl http://localhost:3001
|
||||
|
||||
# Restart both services in order
|
||||
docker compose restart rss-bridge
|
||||
sleep 10
|
||||
docker compose restart feed-master
|
||||
```
|
||||
|
||||
### Configuration issues
|
||||
```bash
|
||||
# Regenerate configuration
|
||||
python -m python_app.generate_feed_config_simple
|
||||
|
||||
# Validate the YAML
|
||||
cat feed-master-config/fm.yml
|
||||
|
||||
# Restart feed-master
|
||||
docker compose restart feed-master
|
||||
```
|
||||
|
||||
### View feed-master logs
|
||||
```bash
|
||||
docker compose logs -f feed-master | grep -E "(ERROR|WARN|youtube)"
|
||||
```
|
||||
|
||||
## Integration Notes
|
||||
|
||||
- **Single Source of Truth**: All channel URLs come from `channels.yml` and `urls.txt` in this repo
|
||||
- **Automatic Regeneration**: Run `python3 -m python_app.generate_feed_config_simple` when `channels.yml` changes
|
||||
- **No Manual Editing**: Don't edit `fm.yml` directly - regenerate it from the script
|
||||
- **Handle Support**: Supports both `/channel/ID` and `/@handle` URL formats
|
||||
- **Shared Channels**: Same channels used for transcript indexing (TLC Search) and RSS aggregation (Feed Master)
|
||||
- **Skip Broken RSS**: Set `rss: false` in `channels.yml` to exclude a channel from RSS aggregation
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
- [ ] Automated config regeneration on git pull
|
||||
- [ ] Channel name lookup from YouTube API
|
||||
- [ ] Integration with TLC Search for unified UI
|
||||
- [ ] Webhook notifications for new videos
|
||||
- [ ] OPML export for other RSS readers
|
||||
74
urls.txt
Normal file
74
urls.txt
Normal file
@ -0,0 +1,74 @@
|
||||
https://www.youtube.com/channel/UCCebR16tXbv5Ykk9_WtCCug/videos
|
||||
https://www.youtube.com/channel/UC6vg0HkKKlgsWk-3HfV-vnw/videos
|
||||
https://www.youtube.com/channel/UCeWWxwzgLYUbfjWowXhVdYw/videos
|
||||
https://www.youtube.com/channel/UC952hDf_C4nYJdqwK7VzTxA/videos
|
||||
https://www.youtube.com/channel/UCU5SNBfTo4umhjYz6M0Jsmg/videos
|
||||
https://www.youtube.com/channel/UC6Tvr9mBXNaAxLGRA_sUSRA/videos
|
||||
https://www.youtube.com/channel/UC4Rmxg7saTfwIpvq3QEzylQ/videos
|
||||
https://www.youtube.com/channel/UCTdH4nh6JTcfKUAWvmnPoIQ/videos
|
||||
https://www.youtube.com/channel/UCsi_x8c12NW9FR7LL01QXKA/videos
|
||||
https://www.youtube.com/channel/UCAqTQ5yLHHH44XWwWXLkvHQ/videos
|
||||
https://www.youtube.com/channel/UCprytROeCztMOMe8plyJRMg/videos
|
||||
https://www.youtube.com/channel/UCpqDUjTsof-kTNpnyWper_Q/videos
|
||||
https://www.youtube.com/channel/UCL_f53ZEJxp8TtlOkHwMV9Q/videos
|
||||
https://www.youtube.com/channel/UCez1fzMRGctojfis2lfRYug/videos
|
||||
https://www.youtube.com/channel/UC2leFZRD0ZlQDQxpR2Zd8oA/videos
|
||||
https://www.youtube.com/channel/UC8SErJkYnDsYGh1HxoZkl-g/videos
|
||||
https://www.youtube.com/channel/UCEPOn4cgvrrerg_-q_Ygw1A/videos
|
||||
https://www.youtube.com/channel/UC2yCyOMUeem-cYwliC-tLJg/videos
|
||||
https://www.youtube.com/channel/UCGsDIP_K6J6VSTqlq-9IPlg/videos
|
||||
https://www.youtube.com/channel/UCEzWTLDYmL8soRdQec9Fsjw/videos
|
||||
https://www.youtube.com/channel/UC1KgNsMdRoIA_njVmaDdHgA/videos
|
||||
https://www.youtube.com/channel/UCFQ6Gptuq-sLflbJ4YY3Umw/videos
|
||||
https://www.youtube.com/channel/UCEY1vGNBPsC3dCatZyK3Jkw/videos
|
||||
https://www.youtube.com/channel/UCIAtCuzdvgNJvSYILnHtdWA/videos
|
||||
https://www.youtube.com/channel/UClIDP7_Kzv_7tDQjTv9EhrA/videos
|
||||
https://www.youtube.com/channel/UC-QiBn6GsM3JZJAeAQpaGAA/videos
|
||||
https://www.youtube.com/channel/UCiJmdXTb76i8eIPXdJyf8ZQ/videos
|
||||
https://www.youtube.com/channel/UCM9Z05vuQhMEwsV03u6DrLA/videos
|
||||
https://www.youtube.com/channel/UCgp_r6WlBwDSJrP43Mz07GQ/videos
|
||||
https://www.youtube.com/channel/UC5uv-BxzCrN93B_5qbOdRWw/videos
|
||||
https://www.youtube.com/channel/UCtCTSf3UwRU14nYWr_xm-dQ/videos
|
||||
https://www.youtube.com/channel/UC1a4VtU_SMSfdRiwMJR33YQ/videos
|
||||
https://www.youtube.com/channel/UCg7Ed0lecvko58ibuX1XHng/videos
|
||||
https://www.youtube.com/channel/UCMVG5eqpYFVEB-a9IqAOuHA/videos
|
||||
https://www.youtube.com/channel/UC8mJqpS_EBbMcyuzZDF0TEw/videos
|
||||
https://www.youtube.com/channel/UCGHuURJ1XFHzPSeokf6510A/videos
|
||||
https://www.youtube.com/@chrishoward8473/videos
|
||||
https://www.youtube.com/channel/UChptV-kf8lnncGh7DA2m8Pw/videos
|
||||
https://www.youtube.com/channel/UCzX6R3ZLQh5Zma_5AsPcqPA/videos
|
||||
https://www.youtube.com/channel/UCiukuaNd_qzRDTW9qe2OC1w/videos
|
||||
https://www.youtube.com/channel/UC5yLuFQCms4nb9K2bGQLqIw/videos
|
||||
https://www.youtube.com/channel/UCVdSgEf9bLXFMBGSMhn7x4Q/videos
|
||||
https://www.youtube.com/channel/UC_dnk5D4tFCRYCrKIcQlcfw/videos
|
||||
https://www.youtube.com/@Freerilian/videos
|
||||
https://www.youtube.com/@marks.-ry7bm/videos
|
||||
https://www.youtube.com/@Adams-Fall/videos
|
||||
https://www.youtube.com/@mcmosav/videos
|
||||
https://www.youtube.com/@Landbeorht/videos
|
||||
https://www.youtube.com/@Corner_Citizen/videos
|
||||
https://www.youtube.com/@ethan.caughey/videos
|
||||
https://www.youtube.com/@MarcInTbilisi/videos
|
||||
https://www.youtube.com/@climbingmt.sophia/videos
|
||||
https://www.youtube.com/@Skankenstein/videos
|
||||
https://www.youtube.com/@UpCycleClub/videos
|
||||
https://www.youtube.com/@JessPurviance/videos
|
||||
https://www.youtube.com/@greyhamilton52/videos
|
||||
https://www.youtube.com/@paulrenenichols/videos
|
||||
https://www.youtube.com/@OfficialSecularKoranism/videos
|
||||
https://www.youtube.com/@FromWhomAllBlessingsFlow/videos
|
||||
https://www.youtube.com/@FoodTruckEmily/videos
|
||||
https://www.youtube.com/@O.G.Rose.Michelle.and.Daniel/videos
|
||||
https://www.youtube.com/@JonathanDumeer/videos
|
||||
https://www.youtube.com/@JordanGreenhall/videos
|
||||
https://www.youtube.com/@NechamaGluck/videos
|
||||
https://www.youtube.com/@justinsmorningcoffee/videos
|
||||
https://www.youtube.com/@grahampardun/videos
|
||||
https://www.youtube.com/@michaelmartin8681/videos
|
||||
https://www.youtube.com/@davidbusuttil9086/videos
|
||||
https://www.youtube.com/@matthewparlato5626/videos
|
||||
https://www.youtube.com/@lancecleaver227/videos
|
||||
https://www.youtube.com/@theplebistocrat/videos
|
||||
https://www.youtube.com/@RightInChrist/videos
|
||||
https://www.youtube.com/@RafeKelley/videos
|
||||
https://www.youtube.com/@WavesOfObsession/videos
|
||||
Loading…
x
Reference in New Issue
Block a user