Compare commits

..

24 Commits

Author SHA1 Message Date
d23888c68d Add last_posted date to /api/channel-list from Elasticsearch
Some checks failed
docker-build / build (push) Has been cancelled
Queries the latest video date per channel and includes it in the
channel-list JSON response.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:14:53 -04:00
c019730666 Fix remaining placeholder channel names
Some checks failed
docker-build / build (push) Has been cancelled
- UCCebR16tXbv5Ykk9_WtCCug -> Christian T. Golden
- UC4YwC5zA9S_2EwthE27Xlew -> CMA

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:04:50 -04:00
bb2850ef98 Add /channels HTML page and fix placeholder channel names
Some checks failed
docker-build / build (push) Has been cancelled
- Add /channels route serving a simple HTML page with channel names
  linked to their YouTube pages
- Fix names for UCehAungJpAeC (Wholly Unfocused) and UCiJmdXTb76i
  (Bridges of Meaning Hub) from Elasticsearch data

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:01:45 -04:00
7fdb31bf18 Add 3 missing channels from jet-alone to channels.yml source of truth
Some checks failed
docker-build / build (push) Has been cancelled
Syncs channels.yml (canonical) and urls.txt with channels that existed
only on jet-alone: LeviathanForPlay, UCehAungJpAeC-F3R5FwvvCQ,
UC4YwC5zA9S_2EwthE27Xlew.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 11:39:06 -04:00
Ubuntu
090f5943c3 Add notes page
Some checks failed
docker-build / build (push) Has been cancelled
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 20:40:53 +00:00
d168287636 Add Rigel Windsong Thurston
Some checks failed
docker-build / build (push) Has been cancelled
2026-01-10 13:36:10 -05:00
6534db6f64 Ignore .gemini artifacts
Some checks failed
docker-build / build (push) Has been cancelled
2026-01-08 22:55:33 -05:00
30503628b5 Add unified channel feed 2026-01-08 22:53:30 -05:00
63fe922860 Document channel feeds 2026-01-08 22:46:30 -05:00
1ac076e5f2 Harden search responses 2026-01-08 15:42:21 -05:00
1c95f47766 Add API rate limits 2026-01-08 15:24:05 -05:00
6a3d1ee491 Disable vector search 2026-01-08 15:20:06 -05:00
8e4c57a93a Security: add security headers, CSP, request size limits 2026-01-08 14:53:44 -05:00
1565c8db38 Security: disable debug mode, sanitize query input, validate Qdrant filters, add size/offset bounds 2026-01-08 14:41:42 -05:00
d26edda029 Add graph traversal endpoints and sort metrics by channel name 2026-01-08 14:22:01 -05:00
9dd74111e7 Change default sort to newer first 2026-01-08 14:12:15 -05:00
93774c025f Respect external filter in metrics and graph
Some checks failed
docker-build / build (push) Has been cancelled
2025-11-20 09:54:41 -05:00
b0c9d319ef Remove full graph node cap
Some checks failed
docker-build / build (push) Has been cancelled
2025-11-20 09:42:14 -05:00
82c334b131 Add full reference graph mode
Some checks failed
docker-build / build (push) Has been cancelled
2025-11-19 15:23:21 -05:00
7f74aaced8 Persist search settings locally
Some checks failed
docker-build / build (push) Has been cancelled
2025-11-19 10:20:00 -05:00
c88d1886c9 Fix backlink badge query to target referencing videos
Some checks failed
docker-build / build (push) Has been cancelled
2025-11-18 23:47:07 -05:00
c6b46edacc Default external off and filter channels/backlink queries
Some checks failed
docker-build / build (push) Has been cancelled
2025-11-18 23:42:49 -05:00
4c20329f36 Add external reference toggle and badges
Some checks failed
docker-build / build (push) Has been cancelled
2025-11-18 23:07:13 -05:00
b267a0ecc6 Add Gitea workflow for Docker image builds
Some checks failed
docker-build / build (push) Has been cancelled
2025-11-18 19:14:20 -05:00
24 changed files with 2756 additions and 898 deletions

View File

@@ -9,3 +9,5 @@ node_modules
data
videos
*.log
feed-master-config/var
feed-master-config/images

View File

@@ -0,0 +1,37 @@
# Build and push the TLC Search Docker image whenever changes land on master.
name: docker-build
on:
push:
branches:
- master
env:
IMAGE_NAME: gitea.ghost.tel/knight/tlc-search
jobs:
build:
runs-on: docker
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to registry
uses: docker/login-action@v2
with:
registry: gitea.ghost.tel
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Build and push image
uses: docker/build-push-action@v5
with:
context: .
file: Dockerfile
push: true
tags: |
${{ env.IMAGE_NAME }}:latest
${{ env.IMAGE_NAME }}:${{ github.sha }}

5
.gitignore vendored
View File

@@ -33,6 +33,7 @@ env/
# IDE
.vscode/
.idea/
.gemini/
*.swp
*.swo
*~
@@ -51,6 +52,10 @@ Thumbs.db
# Logs
*.log
# Feed Master runtime cache
feed-master-config/var/
feed-master-config/images/
# Testing
.pytest_cache/
.coverage

87
Makefile Normal file
View File

@@ -0,0 +1,87 @@
# Makefile for TLC Search + Feed Master
.PHONY: help config up down restart logs status update-channels
help:
@echo "TLC Search + Feed Master - Management Commands"
@echo ""
@echo "Configuration:"
@echo " make config - Regenerate feed-master configuration from channels.yml"
@echo ""
@echo "Service Management:"
@echo " make up - Start all services"
@echo " make down - Stop all services"
@echo " make restart - Restart all services"
@echo " make logs - View all service logs"
@echo " make status - Check service status"
@echo ""
@echo "Updates:"
@echo " make update-channels - Regenerate config and restart feed-master"
@echo ""
@echo "Individual Services:"
@echo " make logs-feed - View feed-master logs"
@echo " make logs-bridge - View rss-bridge logs"
@echo " make logs-app - View TLC Search logs"
@echo " make restart-feed - Restart feed-master only"
# Generate feed-master configuration from channels.yml
config:
@echo "Generating feed-master configuration..."
python3 -m python_app.generate_feed_config_simple
@echo "Configuration updated!"
# Start all services
up:
docker compose up -d
@echo ""
@echo "Services started!"
@echo " - RSS Bridge: http://localhost:3001"
@echo " - Feed Master: http://localhost:8097/rss/youtube-unified"
@echo " - TLC Search: http://localhost:8080"
# Stop all services
down:
docker compose down
# Restart all services
restart:
docker compose restart
# View all logs
logs:
docker compose logs -f
# View feed-master logs
logs-feed:
docker compose logs -f feed-master
# View rss-bridge logs
logs-bridge:
docker compose logs -f rss-bridge
# View TLC Search logs
logs-app:
docker compose logs -f app
# Check service status
status:
@docker compose ps
@echo ""
@echo "Endpoints:"
@echo " - RSS Bridge: http://localhost:3001"
@echo " - Feed Master: http://localhost:8097/rss/youtube-unified"
@echo " - TLC Search: http://localhost:8080"
# Restart only feed-master
restart-feed:
docker compose restart feed-master
# Pull latest channel URLs and regenerate configuration
update-channels:
@echo "Regenerating feed-master configuration..."
python3 -m python_app.generate_feed_config_simple
@echo ""
@echo "Restarting feed-master..."
docker compose restart feed-master
@echo ""
@echo "Update complete!"

209
README-FEED-MASTER.md Normal file
View File

@@ -0,0 +1,209 @@
# TLC Search + Feed Master Integration
This directory contains an integrated setup combining:
- **TLC Search**: Flask app for searching YouTube transcripts (Elasticsearch/Qdrant)
- **Feed Master**: RSS aggregator for YouTube channels
- **RSS Bridge**: Converts YouTube channels to RSS feeds
All services share the same source of truth for YouTube channels from `channels.yml` and the adjacent
`urls.txt` in this repository.
## Architecture
```
┌─────────────────────┐
│ channels.yml │ Source of truth (this repo)
│ (python_app repo) │
└──────────┬──────────┘
├─────────────────────────────┬────────────────────────┐
│ │ │
v v v
┌──────────────┐ ┌──────────────┐ ┌─────────────────┐
│ TLC Search │ │ RSS Bridge │ │ Feed Master │
│ (Flask App) │ │ (Port 3001) │───────>│ (Port 8097) │
│ Port 8080 │ └──────────────┘ └─────────────────┘
│ │ │
│ Elasticsearch│ │
│ Qdrant │ │
└──────────────┘ │
v
http://localhost:8097/rss/youtube-unified
```
## Services
### 1. TLC Search (Port 8080)
- Indexes and searches YouTube transcripts
- Uses Elasticsearch for metadata and Qdrant for vector search
- Connects to remote Elasticsearch/Qdrant instances
### 2. RSS Bridge (Port 3001)
- Converts YouTube channels to RSS feeds
- Supports both channel IDs and @handles
- Used by Feed Master to aggregate feeds
### 3. Feed Master (Port 8097)
- Aggregates all YouTube channel RSS feeds into one unified feed
- Updates every 5 minutes
- Keeps the most recent 200 items from all channels
## Setup
### Prerequisites
- Docker and Docker Compose
- Python 3.x
### Configuration
1. **Environment Variables**: Create `.env` file with:
```bash
# Elasticsearch
ELASTIC_URL=https://your-elasticsearch-url
ELASTIC_INDEX=this_little_corner_py
ELASTIC_USERNAME=your_username
ELASTIC_PASSWORD=your_password
# Qdrant
QDRANT_URL=https://your-qdrant-url
QDRANT_COLLECTION=tlc-captions-full
# Optional UI links
RSS_FEED_URL=/rss/youtube-unified
CHANNELS_PATH=/app/python_app/channels.yml
RSS_FEED_UPSTREAM=http://feed-master:8080
```
2. **Generate Feed Configuration**:
```bash
# Regenerate feed-master config from the channels list
python3 -m python_app.generate_feed_config_simple
```
This reads `channels.yml` and generates `feed-master-config/fm.yml`.
### Starting Services
```bash
# Start all services
docker compose up -d
# View logs
docker compose logs -f
# View specific service logs
docker compose logs -f feed-master
docker compose logs -f rss-bridge
docker compose logs -f app
```
### Stopping Services
```bash
# Stop all services
docker compose down
# Stop specific service
docker compose stop feed-master
```
## Usage
### Unified RSS Feed
Access the aggregated feed through the TLC app (recommended):
- **URL**: http://localhost:8080/rss
- **Format**: RSS/Atom XML
- **Behavior**: Filters RSS-Bridge error items and prefixes titles with channel name
- **Updates**: Every 5 minutes (feed-master schedule)
- **Items**: Most recent 200 items across all channels
Direct feed-master access still works:
- **URL**: http://localhost:8097/rss/youtube-unified
### TLC Search
Access the search interface at:
- **URL**: http://localhost:8080
### Channel List Endpoints
- **Plain text list**: http://localhost:8080/channels.txt
- **JSON metadata**: http://localhost:8080/api/channel-list
### RSS Bridge
Access individual channel feeds or the web interface at:
- **URL**: http://localhost:3001
## Updating Channel List
When channels are added/removed from `channels.yml`:
```bash
# 1. Regenerate feed configuration
cd /var/core/this-little-corner/src/python_app
python3 -m python_app.generate_feed_config_simple
# 2. Restart feed-master to pick up changes
docker compose restart feed-master
```
## File Structure
```
python_app/
├── docker-compose.yml # All services configuration
├── channels.yml # Canonical YouTube channel list
├── urls.txt # URL list kept in sync with channels.yml
├── generate_feed_config_simple.py # Config generator script (run via python -m)
├── feed-master-config/
│ ├── fm.yml # Feed Master configuration (auto-generated)
│ ├── var/ # Feed Master database
│ └── images/ # Cached images
├── data/ # TLC Search data (read-only)
└── README-FEED-MASTER.md # This file
```
## Troubleshooting
### Feed Master not updating
```bash
# Check if RSS Bridge is accessible
curl http://localhost:3001
# Restart both services in order
docker compose restart rss-bridge
sleep 10
docker compose restart feed-master
```
### Configuration issues
```bash
# Regenerate configuration
python -m python_app.generate_feed_config_simple
# Validate the YAML
cat feed-master-config/fm.yml
# Restart feed-master
docker compose restart feed-master
```
### View feed-master logs
```bash
docker compose logs -f feed-master | grep -E "(ERROR|WARN|youtube)"
```
## Integration Notes
- **Single Source of Truth**: All channel URLs come from `channels.yml` and `urls.txt` in this repo
- **Automatic Regeneration**: Run `python3 -m python_app.generate_feed_config_simple` when `channels.yml` changes
- **No Manual Editing**: Don't edit `fm.yml` directly - regenerate it from the script
- **Handle Support**: Supports both `/channel/ID` and `/@handle` URL formats
- **Shared Channels**: Same channels used for transcript indexing (TLC Search) and RSS aggregation (Feed Master)
- **Skip Broken RSS**: Set `rss: false` in `channels.yml` to exclude a channel from RSS aggregation
## Future Enhancements
- [ ] Automated config regeneration on git pull
- [ ] Channel name lookup from YouTube API
- [ ] Integration with TLC Search for unified UI
- [ ] Webhook notifications for new videos
- [ ] OPML export for other RSS readers

View File

@@ -102,5 +102,17 @@ Other tunables (defaults shown in compose):
- `ELASTIC_VERIFY_CERTS` (set to `1` for real TLS verification)
- `QDRANT_COLLECTION` (default `tlc-captions-full`)
- `QDRANT_VECTOR_NAME` / `QDRANT_VECTOR_SIZE` / `QDRANT_EMBED_MODEL`
- `RATE_LIMIT_ENABLED` (default `1`)
- `RATE_LIMIT_REQUESTS` (default `60`)
- `RATE_LIMIT_WINDOW_SECONDS` (default `60`)
Port 8080 on the host is forwarded to the app. Mount `./data` (read-only) if you want local fallbacks for metrics (`LOCAL_DATA_DIR=/app/data/video_metadata`); otherwise the app will rely purely on the remote backends. Stop the container with `docker compose down`.
## CI (Docker build)
A Gitea Actions workflow (`.gitea/workflows/docker-build.yml`) builds and pushes the Docker image on every push to `master`. Configure the following repository secrets in Gitea:
- `DOCKER_USERNAME`
- `DOCKER_PASSWORD`
The image is tagged as `gitea.ghost.tel/knight/tlc-search:latest` and with the commit SHA. Adjust `IMAGE_NAME` in the workflow if you need a different registry/repo.

162
channel_config.py Normal file
View File

@@ -0,0 +1,162 @@
from __future__ import annotations
import json
import re
from pathlib import Path
from typing import Any, Dict, List, Optional
_CHANNEL_ID_PATTERN = re.compile(r"(?:https?://)?(?:www\.)?youtube\.com/channel/([^/?#]+)")
_HANDLE_PATTERN = re.compile(r"(?:https?://)?(?:www\.)?youtube\.com/@([^/?#]+)")
def _strip_quotes(value: str) -> str:
if len(value) >= 2 and value[0] == value[-1] and value[0] in {"'", '"'}:
return value[1:-1]
return value
def _parse_yaml_channels(text: str) -> List[Dict[str, str]]:
channels: List[Dict[str, str]] = []
current: Dict[str, str] = {}
for raw_line in text.splitlines():
line = raw_line.strip()
if not line or line.startswith("#"):
continue
if line == "channels:":
continue
if line.startswith("- "):
if current:
channels.append(current)
current = {}
line = line[2:].strip()
if not line:
continue
if ":" not in line:
continue
key, value = line.split(":", 1)
current[key.strip()] = _strip_quotes(value.strip())
if current:
channels.append(current)
return channels
def _extract_from_url(url: str) -> Dict[str, Optional[str]]:
channel_id = None
handle = None
channel_match = _CHANNEL_ID_PATTERN.search(url)
if channel_match:
channel_id = channel_match.group(1)
handle_match = _HANDLE_PATTERN.search(url)
if handle_match:
handle = handle_match.group(1)
return {"id": channel_id, "handle": handle}
def _normalize_handle(handle: Optional[str]) -> Optional[str]:
if not handle:
return None
return handle.lstrip("@").strip() or None
def _parse_bool(value: Optional[object]) -> Optional[bool]:
if isinstance(value, bool):
return value
if value is None:
return None
text = str(value).strip().lower()
if text in {"1", "true", "yes", "y"}:
return True
if text in {"0", "false", "no", "n"}:
return False
return None
def _normalize_entry(entry: Dict[str, Any]) -> Optional[Dict[str, Any]]:
channel_id = entry.get("id") or entry.get("channel_id")
handle = _normalize_handle(entry.get("handle") or entry.get("username"))
url = entry.get("url")
name = entry.get("name")
rss_flag = _parse_bool(
entry.get("rss_enabled") or entry.get("rss") or entry.get("include_in_feed")
)
if url:
extracted = _extract_from_url(url)
channel_id = channel_id or extracted.get("id")
handle = handle or extracted.get("handle")
if not url:
if channel_id:
url = f"https://www.youtube.com/channel/{channel_id}"
elif handle:
url = f"https://www.youtube.com/@{handle}"
if not name:
name = handle or channel_id
if not name or not url:
return None
normalized = {
"id": channel_id or "",
"handle": handle or "",
"name": name,
"url": url,
"rss_enabled": True if rss_flag is None else rss_flag,
}
return normalized
def load_channel_entries(path: Path) -> List[Dict[str, str]]:
if not path.exists():
raise FileNotFoundError(path)
if path.suffix.lower() == ".json":
payload = json.loads(path.read_text(encoding="utf-8"))
if isinstance(payload, dict):
raw_entries = payload.get("channels", [])
else:
raw_entries = payload
else:
raw_entries = _parse_yaml_channels(path.read_text(encoding="utf-8"))
entries: List[Dict[str, str]] = []
for raw in raw_entries:
if not isinstance(raw, dict):
continue
raw_payload: Dict[str, Any] = {}
for key, value in raw.items():
if value is None:
continue
if isinstance(value, bool):
raw_payload[str(key).strip()] = value
else:
raw_payload[str(key).strip()] = str(value).strip()
normalized = _normalize_entry(raw_payload)
if normalized:
entries.append(normalized)
entries.sort(key=lambda item: item["name"].lower())
return entries
def build_rss_bridge_url(entry: Dict[str, str], rss_bridge_host: str = "rss-bridge") -> Optional[str]:
channel_id = entry.get("id") or ""
handle = _normalize_handle(entry.get("handle"))
if channel_id:
return (
f"http://{rss_bridge_host}/?action=display&bridge=YoutubeBridge"
f"&context=By+channel+id&c={channel_id}&format=Mrss"
)
if handle:
return (
f"http://{rss_bridge_host}/?action=display&bridge=YoutubeBridge"
f"&context=By+username&u={handle}&format=Mrss"
)
return None

271
channels.yml Normal file
View File

@@ -0,0 +1,271 @@
# Shared YouTube Channel Configuration
# Used by both TLC Search (transcript collection) and Feed Master (RSS aggregation)
channels:
- id: UCCebR16tXbv5Ykk9_WtCCug
name: Christian T. Golden
url: https://www.youtube.com/channel/UCCebR16tXbv5Ykk9_WtCCug/videos
- id: UC6vg0HkKKlgsWk-3HfV-vnw
name: A Quality Existence
url: https://www.youtube.com/channel/UC6vg0HkKKlgsWk-3HfV-vnw/videos
- id: UCeWWxwzgLYUbfjWowXhVdYw
name: Andrea with the Bangs
url: https://www.youtube.com/channel/UCeWWxwzgLYUbfjWowXhVdYw/videos
- id: UC952hDf_C4nYJdqwK7VzTxA
name: Charlie's Little Corner
url: https://www.youtube.com/channel/UC952hDf_C4nYJdqwK7VzTxA/videos
- id: UCU5SNBfTo4umhjYz6M0Jsmg
name: Christian Baxter
url: https://www.youtube.com/channel/UCU5SNBfTo4umhjYz6M0Jsmg/videos
- id: UC6Tvr9mBXNaAxLGRA_sUSRA
name: Finding Ideas
url: https://www.youtube.com/channel/UC6Tvr9mBXNaAxLGRA_sUSRA/videos
- id: UC4Rmxg7saTfwIpvq3QEzylQ
name: Ein Sof - Infinite Reflections
url: https://www.youtube.com/channel/UC4Rmxg7saTfwIpvq3QEzylQ/videos
- id: UCTdH4nh6JTcfKUAWvmnPoIQ
name: Eric Seitz
url: https://www.youtube.com/channel/UCTdH4nh6JTcfKUAWvmnPoIQ/videos
- id: UCsi_x8c12NW9FR7LL01QXKA
name: Grail Country
url: https://www.youtube.com/channel/UCsi_x8c12NW9FR7LL01QXKA/videos
- id: UCAqTQ5yLHHH44XWwWXLkvHQ
name: Grizwald Grim
url: https://www.youtube.com/channel/UCAqTQ5yLHHH44XWwWXLkvHQ/videos
- id: UCprytROeCztMOMe8plyJRMg
name: faturechi
url: https://www.youtube.com/channel/UCprytROeCztMOMe8plyJRMg/videos
- id: UCpqDUjTsof-kTNpnyWper_Q
name: John Vervaeke
url: https://www.youtube.com/channel/UCpqDUjTsof-kTNpnyWper_Q/videos
- id: UCL_f53ZEJxp8TtlOkHwMV9Q
name: Jordan B Peterson
url: https://www.youtube.com/channel/UCL_f53ZEJxp8TtlOkHwMV9Q/videos
- id: UCez1fzMRGctojfis2lfRYug
name: Lucas Vos
url: https://www.youtube.com/channel/UCez1fzMRGctojfis2lfRYug/videos
- id: UC2leFZRD0ZlQDQxpR2Zd8oA
name: Mary Kochan
url: https://www.youtube.com/channel/UC2leFZRD0ZlQDQxpR2Zd8oA/videos
- id: UC8SErJkYnDsYGh1HxoZkl-g
name: Sartori Studios
url: https://www.youtube.com/channel/UC8SErJkYnDsYGh1HxoZkl-g/videos
- id: UCEPOn4cgvrrerg_-q_Ygw1A
name: More Christ
url: https://www.youtube.com/channel/UCEPOn4cgvrrerg_-q_Ygw1A/videos
- id: UC2yCyOMUeem-cYwliC-tLJg
name: Paul Anleitner
url: https://www.youtube.com/channel/UC2yCyOMUeem-cYwliC-tLJg/videos
- id: UCGsDIP_K6J6VSTqlq-9IPlg
name: Paul VanderKlay
url: https://www.youtube.com/channel/UCGsDIP_K6J6VSTqlq-9IPlg/videos
- id: UCEzWTLDYmL8soRdQec9Fsjw
name: Randos United
url: https://www.youtube.com/channel/UCEzWTLDYmL8soRdQec9Fsjw/videos
- id: UC1KgNsMdRoIA_njVmaDdHgA
name: Randos United 2
url: https://www.youtube.com/channel/UC1KgNsMdRoIA_njVmaDdHgA/videos
- id: UCFQ6Gptuq-sLflbJ4YY3Umw
name: Rebel Wisdom
url: https://www.youtube.com/channel/UCFQ6Gptuq-sLflbJ4YY3Umw/videos
- id: UCEY1vGNBPsC3dCatZyK3Jkw
name: Strange Theology
url: https://www.youtube.com/channel/UCEY1vGNBPsC3dCatZyK3Jkw/videos
- id: UCIAtCuzdvgNJvSYILnHtdWA
name: The Anadromist
url: https://www.youtube.com/channel/UCIAtCuzdvgNJvSYILnHtdWA/videos
- id: UClIDP7_Kzv_7tDQjTv9EhrA
name: The Chris Show
url: https://www.youtube.com/channel/UClIDP7_Kzv_7tDQjTv9EhrA/videos
- id: UC-QiBn6GsM3JZJAeAQpaGAA
name: TheCommonToad
url: https://www.youtube.com/channel/UC-QiBn6GsM3JZJAeAQpaGAA/videos
- id: UCiJmdXTb76i8eIPXdJyf8ZQ
name: Bridges of Meaning Hub
url: https://www.youtube.com/channel/UCiJmdXTb76i8eIPXdJyf8ZQ/videos
- id: UCM9Z05vuQhMEwsV03u6DrLA
name: Cassidy van der Kamp
url: https://www.youtube.com/channel/UCM9Z05vuQhMEwsV03u6DrLA/videos
- id: UCgp_r6WlBwDSJrP43Mz07GQ
name: The Meaning Code
url: https://www.youtube.com/channel/UCgp_r6WlBwDSJrP43Mz07GQ/videos
- id: UC5uv-BxzCrN93B_5qbOdRWw
name: TheScrollersPodcast
url: https://www.youtube.com/channel/UC5uv-BxzCrN93B_5qbOdRWw/videos
- id: UCtCTSf3UwRU14nYWr_xm-dQ
name: Jonathan Pageau
url: https://www.youtube.com/channel/UCtCTSf3UwRU14nYWr_xm-dQ/videos
- id: UC1a4VtU_SMSfdRiwMJR33YQ
name: The Young Levite
url: https://www.youtube.com/channel/UC1a4VtU_SMSfdRiwMJR33YQ/videos
- id: UCg7Ed0lecvko58ibuX1XHng
name: Transfigured
url: https://www.youtube.com/channel/UCg7Ed0lecvko58ibuX1XHng/videos
- id: UCMVG5eqpYFVEB-a9IqAOuHA
name: President Foxman
url: https://www.youtube.com/channel/UCMVG5eqpYFVEB-a9IqAOuHA/videos
- id: UC8mJqpS_EBbMcyuzZDF0TEw
name: Neal Daedalus
url: https://www.youtube.com/channel/UC8mJqpS_EBbMcyuzZDF0TEw/videos
- id: UCGHuURJ1XFHzPSeokf6510A
name: Aphrael Pilotson
url: https://www.youtube.com/channel/UCGHuURJ1XFHzPSeokf6510A/videos
- id: UC704NVL2DyzYg3rMU9r1f7A
handle: chrishoward8473
name: Chris Howard
url: https://www.youtube.com/@chrishoward8473/videos
- id: UChptV-kf8lnncGh7DA2m8Pw
name: Shoulder Serf
url: https://www.youtube.com/channel/UChptV-kf8lnncGh7DA2m8Pw/videos
- id: UCzX6R3ZLQh5Zma_5AsPcqPA
name: Restoring Meaning
url: https://www.youtube.com/channel/UCzX6R3ZLQh5Zma_5AsPcqPA/videos
- id: UCiukuaNd_qzRDTW9qe2OC1w
name: Kale Zelden
url: https://www.youtube.com/channel/UCiukuaNd_qzRDTW9qe2OC1w/videos
- id: UC5yLuFQCms4nb9K2bGQLqIw
name: Ron Copperman
url: https://www.youtube.com/channel/UC5yLuFQCms4nb9K2bGQLqIw/videos
- id: UCVdSgEf9bLXFMBGSMhn7x4Q
name: Mark D Parker
url: https://www.youtube.com/channel/UCVdSgEf9bLXFMBGSMhn7x4Q/videos
- id: UC_dnk5D4tFCRYCrKIcQlcfw
name: Luke Thompson
url: https://www.youtube.com/channel/UC_dnk5D4tFCRYCrKIcQlcfw/videos
- id: UCT8Lq3ufaGEnCSS8WpFatqw
handle: Freerilian
name: Free Rilian
url: https://www.youtube.com/@Freerilian/videos
- id: UC977g6oGYIJDQnsZOGjQBBA
handle: marks.-ry7bm
name: Mark S
url: https://www.youtube.com/@marks.-ry7bm/videos
- id: UCbD1Pm0TOcRK2zaCrwgcTTg
handle: Adams-Fall
name: Adams Fall
url: https://www.youtube.com/@Adams-Fall/videos
- id: UCnojyPW0IgLWTQ0SaDQ1KBA
handle: mcmosav
name: mcmosav
url: https://www.youtube.com/@mcmosav/videos
- id: UCiOZYvBGHw1Y6wyzffwEp9g
handle: Landbeorht
name: Joseph Lambrecht
url: https://www.youtube.com/@Landbeorht/videos
- id: UCAXyF_HFeMgwS8nkGVeroAA
handle: Corner_Citizen
name: Corner Citizen
url: https://www.youtube.com/@Corner_Citizen/videos
- id: UCv2Qft5mZrmA9XAwnl9PU-g
handle: ethan.caughey
name: Ethan Caughey
url: https://www.youtube.com/@ethan.caughey/videos
- id: UCMJCtS8jKouJ2d8UIYzW3vg
handle: MarcInTbilisi
name: Marc Jackson
url: https://www.youtube.com/@MarcInTbilisi/videos
- id: UCk9O91WwruXmgu1NQrKZZEw
handle: climbingmt.sophia
name: Climbing Mt Sophia
url: https://www.youtube.com/@climbingmt.sophia/videos
- id: UCUSyTPWW4JaG1YfUPddw47Q
handle: Skankenstein
name: Skankenstein
url: https://www.youtube.com/@Skankenstein/videos
- id: UCzw2FNI3IRphcAoVcUENOgQ
handle: UpCycleClub
name: UpCycleClub
url: https://www.youtube.com/@UpCycleClub/videos
- id: UCQ7rVoApmYIpcmU7fB9RPyw
handle: JessPurviance
name: Jesspurviance
url: https://www.youtube.com/@JessPurviance/videos
- id: UCrZyTWGMdRM9_P26RKPvh3A
handle: greyhamilton52
name: Grey Hamilton
url: https://www.youtube.com/@greyhamilton52/videos
- id: UCDCfI162vhPvwdxW6X4nmiw
handle: paulrenenichols
name: Paul Rene Nichols
url: https://www.youtube.com/@paulrenenichols/videos
- id: UCFLovlJ8RFApfjrf2y157xg
handle: OfficialSecularKoranism
name: Secular Koranism
url: https://www.youtube.com/@OfficialSecularKoranism/videos
- id: UC_-YQbnPfBbIezMr1adZZiQ
handle: FromWhomAllBlessingsFlow
name: From Whom All Blessings Flow
url: https://www.youtube.com/@FromWhomAllBlessingsFlow/videos
- id: UCn5mf-fcpBmkepIpZ8eFRng
handle: FoodTruckEmily
name: Emily Rajeh
url: https://www.youtube.com/@FoodTruckEmily/videos
- id: UC6zHDj4D323xJkblnPTvY3Q
handle: O.G.Rose.Michelle.and.Daniel
name: OG Rose
url: https://www.youtube.com/@O.G.Rose.Michelle.and.Daniel/videos
- id: UC4GiA5Hnwy415uVRymxPK-w
handle: JonathanDumeer
name: Jonathan Dumeer
url: https://www.youtube.com/@JonathanDumeer/videos
- id: UCMzT-mdCqoyEv_-YZVtE7MQ
handle: JordanGreenhall
name: Jordan Hall
url: https://www.youtube.com/@JordanGreenhall/videos
- id: UC5goUoFM4LPim4eY4pwRXYw
handle: NechamaGluck
name: Nechama Gluck
url: https://www.youtube.com/@NechamaGluck/videos
- id: UCPUVeoQYyq8cndWwyczX6RA
handle: justinsmorningcoffee
name: Justinsmorningcoffee
url: https://www.youtube.com/@justinsmorningcoffee/videos
- id: UCB0C8DEIQlQzvSGuGriBxtA
handle: grahampardun
name: Grahampardun
url: https://www.youtube.com/@grahampardun/videos
- id: UCpLJJLVB_7v4Igq-9arja1A
handle: michaelmartin8681
name: Michaelmartin8681
url: https://www.youtube.com/@michaelmartin8681/videos
- id: UCxV18lwwh29DiWuooz7UCvg
handle: davidbusuttil9086
name: Davidbusuttil9086
url: https://www.youtube.com/@davidbusuttil9086/videos
- id: UCosBhpwwGh_ueYq4ZSi5dGw
handle: matthewparlato5626
name: Matthewparlato5626
url: https://www.youtube.com/@matthewparlato5626/videos
- id: UCwF5LWNOFou_50bT65bq4Bg
handle: lancecleaver227
name: Lancecleaver227
url: https://www.youtube.com/@lancecleaver227/videos
- id: UCaJ0CqiiMSTq4X0rycUOIjw
handle: theplebistocrat
name: the plebistocrat
url: https://www.youtube.com/@theplebistocrat/videos
- id: UCWehDXDEdUpB58P7-Bg1cHg
handle: rigelwindsongthurston
name: Rigel Windsong Thurston
url: https://www.youtube.com/@rigelwindsongthurston/videos
- id: UCZA5mUAyYcCL1kYgxbeMNrA
handle: RightInChrist
name: Rightinchrist
url: https://www.youtube.com/@RightInChrist/videos
- id: UCDIPXp88qjAV3TiaR5Uo3iQ
handle: RafeKelley
name: Rafekelley
url: https://www.youtube.com/@RafeKelley/videos
- id: UCedgru6YCto3zyXjlbuQuqA
handle: WavesOfObsession
name: Wavesofobsession
url: https://www.youtube.com/@WavesOfObsession/videos
- handle: LeviathanForPlay
name: LeviathanForPlay
url: https://www.youtube.com/@LeviathanForPlay/videos
- id: UCehAungJpAeC-F3R5FwvvCQ
name: Wholly Unfocused
url: https://www.youtube.com/channel/UCehAungJpAeC-F3R5FwvvCQ/videos
- id: UC4YwC5zA9S_2EwthE27Xlew
name: CMA
url: https://www.youtube.com/channel/UC4YwC5zA9S_2EwthE27Xlew/videos

View File

@@ -6,7 +6,13 @@ Environment Variables:
ELASTIC_USERNAME / ELASTIC_PASSWORD: Optional basic auth credentials.
ELASTIC_INDEX: Target index name (default: this_little_corner_py).
LOCAL_DATA_DIR: Root folder containing JSON metadata (default: ../data/video_metadata).
CHANNELS_PATH: Path to the canonical channel list (default: ./channels.yml).
RSS_FEED_URL: Public URL/path for the unified RSS feed (default: /rss/youtube-unified).
RSS_FEED_UPSTREAM: Base URL to proxy feed requests (default: http://localhost:8097).
YOUTUBE_API_KEY: Optional API key for pulling metadata directly from YouTube.
RATE_LIMIT_ENABLED: Toggle API rate limiting (default: 1).
RATE_LIMIT_REQUESTS: Max requests per window per client (default: 60).
RATE_LIMIT_WINDOW_SECONDS: Window size in seconds (default: 60).
"""
from __future__ import annotations
@@ -53,16 +59,27 @@ class YoutubeSettings:
api_key: Optional[str]
@dataclass(frozen=True)
class RateLimitSettings:
enabled: bool
requests: int
window_seconds: int
@dataclass(frozen=True)
class AppConfig:
elastic: ElasticSettings
data: DataSettings
youtube: YoutubeSettings
rate_limit: RateLimitSettings
qdrant_url: str
qdrant_collection: str
qdrant_vector_name: Optional[str]
qdrant_vector_size: int
qdrant_embed_model: str
channels_path: Path
rss_feed_url: str
rss_feed_upstream: str
def _env(name: str, default: Optional[str] = None) -> Optional[str]:
@@ -94,15 +111,29 @@ def load_config() -> AppConfig:
)
data = DataSettings(root=data_root)
youtube = YoutubeSettings(api_key=_env("YOUTUBE_API_KEY"))
rate_limit = RateLimitSettings(
enabled=_env("RATE_LIMIT_ENABLED", "1") in {"1", "true", "True"},
requests=max(int(_env("RATE_LIMIT_REQUESTS", "60")), 0),
window_seconds=max(int(_env("RATE_LIMIT_WINDOW_SECONDS", "60")), 1),
)
channels_path = Path(
_env("CHANNELS_PATH", str(Path(__file__).parent / "channels.yml"))
).expanduser()
rss_feed_url = _env("RSS_FEED_URL", "/rss/youtube-unified")
rss_feed_upstream = _env("RSS_FEED_UPSTREAM", "http://localhost:8097")
return AppConfig(
elastic=elastic,
data=data,
youtube=youtube,
rate_limit=rate_limit,
qdrant_url=_env("QDRANT_URL", "http://localhost:6333"),
qdrant_collection=_env("QDRANT_COLLECTION", "tlc_embeddings"),
qdrant_vector_name=_env("QDRANT_VECTOR_NAME"),
qdrant_vector_size=int(_env("QDRANT_VECTOR_SIZE", "1024")),
qdrant_embed_model=_env("QDRANT_EMBED_MODEL", "BAAI/bge-large-en-v1.5"),
channels_path=channels_path,
rss_feed_url=rss_feed_url or "",
rss_feed_upstream=rss_feed_upstream or "",
)

View File

@@ -1,8 +1,47 @@
version: "3.9"
# Runs only the Flask app container, pointing to remote Elasticsearch/Qdrant.
# TLC Search + Feed Master - Complete YouTube content indexing & RSS aggregation
# Provide ELASTIC_URL / QDRANT_URL (and related) via environment or a .env file.
services:
# RSS Bridge - Converts YouTube channels to RSS feeds
rss-bridge:
image: rssbridge/rss-bridge:latest
container_name: tlc-rss-bridge
hostname: rss-bridge
restart: unless-stopped
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
ports:
- "3001:80"
# Feed Master - Aggregates multiple RSS feeds into unified feed
feed-master:
image: umputun/feed-master:latest
container_name: tlc-feed-master
hostname: feed-master
restart: unless-stopped
depends_on:
- rss-bridge
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
environment:
- DEBUG=false
- FM_DB=/srv/var/feed-master.bdb
- FM_CONF=/srv/etc/fm.yml
volumes:
- ./feed-master-config:/srv/etc
- ./feed-master-config/var:/srv/var
- ./feed-master-config/images:/srv/images
ports:
- "8097:8080"
# TLC Search - Flask app for searching YouTube transcripts
app:
build:
context: .
@@ -16,6 +55,9 @@ services:
ELASTIC_PASSWORD: ${ELASTIC_PASSWORD:-}
ELASTIC_API_KEY: ${ELASTIC_API_KEY:-}
ELASTIC_VERIFY_CERTS: ${ELASTIC_VERIFY_CERTS:-0}
CHANNELS_PATH: ${CHANNELS_PATH:-/app/python_app/channels.yml}
RSS_FEED_URL: ${RSS_FEED_URL:-/rss/youtube-unified}
RSS_FEED_UPSTREAM: ${RSS_FEED_UPSTREAM:-http://feed-master:8080}
QDRANT_URL: ${QDRANT_URL:?set QDRANT_URL to your remote Qdrant URL}
QDRANT_COLLECTION: ${QDRANT_COLLECTION:-tlc-captions-full}
QDRANT_VECTOR_NAME: ${QDRANT_VECTOR_NAME:-}
@@ -23,4 +65,5 @@ services:
QDRANT_EMBED_MODEL: ${QDRANT_EMBED_MODEL:-BAAI/bge-large-en-v1.5}
LOCAL_DATA_DIR: ${LOCAL_DATA_DIR:-/app/data/video_metadata}
volumes:
- ./channels.yml:/app/python_app/channels.yml:ro
- ./data:/app/data:ro

168
feed-master-config/fm.yml Normal file
View File

@@ -0,0 +1,168 @@
# Feed Master Configuration
# Auto-generated from channels.yml
# Do not edit manually - regenerate using generate_feed_config_simple.py
feeds:
youtube-unified:
title: YouTube Unified Feed
description: Aggregated feed from all YouTube channels
link: https://youtube.com
language: "en-us"
sources:
- name: A Quality Existence
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC6vg0HkKKlgsWk-3HfV-vnw&format=Mrss
- name: Adams Fall
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCbD1Pm0TOcRK2zaCrwgcTTg&format=Mrss
- name: Andrea with the Bangs
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCeWWxwzgLYUbfjWowXhVdYw&format=Mrss
- name: Aphrael Pilotson
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCGHuURJ1XFHzPSeokf6510A&format=Mrss
- name: Cassidy van der Kamp
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCM9Z05vuQhMEwsV03u6DrLA&format=Mrss
- name: Channel UCCebR16tXbv
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCCebR16tXbv5Ykk9_WtCCug&format=Mrss
- name: Channel UCiJmdXTb76i
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCiJmdXTb76i8eIPXdJyf8ZQ&format=Mrss
- name: Charlie's Little Corner
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC952hDf_C4nYJdqwK7VzTxA&format=Mrss
- name: Chris Howard
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC704NVL2DyzYg3rMU9r1f7A&format=Mrss
- name: Christian Baxter
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCU5SNBfTo4umhjYz6M0Jsmg&format=Mrss
- name: Climbing Mt Sophia
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCk9O91WwruXmgu1NQrKZZEw&format=Mrss
- name: Corner Citizen
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCAXyF_HFeMgwS8nkGVeroAA&format=Mrss
- name: Davidbusuttil9086
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCxV18lwwh29DiWuooz7UCvg&format=Mrss
- name: Ein Sof - Infinite Reflections
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC4Rmxg7saTfwIpvq3QEzylQ&format=Mrss
- name: Emily Rajeh
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCn5mf-fcpBmkepIpZ8eFRng&format=Mrss
- name: Eric Seitz
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCTdH4nh6JTcfKUAWvmnPoIQ&format=Mrss
- name: Ethan Caughey
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCv2Qft5mZrmA9XAwnl9PU-g&format=Mrss
- name: faturechi
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCprytROeCztMOMe8plyJRMg&format=Mrss
- name: Finding Ideas
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC6Tvr9mBXNaAxLGRA_sUSRA&format=Mrss
- name: Free Rilian
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCT8Lq3ufaGEnCSS8WpFatqw&format=Mrss
- name: From Whom All Blessings Flow
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC_-YQbnPfBbIezMr1adZZiQ&format=Mrss
- name: Grahampardun
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCB0C8DEIQlQzvSGuGriBxtA&format=Mrss
- name: Grail Country
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCsi_x8c12NW9FR7LL01QXKA&format=Mrss
- name: Grey Hamilton
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCrZyTWGMdRM9_P26RKPvh3A&format=Mrss
- name: Grizwald Grim
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCAqTQ5yLHHH44XWwWXLkvHQ&format=Mrss
- name: Jesspurviance
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCQ7rVoApmYIpcmU7fB9RPyw&format=Mrss
- name: John Vervaeke
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCpqDUjTsof-kTNpnyWper_Q&format=Mrss
- name: Jonathan Dumeer
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC4GiA5Hnwy415uVRymxPK-w&format=Mrss
- name: Jonathan Pageau
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCtCTSf3UwRU14nYWr_xm-dQ&format=Mrss
- name: Jordan B Peterson
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCL_f53ZEJxp8TtlOkHwMV9Q&format=Mrss
- name: Jordan Hall
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCMzT-mdCqoyEv_-YZVtE7MQ&format=Mrss
- name: Joseph Lambrecht
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCiOZYvBGHw1Y6wyzffwEp9g&format=Mrss
- name: Justinsmorningcoffee
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCPUVeoQYyq8cndWwyczX6RA&format=Mrss
- name: Kale Zelden
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCiukuaNd_qzRDTW9qe2OC1w&format=Mrss
- name: Lancecleaver227
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCwF5LWNOFou_50bT65bq4Bg&format=Mrss
- name: Lucas Vos
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCez1fzMRGctojfis2lfRYug&format=Mrss
- name: Luke Thompson
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC_dnk5D4tFCRYCrKIcQlcfw&format=Mrss
- name: Marc Jackson
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCMJCtS8jKouJ2d8UIYzW3vg&format=Mrss
- name: Mark D Parker
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCVdSgEf9bLXFMBGSMhn7x4Q&format=Mrss
- name: Mark S
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC977g6oGYIJDQnsZOGjQBBA&format=Mrss
- name: Mary Kochan
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC2leFZRD0ZlQDQxpR2Zd8oA&format=Mrss
- name: Matthewparlato5626
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCosBhpwwGh_ueYq4ZSi5dGw&format=Mrss
- name: mcmosav
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCnojyPW0IgLWTQ0SaDQ1KBA&format=Mrss
- name: Michaelmartin8681
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCpLJJLVB_7v4Igq-9arja1A&format=Mrss
- name: More Christ
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCEPOn4cgvrrerg_-q_Ygw1A&format=Mrss
- name: Neal Daedalus
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC8mJqpS_EBbMcyuzZDF0TEw&format=Mrss
- name: Nechama Gluck
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC5goUoFM4LPim4eY4pwRXYw&format=Mrss
- name: OG Rose
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC6zHDj4D323xJkblnPTvY3Q&format=Mrss
- name: Paul Anleitner
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC2yCyOMUeem-cYwliC-tLJg&format=Mrss
- name: Paul Rene Nichols
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCDCfI162vhPvwdxW6X4nmiw&format=Mrss
- name: Paul VanderKlay
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCGsDIP_K6J6VSTqlq-9IPlg&format=Mrss
- name: President Foxman
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCMVG5eqpYFVEB-a9IqAOuHA&format=Mrss
- name: Rafekelley
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCDIPXp88qjAV3TiaR5Uo3iQ&format=Mrss
- name: Randos United
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCEzWTLDYmL8soRdQec9Fsjw&format=Mrss
- name: Randos United 2
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC1KgNsMdRoIA_njVmaDdHgA&format=Mrss
- name: Rebel Wisdom
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCFQ6Gptuq-sLflbJ4YY3Umw&format=Mrss
- name: Restoring Meaning
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCzX6R3ZLQh5Zma_5AsPcqPA&format=Mrss
- name: Rigel Windsong Thurston
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCWehDXDEdUpB58P7-Bg1cHg&format=Mrss
- name: Rightinchrist
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCZA5mUAyYcCL1kYgxbeMNrA&format=Mrss
- name: Ron Copperman
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC5yLuFQCms4nb9K2bGQLqIw&format=Mrss
- name: Sartori Studios
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC8SErJkYnDsYGh1HxoZkl-g&format=Mrss
- name: Secular Koranism
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCFLovlJ8RFApfjrf2y157xg&format=Mrss
- name: Shoulder Serf
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UChptV-kf8lnncGh7DA2m8Pw&format=Mrss
- name: Skankenstein
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCUSyTPWW4JaG1YfUPddw47Q&format=Mrss
- name: Strange Theology
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCEY1vGNBPsC3dCatZyK3Jkw&format=Mrss
- name: The Anadromist
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCIAtCuzdvgNJvSYILnHtdWA&format=Mrss
- name: The Chris Show
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UClIDP7_Kzv_7tDQjTv9EhrA&format=Mrss
- name: The Meaning Code
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCgp_r6WlBwDSJrP43Mz07GQ&format=Mrss
- name: the plebistocrat
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCaJ0CqiiMSTq4X0rycUOIjw&format=Mrss
- name: The Young Levite
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC1a4VtU_SMSfdRiwMJR33YQ&format=Mrss
- name: TheCommonToad
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC-QiBn6GsM3JZJAeAQpaGAA&format=Mrss
- name: TheScrollersPodcast
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UC5uv-BxzCrN93B_5qbOdRWw&format=Mrss
- name: Transfigured
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCg7Ed0lecvko58ibuX1XHng&format=Mrss
- name: UpCycleClub
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCzw2FNI3IRphcAoVcUENOgQ&format=Mrss
- name: Wavesofobsession
url: http://rss-bridge/?action=display&bridge=YoutubeBridge&context=By+channel+id&c=UCedgru6YCto3zyXjlbuQuqA&format=Mrss
system:
update: 5m
max_per_feed: 5
max_total: 200
max_keep: 1000
base_url: http://localhost:8097

91
generate_feed_config.py Normal file
View File

@@ -0,0 +1,91 @@
#!/usr/bin/env python3
"""
Generate feed-master configuration from channels.yml.
This ensures a single source of truth for the YouTube channels.
"""
import sys
from pathlib import Path
from .channel_config import build_rss_bridge_url, load_channel_entries
def generate_fm_config(channels_file, output_file, rss_bridge_host="rss-bridge"):
"""Generate feed-master YAML configuration from channels.yml"""
print(f"Reading channels from {channels_file}")
channels = load_channel_entries(Path(channels_file))
print(f"Found {len(channels)} channels")
# Generate feed configuration
config = []
config.append("# Feed Master Configuration")
config.append("# Auto-generated from channels.yml")
config.append("# Do not edit manually - regenerate using generate_feed_config.py")
config.append("")
config.append("feeds:")
config.append(" youtube-unified:")
config.append(" title: YouTube Unified Feed")
config.append(" description: Aggregated feed from all YouTube channels")
config.append(" link: https://youtube.com")
config.append(' language: "en-us"')
config.append(" sources:")
processed = 0
skipped = 0
for channel in channels:
if not channel.get("rss_enabled", True):
skipped += 1
continue
bridge_url = build_rss_bridge_url(channel, rss_bridge_host=rss_bridge_host)
if not bridge_url:
skipped += 1
continue
name = channel.get("name", "Unknown")
config.append(f" - name: {name}")
config.append(f" url: {bridge_url}")
processed += 1
# Add system configuration
config.append("")
config.append("system:")
config.append(" update: 5m")
config.append(" max_per_feed: 5")
config.append(" max_total: 200")
config.append(" max_keep: 1000")
config.append(" base_url: http://localhost:8097")
# Write output
print(f"\nProcessed {processed} channels, skipped {skipped}")
with open(output_file, 'w') as f:
f.write('\n'.join(config))
print(f"Configuration written to {output_file}")
print(f"\nTo apply this configuration:")
print(f" 1. Copy {output_file} to feed-master/etc/fm.yml")
print(f" 2. Restart the feed-master service")
if __name__ == "__main__":
# Default paths
script_dir = Path(__file__).parent
channels_file = script_dir / "channels.yml"
output_file = script_dir / "feed-master-config" / "fm.yml"
# Allow overriding via command line
if len(sys.argv) > 1:
channels_file = Path(sys.argv[1])
if len(sys.argv) > 2:
output_file = Path(sys.argv[2])
if not channels_file.exists():
print(f"Error: {channels_file} not found", file=sys.stderr)
print(f"\nUsage: {sys.argv[0]} [channels.yml] [output.yml]", file=sys.stderr)
sys.exit(1)
# Ensure output directory exists
output_file.parent.mkdir(parents=True, exist_ok=True)
generate_fm_config(channels_file, output_file)

88
generate_feed_config_simple.py Executable file
View File

@@ -0,0 +1,88 @@
#!/usr/bin/env python3
"""
Generate feed-master configuration from channels.yml.
Simplified version that doesn't require RSS-Bridge to be running.
"""
import sys
from pathlib import Path
from .channel_config import build_rss_bridge_url, load_channel_entries
def generate_fm_config(channels_file, output_file, rss_bridge_host="rss-bridge"):
"""Generate feed-master YAML configuration from channels.yml"""
print(f"Reading channels from {channels_file}")
channels = load_channel_entries(Path(channels_file))
print(f"Found {len(channels)} channels")
# Generate feed configuration
config = []
config.append("# Feed Master Configuration")
config.append("# Auto-generated from channels.yml")
config.append("# Do not edit manually - regenerate using generate_feed_config_simple.py")
config.append("")
config.append("feeds:")
config.append(" youtube-unified:")
config.append(" title: YouTube Unified Feed")
config.append(" description: Aggregated feed from all YouTube channels")
config.append(" link: https://youtube.com")
config.append(' language: "en-us"')
config.append(" sources:")
processed = 0
skipped = 0
for channel in channels:
if not channel.get("rss_enabled", True):
skipped += 1
continue
bridge_url = build_rss_bridge_url(channel, rss_bridge_host=rss_bridge_host)
if not bridge_url:
skipped += 1
continue
name = channel.get("name", "Unknown")
config.append(f" - name: {name}")
config.append(f" url: {bridge_url}")
processed += 1
# Add system configuration
config.append("")
config.append("system:")
config.append(" update: 5m")
config.append(" max_per_feed: 5")
config.append(" max_total: 200")
config.append(" max_keep: 1000")
config.append(" base_url: http://localhost:8097")
# Write output
print(f"\nProcessed {processed} channels, skipped {skipped}")
with open(output_file, 'w') as f:
f.write('\n'.join(config))
print(f"Configuration written to {output_file}")
if __name__ == "__main__":
# Default paths
script_dir = Path(__file__).parent
channels_file = script_dir / "channels.yml"
output_file = script_dir / "feed-master-config" / "fm.yml"
# Allow overriding via command line
if len(sys.argv) > 1:
channels_file = Path(sys.argv[1])
if len(sys.argv) > 2:
output_file = Path(sys.argv[2])
if not channels_file.exists():
print(f"Error: {channels_file} not found", file=sys.stderr)
print(f"\nUsage: {sys.argv[0]} [channels.yml] [output.yml]", file=sys.stderr)
sys.exit(1)
# Ensure output directory exists
output_file.parent.mkdir(parents=True, exist_ok=True)
generate_fm_config(channels_file, output_file)

View File

@@ -4,4 +4,3 @@ youtube-transcript-api>=0.6
google-api-python-client>=2.0.0
python-dotenv>=0.19.0
requests>=2.31.0
sentence-transformers>=2.7.0

File diff suppressed because it is too large Load Diff

View File

@@ -39,11 +39,16 @@
const exactToggle = document.getElementById("exactToggle");
const fuzzyToggle = document.getElementById("fuzzyToggle");
const phraseToggle = document.getElementById("phraseToggle");
const externalToggle = document.getElementById("externalToggle");
const queryToggle = document.getElementById("queryStringToggle");
const searchBtn = document.getElementById("searchBtn");
const aboutBtn = document.getElementById("aboutBtn");
const aboutPanel = document.getElementById("aboutPanel");
const aboutCloseBtn = document.getElementById("aboutCloseBtn");
const rssButton = document.getElementById("rssButton");
const rssFeedLink = document.getElementById("rssFeedLink");
const channelListLink = document.getElementById("channelListLink");
const channelCount = document.getElementById("channelCount");
const resultsDiv = document.getElementById("results");
const metaDiv = document.getElementById("meta");
const metricsContainer = document.getElementById("metrics");
@@ -55,13 +60,28 @@
const graphModalClose = document.getElementById("graphModalClose");
const channelMap = new Map();
const transcriptCache = new Map();
const SETTINGS_KEY = "tlc-search-settings";
const DEFAULT_SETTINGS = {
channel: "",
year: "",
sort: "newer",
size: "10",
exact: true,
fuzzy: true,
phrase: true,
external: false,
queryString: false,
};
let settings = loadSettings();
let lastFocusBeforeModal = null;
let pendingChannelSelection = "";
let channelsReady = false;
let previousToggleState = { exact: true, fuzzy: true, phrase: true };
let currentPage =
parseInt(qs.get("page") || "0", 10) ||
0;
let previousToggleState = {
exact: settings.exact,
fuzzy: settings.fuzzy,
phrase: settings.phrase,
};
let currentPage = 0;
function toggleAboutPanel(show) {
if (!aboutPanel) return;
@@ -72,13 +92,6 @@
}
}
function parseBoolParam(name, defaultValue) {
const raw = qs.get(name);
if (raw === null) return defaultValue;
const lowered = raw.toLowerCase();
return !["0", "false", "no"].includes(lowered);
}
function parseChannelParam(params) {
if (!params) return "";
const seen = new Set();
@@ -101,6 +114,69 @@
return first || "";
}
function loadSettings() {
try {
const raw = localStorage.getItem(SETTINGS_KEY);
if (!raw) return { ...DEFAULT_SETTINGS };
const parsed = JSON.parse(raw);
return { ...DEFAULT_SETTINGS, ...parsed };
} catch (err) {
console.warn("Failed to load settings", err);
return { ...DEFAULT_SETTINGS };
}
}
function persistSettings() {
try {
localStorage.setItem(SETTINGS_KEY, JSON.stringify(settings));
} catch (err) {
console.warn("Failed to persist settings", err);
}
}
function applyStoredSettings() {
yearSel.value = settings.year || "";
sortSel.value = settings.sort || "relevant";
sizeSel.value = settings.size || "10";
exactToggle.checked = settings.exact;
fuzzyToggle.checked = settings.fuzzy;
phraseToggle.checked = settings.phrase;
if (externalToggle) {
externalToggle.checked = settings.external;
}
if (queryToggle) {
queryToggle.checked = settings.queryString;
}
}
function currentTogglePreferences() {
if (queryToggle && queryToggle.checked) {
return { ...previousToggleState };
}
return {
exact: !!exactToggle.checked,
fuzzy: !!fuzzyToggle.checked,
phrase: !!phraseToggle.checked,
};
}
function syncSettingsFromControls() {
const togglePrefs = currentTogglePreferences();
const next = {
...settings,
channel: channelSelect ? channelSelect.value || "" : "",
year: yearSel.value || "",
sort: sortSel.value || "relevant",
size: sizeSel.value || "10",
external: externalToggle ? !!externalToggle.checked : false,
queryString: queryToggle ? !!queryToggle.checked : false,
...togglePrefs,
};
settings = next;
persistSettings();
return settings;
}
function getSelectedChannels() {
if (!channelSelect) return [];
const value = channelSelect.value;
@@ -129,17 +205,18 @@
function setFromQuery() {
qInput.value = qs.get("q") || "";
yearSel.value = qs.get("year") || "";
sortSel.value = qs.get("sort") || "relevant";
sizeSel.value = qs.get("size") || "10";
pendingChannelSelection = parseChannelParam(qs);
const urlChannel = parseChannelParam(qs);
if (urlChannel) {
pendingChannelSelection = urlChannel;
settings.channel = urlChannel;
persistSettings();
} else {
pendingChannelSelection = settings.channel || "";
}
applyStoredSettings();
if (channelSelect) {
channelSelect.value = pendingChannelSelection || "";
}
exactToggle.checked = parseBoolParam("exact", true);
fuzzyToggle.checked = parseBoolParam("fuzzy", true);
phraseToggle.checked = parseBoolParam("phrase", true);
queryToggle.checked = parseBoolParam("query_string", false);
applyQueryMode();
rememberToggleState();
}
@@ -153,6 +230,8 @@
fuzzy: fuzzyToggle.checked,
phrase: phraseToggle.checked,
};
settings = { ...settings, ...previousToggleState };
persistSettings();
}
exactToggle.checked = false;
fuzzyToggle.checked = false;
@@ -168,6 +247,8 @@
fuzzyToggle.checked = previousToggleState.fuzzy;
phraseToggle.checked = previousToggleState.phrase;
}
settings.queryString = !!(queryToggle && queryToggle.checked);
persistSettings();
}
function rememberToggleState() {
@@ -177,6 +258,8 @@
fuzzy: !!fuzzyToggle.checked,
phrase: !!phraseToggle.checked,
};
settings = { ...settings, ...previousToggleState };
persistSettings();
}
}
@@ -188,6 +271,10 @@
if (!graphOverlay || !graphUiAvailable()) {
return;
}
const includeExternal = externalToggle ? !!externalToggle.checked : false;
if (graphUiAvailable() && typeof window.GraphUI.setIncludeExternal === "function") {
window.GraphUI.setIncludeExternal(includeExternal);
}
lastFocusBeforeModal =
document.activeElement instanceof HTMLElement ? document.activeElement : null;
graphOverlay.classList.add("active");
@@ -203,7 +290,10 @@
graphVideoField.value = videoId;
}
if (videoId) {
window.GraphUI.load(videoId, undefined, undefined, { updateInputs: true });
window.GraphUI.load(videoId, undefined, undefined, {
updateInputs: true,
includeExternal,
});
}
window.GraphUI.focusInput();
});
@@ -277,7 +367,11 @@
});
if (!collected.length) return null;
const escaped = collected.map((id) => `"${escapeQueryValue(id)}"`);
return `${field}:(${escaped.join(" OR ")})`;
const variants = field.endsWith(".keyword")
? [field]
: [`${field}.keyword`, field];
const clauses = variants.map((fname) => `${fname}:(${escaped.join(" OR ")})`);
return clauses.length > 1 ? `(${clauses.join(" OR ")})` : clauses[0];
}
async function loadChannels() {
@@ -286,7 +380,8 @@
return;
}
try {
const res = await fetch("/api/channels");
const includeExternal = externalToggle ? externalToggle.checked : false;
const res = await fetch(`/api/channels?external=${includeExternal ? "1" : "0"}`);
const data = await res.json();
channelMap.clear();
channelSelect.innerHTML = '<option value="">All Channels</option>';
@@ -304,6 +399,8 @@
} else {
channelSelect.value = "";
}
settings.channel = channelSelect.value || "";
persistSettings();
channelsReady = true;
} catch (err) {
@@ -313,24 +410,75 @@
}
}
function updateUrl(q, sort, channels, year, page, size, exact, fuzzy, phrase, queryMode) {
async function loadChannelListInfo() {
if (!rssFeedLink && !channelListLink && !channelCount) return;
try {
const res = await fetch("/api/channel-list");
const payload = await res.json();
if (rssFeedLink) {
const feedUrl = payload.rss_feed_url || "";
if (feedUrl) {
rssFeedLink.href = feedUrl;
rssFeedLink.textContent = feedUrl;
} else {
rssFeedLink.textContent = "Unavailable";
rssFeedLink.removeAttribute("href");
}
}
if (rssButton) {
const feedUrl = payload.rss_feed_url || "";
if (feedUrl) {
rssButton.href = feedUrl;
rssButton.classList.remove("is-disabled");
rssButton.removeAttribute("aria-disabled");
} else {
rssButton.removeAttribute("href");
rssButton.classList.add("is-disabled");
rssButton.setAttribute("aria-disabled", "true");
}
}
if (channelCount) {
const count = Array.isArray(payload.channels) ? payload.channels.length : 0;
channelCount.textContent = count ? `${count} channels` : "No channels loaded";
}
if (channelListLink && payload.error) {
channelListLink.textContent = "Channel list unavailable";
}
} catch (err) {
console.error("Failed to load channel list", err);
if (rssFeedLink) {
rssFeedLink.textContent = "Unavailable";
rssFeedLink.removeAttribute("href");
}
if (rssButton) {
rssButton.removeAttribute("href");
rssButton.classList.add("is-disabled");
rssButton.setAttribute("aria-disabled", "true");
}
if (channelCount) {
channelCount.textContent = "Channel list unavailable";
}
}
}
function updateUrl(q) {
const next = new URL(window.location.href);
next.searchParams.set("q", q);
next.searchParams.set("sort", sort);
if (q) {
next.searchParams.set("q", q);
} else {
next.searchParams.delete("q");
}
next.searchParams.delete("page");
next.searchParams.delete("sort");
next.searchParams.delete("channel_id");
next.searchParams.delete("channel");
channels.forEach((id) => next.searchParams.append("channel_id", id));
if (year) {
next.searchParams.set("year", year);
} else {
next.searchParams.delete("year");
}
next.searchParams.set("page", page);
next.searchParams.set("size", size);
next.searchParams.set("exact", exact ? "1" : "0");
next.searchParams.set("fuzzy", fuzzy ? "1" : "0");
next.searchParams.set("phrase", phrase ? "1" : "0");
next.searchParams.set("query_string", queryMode ? "1" : "0");
next.searchParams.delete("year");
next.searchParams.delete("size");
next.searchParams.delete("exact");
next.searchParams.delete("fuzzy");
next.searchParams.delete("phrase");
next.searchParams.delete("query_string");
next.searchParams.delete("external");
history.pushState({}, "", next.toString());
}
@@ -934,7 +1082,7 @@
}
}
function renderMetrics(data) {
function renderMetrics(data) {
if (!metricsContent) return;
metricsContent.innerHTML = "";
if (!data) return;
@@ -972,7 +1120,8 @@ async function loadMetrics() {
metricsStatus.textContent = "Loading metrics…";
}
try {
const res = await fetch("/api/metrics");
const includeExternal = externalToggle ? !!externalToggle.checked : false;
const res = await fetch(`/api/metrics?external=${includeExternal ? "1" : "0"}`);
const data = await res.json();
renderMetrics(data);
metricsContainer.dataset.loaded = "1";
@@ -1193,10 +1342,11 @@ async function updateFrequencyChart(term, channels, year, queryMode, toggles = {
if (queryMode) {
params.set("query_string", "1");
}
const { exact = true, fuzzy = true, phrase = true } = toggles || {};
const { exact = true, fuzzy = true, phrase = true, external = false } = toggles || {};
params.set("exact", exact ? "1" : "0");
params.set("fuzzy", fuzzy ? "1" : "0");
params.set("phrase", phrase ? "1" : "0");
params.set("external", external ? "1" : "0");
clearFrequency("Loading timeline…");
try {
@@ -1238,18 +1388,32 @@ async function updateFrequencyChart(term, channels, year, queryMode, toggles = {
} of ${payload.totalPages}`;
(payload.items || []).forEach((item) => {
const isExternal = !!item.external_reference;
const hasTitle = typeof item.title === "string" && item.title.trim().length > 0;
if (isExternal && !hasTitle) {
return;
}
const el = document.createElement("div");
el.className = "item";
const rawTitle = item.title || "Untitled";
const rawDescription = item.description || "";
const titleHtml =
item.titleHtml || escapeHtml(item.title || "Untitled");
item.titleHtml || escapeHtml(rawTitle);
const descriptionHtml =
item.descriptionHtml || escapeHtml(item.description || "");
item.descriptionHtml || escapeHtml(rawDescription);
const header = document.createElement("div");
header.className = "result-header";
const headerMain = document.createElement("div");
headerMain.className = "result-header-main";
const badgeDefs = [];
if (item.external_reference) {
badgeDefs.push({
label: "External",
badgeType: "external",
title: "Indexed from an external reference source",
});
}
if (item.highlightSource && item.highlightSource.primary) {
badgeDefs.push({ label: "primary transcript", badgeType: "transcript-primary" });
}
@@ -1264,13 +1428,10 @@ async function updateFrequencyChart(term, channels, year, queryMode, toggles = {
const refToIds = Array.isArray(item.internal_references) ? item.internal_references : [];
if (refByCount > 0) {
let query = null;
if (item.video_id) {
let query = buildFieldClause("video_id", refByIds);
if (!query && item.video_id) {
query = buildFieldClause("internal_references", [item.video_id]);
}
if (!query) {
query = buildFieldClause("video_id", refByIds);
}
badgeDefs.push({
label: `${refByCount} backlink${refByCount !== 1 ? "s" : ""}`,
query,
@@ -1291,7 +1452,11 @@ async function updateFrequencyChart(term, channels, year, queryMode, toggles = {
}
const titleEl = document.createElement("strong");
titleEl.innerHTML = titleHtml;
if (item.titleHtml) {
titleEl.innerHTML = titleHtml;
} else {
titleEl.textContent = rawTitle;
}
headerMain.appendChild(titleEl);
const metaLine = document.createElement("div");
@@ -1415,7 +1580,11 @@ async function updateFrequencyChart(term, channels, year, queryMode, toggles = {
if (descriptionHtml) {
const desc = document.createElement("div");
desc.className = "muted description-block";
desc.innerHTML = descriptionHtml;
if (item.descriptionHtml) {
desc.innerHTML = descriptionHtml;
} else {
desc.textContent = rawDescription;
}
el.appendChild(desc);
}
@@ -1516,6 +1685,7 @@ async function updateFrequencyChart(term, channels, year, queryMode, toggles = {
let exact = !!exactToggle.checked;
let fuzzy = !!fuzzyToggle.checked;
let phrase = !!phraseToggle.checked;
const includeExternal = externalToggle ? externalToggle.checked : false;
if (queryMode) {
exact = false;
fuzzy = false;
@@ -1526,12 +1696,14 @@ async function updateFrequencyChart(term, channels, year, queryMode, toggles = {
fuzzy,
phrase,
};
settings = { ...settings, ...previousToggleState };
persistSettings();
}
const page = pageOverride != null ? pageOverride : currentPage;
currentPage = page;
if (pushState) {
updateUrl(q, sort, channels, year, page, size, exact, fuzzy, phrase, queryMode);
updateUrl(q);
}
const params = new URLSearchParams();
@@ -1543,13 +1715,16 @@ async function updateFrequencyChart(term, channels, year, queryMode, toggles = {
params.set("fuzzy", fuzzy ? "1" : "0");
params.set("phrase", phrase ? "1" : "0");
params.set("query_string", queryMode ? "1" : "0");
params.set("external", includeExternal ? "1" : "0");
channels.forEach((id) => params.append("channel_id", id));
if (year) params.set("year", year);
syncSettingsFromControls();
const res = await fetch(`/api/search?${params.toString()}`);
const payload = await res.json();
renderResults(payload, page);
updateFrequencyChart(q, channels, year, queryMode, { exact, fuzzy, phrase });
updateFrequencyChart(q, channels, year, queryMode, { exact, fuzzy, phrase, external: includeExternal });
}
searchBtn.addEventListener("click", () => runSearch(0));
@@ -1569,31 +1744,50 @@ async function updateFrequencyChart(term, channels, year, queryMode, toggles = {
if (channelSelect) {
channelSelect.addEventListener("change", () => {
pendingChannelSelection = channelSelect.value || "";
settings.channel = pendingChannelSelection;
persistSettings();
if (channelsReady) {
runSearch(0);
}
});
}
yearSel.addEventListener("change", () => runSearch(0));
sortSel.addEventListener("change", () => runSearch(0));
sizeSel.addEventListener("change", () => runSearch(0));
exactToggle.addEventListener("change", () => { rememberToggleState(); runSearch(0); });
fuzzyToggle.addEventListener("change", () => { rememberToggleState(); runSearch(0); });
phraseToggle.addEventListener("change", () => { rememberToggleState(); runSearch(0); });
yearSel.addEventListener("change", () => { syncSettingsFromControls(); runSearch(0); });
sortSel.addEventListener("change", () => { syncSettingsFromControls(); runSearch(0); });
sizeSel.addEventListener("change", () => { syncSettingsFromControls(); runSearch(0); });
exactToggle.addEventListener("change", () => { rememberToggleState(); syncSettingsFromControls(); runSearch(0); });
fuzzyToggle.addEventListener("change", () => { rememberToggleState(); syncSettingsFromControls(); runSearch(0); });
phraseToggle.addEventListener("change", () => { rememberToggleState(); syncSettingsFromControls(); runSearch(0); });
if (externalToggle) {
externalToggle.addEventListener("change", () => {
pendingChannelSelection = "";
settings.external = !!externalToggle.checked;
persistSettings();
loadChannels().then(() => runSearch(0));
loadMetrics();
if (graphUiAvailable()) {
window.GraphUI.setIncludeExternal(settings.external);
}
});
}
if (queryToggle) {
queryToggle.addEventListener("change", () => { applyQueryMode(); runSearch(0); });
queryToggle.addEventListener("change", () => {
applyQueryMode();
syncSettingsFromControls();
runSearch(0);
});
}
window.addEventListener("popstate", () => {
qs = new URLSearchParams(window.location.search);
setFromQuery();
currentPage = parseInt(qs.get("page") || "0", 10) || 0;
currentPage = 0;
runSearch(currentPage, false);
});
setFromQuery();
loadMetrics();
loadYears();
loadChannelListInfo();
loadChannels().then(() => runSearch(currentPage));
})();

View File

@@ -54,6 +54,17 @@
</select>
</div>
<div class="field-group">
<label class="checkbox">
<input type="checkbox" id="graphFullToggle" name="full_graph" />
Attempt entire reference graph
</label>
<p class="field-hint">
Includes every video that references another (ignores depth; may be slow). Max nodes still
applies.
</p>
</div>
<div class="field-group">
<label for="graphLabelSize">Labels</label>
<select id="graphLabelSize" name="label_size">

View File

@@ -7,6 +7,7 @@
const depthInput = document.getElementById("graphDepth");
const maxNodesInput = document.getElementById("graphMaxNodes");
const labelSizeInput = document.getElementById("graphLabelSize");
const fullGraphToggle = document.getElementById("graphFullToggle");
const statusEl = document.getElementById("graphStatus");
const container = document.getElementById("graphContainer");
const isEmbedded =
@@ -133,6 +134,10 @@
let currentDepth = sanitizeDepth(depthInput.value);
let currentMaxNodes = sanitizeMaxNodes(maxNodesInput.value);
let currentSimulation = null;
let currentFullGraph = false;
let currentIncludeExternal = true;
let previousMaxNodesValue = maxNodesInput ? maxNodesInput.value : "200";
let previousMaxNodesValue = maxNodesInput ? maxNodesInput.value : "200";
function setStatus(message, isError = false) {
if (!statusEl) return;
@@ -148,11 +153,61 @@
return (value || "").trim();
}
async function fetchGraph(videoId, depth, maxNodes) {
function isFullGraphMode(forceValue) {
if (typeof forceValue === "boolean") {
return forceValue;
}
return fullGraphToggle ? !!fullGraphToggle.checked : false;
}
function applyFullGraphState(forceValue) {
const enabled = isFullGraphMode(forceValue);
if (typeof forceValue === "boolean" && fullGraphToggle) {
fullGraphToggle.checked = forceValue;
}
if (depthInput) {
depthInput.disabled = enabled;
}
if (maxNodesInput) {
if (enabled) {
previousMaxNodesValue = maxNodesInput.value || previousMaxNodesValue || "200";
maxNodesInput.value = "0";
maxNodesInput.disabled = true;
} else {
if (maxNodesInput.disabled) {
maxNodesInput.value = previousMaxNodesValue || "200";
}
maxNodesInput.disabled = false;
}
}
if (videoInput) {
if (enabled) {
videoInput.removeAttribute("required");
} else {
videoInput.setAttribute("required", "required");
}
}
}
async function fetchGraph(
videoId,
depth,
maxNodes,
fullGraphMode = false,
includeExternal = true
) {
const params = new URLSearchParams();
params.set("video_id", videoId);
params.set("depth", String(depth));
params.set("max_nodes", String(maxNodes));
if (videoId) {
params.set("video_id", videoId);
}
if (fullGraphMode) {
params.set("full_graph", "1");
params.set("max_nodes", "0");
} else {
params.set("depth", String(depth));
params.set("max_nodes", String(maxNodes));
}
params.set("external", includeExternal ? "1" : "0");
const response = await fetch(`/api/graph?${params.toString()}`);
if (!response.ok) {
const errorPayload = await response.json().catch(() => ({}));
@@ -320,7 +375,10 @@
})
.on("contextmenu", (event, d) => {
event.preventDefault();
loadGraph(d.id, currentDepth, currentMaxNodes, { updateInputs: true });
loadGraph(d.id, currentDepth, currentMaxNodes, {
updateInputs: true,
includeExternal: currentIncludeExternal,
});
});
nodeSelection
@@ -399,24 +457,44 @@
currentSimulation = simulation;
}
async function loadGraph(videoId, depth, maxNodes, { updateInputs = false } = {}) {
async function loadGraph(
videoId,
depth,
maxNodes,
{ updateInputs = false, fullGraph, includeExternal } = {}
) {
const wantsFull = isFullGraphMode(
typeof fullGraph === "boolean" ? fullGraph : undefined
);
const includeFlag =
typeof includeExternal === "boolean" ? includeExternal : currentIncludeExternal;
currentIncludeExternal = includeFlag;
const sanitizedId = sanitizeId(videoId);
if (!sanitizedId) {
if (!wantsFull && !sanitizedId) {
setStatus("Please enter a video ID.", true);
return;
}
const safeDepth = sanitizeDepth(depth);
const safeMaxNodes = sanitizeMaxNodes(maxNodes);
const safeDepth = wantsFull ? currentDepth || 1 : sanitizeDepth(depth);
const safeMaxNodes = wantsFull ? 0 : sanitizeMaxNodes(maxNodes);
if (updateInputs) {
videoInput.value = sanitizedId;
depthInput.value = String(safeDepth);
depthInput.value = String(wantsFull ? currentDepth || 1 : safeDepth);
maxNodesInput.value = String(safeMaxNodes);
applyFullGraphState(wantsFull);
} else {
applyFullGraphState();
}
setStatus("Loading graph…");
setStatus(wantsFull ? "Loading full reference graph…" : "Loading graph…");
try {
const data = await fetchGraph(sanitizedId, safeDepth, safeMaxNodes);
const data = await fetchGraph(
sanitizedId,
safeDepth,
safeMaxNodes,
wantsFull,
includeFlag
);
if (!data.nodes || data.nodes.length === 0) {
setStatus("No nodes returned for this video.", true);
container.innerHTML = "";
@@ -428,12 +506,22 @@
currentGraphData = data;
currentDepth = safeDepth;
currentMaxNodes = safeMaxNodes;
currentFullGraph = wantsFull;
renderGraph(data, getLabelSize());
renderLegend(data.nodes);
setStatus(
`Showing ${data.nodes.length} nodes and ${data.links.length} links (depth ${data.depth})`
`Showing ${data.nodes.length} nodes and ${data.links.length} links (${
data.meta?.mode === "full" ? "full graph" : `depth ${data.depth}`
})`
);
updateUrlState(
sanitizedId,
safeDepth,
safeMaxNodes,
getLabelSize(),
wantsFull,
includeFlag
);
updateUrlState(sanitizedId, safeDepth, safeMaxNodes, getLabelSize());
} catch (err) {
console.error(err);
setStatus(err.message || "Failed to build graph.", true);
@@ -448,6 +536,8 @@
event.preventDefault();
await loadGraph(videoInput.value, depthInput.value, maxNodesInput.value, {
updateInputs: true,
fullGraph: isFullGraphMode(),
includeExternal: currentIncludeExternal,
});
}
@@ -559,14 +649,37 @@
}
}
function updateUrlState(videoId, depth, maxNodes, labelSize) {
function updateUrlState(
videoId,
depth,
maxNodes,
labelSize,
fullGraphMode,
includeExternal
) {
if (isEmbedded) {
return;
}
const next = new URL(window.location.href);
next.searchParams.set("video_id", videoId);
next.searchParams.set("depth", String(depth));
next.searchParams.set("max_nodes", String(maxNodes));
if (videoId) {
next.searchParams.set("video_id", videoId);
} else {
next.searchParams.delete("video_id");
}
if (fullGraphMode) {
next.searchParams.set("full_graph", "1");
next.searchParams.delete("depth");
next.searchParams.set("max_nodes", "0");
} else {
next.searchParams.set("depth", String(depth));
next.searchParams.delete("full_graph");
next.searchParams.set("max_nodes", String(maxNodes));
}
if (!includeExternal) {
next.searchParams.set("external", "0");
} else {
next.searchParams.delete("external");
}
if (labelSize && labelSize !== "normal") {
next.searchParams.set("label_size", labelSize);
} else {
@@ -579,27 +692,52 @@
const params = new URLSearchParams(window.location.search);
const videoId = sanitizeId(params.get("video_id"));
const depth = sanitizeDepth(params.get("depth") || "");
const maxNodes = sanitizeMaxNodes(params.get("max_nodes") || "");
const rawMaxNodes = params.get("max_nodes");
let maxNodes = sanitizeMaxNodes(rawMaxNodes || "");
if (rawMaxNodes && rawMaxNodes.trim() === "0") {
maxNodes = 0;
}
const labelSizeParam = params.get("label_size");
const fullGraphParam = params.get("full_graph");
const viewFull =
fullGraphParam && ["1", "true", "yes"].includes(fullGraphParam.toLowerCase());
const externalParam = params.get("external");
const includeExternal =
!externalParam ||
!["0", "false", "no"].includes(externalParam.toLowerCase());
currentIncludeExternal = includeExternal;
if (videoId) {
videoInput.value = videoId;
}
depthInput.value = String(depth);
maxNodesInput.value = String(maxNodes);
maxNodesInput.value = String(viewFull ? 0 : maxNodes);
if (fullGraphToggle) {
fullGraphToggle.checked = !!viewFull;
}
applyFullGraphState();
if (labelSizeParam && isValidLabelSize(labelSizeParam)) {
setLabelSizeInput(labelSizeParam);
} else {
setLabelSizeInput(getLabelSize());
}
if (!videoId || isEmbedded) {
if ((isEmbedded && !viewFull) || (!videoId && !viewFull)) {
return;
}
loadGraph(videoId, depth, maxNodes, { updateInputs: false });
loadGraph(videoId, depth, maxNodes, {
updateInputs: false,
fullGraph: viewFull,
includeExternal,
});
}
resizeContainer();
window.addEventListener("resize", resizeContainer);
form.addEventListener("submit", handleSubmit);
if (fullGraphToggle) {
fullGraphToggle.addEventListener("change", () => {
applyFullGraphState();
});
}
labelSizeInput.addEventListener("change", () => {
const size = getLabelSize();
if (currentGraphData) {
@@ -610,7 +748,9 @@
sanitizeId(videoInput.value),
currentDepth,
currentMaxNodes,
size
size,
currentFullGraph,
currentIncludeExternal
);
});
initFromQuery();
@@ -619,8 +759,34 @@
load(videoId, depth, maxNodes, options = {}) {
const targetDepth = depth != null ? depth : currentDepth;
const targetMax = maxNodes != null ? maxNodes : currentMaxNodes;
const explicitFull =
typeof options.fullGraph === "boolean"
? options.fullGraph
: undefined;
if (fullGraphToggle && typeof explicitFull === "boolean") {
fullGraphToggle.checked = explicitFull;
}
applyFullGraphState(
typeof explicitFull === "boolean" ? explicitFull : undefined
);
const fullFlag =
typeof explicitFull === "boolean"
? explicitFull
: isFullGraphMode();
const explicitInclude =
typeof options.includeExternal === "boolean"
? options.includeExternal
: undefined;
if (typeof explicitInclude === "boolean") {
currentIncludeExternal = explicitInclude;
}
return loadGraph(videoId, targetDepth, targetMax, {
updateInputs: options.updateInputs !== false,
fullGraph: fullFlag,
includeExternal:
typeof explicitInclude === "boolean"
? explicitInclude
: currentIncludeExternal,
});
},
setLabelSize(size) {
@@ -659,8 +825,14 @@
labelSize: getLabelSize(),
nodes: currentGraphData ? currentGraphData.nodes.slice() : [],
links: currentGraphData ? currentGraphData.links.slice() : [],
fullGraph: currentFullGraph,
includeExternal: currentIncludeExternal,
};
},
setIncludeExternal(value) {
if (typeof value !== "boolean") return;
currentIncludeExternal = value;
},
isEmbedded,
});
GraphUI.ready = true;

View File

@@ -5,9 +5,9 @@
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>TLC Search</title>
<link rel="icon" href="/static/favicon.png" type="image/png" />
<link rel="stylesheet" href="https://unpkg.com/xp.css" />
<link rel="stylesheet" href="https://unpkg.com/xp.css" integrity="sha384-isKk8ZXKlU28/m3uIrnyTfuPaamQIF4ONLeGSfsWGEe3qBvaeLU5wkS4J7cTIwxI" crossorigin="anonymous" />
<link rel="stylesheet" href="/static/style.css" />
<script src="https://cdn.jsdelivr.net/npm/d3@7/dist/d3.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/d3@7/dist/d3.min.js" integrity="sha384-CjloA8y00+1SDAUkjs099PVfnY2KmDC2BZnws9kh8D/lX1s46w6EPhpXdqMfjK6i" crossorigin="anonymous"></script>
</head>
<body>
<div class="window" style="max-width: 1200px; margin: 20px auto;">
@@ -21,11 +21,23 @@
</div>
</div>
<div class="window-body">
<div class="window-actions">
<a
id="rssButton"
class="rss-button"
href="/rss"
target="_blank"
rel="noopener"
title="Unified RSS feed"
aria-label="Unified RSS feed"
>
<svg class="rss-button__icon" viewBox="0 0 24 24" aria-hidden="true">
<path d="M6 18a2 2 0 1 0 0 4a2 2 0 0 0 0-4zm-4 6a4 4 0 0 1 4-4a4 4 0 0 1 4 4h-2a2 2 0 0 0-2-2a2 2 0 0 0-2 2zm0-8v-2c6.627 0 12 5.373 12 12h-2c0-5.523-4.477-10-10-10zm0-4V4c11.046 0 20 8.954 20 20h-2c0-9.941-8.059-18-18-18z"/>
</svg>
<span class="rss-button__label">RSS</span>
</a>
</div>
<p>Enter a phrase to query title, description, and transcript text.</p>
<p style="font-size: 11px;">
Looking for semantic matches? Try the
<a href="/vector-search">vector search beta</a>.
</p>
<fieldset>
<legend>Search</legend>
@@ -81,6 +93,12 @@
<span class="toggle-help">Boost exact phrases inside transcripts.</span>
</div>
<div class="toggle-item">
<input type="checkbox" id="externalToggle" />
<label for="externalToggle">External</label>
<span class="toggle-help">Include externally referenced items.</span>
</div>
<div class="toggle-item">
<input type="checkbox" id="queryStringToggle" />
<label for="queryStringToggle">Query string mode</label>
@@ -127,6 +145,15 @@
<p>Use the toggles to choose exact, fuzzy, or phrase matching. Query string mode accepts raw Lucene syntax.</p>
<p>Results are ranked by your chosen sort order; the timeline summarizes the same query.</p>
<p>You can download transcripts, copy MLA citations, or explore references via the graph button.</p>
<div class="about-panel__section">
<div class="about-panel__label">Unified RSS feed</div>
<a id="rssFeedLink" href="#" target="_blank" rel="noopener">Loading…</a>
</div>
<div class="about-panel__section">
<div class="about-panel__label">Channel list</div>
<a id="channelListLink" href="/api/channel-list" target="_blank" rel="noopener">View JSON</a>
<div id="channelCount" class="about-panel__meta"></div>
</div>
</div>
</div>

61
static/notes.html Normal file
View File

@@ -0,0 +1,61 @@
<!doctype html>
<html>
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Notes</title>
<link rel="icon" href="/static/favicon.png" type="image/png" />
<link rel="stylesheet" href="https://unpkg.com/xp.css" integrity="sha384-isKk8ZXKlU28/m3uIrnyTfuPaamQIF4ONLeGSfsWGEe3qBvaeLU5wkS4J7cTIwxI" crossorigin="anonymous" />
<link rel="stylesheet" href="/static/style.css" />
<style>
.notes-content {
line-height: 1.6;
}
.notes-content h2 {
margin-top: 1.5em;
margin-bottom: 0.5em;
border-bottom: 1px solid #ccc;
padding-bottom: 0.25em;
}
.notes-content h2:first-child {
margin-top: 0;
}
.notes-content p {
margin: 0.75em 0;
}
.notes-content ul, .notes-content ol {
margin: 0.75em 0;
padding-left: 1.5em;
}
.notes-content li {
margin: 0.25em 0;
}
</style>
</head>
<body>
<div class="window" style="max-width: 800px; margin: 20px auto;">
<div class="title-bar">
<div class="title-bar-text">Notes</div>
<div class="title-bar-controls">
<button aria-label="Minimize"></button>
<button aria-label="Maximize"></button>
<button aria-label="Close"></button>
</div>
</div>
<div class="window-body">
<p style="margin-bottom: 16px;"><a href="/">← Back to search</a></p>
<div class="notes-content">
<h2>Welcome</h2>
<p>This is a space for thoughts, observations, and notes related to this project and beyond.</p>
<!-- Add your notes below -->
</div>
</div>
<div class="status-bar">
<p class="status-bar-field">Last updated: January 2026</p>
</div>
</div>
</body>
</html>

View File

@@ -196,6 +196,13 @@ body.dimmed {
font-weight: bold;
}
.graph-controls .field-hint {
font-size: 10px;
color: #3c3c3c;
margin: 0;
max-width: 280px;
}
.graph-controls input,
.graph-controls select {
min-width: 160px;
@@ -503,6 +510,22 @@ body.modal-open {
color: #000;
}
.about-panel__section {
margin-top: 8px;
padding-top: 6px;
border-top: 1px solid #c0c0c0;
}
.about-panel__label {
font-weight: bold;
margin-bottom: 2px;
}
.about-panel__meta {
font-size: 10px;
color: #555;
}
.about-panel__header button {
border: none;
background: transparent;
@@ -542,6 +565,50 @@ body.modal-open {
box-sizing: border-box;
}
.window-actions {
display: flex;
justify-content: flex-end;
margin-bottom: 6px;
}
.rss-button {
display: inline-flex;
align-items: center;
gap: 4px;
padding: 2px 6px;
border: 1px solid;
border-color: ButtonHighlight ButtonShadow ButtonShadow ButtonHighlight;
background: ButtonFace;
color: #000;
text-decoration: none;
font-size: 11px;
cursor: pointer;
}
.rss-button:hover {
background: #f3f3f3;
}
.rss-button:active {
border-color: ButtonShadow ButtonHighlight ButtonHighlight ButtonShadow;
}
.rss-button.is-disabled {
opacity: 0.5;
cursor: default;
pointer-events: none;
}
.rss-button__icon {
width: 14px;
height: 14px;
fill: #f38b00;
}
.rss-button__label {
font-weight: bold;
}
/* Badges */
.badge-row {
margin-top: 6px;
@@ -571,6 +638,12 @@ body.modal-open {
background: #8f4bff;
}
.badge--external {
background: #f5d08a;
color: #000;
border: 1px solid #cfa74f;
}
.badge-clickable {
cursor: pointer;
}

View File

@@ -1,46 +0,0 @@
<!doctype html>
<html>
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>TLC Vector Search</title>
<link rel="icon" href="/static/favicon.png" type="image/png" />
<link rel="stylesheet" href="https://unpkg.com/xp.css" />
<link rel="stylesheet" href="/static/style.css" />
</head>
<body>
<div class="window" style="max-width: 1200px; margin: 20px auto;">
<div class="title-bar">
<div class="title-bar-text">Vector Search (Experimental)</div>
<div class="title-bar-controls">
<a class="title-bar-link" href="/">⬅ Back to Search</a>
</div>
</div>
<div class="window-body">
<p>Enter a natural language prompt; results come from the Qdrant vector index.</p>
<fieldset>
<legend>Vector Query</legend>
<div class="field-row" style="margin-bottom: 8px;">
<label for="vectorQuery" style="width: 60px;">Query:</label>
<input id="vectorQuery" type="text" placeholder="Describe what you are looking for" style="flex: 1;" />
<button id="vectorSearchBtn">Search</button>
</div>
</fieldset>
<div id="vectorMeta" style="margin-top: 12px; font-size: 11px;"></div>
<fieldset style="margin-top: 16px;">
<legend>Results</legend>
<div id="vectorResults"></div>
</fieldset>
</div>
<div class="status-bar">
<p class="status-bar-field">Experimental mode • Qdrant</p>
</div>
</div>
<script src="/static/vector.js"></script>
</body>
</html>

View File

@@ -1,423 +0,0 @@
(() => {
const queryInput = document.getElementById("vectorQuery");
const searchBtn = document.getElementById("vectorSearchBtn");
const resultsDiv = document.getElementById("vectorResults");
const metaDiv = document.getElementById("vectorMeta");
const transcriptCache = new Map();
if (!queryInput || !searchBtn || !resultsDiv || !metaDiv) {
console.error("Vector search elements missing");
return;
}
/** Utility helpers **/
const escapeHtml = (str) =>
(str || "").replace(/[&<>"']/g, (ch) => {
switch (ch) {
case "&":
return "&amp;";
case "<":
return "&lt;";
case ">":
return "&gt;";
case '"':
return "&quot;";
case "'":
return "&#39;";
default:
return ch;
}
});
const fmtDate = (value) => {
try {
return (value || "").split("T")[0];
} catch {
return value;
}
};
const fmtSimilarity = (score) => {
if (typeof score !== "number" || Number.isNaN(score)) return "";
return score.toFixed(3);
};
const getVideoStatus = (item) =>
(item && item.video_status ? String(item.video_status).toLowerCase() : "");
const isLikelyDeleted = (item) => getVideoStatus(item) === "deleted";
const formatTimestamp = (seconds) => {
if (!seconds && seconds !== 0) return "00:00";
const hours = Math.floor(seconds / 3600);
const mins = Math.floor((seconds % 3600) / 60);
const secs = Math.floor(seconds % 60);
if (hours > 0) {
return `${hours}:${mins.toString().padStart(2, "0")}:${secs
.toString()
.padStart(2, "0")}`;
}
return `${mins}:${secs.toString().padStart(2, "0")}`;
};
const formatSegmentTimestamp = (segment) => {
if (!segment) return "";
if (segment.timestamp) return segment.timestamp;
const fields = [
segment.start_seconds,
segment.start,
segment.offset,
segment.time,
];
for (const value of fields) {
if (value == null) continue;
const num = parseFloat(value);
if (!Number.isNaN(num)) {
return formatTimestamp(num);
}
}
return "";
};
const serializeTranscriptSection = (label, parts, fullText) => {
let content = "";
if (typeof fullText === "string" && fullText.trim()) {
content = fullText.trim();
} else if (Array.isArray(parts) && parts.length) {
content = parts
.map((segment) => {
const ts = formatSegmentTimestamp(segment);
const text = segment && segment.text ? segment.text : "";
return ts ? `[${ts}] ${text}` : text;
})
.join("\n")
.trim();
}
if (!content) return "";
return `${label}\n${content}\n`;
};
const fetchTranscriptData = async (videoId) => {
if (!videoId) return null;
if (transcriptCache.has(videoId)) {
return transcriptCache.get(videoId);
}
const res = await fetch(`/api/transcript?video_id=${encodeURIComponent(videoId)}`);
if (!res.ok) {
throw new Error(`Transcript fetch failed (${res.status})`);
}
const data = await res.json();
transcriptCache.set(videoId, data);
return data;
};
const buildTranscriptDownloadText = (item, transcriptData) => {
const lines = [];
lines.push(`Title: ${item.title || "Untitled"}`);
if (item.channel_name) lines.push(`Channel: ${item.channel_name}`);
if (item.date) lines.push(`Published: ${item.date}`);
if (item.url) lines.push(`URL: ${item.url}`);
lines.push("");
const primaryText = serializeTranscriptSection(
"Primary Transcript",
transcriptData.transcript_parts,
transcriptData.transcript_full
);
const secondaryText = serializeTranscriptSection(
"Secondary Transcript",
transcriptData.transcript_secondary_parts,
transcriptData.transcript_secondary_full
);
if (primaryText) lines.push(primaryText);
if (secondaryText) lines.push(secondaryText);
if (!primaryText && !secondaryText) {
lines.push("No transcript available.");
}
return lines.join("\n").trim() + "\n";
};
const flashButtonMessage = (button, message, duration = 1800) => {
if (!button) return;
const original = button.dataset.originalLabel || button.textContent;
button.dataset.originalLabel = original;
button.textContent = message;
setTimeout(() => {
button.textContent = button.dataset.originalLabel || original;
}, duration);
};
const handleTranscriptDownload = async (item, button) => {
if (!item.video_id) return;
button.disabled = true;
try {
const transcriptData = await fetchTranscriptData(item.video_id);
if (!transcriptData) throw new Error("Transcript unavailable");
const text = buildTranscriptDownloadText(item, transcriptData);
const blob = new Blob([text], { type: "text/plain" });
const url = URL.createObjectURL(blob);
const link = document.createElement("a");
link.href = url;
link.download = `${item.video_id}.txt`;
document.body.appendChild(link);
link.click();
document.body.removeChild(link);
URL.revokeObjectURL(url);
flashButtonMessage(button, "Downloaded");
} catch (err) {
console.error("Download failed", err);
alert("Unable to download transcript right now.");
} finally {
button.disabled = false;
}
};
const formatMlaDate = (value) => {
if (!value) return "n.d.";
const parsed = new Date(value);
if (Number.isNaN(parsed.valueOf())) return value;
const months = [
"Jan.", "Feb.", "Mar.", "Apr.", "May", "June",
"July", "Aug.", "Sept.", "Oct.", "Nov.", "Dec.",
];
return `${parsed.getDate()} ${months[parsed.getMonth()]} ${parsed.getFullYear()}`;
};
const buildMlaCitation = (item) => {
const channel = (item.channel_name || item.channel_id || "Unknown").trim();
const title = (item.title || "Untitled").trim();
const url = item.url || "";
const publishDate = formatMlaDate(item.date);
const today = formatMlaDate(new Date().toISOString().split("T")[0]);
return `${channel}. "${title}." YouTube, uploaded by ${channel}, ${publishDate}, ${url}. Accessed ${today}.`;
};
const handleCopyCitation = async (item, button) => {
const citation = buildMlaCitation(item);
try {
if (navigator.clipboard && window.isSecureContext) {
await navigator.clipboard.writeText(citation);
} else {
const textarea = document.createElement("textarea");
textarea.value = citation;
textarea.style.position = "fixed";
textarea.style.opacity = "0";
document.body.appendChild(textarea);
textarea.select();
document.execCommand("copy");
document.body.removeChild(textarea);
}
flashButtonMessage(button, "Copied!");
} catch (err) {
console.error("Citation copy failed", err);
alert(citation);
}
};
/** Rendering helpers **/
const createHighlightRows = (entries) => {
if (!Array.isArray(entries) || !entries.length) return null;
const container = document.createElement("div");
container.className = "transcript highlight-list";
entries.forEach((entry) => {
if (!entry) return;
const row = document.createElement("div");
row.className = "highlight-row";
const textBlock = document.createElement("div");
textBlock.className = "highlight-text";
const html = entry.html || entry.text || entry;
textBlock.innerHTML = html || "";
row.appendChild(textBlock);
const indicator = document.createElement("span");
indicator.className = "highlight-source-indicator highlight-source-indicator--primary";
indicator.title = "Vector highlight";
row.appendChild(indicator);
container.appendChild(row);
});
return container;
};
const createActions = (item) => {
const actions = document.createElement("div");
actions.className = "result-actions";
const downloadBtn = document.createElement("button");
downloadBtn.type = "button";
downloadBtn.className = "result-action-btn";
downloadBtn.textContent = "Download transcript";
downloadBtn.addEventListener("click", () => handleTranscriptDownload(item, downloadBtn));
actions.appendChild(downloadBtn);
const citationBtn = document.createElement("button");
citationBtn.type = "button";
citationBtn.className = "result-action-btn";
citationBtn.textContent = "Copy citation";
citationBtn.addEventListener("click", () => handleCopyCitation(item, citationBtn));
actions.appendChild(citationBtn);
const graphBtn = document.createElement("button");
graphBtn.type = "button";
graphBtn.className = "result-action-btn graph-launch-btn";
graphBtn.textContent = "Graph";
graphBtn.disabled = !item.video_id;
graphBtn.addEventListener("click", () => {
if (!item.video_id) return;
const target = `/graph?video_id=${encodeURIComponent(item.video_id)}`;
window.open(target, "_blank", "noopener");
});
actions.appendChild(graphBtn);
return actions;
};
const renderVectorResults = (payload) => {
resultsDiv.innerHTML = "";
const items = payload.items || [];
if (!items.length) {
metaDiv.textContent = "No vector matches for this prompt.";
return;
}
metaDiv.textContent = `Matches: ${items.length} (vector mode)`;
items.forEach((item) => {
const el = document.createElement("div");
el.className = "item";
const header = document.createElement("div");
header.className = "result-header";
const headerMain = document.createElement("div");
headerMain.className = "result-header-main";
const titleEl = document.createElement("strong");
titleEl.innerHTML = item.titleHtml || escapeHtml(item.title || "Untitled");
headerMain.appendChild(titleEl);
const metaLine = document.createElement("div");
metaLine.className = "muted result-meta";
const channelLabel = item.channel_name || item.channel_id || "Unknown";
const dateLabel = fmtDate(item.date);
let durationSeconds = null;
if (typeof item.duration === "number") {
durationSeconds = item.duration;
} else if (typeof item.duration === "string" && item.duration.trim()) {
const parsed = parseFloat(item.duration);
if (!Number.isNaN(parsed)) {
durationSeconds = parsed;
}
}
const durationLabel = durationSeconds != null ? `${formatTimestamp(durationSeconds)}` : "";
metaLine.textContent = channelLabel ? `${channelLabel}${dateLabel}${durationLabel}` : `${dateLabel}${durationLabel}`;
if (isLikelyDeleted(item)) {
metaLine.appendChild(document.createTextNode(" "));
const statusEl = document.createElement("span");
statusEl.className = "result-status result-status--deleted";
statusEl.textContent = "Likely deleted";
metaLine.appendChild(statusEl);
}
headerMain.appendChild(metaLine);
if (item.url) {
const linkLine = document.createElement("div");
linkLine.className = "muted";
const anchor = document.createElement("a");
anchor.href = item.url;
anchor.target = "_blank";
anchor.rel = "noopener";
anchor.textContent = "Open on YouTube";
linkLine.appendChild(anchor);
headerMain.appendChild(linkLine);
}
if (typeof item.distance === "number") {
const scoreLine = document.createElement("div");
scoreLine.className = "muted";
scoreLine.textContent = `Similarity score: ${fmtSimilarity(item.distance)}`;
headerMain.appendChild(scoreLine);
}
header.appendChild(headerMain);
header.appendChild(createActions(item));
el.appendChild(header);
if (item.descriptionHtml || item.description) {
const desc = document.createElement("div");
desc.className = "muted description-block";
desc.innerHTML = item.descriptionHtml || escapeHtml(item.description);
el.appendChild(desc);
}
if (item.chunkText) {
const chunkBlock = document.createElement("div");
chunkBlock.className = "vector-chunk";
if (item.chunkTimestamp && item.url) {
const tsObj =
typeof item.chunkTimestamp === "object"
? item.chunkTimestamp
: { timestamp: item.chunkTimestamp };
const ts = formatSegmentTimestamp(tsObj);
const tsLink = document.createElement("a");
const paramValue =
typeof item.chunkTimestamp === "number"
? Math.floor(item.chunkTimestamp)
: item.chunkTimestamp;
tsLink.href = `${item.url}${item.url.includes("?") ? "&" : "?"}t=${encodeURIComponent(
paramValue
)}`;
tsLink.target = "_blank";
tsLink.rel = "noopener";
tsLink.textContent = ts ? `[${ts}]` : "[timestamp]";
chunkBlock.appendChild(tsLink);
chunkBlock.appendChild(document.createTextNode(" "));
}
const chunkTextSpan = document.createElement("span");
chunkTextSpan.textContent = item.chunkText;
chunkBlock.appendChild(chunkTextSpan);
el.appendChild(chunkBlock);
}
const highlights = createHighlightRows(item.toHighlight);
if (highlights) {
el.appendChild(highlights);
}
resultsDiv.appendChild(el);
});
};
/** Search handler **/
const runVectorSearch = async () => {
const query = queryInput.value.trim();
if (!query) {
alert("Please enter a query.");
return;
}
metaDiv.textContent = "Searching vector index…";
resultsDiv.innerHTML = "";
searchBtn.disabled = true;
try {
const res = await fetch("/api/vector-search", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ query }),
});
if (!res.ok) {
throw new Error(`Vector search failed (${res.status})`);
}
const data = await res.json();
if (data.error) {
metaDiv.textContent = "Vector search unavailable.";
return;
}
renderVectorResults(data);
} catch (err) {
console.error(err);
metaDiv.textContent = "Vector search unavailable.";
} finally {
searchBtn.disabled = false;
}
};
searchBtn.addEventListener("click", runVectorSearch);
queryInput.addEventListener("keypress", (event) => {
if (event.key === "Enter") {
runVectorSearch();
}
});
})();

78
urls.txt Normal file
View File

@@ -0,0 +1,78 @@
https://www.youtube.com/channel/UCCebR16tXbv5Ykk9_WtCCug/videos
https://www.youtube.com/channel/UC6vg0HkKKlgsWk-3HfV-vnw/videos
https://www.youtube.com/channel/UCeWWxwzgLYUbfjWowXhVdYw/videos
https://www.youtube.com/channel/UC952hDf_C4nYJdqwK7VzTxA/videos
https://www.youtube.com/channel/UCU5SNBfTo4umhjYz6M0Jsmg/videos
https://www.youtube.com/channel/UC6Tvr9mBXNaAxLGRA_sUSRA/videos
https://www.youtube.com/channel/UC4Rmxg7saTfwIpvq3QEzylQ/videos
https://www.youtube.com/channel/UCTdH4nh6JTcfKUAWvmnPoIQ/videos
https://www.youtube.com/channel/UCsi_x8c12NW9FR7LL01QXKA/videos
https://www.youtube.com/channel/UCAqTQ5yLHHH44XWwWXLkvHQ/videos
https://www.youtube.com/channel/UCprytROeCztMOMe8plyJRMg/videos
https://www.youtube.com/channel/UCpqDUjTsof-kTNpnyWper_Q/videos
https://www.youtube.com/channel/UCL_f53ZEJxp8TtlOkHwMV9Q/videos
https://www.youtube.com/channel/UCez1fzMRGctojfis2lfRYug/videos
https://www.youtube.com/channel/UC2leFZRD0ZlQDQxpR2Zd8oA/videos
https://www.youtube.com/channel/UC8SErJkYnDsYGh1HxoZkl-g/videos
https://www.youtube.com/channel/UCEPOn4cgvrrerg_-q_Ygw1A/videos
https://www.youtube.com/channel/UC2yCyOMUeem-cYwliC-tLJg/videos
https://www.youtube.com/channel/UCGsDIP_K6J6VSTqlq-9IPlg/videos
https://www.youtube.com/channel/UCEzWTLDYmL8soRdQec9Fsjw/videos
https://www.youtube.com/channel/UC1KgNsMdRoIA_njVmaDdHgA/videos
https://www.youtube.com/channel/UCFQ6Gptuq-sLflbJ4YY3Umw/videos
https://www.youtube.com/channel/UCEY1vGNBPsC3dCatZyK3Jkw/videos
https://www.youtube.com/channel/UCIAtCuzdvgNJvSYILnHtdWA/videos
https://www.youtube.com/channel/UClIDP7_Kzv_7tDQjTv9EhrA/videos
https://www.youtube.com/channel/UC-QiBn6GsM3JZJAeAQpaGAA/videos
https://www.youtube.com/channel/UCiJmdXTb76i8eIPXdJyf8ZQ/videos
https://www.youtube.com/channel/UCM9Z05vuQhMEwsV03u6DrLA/videos
https://www.youtube.com/channel/UCgp_r6WlBwDSJrP43Mz07GQ/videos
https://www.youtube.com/channel/UC5uv-BxzCrN93B_5qbOdRWw/videos
https://www.youtube.com/channel/UCtCTSf3UwRU14nYWr_xm-dQ/videos
https://www.youtube.com/channel/UC1a4VtU_SMSfdRiwMJR33YQ/videos
https://www.youtube.com/channel/UCg7Ed0lecvko58ibuX1XHng/videos
https://www.youtube.com/channel/UCMVG5eqpYFVEB-a9IqAOuHA/videos
https://www.youtube.com/channel/UC8mJqpS_EBbMcyuzZDF0TEw/videos
https://www.youtube.com/channel/UCGHuURJ1XFHzPSeokf6510A/videos
https://www.youtube.com/@chrishoward8473/videos
https://www.youtube.com/channel/UChptV-kf8lnncGh7DA2m8Pw/videos
https://www.youtube.com/channel/UCzX6R3ZLQh5Zma_5AsPcqPA/videos
https://www.youtube.com/channel/UCiukuaNd_qzRDTW9qe2OC1w/videos
https://www.youtube.com/channel/UC5yLuFQCms4nb9K2bGQLqIw/videos
https://www.youtube.com/channel/UCVdSgEf9bLXFMBGSMhn7x4Q/videos
https://www.youtube.com/channel/UC_dnk5D4tFCRYCrKIcQlcfw/videos
https://www.youtube.com/@Freerilian/videos
https://www.youtube.com/@marks.-ry7bm/videos
https://www.youtube.com/@Adams-Fall/videos
https://www.youtube.com/@mcmosav/videos
https://www.youtube.com/@Landbeorht/videos
https://www.youtube.com/@Corner_Citizen/videos
https://www.youtube.com/@ethan.caughey/videos
https://www.youtube.com/@MarcInTbilisi/videos
https://www.youtube.com/@climbingmt.sophia/videos
https://www.youtube.com/@Skankenstein/videos
https://www.youtube.com/@UpCycleClub/videos
https://www.youtube.com/@JessPurviance/videos
https://www.youtube.com/@greyhamilton52/videos
https://www.youtube.com/@paulrenenichols/videos
https://www.youtube.com/@OfficialSecularKoranism/videos
https://www.youtube.com/@FromWhomAllBlessingsFlow/videos
https://www.youtube.com/@FoodTruckEmily/videos
https://www.youtube.com/@O.G.Rose.Michelle.and.Daniel/videos
https://www.youtube.com/@JonathanDumeer/videos
https://www.youtube.com/@JordanGreenhall/videos
https://www.youtube.com/@NechamaGluck/videos
https://www.youtube.com/@justinsmorningcoffee/videos
https://www.youtube.com/@grahampardun/videos
https://www.youtube.com/@michaelmartin8681/videos
https://www.youtube.com/@davidbusuttil9086/videos
https://www.youtube.com/@matthewparlato5626/videos
https://www.youtube.com/@lancecleaver227/videos
https://www.youtube.com/@theplebistocrat/videos
https://www.youtube.com/@rigelwindsongthurston/videos
https://www.youtube.com/@RightInChrist/videos
https://www.youtube.com/@RafeKelley/videos
https://www.youtube.com/@WavesOfObsession/videos
https://www.youtube.com/@LeviathanForPlay/videos
https://www.youtube.com/channel/UCehAungJpAeC-F3R5FwvvCQ/videos
https://www.youtube.com/channel/UC4YwC5zA9S_2EwthE27Xlew/videos