# Scripts

## build-morphology-db.py

See `docs/MORPHEUS-BUILD-GUIDE.md` for the complete script and instructions.

## ingest-nt.ts

Ingests all 27 books of the SBLGNT (MorphGNT edition) into the `passages` table as Biblical Greek passages, with World English Bible translations.

**Prerequisites:** The MorphGNT `.txt` files must be present at `/app/data/morphgnt/` inside the container (one file per book, named `61-Mt.txt` through `87-Re.txt`).

**Run (full NT):**
```sh
docker exec -w /app lector-app npx tsx scripts/ingest-nt.ts
```

**Options:**
```sh
# Ingest specific books only
docker exec -w /app lector-app npx tsx scripts/ingest-nt.ts --books=Matt,John,Rev

# Skip fetching WEB translations (faster; passages will have no translation)
docker exec -w /app lector-app npx tsx scripts/ingest-nt.ts --no-translation
```

**What it does:**
- Reads MorphGNT per-word records (`BBCCVV POS PARSING TEXT WORD NORM LEMMA`)
- Groups words into verse-boundary-respecting chunks of 40–160 words (target 90)
- Fetches WEB chapter translations from bolls.life at 80ms rate limit
- Stores passages with `language = 'greek'`, `genre_tag = 'biblical'`
- Upserts on `(source_work, section_ref)` so re-running is safe

**Sources:**
- Greek text: SBLGNT (CC BY 4.0)
- Morphology: MorphGNT (CC BY-SA 3.0)
- Translation: World English Bible (public domain) via bolls.life

## ingest-lxx.ts

Ingests the Greek Septuagint (LXX) — all 55 books from the Swete edition — as Biblical Greek passages, with Brenton English translations.

**Run (full LXX):**
```sh
docker exec -w /app lector-app npx tsx scripts/ingest-lxx.ts
```

**Options:**
```sh
# Specific books (slug, abbreviation, or full name)
docker exec -w /app lector-app npx tsx scripts/ingest-lxx.ts --books=Gen,Ps,Isa

# Skip translation fetch
docker exec -w /app lector-app npx tsx scripts/ingest-lxx.ts --no-translation
```

**What it does:**
- Fetches Greek text from nathans/lxx-swete on GitHub (CC BY-SA 4.0)
- Groups into verse-boundary-respecting chunks of 40–160 words (target 90)
- Fetches Brenton (1851) translations from bolls.life LXXE API at 80ms rate limit
- Best-effort morphology from morphology.db; Morpheus handles the rest at read time
- Stores passages with `language = 'greek'`, `genre_tag = 'biblical'`, `period_tag = 'hellenistic'`
- Upserts on `id` — safe to re-run

**Notes:**
- Ezra-Nehemiah (Esdras_B), Odes, and Psalms of Solomon have no Brenton translation mapped; Greek text is ingested without translation
- Skips Theodotion versions of Daniel, Susanna, and Bel (uses LXX originals)
- Verse 0 entries (chapter superscriptions) are skipped

**Sources:**
- Greek text: Swete LXX via [nathans/lxx-swete](https://github.com/nathans/lxx-swete) (CC BY-SA 4.0)
- Translation: Brenton LXX (1851) via bolls.life LXXE (public domain)

## Word Lists

Place your word lists here:
- `wordlist-greek.txt` — one Greek word per line
- `wordlist-latin.txt` — one Latin word per line

Good sources for core vocabulary:
- DCC Core Greek: https://dcc.dickinson.edu/greek-core-list
- DCC Core Latin: https://dcc.dickinson.edu/latin-core-list
