From ReadAware
把一本书导入 readaware 书库——提取正文、解析卷/章结构、生成可定位的 manifest。当用户想"加一本书""把这个 epub/txt 导进来读""开始读某本新书"时触发。Ingest a book into the reading library so read can locate passages in it.
How this skill is triggered — by the user, by Claude, or both
Slash command
/readaware:ingestThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Goal: put the user's book into the library `~/.claude/readaware/books/<slug>/`, holding
Goal: put the user's book into the library ~/.claude/readaware/books/<slug>/, holding
text.txt (cleaned body text) and manifest.json (which paragraph each volume/chapter
starts at, and the front-matter/back-matter boundaries). Afterwards readaware:read uses
this manifest to precisely locate the passages the user throws at it.
The scripts live in ${CLAUDE_PLUGIN_ROOT}/scripts/: extract_text.py turns .epub/.html/.txt
into the body format, and build_manifest.py does the structure parsing.
Use the bundled extractor — it's pure-stdlib, no pandoc/Calibre needed, and handles .epub,
.html/.xhtml, and .txt:
python3 "${CLAUDE_PLUGIN_ROOT}/scripts/extract_text.py" input.epub \
-o ~/.claude/readaware/books/<slug>/text.txt
.epub it walks the spine in reading order, strips tags/scripts, decodes entities, and
collapses each block into one line. It also writes a text.struct.json sidecar from the
epub's own NCX/nav table of contents — build_manifest.py uses that authoritative structure
instead of guessing chapter boundaries from text patterns.build_manifest.py
treats every non-empty line as one paragraph)..txt. pandoc/ebook-convert remain fine alternatives if the user prefers them.karamazov, brothers-k).mkdir -p ~/.claude/readaware/books/<slug>, copy the body text in, and name it text.txt.First run --toc to see whether parsing is right, and hand the result to the user to verify:
python3 "${CLAUDE_PLUGIN_ROOT}/scripts/build_manifest.py" \
~/.claude/readaware/books/<slug>/text.txt \
--title "Title" --translator "Translator" --toc
The parser reports a parse mode:
epub-struct — used the epub's own NCX/nav TOC (via the text.struct.json sidecar). Most
reliable; this is what you get for a normal epub.toc — a .txt with an in-text table of contents; parsed it and matched chapters in the body.scan — a .txt with no TOC; scanned the body directly for headings.It auto-detects layout either way — Chinese (第X卷/第X章/bare 一 标题) and Western
(Part/Book + Chapter, Arabic/Roman numbering) — and the volume/part layer is optional
(flat chapter-only books work too).
Check two things: whether the part/chapter counts look right; and (in toc mode) if it warns
"TOC chapter count ≠ body match count", some chapter headings didn't line up and locating will drift.
Most books need no tuning. When an unusual layout doesn't line up, adjust the command-line
parameters and retry (locate.py never moves — the book-specific knowledge stays here):
--part-re (volume/part-title regex(es); group1 = number, group2 = title) — defaults cover
第X卷/部/篇 and Part/Book/Volume X--chap-re (extra prefixed chapter-title regex(es), appended to the built-in 第X章/Chapter X)--epilogue (names of unnumbered closing parts, e.g. 尾声, Epilogue)--front (front-matter section titles), --back (back-matter/afterword titles)Once --toc looks right, go to the next step.
python3 "${CLAUDE_PLUGIN_ROOT}/scripts/build_manifest.py" \
~/.claude/readaware/books/<slug>/text.txt \
-o ~/.claude/readaware/books/<slug>/manifest.json \
--title "Title" --translator "Translator" # carry over any parameters you tuned in step 4
Don't trust the parse by eye. Run the checker:
python3 "${CLAUDE_PLUGIN_ROOT}/scripts/verify_manifest.py" ~/.claude/readaware/books/<slug>
Exit 0 = good (skim any ⚠️ WARN — usually fine). Exit 1 = FAIL: the structure is wrong (citation initials like "V." mistaken for chapter numbers, the whole book collapsed into "one chapter", counts that don't add up, markers out of order…). Fix it before continuing.
The repair principle: the shared scripts (build_manifest.py, locate.py, extract_text.py)
stay universal — never edit them per book. We can't make one parser fit every book; a book the
defaults can't handle is fixed with data in that book's own directory, then re-verified.
Escalate only as far as you need:
Tune parameters (cheapest). Re-run step 5 with --part-re/--chap-re/--front/--back, then
re-verify. Good when the layout is regular but unusual (卷一 not 第一卷, Book I headings…).
Hand-author the structure sidecar when parameters can't express the layout (chapter numbers
buried in markup, OCR noise, an epub with no usable TOC). You become the parser: read text.txt,
find where each chapter/part actually starts, and write text.struct.json beside it —
build_manifest.py uses it verbatim (mode epub-struct). Format (reading order; depth 1 = part,
2 = chapter; drop parts entirely for a flat book):
{"headings": [
{"para_idx": 12, "title": "Part One", "depth": 1},
{"para_idx": 13, "title": "Chapter 1: Adventurers", "depth": 2}
]}
para_idx is the 0-based index among non-empty lines of text.txt. To find indices, dump the
short (heading-ish) lines with their indices, then pick the real ones:
python3 - ~/.claude/readaware/books/<slug>/text.txt <<'PY'
import sys
for i, line in enumerate(p for p in open(sys.argv[1]) if p.strip()):
if len(line.strip()) <= 40:
print(i, repr(line.strip()[:50]))
PY
Then re-run step 5 (it picks up the sidecar) and re-verify.
Edit manifest.json directly (last resort) for a handful of wrong markers — fix/add/remove
markers[] entries or body_start/body_end, keeping para_idx strictly increasing — re-verify.
Loop until verify_manifest.py passes (or you've reasoned a lone WARN is genuinely fine). Whatever
you wrote — tuned params noted in your summary, or the sidecar/manifest in the book dir — stays with
the book, so re-ingest is reproducible.
In the book dir, write card.md: this book's core motifs, main characters, and structures
worth stopping for. read references it during analysis so the interpretation hugs this
book instead of being generic. Write it from what you know about the book, flag anything
uncertain, and don't make things up.
Write it into ~/.claude/readaware/state.json:
{"active_book": "<slug>"}
(Create the file if it doesn't exist. read defaults to active_book; it can be omitted
when the library has only one book.)
Tell the user: which book was ingested, how many parts/chapters were parsed (and the parse
mode), that it passed verify_manifest.py (and any repair you had to do), where it lives,
and whether a card was written. They can start readaware:read.
npx claudepluginhub ahpxex/readaware.skillCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.