From sci-brain
Indexes paper collections (Zotero library, PDF folder, or Google Scholar profile) into a structured knowledge base under a project directory.
How this skill is triggered — by the user, by Claude, or both
Slash command
/sci-brain:researchstyleThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Turn an existing paper collection into a structured knowledge base under `<project>/.knowledge/` (or an advisor KB). The output uses the same KB format as the `survey` and `download-ref` skills — project and advisor KBs can coexist cleanly.
Turn an existing paper collection into a structured knowledge base under <project>/.knowledge/ (or an advisor KB). The output uses the same KB format as the survey and download-ref skills — project and advisor KBs can coexist cleanly.
Step 1 — Identify the researcher and source. First, ask whose papers to index:
"Whose papers should I index? (Give me a name, or leave blank for your own collection.)"
Then ask which source to use:
"Where are the papers?"
- (a) Zotero library
- (b) A PDF folder (give me the path)
- (c) Google Scholar profile (give me the URL)
Note: the Zotero option is only meaningful for indexing your own collection (it's your local DB). For another researcher, choose (b) or (c).
Step 2 — Index the collection.
Zotero:
Locate zotero.sqlite — check in order: ~/Zotero/, ~/Library/Application Support/Zotero/, ~/snap/zotero-snap/common/Zotero/. If not found, use find ~ -maxdepth 4 -name "zotero.sqlite" as fallback. If still not found, ask for the path.
Run the bundled script:
python3 <skill-base-dir>/parse_zotero.py <path-to-zotero.sqlite> <output_dir>
The script handles: copying the DB to avoid locking, pivot queries to avoid cartesian products, author extraction, cite key deduplication, topic classification, and generating structured output.
Important — treat <output_dir> as a scratch directory, not the KB. The script writes legacy-format index files (a topic index and a .bib file) into <output_dir>. Pick a temp path (e.g., /tmp/zotero-export-$$/). Steps 3–6 are the authoritative writes — they read those intermediate files from <output_dir> as input data, then emit .raw/{arxiv,doi}/<id>.json into $KB and append to $KB/references.bib. After Steps 3–6 finish, the contents of <output_dir> can be deleted.
Review the output — the script's topic classification uses keyword matching and may need manual adjustment. Check the topic distribution it prints and offer to re-classify if the user's field isn't well covered by the default patterns.
For papers missing abstracts or DOIs, find the PDF via the itemAttachments table. PDFs are at <zotero-data-dir>/storage/<key>/<filename>.pdf. Read them to extract the abstract.
PDF folder:
pdfgrep -r -i "KEYWORD" <folder> (install via package manager if missing, e.g., apt install pdfgrep or brew install pdfgrep).Google Scholar:
Note: Google Scholar actively blocks automated access — WebFetch may hit CAPTCHAs or rate limits. If scraping fails, suggest alternatives: export BibTeX manually from the Scholar profile page (Scholar → select all → export BibTeX), use ORCID or DBLP profiles instead (both have machine-friendly APIs), or switch to the PDF folder method with downloaded papers.
Processing tips:
parse_zotero.py for Zotero). Don't try to do it inline with shell commands — even for small libraries, a script is more reliable and easier to debug.TOPIC_PATTERNS in the script or ask the user to provide keywords for their domain.The KB target is decided by the caller:
# Standalone (indexes the user's own collection into the project KB):
KB=$(python3 skills/download-ref/helpers/resolve_kb.py)
# Invoked from /incarnate (indexes another researcher's collection into the advisor KB):
KB=$(python3 skills/download-ref/helpers/resolve_kb.py --advisor <slug>)
Ensure $KB/.raw/arxiv/ and $KB/.raw/doi/ exist.
.raw/ JSON per paperFor each indexed paper, write metadata to $KB/.raw/{arxiv,doi}/<id>.json in the exact shape fetch_metadata.py produces (top-level keys: title, authors, year, venue, abstract, externalIds, citationStyles, openAccessPdf). Use <safe-doi> (DOI with / → -) for DOI filenames.
For papers without a DOI or arXiv ID, skip — they don't fit the canonical KB; mention them to the user.
Per indexed paper:
KEY=$(python3 skills/download-ref/helpers/append_bibtex.py propose \
--kb "$KB" --id "$ID" --type "$TYPE" | python3 -c 'import sys,json; print(json.load(sys.stdin)["proposed_key"])')
python3 skills/download-ref/helpers/append_bibtex.py append \
--kb "$KB" --id "$ID" --type "$TYPE" --key "$KEY" \
--bib "$KB/references.bib"
Auto-accept the proposed key — per-paper confirmation is unworkable at 100+ papers.
python3 skills/download-ref/helpers/index.py \
--kb "$KB" \
--title "<advisor-slug or 'project'> — researcher index" \
--source-note "Built by /researchstyle on $(date -u +%Y-%m-%d)."
Write or extend $KB/NOTES.md with:
Reference papers as [@<cite-key>]. If NOTES.md exists, extend rather than overwrite.
After Steps 3–6 complete, the KB is populated with metadata but PDFs aren't downloaded yet. Ask the user via AskUserQuestion:
"Index built. What next?"
- (a) Fetch PDFs for all refs — invokes
download-ref --from-bib $KB/references.bib --kb $KB(bulk mode)- (b) Add specific refs by ID — invokes
download-refwith explicit IDs (single-shot, per-ref cite-key confirmation)- (c) Continue to
/brainstorm-ideas— start brainstorming with the indexed literature loaded- (d) Stop — leave the KB as-is
For (a) and (b), see skills/download-ref/SKILL.md. For (c), invoke /brainstorm-ideas in the current session.
npx claudepluginhub quantumbfs/sci-brain --plugin sci-brainAdds arXiv IDs or DOIs to a knowledge base by fetching metadata, downloading PDFs (with SciHub fallback), rendering to markdown, regenerating INDEX.md, and appending to references.bib.
Syncs .bib references to Zotero library and generates Obsidian literature notes with cross-cutting concept extraction. Use after /search-lit or to bulk-register references.
Manages Paperpile reference library and resolves citations to PDFs via the paperpile CLI. Supports add, search, fetch, label, edit, trash, and auth operations.